IJCATR Volume 3 Issue 3

Re-enactment of Newspaper Articles

Thilagavathi .N Archanaa S.R Lavanya.K Valarmathi.S
10.7753/IJCATR0303.1007
keywords : Page segmentation, TF-IDF weighting, Cosine similarity, Clustering, K-Means algorithm, Keyword Extraction.

PDF
Every document that we use has become digitized which makes a great way to save, retrieve and protect documents. They are digitized to have a backup for most paper work .Digitization is found to be more important since everything is going paper free. Digitization of newspaper contributes greatly to preservation and access to newspaper archives. Our paper provides an integrated mechanism that involves document image analysis and k means clustering algorithm to digitize news articles and provide an efficient retrieval of user requested news article. In first stage the news article is segmented from newspaper and pre-processed. In the second stage the pre-processed news articles are clustered by K- means clustering algorithm and key words are extracted for each cluster. The third stage involves selection of cluster containing key phrase given by user and providing the user with requested news article.
@artical{t332014ijcatr03031007,
Title = "Re-enactment of Newspaper Articles",
Journal ="International Journal of Computer Applications Technology and Research(IJCATR)",
Volume = "3",
Issue ="3",
Pages ="165 - 168",
Year = "2014",
Authors ="Thilagavathi .N Archanaa S.R Lavanya.K Valarmathi.S"}
  • null