Header menu link for other important links
A step towards Interactive Document Clustering
Published in
Volume: 3.0
Issue: 4.0
Pages: 305.0 - 309.0
Document clustering has been implemented in innovative ways but has till date refrained from making better use of data and information which can be extracted from the World Wide Web. Most research is dependent on static databases like the well investigated Reuters-21578 news corpus for news articles categorization analysis. This paper presents the use of a dynamic news database which is obtained by using a web crawler and this database is updated daily. On the fly clustering is performed and categories are given as input by the user. To reduce the volume of text being analyzed and avoid misleading results preprocessing techniques has been used. To understand the context of the articles and categorize them further TF-IDF has been performed in this experiment.
About the journal
JournalInternational Journal of Emerging Technologies and Innovative Research (www.jetir.org)
Open Access0