Conference Proceedings

International Conference on Advances in Computer Science and Electronics Engineering - CSEE 2012

Clustering of Text Document using Kea-means with F-measure

Author(s) : M.J.YEOLA

Abstract

Document clustering is an area that deals with the unsupervised grouping of text documents into meaningful groups, usually representing topics in the document collection. It is one way to organize information without requiring prior knowledge about the classification of documents. The well-known K-means clustering algorithm allows users to specify the number of clusters. However, if the pre-specified number of clusters is modified, the precision of each result also changes. To solve this problem, this paper proposes a new clustering algorithm based on the Kea keyphrase extraction algorithm. In this paper, documents are grouped into several clusters like K-means, but the number of clusters is automatically determined by finding out the similarities between documents and the extracted key phrases. It also calculates F-measure value using precision and recall which gives the better clusters.

Conference Title : International Conference on Advances in Computer Science and Electronics Engineering - CSEE 2012
Conference Date(s) : February 2-3, 2012
Place : VITS - Luxury Business Hotel, Delhi-NCR, India
No fo Author(s) : 1
DOI : 10.15224/978-981-07-1403-1-185
Page(s) : 6 - 9
Electronic ISBN : 978-981-07-1403-1
Views : 1098   |   Download(s) : 109