Topical Concept Based Text Clustering Method
Text clustering typically involves clustering in a high dimensional space,which appears difficult with regard to virtually all practical settings.In addition,given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are..To solve these problems,based on topic concept clustering,this paper proposes a method for Chinese document clustering.In this paper,we introduce a novel topical document clustering method called Document Features Indexing Clustering (DFIC),which can identify topics accurately and cluster documents according to these topics.In DFIC,topic elements are defined and extracted for indexing base clusters.Additionally,document features are investigated and exploited.Experimental results show that DFIC can gain a higher precision (92.76%) than some widely used traditional clustering methods.
document clustering clusters indexing topical concept
Yi Ding Xian FU
The college of computer science and technologyHubei normal university Huangshi, China
国际会议
西安
英文
939-943
2012-08-24(万方平台首次上网日期,不代表论文的发表时间)