A Topic Detection Approach Based on Multi-level Clustering
Text clustering is the major route for topic detection. The major shortcoming which the current algorithms always suffers is the high computing complexity and great time cost when the number of instance is too large. We introduce a new algorithm which cluster the text copra is two steps: in the C-process we divide the copra into some overlapping subsets using Canopy clustering; in the K-process we take X-means algorithm to generate rough clusters from the canopies which share common instance. Experiments show this text clustering technique reveals the true number of the clusters from the copra and runs faster than Single-pass and K-means clustering algorithms.
topic detection multi-level canopy clustering K-means clustering
Yang Song Junping Du Lisha Hou
Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, School of Computer Science Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, School of Computer Science
国际会议
The 31st Chinese Control Conference(第三十一届中国控制会议)
合肥
英文
3834-3838
2012-07-01(万方平台首次上网日期,不代表论文的发表时间)