Implication Intensity: Randomized F-measure for Cluster Evaluation
The ever-growing resources of information and services on World Wide Web provide a welcome boost for the researches in the information retrieval space. Text clustering groups a set of documents into subsets or clusters so that the vast retrieved documents can be browsed selectively and efficiently. Many cluster validation measures, such as the F-measure, are then introduced to evaluate the clustering qualities. In this paper, however, we demonstrate that this widely adopted F-measure suffers from the so-call increment effect which may mislead the comparison of clustering results with different cluster numbers. To meet this challenge, we propose a novel “implication intensity (IMI) measure based on the F-measure and a random clustering perspective. Experimental results on real-world data sets demonstrate that IMI shows merits on alleviating the increment effect introduced by the F-measure.
cluster evaluation increment effect F-measure implication intensity
Limin Li Junjie Wu Shiwei Zhu
School of Economics and Management,Beihang University,Beijing 100083,China
国际会议
2009 6th International Conference on Service Systems and Service Management( 2009 第六届服务系统与服务管理国际会议)
厦门
英文
510-515
2009-06-08(万方平台首次上网日期,不代表论文的发表时间)