Mining high-quality Clusters in Pattern-based Clustering
Pattern-based clustering, which capture the similarity of the patterns exhibited by objects in a subset of dimensions, has broad applications in DNA microarray data analysis, customer segmentation, ebusiness data analysis, etc. However, pattern-based clustering often returns a large number of highlyoverlapping clusters, which makes it hard for users to identify interesting patterns from the huge mining results. Moreover, there lacks a general measurement to evaluate the quality of Clusters which patternbased clustering obtained. In this paper, we discuss factors which cause highly-overlapping, make error analysis and pattern weighting, and propose qScore as a key evaluation parameters on quality of Clusters. A algorithm which based on qScore is presented to solve the problem of high-overlapping and get better quality clustering results.
Pattern-based Clustering Pattern similarity error analysis qScore highly-Overlapping clusters
Qian Ma Jingfeng Guo
Modem Education Technology Management Center HengShui College HengShui, Hebei, China The collage of Information Science and Engineering YanShan University Qin Huangdao, Hebei, China
国际会议
上海
英文
1200-1204
2011-07-26(万方平台首次上网日期,不代表论文的发表时间)