Star-Scan:A Stable Clustering by Statistically Finding Centers and Noises Nan
In this paper,we present a new clustering algorithm,called A Stable Clustering by Statistically Finding Centers and Noises (Star-Scan).Star-Scan is a density-based clustering algorithm that can find arbitrary shape clusters and resists to the noise in a dataset.It borrows the idea from Rodriguezs Clustering by Fast Search and Find of Density Peaks (CFSFDP) that the cluster centers are characterized by the points with both higher density and farther distance to other centers than their neighbors.Different from CFSFDP,instead of manual operation,Star-Scan uses a statistical method,box plot,to select cluster centers automatically.Furthermore,due to inadequate selection of cluster centers in CFSFDP,we apply a merging post-process to the produced clusters to get stable and correct results.Finally,we also use box plot to filter out noises on each of final clusters to solve the problem of over-filtering in CFSFDP.We have demonstrated the good performance of Star-Scan algorithm on several synthetic datasets.
Density-based clustering Box plot Statistics
Nan Yang Qing Liu Yaping Li Lin Xiao Xiaoqing Liu
Information School,Renmin University of China,No.59,Zhongguancun Street,Haidian District,Beijing,China
国际会议
International Asia-Pacific Web Conference(第18届国际亚太互联网大会)
苏州
英文
456-467
2016-09-23(万方平台首次上网日期,不代表论文的发表时间)