Memory Effect in DBSCAN Algorithm

摘要：

As a density-based clustering algorithm, DBSCAN plays an important role in data mining. Normally DBSCAN algorithm is computationally expensive, limiting its performance in large scale data sets, especially in high dimensional data sets. The high complexity is rooted from the region queries, a very common operation in density-based algorithms, which brings the complexity of the algorithms to O(n2), where n is the number of database objects. With the help of index structure the complexity can be reduced to O (nlogn), however it is inefficient to create the index structureespecially for high dimensional data sets or large scale databases. In this paper we propose a new concept named memory effect (ME). ME can be used to shrink the scope of region queries to neighboring objects. Based on ME we have improved DBSCAN algorithm evidently, and empirical experiments have shown the improvement in both effectiveness and efficiency. At last, we give the theoretical analysis of MEDBSCAN algorithm and talk about the influence of parameters.

关键词： Density-based clustering DBSCAN Algorithm MEDBSCAN Algorithm Memory Effect

作者: LI Jian YU Wei YAN Bao-Ping

作者单位: Computer Network Information Center Chinese Academy of Sciences, Beijing, China Software College, Bei Hang University, Beijing, China

会议类型: 国际会议

会议名称: 第四届国际计算机新科技与教育学术会议(2009 4th International Conference on Computer Science & Education)

会议地点: 南京

会议语种:英文

页码: 31-36

在线出版日期: 2009-07-25（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Memory Effect in DBSCAN Algorithm