会议专题

Using DBSCAN Clustering Algorithm in Spam Identifying

In the field of internet research, anti-spam mechanism has become a focus currently. The identification of spam plays an important role in current anti-spam mechanism. In order to identify spam efficiently, it usually needs to be able to identify similar emails, i.e. spam clustering. Using the present methods to cluster the emails, many similar emails will be clustered into several groups. For improving the accuracy of spam identification, we present a new clustering method which is based on the DBSCAN clustering algorithm and nilsimsa digest algorithm. Using this method, all emails identified similar artificially are clustered together. The result of the simulation shows that the clustering method based on DBSCAN and nilsimsa performs with higher clustering accuracy than the other clustering methods. From the simulation result, we can also conclude that the shape of the spam digest subspace is irregular.

DBSCAN cluster nilsimsa spam

Wu Ying Yang Kai Zhang Jianzhong

Department of Computer Science, Nankai University, Tianjin P.R.China

国际会议

2010 2nd International Conference on Education Technology and Computer(第二届IEEE教育技术与计算机国际会议 ICETC 2010)

上海

英文

398-402

2010-06-22(万方平台首次上网日期,不代表论文的发表时间)