Sample Clustering for Fast Classification by Using the Mean Shift Procedure

摘要：

Most classification methods are limited by speed particularly when the training data set is large, such as artificial neural networks (ANNs) and support vector machines (SVMs). In this article, we explore the possibility of utilizing the Mean Shift algorithm, which is a mode seeking procedure that estimates the gradient of the data density, to decrease the sample size. We found that in a large number of samples to be trained, most samples can be clustered into a small number of mode centroids (extreme values of density), therefore, the original samples can be reduced by means of using the results of the Mean Shift procedure. To verify the validity of this method, several classifiers including the linear discriminant analysis (LDA), k nearest neighbor (kNN) and SVMs have been tested. Experimental results prove that when the parameters are selected appropriately, the proposed method is capable of reducing the computational complexity of above classification methods, with minimum effects on the classification accuracy.

关键词： sample reductio mean shift classification methods mode seeking sample selection

作者: Liang Lie-quan Liang Ying-hong

作者单位: Guangdong Provincial Key Lab of E-Commerce Marketing Application Technology Guangdong Commerce College Guangzhou, P.R China, 5103201e Guangzhou 510320 P.R China

会议类型: 国际会议

会议名称: Second International Symposium on Electronic Commerce and Security(第二届电子商务与安全国际研究大会)(ISECS 2009)

会议地点: 南昌

会议语种:英文

页码: 835-839

在线出版日期: 2009-05-22（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Sample Clustering for Fast Classification by Using the Mean Shift Procedure