A K-Nearest Neighbor Algorithm Based on Cluster in Text Classification

摘要：

The K-Nearest Neighbor Algorithm (K-NN) is an important approach for automatic text classification. In this paper, cluster was applied In order to overcome the disadvantages of the traditional K-NN algorithm. First Clustering was utilized in training set through an improved K-mean approach to select the most representative samples as cluster center.Then we compute the comparability between the testing samples and the central vector of each cluster. A K-NN algorithm based on cluster was presented ,The experiment results verify that this classification algorithm is much faster than the traditional K-NN algorithm, and it can raise the accuracy.

关键词： text classification k-Nearest Neighbor cluster

作者: Chun-Yan WANG Kuo Zhang Yu-Guang YAN Jian-Gang Li

作者单位: Department of computer science and technology, Changchun Normal College Changchun, China Institute of mechanical science and engineering Jilin University Changchun China Department of computer science and technology, Changchun Normal College Changchun, China

会议类型: 国际会议

会议名称: 2010 International Conference on Computer,Mechatronics,Control and Electronic Engineering(2010计算机、机电、控制与电子工程国际会议 CMCE 2010)

会议地点: 长春

会议语种:英文

页码: 225-228

在线出版日期: 2010-08-24（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A K-Nearest Neighbor Algorithm Based on Cluster in Text Classification