A NEW DENSITY-BASED METHOD FOR REDUCING THE AMOUNT OF TRAINING DATA IN K-NN TEXT CLASSIFICATION

摘要：

With the rapid development of WWW, text classification has become the key technology in organizing and processing large amount of text data.As a simple, effective and nonparametric classification method, k-NN method is widely used in text classification.But k-NN clasifier not only has large computational demands, but also may decrease the precision of classification because of uneven density of training data.In this paper, a new density-based method for reducing the amount of training data is presented, which not only reduces the computational demands of k-NN classifier, but also improves the classification precision.The experiments show that the new method has better performance than the traditional k-NN method.

关键词： Text classification K-Nearest Neighbor Density Training data

作者: FANG YUAN LIU YANG GE YU

作者单位: College of Mathematics and Computer Science, Hebei University, Baoding, Hebei, 071002 China College of Information Science and Engineering, Northeastern University, Shenyang, 110004 China

会议类型: 国际会议

会议名称: 2007 International Conference on Machine Learning and Cybernetics(IEEE第六届机器学习与控制论国际会议)

会议地点: 香港

会议语种:英文

页码: 3372-3376

在线出版日期: 2007-08-19（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A NEW DENSITY-BASED METHOD FOR REDUCING THE AMOUNT OF TRAINING DATA IN K-NN TEXT CLASSIFICATION