会议专题

A NEW DENSITY-BASED METHOD FOR REDUCING THE AMOUNT OF TRAINING DATA IN K-NN TEXT CLASSIFICATION

With the rapid development of WWW, text classification has become the key technology in organizing and processing large amount of text data.As a simple, effective and nonparametric classification method, k-NN method is widely used in text classification.But k-NN clasifier not only has large computational demands, but also may decrease the precision of classification because of uneven density of training data.In this paper, a new density-based method for reducing the amount of training data is presented, which not only reduces the computational demands of k-NN classifier, but also improves the classification precision.The experiments show that the new method has better performance than the traditional k-NN method.

Text classification K-Nearest Neighbor Density Training data

FANG YUAN LIU YANG GE YU

College of Mathematics and Computer Science, Hebei University, Baoding, Hebei, 071002 China College of Information Science and Engineering, Northeastern University, Shenyang, 110004 China

国际会议

2007 International Conference on Machine Learning and Cybernetics(IEEE第六届机器学习与控制论国际会议)

香港

英文

3372-3376

2007-08-19(万方平台首次上网日期,不代表论文的发表时间)