A NEW DENSITY-BASED METHOD FOR REDUCING THE AMOUNT OF TRAINING DATA IN K-NN TEXT CLASSIFICATION
With the rapid development of WWW, text classification has become the key technology in organizing and processing large amount of text data.As a simple, effective and nonparametric classification method, k-NN method is widely used in text classification.But k-NN clasifier not only has large computational demands, but also may decrease the precision of classification because of uneven density of training data.In this paper, a new density-based method for reducing the amount of training data is presented, which not only reduces the computational demands of k-NN classifier, but also improves the classification precision.The experiments show that the new method has better performance than the traditional k-NN method.
Text classification K-Nearest Neighbor Density Training data
FANG YUAN LIU YANG GE YU
College of Mathematics and Computer Science, Hebei University, Baoding, Hebei, 071002 China College of Information Science and Engineering, Northeastern University, Shenyang, 110004 China
国际会议
2007 International Conference on Machine Learning and Cybernetics(IEEE第六届机器学习与控制论国际会议)
香港
英文
3372-3376
2007-08-19(万方平台首次上网日期,不代表论文的发表时间)