会议专题

An Improved KNN Text Categorization on Skew Sort Condition

KNN is one of most frequent used methods for text categorization. The feature high-dimension and skew of sort distribution will impact the performance of the classifier. An improved KNN based on skew sort condition is introduced in this paper for solving the problem that the big swatch sort with more texts is easy to be selected when conducting the K neighbor selection. Firstly, text feature selection is conducted by an improved information gain method for more efficient using the categorization distribution information in the sample training set. Then an improved KNN classifier based on the sort is used for categorization, which can solve the problem that big swatch sort is selected in training set. The experiment shows this method has improved the KNN classification performance.

KNN feature reduction feature selection text categorization

Liu Haifeng Liu Shousheng Su Zhan

Institute of Sciences PLA University of Science and Technology Nanjing, China

国际会议

The 2010 International Conference on Computer Application and System Modeling(2010计算机应用与系统建模国际会议 ICCASM 2010)

太原

英文

182-186

2010-10-22(万方平台首次上网日期,不代表论文的发表时间)