会议专题

An improved method of term weighting for tezt classification

In text classification, term weighting methods design appropriate weights to the given terms to improve the text classification performance. Traditional algorithm of term weighting only considers about tf (term frequency), idf (inverse document frequency) and so on, and this approach simply thinks low frequency terms are important, high frequency terms are unimportant, so it designs higher weights to the rare terms frequently. In this paper, we present an effective term weighting approach to avoid the deficiency of the traditional approach, and make use of kNN classifiers to classify over widely-used benchmark data set Reuters-21578. The experimental results prove that the new approach can improve the accuracy of classification.

Tezt classification tf-idf term weighting kNN.

Hua Jiang Ping Li Xin Hu Shuyan Wang

School of Computer Science Northeast Normal University Changchun,Jilin Province,China

国际会议

2009 IEEE International Conference on Intelligent Computing and Intelligent Systems(2009 IEEE 智能计算与智能系统国际会议)

上海

英文

294-298

2009-11-20(万方平台首次上网日期,不代表论文的发表时间)