Analysis of Text Classifier and the Improvement of KNN
In the field of data mining, text classifier is a widely used tool. Naive Bayes Classifier (NBC), KNearest Neighbors (KNN) and Support Vector Machine (SVM) are very mature algorithms of them. This paper briefly introduces the principles of work of the three classifiers and the basic algorithm of KNN. Especially, we put forward two measures to improve the performance of KNN. These measures include the optimized algorithm to determine the unknown texts sort by calculating the means and the method to classify the text which doesnt belong to any sort in the train set. And these measures are validated by experimentations.
text classifier distance KNN
Yuqing Zhang Kexian Wu Xin Chen
School of Information Engineering, China University of Geosciences, Beijing, China
国际会议
哈尔滨
英文
307-311
2012-05-19(万方平台首次上网日期,不代表论文的发表时间)