Design of Chinese Tezt Categorization Classifier Based on Attribute Bagging
In order to improve the precise rate and recall rate of Chinese text classifier, an improved bagging algorithm-attribute bagging is used in this paper. Document is represented by vector space model and Information Gain is used to do the feature selection. Re-sampling attributes is used to get multiple training sets and the kNN is selected as the individual classifier. The classification result is attained by voting. Experiments show that the attribute bagging gets lower errors and better performance than bagging and kNN in Chinese text categorization.
Chinese tezt categorizatum attribute bagging vector space model information gain
Xiang Zhang Mingquan Zhou Lili Dong Na Ye
College of Information Science and Technology Northwest University Xian, China College of Informati College of Information Science and Technology Beijing Normal University Beijing, China College of Information and Control Engineering Xian University of Architecture and Technology Xian
国际会议
北京
英文
201-204
2009-07-24(万方平台首次上网日期,不代表论文的发表时间)