Automatic Tezt Classification using Modified Centroid Classifier
This work proposes an approach to address the problem of inductive bias or model misfit incurred by the centroid classifier assumption to enhance the automatic text classification task. This approach is a trainable classifier, which takes into account tfidf as a text feature. The main idea of the proposed approach is to take advantage of the most similar training errors to the classification model to successively update it based on a certain threshold. The proposed approach is simple to implement and flexible. The proposed approach performance is measured at several threshold values on the Reuters-21578 text categorization test collection. The experimental results show that the proposed approach can improve the performance of centroid classifier.
Tezt classification tezt categorization centroid classifier Data mining
Mahmoud Elmarhumy Mohamed Abdel Fattah Fuji Ren
Faculty of Engineering,University of Tokushima 2-1 Minamijosanjima Tokushima, Japan 770-8506 FIE, Helwan University, Cairo, Egypt Beijing University of Posts & Telecommunications Beijing, 100088, China
国际会议
大连
英文
1-4
2009-09-24(万方平台首次上网日期,不代表论文的发表时间)