Divergence-Based Feature Selection for Na(i)e Bayes Text Classification
A new divergence-based approach to feature selection for na?ve Bayes text classification is proposed in this paper. In this approach, the discrimination power of each feature is directly used for ranking various features through a criterion named overall-divergence, which is based on the divergence measures evaluated between various class density function pairs. Compared with other state-of-the-art algorithms (e.g. IG and CHI), the proposed approach shows more discrimination power for classifying confusing classes, and achieves better or comparable performance on evaluation data sets.
Divergence-based feature selection text classification overall-divergence
Huizhen Wang Jingbo Zhu Keh-Yih Su
Natural Language Processing Laboratory,Northeastern University,Shenyang,Liaoning,China Behavior Design Corporation Hsinchu,TAIWAN,R.O.C.
国际会议
北京
英文
2008-10-19(万方平台首次上网日期,不代表论文的发表时间)