An Algorithm for Selecting Chinese features based on TF-NIDF weight
This article discusses the problem of selecting Chinese features based on TF-IDF weight in text categorization. TF-IDF weight is commonly used in text categorization for its simplexes. However, it can not express the relationship between a feature appearance frequency in one class and appearance frequency in other classes. To solve the problem, we designed TF-NIDF weighting method to express the relationship and computer feature weight. We also incorporated the weight into Naǐve Bayesian classifier and tested it on Chinese text data. Experiments showed that Naǐve Bayesian classifier with features selection based on TF-NIDF weight have a higher categorization precision than Naǐve Bayesian classifier with features selection based on traditional TF-IDF weight.
Text Categorization Feature Weight TF-IDF TF-NIDF
Li Yongli Liu Yanheng Shi Mo Dong Liyan Li Zhen Liu Lixiang Yan Pengfei
College of Computer Science and Technology,Jilin University,Changchun,130012,China School of Compute College of Computer Science and Technology,Jilin University,Changchun,130012,China
国际会议
2010 IEEE信息与自动化国际会议(ICIA 2010)
哈尔滨
英文
1-6
2010-06-20(万方平台首次上网日期,不代表论文的发表时间)