会议专题

An Algorithm for Selecting Chinese features based on TF-NIDF weight

This article discusses the problem of selecting Chinese features based on TF-IDF weight in text categorization. TF-IDF weight is commonly used in text categorization for its simplexes. However, it can not express the relationship between a feature appearance frequency in one class and appearance frequency in other classes. To solve the problem, we designed TF-NIDF weighting method to express the relationship and computer feature weight. We also incorporated the weight into Naǐve Bayesian classifier and tested it on Chinese text data. Experiments showed that Naǐve Bayesian classifier with features selection based on TF-NIDF weight have a higher categorization precision than Naǐve Bayesian classifier with features selection based on traditional TF-IDF weight.

Text Categorization Feature Weight TF-IDF TF-NIDF

Li Yongli Liu Yanheng Shi Mo Dong Liyan Li Zhen Liu Lixiang Yan Pengfei

College of Computer Science and Technology,Jilin University,Changchun,130012,China School of Compute College of Computer Science and Technology,Jilin University,Changchun,130012,China

国际会议

2010 IEEE信息与自动化国际会议(ICIA 2010)

哈尔滨

英文

1-6

2010-06-20(万方平台首次上网日期,不代表论文的发表时间)