会议专题

Utilizing Category Relevancy Factor for Text Categorization

One of the main preprocessing steps for having a high performance text classifier is feature weighting. Commonly used feature weighting methods such as TF and IDF-based methods only consider the distribution of a feature in the document(s) and do not consider class information for feature weighting. In this paper, we present TFCRF (Term Frequency and Category Relevancy Factor) method in which the weight of features depends on their power to discriminate the classes from each other by using class information. The results show significant improvement in the performance of SVM algorithm by using TFCRF feature weighting method in comparison to the other implemented standard feature weighting methods.

Feature weighting SVM Text categorization Text rriining

Mina Maleki

Iran Telecommunication Research Center.Tehran. Iran

国际会议

The 2nd International Conference on Software Engineering and Data Mining(IEEE 第二届国际软件工程和数据挖掘学术大会 SEDM 2010)

成都

英文

263-268

2010-06-23(万方平台首次上网日期,不代表论文的发表时间)