会议专题

A Feature Weight Algorithm for Text Classification Based on Class Information

  TFIDF algorithm was used for feature weighting in text classification.But the reault of classification was not very well because of lack of class information in feature Weighting.The known clan information in the training set was used to improve the traditional TFIDF feature weight algorithm.Class distinction ability and class deacription ability were introduced,reapectively expressed by inverse class frequency and term frequency in class,document frequency in class.A new feature weight algorithm besed on class information,TF_IDT,was proposed.Nalve Bayea classifier was used to test the algorithm.The precision,recall and F1 measure were significantly increased.Macro F1 measure raise by 6.46%.It was proved to be useful for improving text clarification to use class information in feature weighting.In addition,the computational complexity of the proposed algorithm was lower and more suitable for use in fields of limited computing capability.

text classification feature weight inverse class frequency term frequency in class document frequency in class

LI Yong-fei

Department of Computer North China Institute of Science and Technology Beijing, China

国际会议

2012 2nd International Conference on Computer and Information Applications(ICCIA2012)(2012第二届计算机和信息应用国际会议)

太原

英文

930-932

2012-12-08(万方平台首次上网日期,不代表论文的发表时间)