会议专题

AN EFFICIENT TEXT CLASSIFICATION RULE EXTRACTION METHOD BASED ON χ VALUE AND ROUGH SET

In this paper we propose a text classification rule extraction method, which is more efficient and more practical than existing similar methods. The definition of a proximate rule is first by given based on the characteristic of text classification rule extraction. Based on the χ values, the features of text set are selected and feature significance information is provided for the further feature selection. Then rough set is used to further reduce the features on the discrete decision table. Finally precise rules or proximate rules are extracted by using rough set theory. The method combines an improved χ2 value feature selection and rough set theory fully so as to avoid both feature reduction on a large scale decision table and the discretization of the decision table. The method greatly improves the efficiency and the practicability of the extracted text rule. Experimental results demonstrate the effectiveness of the method.

CHI value feature selection rough set text classification rule

YE WANG MING-CHUN WANG

Management School, Tianjin university, Tianjin 300072, China;Sports school, Tianjin Normal universit Mathematics Department, Tianjin university of Education and Technology, Tianjin 300222,China

国际会议

2006 International Conference on Machine Learning and Cybernetics(IEEE第五届机器学习与控制论坛)

大连

英文

1552-1557

2006-08-13(万方平台首次上网日期,不代表论文的发表时间)