A Hybrid Feature Selection Method for Data Sets of thousands of Variables

摘要：

Feature selection has become the focus of research areas of applications with datasets of thousands of variables. In this study we present a hybrid feature selection (HFS) method that adopts both filter and wrapper models of feature subset selection. In the first stage of the feature selection, we use the filter model to rank the features by the mutual information (MI) between each feature and each class, and then choose k highest relevant features to the classes. In the second stage, we complete a wrapper model based feature selection algorithm, which uses Shepley value to evaluate the contribution of features to the classification task in a feature subset Experimental results show obviously that the HFS method obtains better classification performance than solo Shepley value based or solo MI based feature selection method.

关键词： feature selection Shepley value mutual information

作者: Jihong Liu Guoxiong Wang

作者单位: College of Information Science and Engineering Northeastern University Shenyang, China Political Department Liaoning Military Region Shenyang, China

会议类型: 国际会议

会议名称: The 2nd IEEE International Conference on Advanced Computer Control(第二届先进计算机控制国际会议 ICACC 2010)

会议地点: 沈阳

会议语种:英文

页码: 288-291

在线出版日期: 2010-03-27（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Hybrid Feature Selection Method for Data Sets of thousands of Variables