会议专题

The Imbalanced Problem in Mass-spectrometry Data Analysis

In many cases, protein mass-spectrometry data are imbalanced, i.e. the number of positive examples is much less than that of negative ones, which generally degrade the performance of classifiers used for protein recognition. Despite its importance, few works have been conducted to handle this problem. In this paper, we present a new method that utilizes the EasyEnsemble algorithm to cope with the imbalance problem in mass-spectrometry data. Furthermore, two feature selection algorithms, namely PREE (Prediction Risk based feature selection for EasyEnsemble) and PRIEE (Prediction Risk based feature selection for Individuals of EasyEnsemble), are proposed to select informative features and improve the performance of the EasyEnsemble classifier. Experimental results on three mass spectra data sets demonstrate that the proposed methods outperform two existing filter feature selection methods, which prove the effectiveness of the proposed methods.

Mass-spectrometry Feature selection Ensemble

Hao-Hua Meng Guo-Zheng Li Rui-Sheng Wang Xing-Ming Zhao Luonan Chen

School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China Department of Control Science and Engineering, Tongji University, Shanghai 201804, China School of Information, Renmin University of China, Beijing 100872, China Institute of System Biology, Shanghai University, Shanghai 200444, China

国际会议

The Second International Symposium(OSB08)(第二届国际优化及系统生物学学术会议)

云南丽江

英文

136-143

2008-10-31(万方平台首次上网日期,不代表论文的发表时间)