会议专题

Missing Data Processing Based on Neural Network and AdaBoost

Missing data is a common problem in data quality. Such data are generally ignored or simply substituted in classification problem, which will affect the performance of a classifier. In the paper an innovative framework RBPAdaBoost for handling with missing features values in classification is presented. This framework is composed of two parts: predicting the missing values and classifying the data including predicted missing values. Back-propagation algorithm (BP) is adopted to predict missing value firstly, and Adaptive Boosting (AdaBoost) as a methodology of aggregation of many weak classifiers into one strong classifier is used in classifying predicted missing data. We carry out experiments with nine UCI datasets to evaluate the effect on classification error rate of four general methods and the prediction model of BP. Experimental results show that the classification rate of the proposed new framework RBPAdaBoost is increased 6.4% to 23.69% comparing with other methods. The performance of missing data treatment model is considered to be effective.

Miao Zhi-Min Pan Zhi-Song Hu Gu-Yu

国际会议

2007年IEEE灰色系统与智能服务国际会议(2007 IEEE International Conference on Grey Systems and Intelligent Services)

南京

英文

2007-11-18(万方平台首次上网日期,不代表论文的发表时间)