会议专题

IMBALANCED DISTRIBUTION ANALYSIS BASED ON TELECOM DATA

The proportion of telecom churning customers in the database was very small, so building a data mining model of customer churn based on the data set would result in inaccurate forecasting. Based on this situation, this paper put forward a sampling algorithm which mixed over-sampling algorithm and under-sampling algorithm together. Firstly, set an initial sample classification, then put fault training samples and a part of right samples in the minority class into this classification and completed the training sample by fixed regulation and executed the samples repeatedly. Lastly, use the actual data of customers of telecom industry to compare the new algorithm with over-sampling method and under-sampling method. The experimental results show that the new algorithm performs better than the others.

unbalanced distribution over-sampling under-sampling

Wenting Zhang Hong Li

School of Economics and Management, Beihang University, Beijing 100191, China

国际会议

The Tneth International Conference on Industrial Management(第十届工业管理国际会议 ICIM 2010)

北京

英文

392-394

2010-09-16(万方平台首次上网日期,不代表论文的发表时间)