A COMPREHENSIVE STUDY OF THE EFFECT OF CLASS IMBALANCE ON THE PERFORMANCE OF CLASSIFIERS
Class imbalance is one of the significant issues which affect the performance of classifiers. In this paper we systematically analyze the effect of class imbalance on some standard classification algorithms. The study is performed on benchmark datasets, in relationship with concept complexity, size of the training set, and ratio between number of instances and number of attributes of the training set data. In the evaluation we considered six different metrics. The results indicate that the multilayer perceptron is the most robust to the imbalance in training data, while the support vector machines performance is the most affected. Also, we found that unpruned C4.5 models work better than the pruned versions.
Class imbalance Metrics Classifiers Comprehensive study
Rodica Potolea Camelia Lemnaru
Computer Science Department, Technical University of Cluj-Napoca, 26 Baritiu st., Cluj-Napoca, Romania
国际会议
13th International Conference on Enterprise Information System(第13届企业信息系统国际会议 ICEIS 2011)
北京
英文
1914-1921
2011-06-08(万方平台首次上网日期,不代表论文的发表时间)