Development and Design of General Data Mining System
In this paper,we focus on top-down discretization methods and propose a new method for supervised discretization based on class-feature correlation by defining a class-feature contingency factor.The proposed method takes into consideration the distribution of all samples to generate an ideal discretization scheme.The method maintains a high interdependence between the target class and the discretized attribute,and avoids overfitting.Empirical evaluation of seven discretization algorithms on UCI real datasets show that the novel algorithm can yield a better discretization scheme that improves the accuracy of decision tree classification.As to the execution time of discretization and the number of generated rules,our approach also achieves promising results.
Data Mining Optimal Design discretization
Chen Baowen
Department of College of Computer Science& Software Engineering,Shenzhen University,Shenzhen,China,518060
国际会议
2015 Information Technology and Mechatronics Engineering Conference (ITOEC 2015)(2015信息技术与机电一体化国际会议)
重庆
英文
120-123
2015-03-28(万方平台首次上网日期,不代表论文的发表时间)