Asymmetric measure for supervised learning models assessment, application to breast cancer detection
Evaluation of supervised learning models as well as decision trees building are mostly done with symmetrical crite-ria. Pragmatically that means that each class of the target attribute has the same importance. However, this is not the case in many practical situations. Thus, obvious examples are strongly imbalanced datasets (computer aided diagnosis, identification of unusual phenomena: frauds, equipments failures...), in these cases the aim is mainly the identification of objects representing the minority class. In these situations, assigning the same importance to each kind of prediction error does not constitute the best solution. We propose in this paper a criterion (that may be used for evaluation of supervised learning models as well as for decision trees building) which takes into account this nonsymmetrical aspect of the importance associated to each class of the target attribute. Afterwards, we pro-pose an evolution of random forests that uses this criterion and which is better adapted to strongly imbalanced datasets. Our experiments concern classical imbalanced datasets as well as results of experimental evaluations obtained within the framework of an industrial application dealing with breast cancer diagnosis. Actually, needs from this latter specific application guided us through the design of this adaptation of random forests.
Artificial intelligence supervised learning models assessment breast cancer detection
Julien THOMAS Pierre-Emmanuel JOUVE Nicolas NICOLOYANNIS
Laboratory ERIC,University Lumi`ere Lyon2,France;Compagny Fenics,Lyon,France Compagny Fenics,Lyon,France Laboratory ERIC,University Lumi`ere Lyon2,France
国际会议
北京
英文
2007-05-30(万方平台首次上网日期,不代表论文的发表时间)