The Study of KPCA Active Learning Method based on the ROC Curve
During the course of ship machinery condition monitoring,, it is universal that the number of normal condition samples is greater than the number of fault samples because the factors of test difficult or expensive testing costs. In the passive learning method to train the classifier using the all normal samples, it will not only lead to too much training time and even NP problem for some machine learning methods, but also each sample has different impacts for classifier model because noise and other factors, which would lead to the phenomenon that the classifier generalization performance degradation when the bad training samples are too many. A KPCA active learning method based on the Receiver Operating Characteristic (ROC) curve is proposed in this paper. In this method, the significance of each training sample is evaluated by the KPCA mehtod, and then a sequence comprised all training samples is obtained based on the significance. Selecting the foregoing samples compose the train sets to train the calssifier and evaluate the performance based on the Receiver Operating Characteristic curve step by step. At last, when the area under the ROC curve (AUC) of the data set is biggest, the data set is selected as the optimal training sample set, which is the finally result of the KPCA active learning method based on the ROC curve. The experiment results of 1:1 Cabin model show that the method is feasible and effective.
Active learning Kernal principal component analysis (KPCA) Receiver Operating Characteristic curve
Cui Li-Lin Zhu Hai-chao Zhang Lin-ke Ma Chao
Institution of Noise & Vibration Naval University of Engineering Wuhan,China
国际会议
上海
英文
284-287
2010-06-22(万方平台首次上网日期,不代表论文的发表时间)