An Enhanced EM Method of Semi-supervised Classification Based on Naive Bayesian
Semi-supervised learning (SSL) based on Naive Bayesian and Expectation Maximization (EM) combines small limited numbers of labeled data with a large amount of unlabeled data to help train classifier and increase classification accuracy. With the aim of improving the efficiency problem of the basic EM algorithm, an enhanced EM method is proposed. Firstly, a feature selection function of strong category information is constructed to control the dimension of feature vector and preserve useful feature terms. Secondly, an intermediate classifier gradually transfers unlabeled documents of maximum posterior category probability to labeled collection during each iteration process of the EM algorithm. The iteration number of the enhanced EM is obviously less than the basic EM. Finally, experiments shows that the improved method obtains very effective performance in terms of macro average accuracy and algorithm efficiency.
Semi-supervised classification feature selection enhanced EM Naive Bayesian
WEN Han XIAO Nan-feng LI Zhao
School of Computer Science and Engineering, South China University of Technology Guangzhou 510006, P School of Computer Science and Engineering, South China University of Technology Guangzhou 510006, P Department of Computer Science, University of Vermont Burlington, 05405, United States
国际会议
上海
英文
1027-1031
2011-07-26(万方平台首次上网日期,不代表论文的发表时间)