会议专题

A Good All-around Semi-supervised Learning Algorithm for Information Categorization

The paper reports a study on information categorizing based on high efficient feature selection and comprehensive semi-supervised learning algorithm. Feature selections or conversions are performed using maximum mutual information including linear and nonlinear feature conversions. Entropy is made use of and extended to find right features commendably with machine learning method. Fuzzy Partition Clustering Method is presented and used to obtain a few labeled samples and some external clusters automatically by measuring the similarity of clustering correlation documents. So categorization bases are found for supervised learning. Furthermore, Naive Bayes augment learning is combined to design and learn categorizers. And the approach of estimating the loss of classifying error facilitates to balance the selection of candidates. The all-around learning algorithm can greatly improve the precision and efficiency of web information categorization.

component web information categorization dimensionality reduction fuzzy clustering

Lizhen Liu Hai Chen Chao Du

Information Engineering College CNU Beijing,China

国际会议

2009 IEEE International Conference on Intelligent Computing and Intelligent Systems(2009 IEEE 智能计算与智能系统国际会议)

上海

英文

299-302

2009-11-20(万方平台首次上网日期,不代表论文的发表时间)