会议专题

Machine Learning based Blog Classification

Since the blog service brings a wealth of information resources, blog search and classification are showing their great research value. This paper focuses on the blog classification on the personal vs. official facet. Our system adopts a two-stage strategy; in training model, lexicons are built automatically; in classification model, scoring and ranking are carried out orderly. Our experimental results reveal that feature selection, Mutual Information weighting are good for lexicons with significant results. However, sentiment words can only slightly improve the results.

Blog Classification Machine Learning Feature selection Lexicons Sentiment

Xueji Sun Si Li Weiran Xu Guang Chen Jun Guo

School of Information and Communication Engineering Beijing University of Posts and Telecommunications Beijing, China

国际会议

2011 3rd IEEE International Conference on Computer Research and Development(ICCRD 2011)(2011第三届计算机研究与发展国际会议)

上海

英文

31-34

2011-03-11(万方平台首次上网日期,不代表论文的发表时间)