Research on SVM-Based Automatic Classification of Chinese Web Page

摘要：

This paper deals with Chinese web page classification based on support vector machine (SVM). Some methods are proposed for text extraction and Chinese word segment. And it discusses the different contribution to text classification on different locations of the page. The SVM classifier is applied on classification. The results showed that the performance of the classification has further improved, for the text without noisy blocks after extraction, high correct rate of Chinese word segment In addition, picking the title, keywords and description out, and increasing its weighs can also improve the accuracy of classification.

作者: Jie Song Yanque Liu Nana Li Junhua Gu

作者单位: College of Computer Science and Software, Hebei University of Technology, Tianjin 300401, China

会议类型: 国际会议

会议名称: Third International Symposium on Intelligence Computation and Applications(ISICA 2008)(第三届智能自动化、计算与制造国际研讨会)

会议地点: 武汉

会议语种:英文

页码: 160-164

在线出版日期: 2008-12-19（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Research on SVM-Based Automatic Classification of Chinese Web Page