Research on SVM-Based Automatic Classification of Chinese Web Page
This paper deals with Chinese web page classification based on support vector machine (SVM). Some methods are proposed for text extraction and Chinese word segment. And it discusses the different contribution to text classification on different locations of the page. The SVM classifier is applied on classification. The results showed that the performance of the classification has further improved, for the text without noisy blocks after extraction, high correct rate of Chinese word segment In addition, picking the title, keywords and description out, and increasing its weighs can also improve the accuracy of classification.
Jie Song Yanque Liu Nana Li Junhua Gu
College of Computer Science and Software, Hebei University of Technology, Tianjin 300401, China
国际会议
武汉
英文
160-164
2008-12-19(万方平台首次上网日期,不代表论文的发表时间)