会议专题

Web Document Classification based on SVM

With the rapid growth of web information, web document classification has become an important research field for the management of Internet information. Most of the existing methods are based on traditional statistics and they are effective only when the sample size tends to be infinite.They may not work well in practical case with limited samples and it will easily lead to the problem of over-fitting.In order to effectively classify web pages, the paper studies the approach of web document classification in Vector Space Model and feature extraction, and analysis the selection of kernel functions. Based on Support Vector Machine (SVM), a web document classification model and algorithm is proposed. The experiment shows that it can not only improve the training efficiency, but also has good precision.

Web document classification Support vector machines Kernel function Feature selection, Statistical learning theory

Qiang Niu Zhixiao Wang Dai Chen

School of Computer Science and Technology, China University of Mining and Technology.XuZhou, JiangSu, 221008, CHINA

国际会议

2006 International Symposium on Distributed Computing and Applications to Business,Engineering and Science(2006年国际电子、工程及科学领域的分布式计算应用学术研讨会)

杭州

英文

619-622

2006-10-12(万方平台首次上网日期,不代表论文的发表时间)