会议专题

Application of Web Page Classification in a Domain-specific Search Engine

  Automatic web page classification can be used in domain-specific search engines to help users get the specific information more conveniently and precisely on Intcrnet.The semantic similarity and noisy data in domain-specific web pages make traditional classifier perform poorly on them.In this paper,a dictionary-based muitilingual web page classification method is proposed to try to improve the classification performance.A domain-specific dictionary is constructed in the method to intensify the domain-specific knowledge in the pages.An automatic encoding detection and integration method is also introduced in the classifier to extract Chinese and English information precisely from the multilinguai pages.After verified in the experiments,the method is integrated into a real domain-specific search engine where it shows good effectiveness.

Web page classification Search engine Domain-specific knowledge Dictionary

Chunyan Liang

North China Electric Power University

国际会议

2012 2nd International Conference on Computer Application and System Modeling(2012第二届计算机应用与系统建模国际会议)(ICCASM-2012)

沈阳

英文

568-570

2012-07-27(万方平台首次上网日期,不代表论文的发表时间)