Automatic Web Query Classification Using Large Unlabeled Web Pages

摘要：

In this paper, a novel and simple method is employed to automatically construct domain knowledge base for query classification from large-scale web pages. Besides, using context as the feature of words, the resource of relevant words is built automatically in order to extend the users query. On the basis of domain knowledge base and extension of the query using relevant words, satisfactory performance in query classification is achieved. Experimental results demonstrate that our method achieves precision of 77.68％ and recall of 75.34％ in Chinese query classification. In English experiments, in spite of the scarcity of English web pages and absence of stemming, precision achieves 58.83％ and recall achieves 54.13％, which is a great improvement compared to state-of-the-art query classification algorithms.

作者: Yu Jingbo Ye Na

作者单位: Beijing Institute of Graphic Communication,Beijing Beijing,P.R.China,102600 Institute of Computer Software and Theory,College of Information Science and Engineering,Northeaster

会议类型: 国际会议

会议名称: The Ninth International Conference on Web-Age Information Management(第九届web时代信息管理国际会议)(WAIM 2008)

会议地点: 张家界

会议语种:英文

在线出版日期: 2008-07-20（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Automatic Web Query Classification Using Large Unlabeled Web Pages