Web mining Based on VIPS in Intention-Based Information Retrieval
This paper introduces a VIPS (Vision-based Page Segmentation) based web mining method which aims to user intents based retrieval. It firstly grasps information from web by making use of large search engines such as Baidu and so on, and then clusters the web pages basing on the intention-related features of web text. The main algorithm is described in detail and experiments are designed to grasp the query in Chinese from Baidu and Ask search engines. The results prove that the VIPS based method can achieve significant improvement comparing with some previous work.
VIPS web mining HTML structure information retrieval
Qiang Zhang Xiaoxiao Jiang Jiashen Sun
Beijing University of Posts and Telecommunications. Beijing,China
国际会议
大连
英文
1-5
2009-09-24(万方平台首次上网日期,不代表论文的发表时间)