会议专题

HAWK: A Focused Crawler with Content and Link Analysis

Maintaining currency of search engine indices by exhaustive crawling is rapidly becoming impossible due to the increasing size of the web. Focused crawlers aim to search only the subset of the web related to a specific topic, and offer a potential solution to the problem. But it also has problems. The major problem is how to retrieve the maximal set of relevant and quality pages. To address this problem we design a focused crawler (we call it HAWK) that not only uses content of web page to improve page relevance, but also uses link structure to improve the coverage of a specific topic.

search engine focused crawler content link structure

Xiaoyun Chen Xin Zhang

School of Information Science & Engineering, Lanzhou University, PRC 730000

国际会议

AiR08,EM2108,SOAIC08,SIOKM08,BIMA08,DKEEE08(2008IEEE国际电子商务工程学术会议)

西安

英文

677-680

2008-10-22(万方平台首次上网日期,不代表论文的发表时间)