SAILER: An Effective Search Engine for Unified Retrieval of Heterogeneous XML and Web Documents
This paper studies the problem of unified ranked retrieval of heterogeneous XML documents and Web data. We propose an effective search engine called Sailer to adaptively and versatilely answer keyword queries over the heterogenous data. We model the Web pages and XML documents as graphs. We propose the concept of pivotal trees to effectively answer keyword queries and present an effective method to identify the top-k pivotal trees with the highest ranks from the graphs. Moreover, we propose effective indexes to facilitate the effective unified ranked retrieval. We have conducted an extensive experimental study using real datasets, and the experimental results show that Sailer achieves both high search efficiency and accuracy, and outperforms the existing approaches significantly.
Keyword Search XML Web Pages Unified Keyword Search
Guoliang Li Jianhua Feng Jianyong Wang Xiaoming Song Lizhu Zhou
Department of Computer Science and Technology, Tsinghua University, Beijing 100084, P. R. China
国际会议
第十七届国际万维网大会(the 17th International World Wide Web Conference)(WWW08)
北京
英文
2008-04-21(万方平台首次上网日期,不代表论文的发表时间)