会议专题

RACE: Finding and Ranking Compact Connected Trees for Keyword Proximity Search over XML Documents

In this paper, we study the problem of keyword proximity search over XML documents and leverage the efficiency and effectiveness. We take the disjunctive semantics among input keywords into consideration and identify meaningful compact connected trees as the answers of keyword proximity queries. We introduce the notions of Compact Lowest Common Ancestor (CLCA) and Maximal CLCA (MCLCA) and propose Compact Connected Trees (CCTrees) and Maximal CCTrees (MCCTrees) to efficiently and effectively answer keyword queries. We propose a novel ranking mechanism, RACE, to Rank compAct Connected trEes, by taking into consideration both the structural similarity and the textual similarity. Our extensive experimental study shows that our method achieves both high search effciency and effectiveness, and outperforms existing approaches significantly.

Lowest Common Ancestor (LCA) Compact LCA (CLCA) Maximal CLCA(MCLCA)

Guoliang Li Jianhua Feng Jianyong Wang Bei Yu Yukai He

Department of Computer Science and Technology Tsinghua University, Beijing, China School of Computing National University of Singapore, Singapore

国际会议

第十七届国际万维网大会(the 17th International World Wide Web Conference)(WWW08)

北京

英文

2008-04-21(万方平台首次上网日期,不代表论文的发表时间)