Efficient Maintenance Scheme of Inverted Index for Large-scale Full-Text Retrieval
Inverted index is the mainstay of modern full-text retrieval systems, and it is a promising way to improve time and space efficiencies with appropriately maintenance scheme of inverted files for huge amount of information management and retrieval. In order to improve the retrieval performance of inverted index in large-scale fulltext systems, a time and space efficient random access blocked inverted index (RABI) and an efficient dynamic maintenance scheme (DMS) are proposed in this paper. RABI divides inverted list into blocks and compresses different part of each block with the corresponding compression method to decrease space consumption. Based on RABI, DMS distinguishes between long and short posting lists. Then short posting lists are updated by remerge strategy and long posting lists are updated by hybrid in-place and remerge strategy. Experimental results show that, compared with existed schemes, the proposed scheme greatly averagely reduces space cost, conjunctive Boolean query time, and the cost of on-line index construction.
information retrieval inverted index index maintenance
Xiaozhu Liu
State Key Lab of Software Engineering Wuhan University Wuhan 430072, China School of Automation Wuhan University of Technology Wuhan 430070, China
国际会议
武汉
英文
565-570
2010-05-21(万方平台首次上网日期,不代表论文的发表时间)