会议专题

Entropy-based Clustering for Improving Document Re-ranking

Document re-ranking locates between initial retrieval and query expansion in information retrieval system. In this paper, we propose entropy-based clustering approach for document re-ranking. The value of within-cluster entropy determines whether two classes should be merged, and the value of between-cluster entropy determines how many clusters are reasonable. What to do next is finding a suitable cluster from clustering result to construct pseudo labeled document, and conduct document re-ranking as our previous method. We focus clustering strategy for documents after initial retrieval. Experiment with NTCIR-5 data show that the approach can improve the performance of initial retrieval, and it is helpful for improving the quality of document re-ranking.

component Information Retrieval Document re-ranking Clustering within-cluster entropy between-cluster entropy

Chong Teng Yanxiang He Donghong Ji Cheng zhou Yixuan Geng Shu Chen

Computer School Wuhan University Wuhan,China School of Mathematics and Statistics Wuhan University Wuhan.China International School of Software Wuhan University Wuhan,China

国际会议

2009 IEEE International Conference on Intelligent Computing and Intelligent Systems(2009 IEEE 智能计算与智能系统国际会议)

上海

英文

2477-2481

2009-11-20(万方平台首次上网日期,不代表论文的发表时间)