会议专题

A Novel Composite Kernel for Finding Similar Questions in CQA Services

Finding similar questions in Community Question Answer ing (CQA) services plays more and more important role in current web and IR applications. The task aims to retrieve historical questions that are similar or relevant to new questions posed by users. However, tradi tional bag-of-words based models would fail to measure the similarity between question sentences, as they usually ignore sequential and syn tactic information. In this paper, we propose a novel composite kernel to improve the accuracy in question matching. Our study illustrate that the composite kernel can efficiently capture both lexical semantics and syntactic information in a question sentence by leveraging word sequence kernel, POS tag sequence kernel and syntactic tree kernel. Experimental results on real world datasets show that our proposed method signifi cantly outperforms the state-of-the-art models.

question answering similarity measure tree kernel string kernel

Jun Wang Zhoujun Li Xia Hu Biyun Hu

School of Computer Science and Engineering Beihang University, 100191 Beijing, China School of Computing, National University of Singapore, 117590, Singapore

国际会议

11th International Conference,WAIM 2010(第十一届网络时代管理国际会议)

九寨沟

英文

608-619

2010-07-14(万方平台首次上网日期,不代表论文的发表时间)