A Novel Composite Kernel for Finding Similar Questions in CQA Services
Finding similar questions in Community Question Answer ing (CQA) services plays more and more important role in current web and IR applications. The task aims to retrieve historical questions that are similar or relevant to new questions posed by users. However, tradi tional bag-of-words based models would fail to measure the similarity between question sentences, as they usually ignore sequential and syn tactic information. In this paper, we propose a novel composite kernel to improve the accuracy in question matching. Our study illustrate that the composite kernel can efficiently capture both lexical semantics and syntactic information in a question sentence by leveraging word sequence kernel, POS tag sequence kernel and syntactic tree kernel. Experimental results on real world datasets show that our proposed method signifi cantly outperforms the state-of-the-art models.
question answering similarity measure tree kernel string kernel
Jun Wang Zhoujun Li Xia Hu Biyun Hu
School of Computer Science and Engineering Beihang University, 100191 Beijing, China School of Computing, National University of Singapore, 117590, Singapore
国际会议
11th International Conference,WAIM 2010(第十一届网络时代管理国际会议)
九寨沟
英文
608-619
2010-07-14(万方平台首次上网日期,不代表论文的发表时间)