Exploiting External Knowledge Sources to Improve Kernel-based Word Sense Disambiguation

摘要：

This paper proposes a novel approach to improve the kernel-based Word Sense Disambiguation (WSD). We first explain why linear kernels are more suitable to WSD and many other natural language processing problems than translation-invariant kernels. Based on the linear kernel, two external knowledge sources are integrated. One comprises a set of linguistic rules to find the crucial features. For the other, a distributional similarity thesaurus is used to alleviate data sparseness by generalizing crucial features when they do not match the word-form exactly. The experiments show that we have outperformed the state-of-the-art system on the benchmark data from English lexical sample task of SemEval-2007 and the improvement is statistically significant.

关键词： word sense disambiguation kernel based method support vector machine

作者: Peng Jin Fuxin Li Danqing Zhu Yunfang Wu Shiwen Yu

作者单位: Institute of Computational Linguistics,Peking University Beijing,China Institute of Automation,Chinese Academy of Sciences Beijing,China

会议类型: 国际会议

会议名称: The 2008 IEEE International Conference on Natural Language Processing and Knowledge Engineering(IEEE NLP-KE 2008)(2008IEEE自然语言处理与知识工程国际会议)

会议地点: 北京

会议语种:英文

在线出版日期: 2008-10-19（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Exploiting External Knowledge Sources to Improve Kernel-based Word Sense Disambiguation