会议专题

An Approach Based on Tongyici Cilin and Word Similarity for Chinese Word Sense Induction

This paper presents a new approach of automatic unsupervised Word Sense Induction. This approach is based on Tongyici Cilin (a Chinese synonym dictionary). First we extract the neighbor words with consideration of the POS tags. Then, we calculate the word similarity according to the semantic code mapped in Tongyici Cilin (Extended). Finally, we design a greedy algorithm to accomplish clustering. The experimental results indicate that our approach is very potentially promising based on the benchmark data set provided by CIPS and SIGHAN. The work has an important significance for the usage of this thesaurus in Word Similarity Calculation and Word Sense Disambiguation.

Word Sense Induction Tongyici Cilin cooccurrence word word similarity

Rui Sun Peng Jin Yihao Zhang

Laboratory of Intelligent Information Processing and Application Leshan Teachers College Leshan, China

国际会议

2010 International Conference on Information Security and Artificial Intelligence(2010年信息安全与人工智能国际会议 ISAI 2010)

成都

英文

159-162

2010-12-17(万方平台首次上网日期,不代表论文的发表时间)