An Approach Based on Tongyici Cilin and Word Similarity for Chinese Word Sense Induction
This paper presents a new approach of automatic unsupervised Word Sense Induction. This approach is based on Tongyici Cilin (a Chinese synonym dictionary). First we extract the neighbor words with consideration of the POS tags. Then, we calculate the word similarity according to the semantic code mapped in Tongyici Cilin (Extended). Finally, we design a greedy algorithm to accomplish clustering. The experimental results indicate that our approach is very potentially promising based on the benchmark data set provided by CIPS and SIGHAN. The work has an important significance for the usage of this thesaurus in Word Similarity Calculation and Word Sense Disambiguation.
Word Sense Induction Tongyici Cilin cooccurrence word word similarity
Rui Sun Peng Jin Yihao Zhang
Laboratory of Intelligent Information Processing and Application Leshan Teachers College Leshan, China
国际会议
成都
英文
159-162
2010-12-17(万方平台首次上网日期,不代表论文的发表时间)