Natural-Annotation-based Unsupervised Construction of Ko-rean-Chinese Domain Dictionary
The large-scale bilingual parallel resource is significant to statistical learn-ing and deep learning in natural language processing.This paper addresses the auto-matic construction issue of the Korean-Chinese domain dictionary,and presents a novel unsupervised construction method based on the natural annotation in the raw corpus.We firstly extract all Korean-Chinese word pairs from Korean texts according to natural annotations,secondly transform the traditional Chinese characters into the simplified ones,and finally distill out a bilingual domain dictionary after retrieving the simplified Chinese words in an extra Chinese domain dictionary.The experimental re-sults show that our method can automatically build multiple Korean-Chinese domain dictionaries efficiently.
Wuying Liu Lin Wang
Laboratory of Language Engineering and Computing,Guangdong University of For-eign Studies,Guangzhou Xianda College of Economics and Humanities,Shanghai International Studies Uni-versity,Shanghai 20008
国际会议
上海
英文
1-7
2017-12-28(万方平台首次上网日期,不代表论文的发表时间)