Neural Chinese Word Segmentation with Dictionary Knowledge
Chinese word segmentation(CWS)is an important task for Chinese NLP.Recently,many neural network based methods have been proposed for CWS.However,these methods require a large number of labeled sentences for model training,and usually cannot utilize the useful information in Chinese dictionary.In this paper,we propose two methods to exploit the dictionary information for CWS.The first one is based on pseudo labeled data generation,and the second one is based on multi-task learning.The experimental results on two benchmark datasets validate that our approach can effectively improve the performance of Chinese word segmentation,especially when training data is insufficient.
Chinese word segmentation Dictionary Neural network
Junxin Liu Fangzhao Wu Chuhan Wu Yongfeng Huang Xing Xie
Department of Electronic Engineering,Tsinghua University,Beijing,China Microsoft Research Asia,Beijing,China
国际会议
2018自然语言处理与中文计算国际会议(NLPCC2018)
呼和浩特
英文
80-91
2018-08-26(万方平台首次上网日期,不代表论文的发表时间)