Study on Tibetan Word Segmentation as Syllable Tagging
Tibetan word segmentation (TWS) is the basic problem for Tibetan natural language processing.The paper reformulates the segmentation as a syllable tagging problem, and studies the performance of TWS with different sequence labeling models.Experimental results show that, the TWS system with conditional random field achieves the best performance in the condition of current 4-tag set, at the same time, the other models achieve good results too.All the above show that, the segmentation as a syllable tagging problem that is an efficient approach to deal with TWS.
Tibetan word segmentation sequence label
Yachao Li Hongzhi Yu
Key Lab of Chinese National Linguistic Information Technology, Northwest University for Nationalities, Lanzhou, China 730030
国际会议
Second CCF Conference,NLPCC2013(第二届自然语言处理与中文计算会议)
重庆
英文
363-369
2013-11-15(万方平台首次上网日期,不代表论文的发表时间)