A Feature-Rich CRF Segmenter for Chinese Micro-blog
This paper describes our system for Chinese word segmentation of micro-blog text,one of the NLPCC-ICCPOL 2016 Shared Tasks 1.The CRF(Conditional Random Field)model is employed to model word segmentation as a sequence labeling problem,7 sets of features are selected to train the CRF model.The system achieves fb 0.798144 on closed track,0.81968 on semi-open track,and 0.82217 on open track with weighted measures 2.
Chinese word segmentation on Micro-blog Sequence labeling CRF
Yabin Leng Weiwei Liu Sheng Wang Xiaojie Wang
School of Computer Science,Beijing University of Posts and Telecommunications,China,100876
国际会议
第五届自然语言处理与中文计算会议(NLPCC-ICCPOL2016)
昆明
英文
1-8
2016-12-02(万方平台首次上网日期,不代表论文的发表时间)