Five-Stroke Based CNN-BiRNN-CRF Network for Chinese Named Entity Recognition
Identifying entity boundaries and eliminating entity ambiguity are two major challenges faced by Chinese named entity recognition researches.This paper proposes a five-stroke based CNN-BiRNN-CRF network for Chinese named entity recognition.In terms of input embeddings,we apply five-stroke input method to obtain stroke-level representations,which are concatenated with pre-trained character embeddings,in order to explore the morphological and semantic information of characters.Moreover,the convolutional neural network is used to extract n-gram features,without involving hand-crafted features or domainspecific knowledge.The proposed model is evaluated and compared with the state-of-the-art results on the third SIGHAN bakeoff corpora.The experimental results show that our model achieves 91.67%and 90.68%F1-score on MSRA corpus and CityU corpus separately.
CNN-BiRNN-CRF network Stroke-level representations N-gram features Chinese named entity recognition
Fan Yang Jianhu Zhang Gongshen Liu Jie Zhou Cheng Zhou Huanrong Sun
School of Electric Information and Electronic Engineering,Shanghai Jiaotong University,Shanghai,Chin SJTU-Shanghai Songheng Content Analysis Joint Lab,Shanghai,China
国际会议
2018自然语言处理与中文计算国际会议(NLPCC2018)
呼和浩特
英文
184-195
2018-08-26(万方平台首次上网日期,不代表论文的发表时间)