会议专题

Recognizing Biomedical Named Entities Based on the Sentence Vector/Twin Word Embeddings Conditioned Bidirectional LSTM

  As a fundamental step in biomedical information extraction tasks,biomedical named entity recognition remains challenging.In recent years,the neural network has been applied on the entity recognition to avoid the complex hand-designed features,which are derived from various linguistic analyses.However,performance of the conventional neural network systems is always limited to exploiting long range dependencies in sentences.In this paper,we mainly adopt the bidirectional recurrent neural network with LSTM unit to identify biomedical entities,in which the twin word embeddings and sentence vector are added to rich input information.Therefore,the complex feature ex-traction can be skipped.In the testing phase,Viterbi algorithm is also used to filter the illogical label sequences.The experimental results conducted on the BioCreative II GM corpus show that our system can achieve an F-score of 88.61%,which outperforms CRF models using the complex hand-designed features and is 6.74%higher than RNNs.

LSTM twin word embeddings sentence vector Viterbi algorithm

Lishuang Li Liuke Jin Yuxin Jiang Degen Huang

School of Computer Science and Technology,Dalian University of Technology,Dalian 116024,Liaoning,China

国内会议

第十五届全国计算语言学学术会议(CCL2016)暨第四届基于自然标注大数据的自然语言处理国际学术研讨会(NLP-NABD-2016)

烟台

英文

1-12

2016-10-14(万方平台首次上网日期,不代表论文的发表时间)