会议专题

A Span-Based Distantly Supervised NER with Self-learning

  The lack of labeled data is one of the major obstacles for named entity recognition(NER).Distant supervision is often used to alleviate this problem,which automatically generates annotated train-ing datasets by dictionaries.However,as far as we know,existing distant supervision based methods do not consider the latent entities which are not in dictionaries.Intuitively,entities of the same type have the simi-lar contextualized feature,we can use the feature to extract the latent entities within corpuses into corresponding dictionaries to improve the performance of distant supervision based methods.Thus,in this paper,we propose a novel span-based self-learning method,which employs span-level features to update corresponding dictionaries.Specifically,the pro-posed method directly takes all possible spans into account and scores them for each label,then picks latent entities from candidate spans into corresponding dictionaries based on both local and global features.Extensive experiments on two public datasets show that our proposed method performs better than the state-of-the-art baselines.

Name entity recognition Distant supervision Span-level-Self-learning

Hongli Mao Hanlin Tang Wen Zhang Heyan Huang Xian-Ling Mao

School of Computer Science and Technology,Beijing Institute of Technology,Beijing,China Huazhong University of Science and Technology,Wuhan,China

国际会议

9th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC 2020)

郑州

英文

192-203

2020-10-14(万方平台首次上网日期,不代表论文的发表时间)