会议专题

Bilingual Parallel Active Learning between Chinese and English

  Active learning is an effective machine learning paradigm which can significantly reduce the amount of labor for manually annotating NLP corpora while achieving competitive perfor-mance.Previous studies on active learning are focused on corpora in one single language or two languages translated from each other.This paper proposes a Bilingual Parallel Active Learning paradigm(BPAL),where an instance-level parallel Chinese and English corpus adapted from OntoNotes is augmented for relation extraction and both the seeds and jointly selected unlabeled instances at each iteration are parallel between two lan-guages in order to enhance active learning.Experimental results on the task of relation classification on the corpus demonstrate that BPAL can significantly out-perform monolingual active learning.Moreover,the success of BPAL suggests a new way of annotating parallel corpora for NLP tasks in order to induce two high-performance classifiers in two languages respectively.

Active Learning Parallel Corpus Relation Classification

Longhua Qian JiaXin Liu Guodong Zhou Qiaoming Zhu

Natural Language Processing Lab,Soochow University,Suzhou,Jiangsu,215006 School of Computer Science & Technology,Soochow University,Suzhou,Jiangsu,215006

国际会议

第五届自然语言处理与中文计算会议(NLPCC-ICCPOL2016)

昆明

英文

1-12

2016-12-02(万方平台首次上网日期,不代表论文的发表时间)