Building Powerful Dependency Parsers for Resource-Poor Languages

摘要：

　　In this paper,we present an approach to building dependency parsers for the resource-poor languages without any annotated resources on the target side.Compared with the previous studies,our approach requires less human annotated resources.In our approach,we first train a POS tagger and a parser on the source treebank.Then,they are used to parse the source sentences in bilingual data.We obtain auto-parsed sentences(with POS tags and dependencies)on the target side by projection techniques.Based on the fully projected sentences,we can train a base POS tagger and a base parser on the target side.But most of sentence pairs are not fully projected,so we get lots of partially projected sentences.To make full use of partially projected sentences,we implement a learning algorithm to train POS taggers,which leads to better parsing performance.We further exploit a set of features from the large-scale monolingual data to help parsing.Finally,we evaluate our proposed approach on Google Universal Treebank(v2.0,standard).The experimental results show that the proposed approach can significantly improve parsing performance.

关键词： dependency parsing POS tagging resource-poor languages bilingual data

作者: Junjie Yu Wenliang Chen Zhenghua Li Min Zhang

作者单位: School of Computer Science and Technology,Soochow University,Suzhou 215006,Jiangsu,China

会议类型: 国际会议

会议名称: 第五届自然语言处理与中文计算会议(NLPCC-ICCPOL2016)

会议地点: 昆明

会议语种:英文

页码: 1-12

在线出版日期: 2016-12-02（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Building Powerful Dependency Parsers for Resource-Poor Languages