Building Powerful Dependency Parsers for Resource-Poor Languages
In this paper,we present an approach to building dependency parsers for the resource-poor languages without any annotated resources on the target side.Compared with the previous studies,our approach requires less human annotated resources.In our approach,we first train a POS tagger and a parser on the source treebank.Then,they are used to parse the source sentences in bilingual data.We obtain auto-parsed sentences(with POS tags and dependencies)on the target side by projection techniques.Based on the fully projected sentences,we can train a base POS tagger and a base parser on the target side.But most of sentence pairs are not fully projected,so we get lots of partially projected sentences.To make full use of partially projected sentences,we implement a learning algorithm to train POS taggers,which leads to better parsing performance.We further exploit a set of features from the large-scale monolingual data to help parsing.Finally,we evaluate our proposed approach on Google Universal Treebank(v2.0,standard).The experimental results show that the proposed approach can significantly improve parsing performance.
dependency parsing POS tagging resource-poor languages bilingual data
Junjie Yu Wenliang Chen Zhenghua Li Min Zhang
School of Computer Science and Technology,Soochow University,Suzhou 215006,Jiangsu,China
国际会议
第五届自然语言处理与中文计算会议(NLPCC-ICCPOL2016)
昆明
英文
1-12
2016-12-02(万方平台首次上网日期,不代表论文的发表时间)