Improving Unsupervised Neural Machine Translation with Dependency Relationships

摘要：

　　Nowadays,neural networks have been widely used in the domain of machine translation(MT)and achieved good results.Neural machine translation(NMT)models need large bilingual parallel corpora to perform training.However,in many languages or domains,such corpora are scarce.Therefore,the technology of unsupervised neural machine translation(UNMT)which does not need bilin-gual parallel corpora attracted wide interest.State-of-the-art UNMT models use Transformer for training and cannot learn the syntactic knowledge from the cor-pora.In this paper,we propose a method to improve UNMT by using dependency relationships extracted from dependency parsing.The extracted dependency rela-tionships are concatenated with the original training data after Byte Pair Encoding(BPE)to obtain new sentence representations for UNMT training.Models that combine dependency relationships allow for a better understanding of the under-lying syntactic structure in sentences and thus affect the quality of UNMT.We leverage linearized parsing trees of the training sentences in order to incorpo-rate syntax into the Transformer architecture without modifying it.Compared with state-of-the-art UNMT method,our method increased the BLEU scores by 5.11 and 9.41 respectively on WMT 2019 English-French and German-English monolingual news corpora with 5 million sentence pairs.

关键词： Unsupervised neural machine translation Dependency parsing Dependency relationship

作者: Jia Xu Na Ye GuiPing Zhang

作者单位: Human-Computer Intelligence Research Center,Shenyang Aerospace University,Shenyang,Liaoning,China

会议类型: 国际会议

会议名称: 9th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC 2020)

会议地点: 郑州

会议语种:英文

页码: 429-440

在线出版日期: 2020-10-14（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Improving Unsupervised Neural Machine Translation with Dependency Relationships