会议专题

Improving Unsupervised Neural Machine Translation with Dependency Relationships

  Nowadays,neural networks have been widely used in the domain of machine translation(MT)and achieved good results.Neural machine translation(NMT)models need large bilingual parallel corpora to perform training.However,in many languages or domains,such corpora are scarce.Therefore,the technology of unsupervised neural machine translation(UNMT)which does not need bilin-gual parallel corpora attracted wide interest.State-of-the-art UNMT models use Transformer for training and cannot learn the syntactic knowledge from the cor-pora.In this paper,we propose a method to improve UNMT by using dependency relationships extracted from dependency parsing.The extracted dependency rela-tionships are concatenated with the original training data after Byte Pair Encoding(BPE)to obtain new sentence representations for UNMT training.Models that combine dependency relationships allow for a better understanding of the under-lying syntactic structure in sentences and thus affect the quality of UNMT.We leverage linearized parsing trees of the training sentences in order to incorpo-rate syntax into the Transformer architecture without modifying it.Compared with state-of-the-art UNMT method,our method increased the BLEU scores by 5.11 and 9.41 respectively on WMT 2019 English-French and German-English monolingual news corpora with 5 million sentence pairs.

Unsupervised neural machine translation Dependency parsing Dependency relationship

Jia Xu Na Ye GuiPing Zhang

Human-Computer Intelligence Research Center,Shenyang Aerospace University,Shenyang,Liaoning,China

国际会议

9th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC 2020)

郑州

英文

429-440

2020-10-14(万方平台首次上网日期,不代表论文的发表时间)