Cost-aware Learning Rate for Neural Machine Translation
Neural Machine Translation(NMT)has drawn much attention due to its promising translation performance in recent years.The conventional optimiza-tion algorithm for NMT sets a unified learning rate for each gold target word dur-ing training.However,words under different probability distributions should be handled differently.Thus,we propose a cost-aware learning rate method,which can produce different learning rates for words with different costs.Specifically,for the gold word which ranks very low or has a big probability gap with the best candidate,the method can produce a larger learning rate and vice versa.The extensive experiments demonstrate the effectiveness of our proposed method.
Neural Machine Translation Cost-aware Learning Rate
Yang Zhao Yining Wang Jiajun Zhang Chengqing Zong
National Laboratory of Pattern Recognition,Institute of Automation,CAS University of Chinese Academy of Sciences
国内会议
第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会
南京
英文
1-9
2017-10-13(万方平台首次上网日期,不代表论文的发表时间)