A Chinese text corrector based on seq2seq model
In this paper,we build a Chinese text corrector which can correct spelling mistakes precisely in Chinese texts.Our motivation is inspired by the recently proposed seq2seq model which consider the text corrector as a sequence learning problem.To begin with,we propose a biased-decoding method to improve the bilingual evaluation understudy(BLEU)score of our model.Secondly,we adopt a more reasonable OOV token scheme,which enhances the robustness of our correction mechanism.Moreover,to test the performance of our proposed model thoroughly,we establish a corpus which includes 600,000 sentences from news data of Sogou Labs.Experiments show that our corrector model can achieve better corrector results based on the corpus.
natural language processing Chinese text corrector seq2seq model biased-decoding
Sunyan Gu Fei Lang
College of Automation Nanjing University of Posts and Telecommunications Nanjing,China College of Telecommunications and Information Engineering Nanjing University of Posts and Telecommun
国际会议
南京
英文
322-325
2017-10-12(万方平台首次上网日期,不代表论文的发表时间)