会议专题

A Chinese text corrector based on seq2seq model

  In this paper,we build a Chinese text corrector which can correct spelling mistakes precisely in Chinese texts.Our motivation is inspired by the recently proposed seq2seq model which consider the text corrector as a sequence learning problem.To begin with,we propose a biased-decoding method to improve the bilingual evaluation understudy(BLEU)score of our model.Secondly,we adopt a more reasonable OOV token scheme,which enhances the robustness of our correction mechanism.Moreover,to test the performance of our proposed model thoroughly,we establish a corpus which includes 600,000 sentences from news data of Sogou Labs.Experiments show that our corrector model can achieve better corrector results based on the corpus.

natural language processing Chinese text corrector seq2seq model biased-decoding

Sunyan Gu Fei Lang

College of Automation Nanjing University of Posts and Telecommunications Nanjing,China College of Telecommunications and Information Engineering Nanjing University of Posts and Telecommun

国际会议

第九届网络分布式计算与知识发现国际会议( 2017 International Conference on Cyber-enabled distributed computing and knowledge discovery)

南京

英文

322-325

2017-10-12(万方平台首次上网日期,不代表论文的发表时间)