会议专题

Text Window Denoising Autoencoder:Building Deep Architecture for Chinese Word Segmentation

  Deep learning is the new frontier of machine learning re search, which has led to many recent breakthroughs in English natu ral language processing.However, there are inherent differences between Chinese and English, and little work has been done to apply deep learning techniques to Chinese natural language processing.In this paper, we pro pose a deep neural network model: text window denoising autoencoder, as well as a complete pre-training solution as a new way to solve clas sical Chinese natural language processing problems.This method does not require any linguistic knowledge or manual feature design, and can be applied to various Chinese natural language processing tasks, such as Chinese word segmentation.On the PKU dataset of Chinese word segmentation bakeoff 2005, applying this method decreases the F1 error rate by 11.9% for deep neural network based models.We are the first to apply deep learning methods to Chinese word segmentation to our best knowledge.

Deep Learning Word Segmentation Denoising Autoencoder Chinese Natural Language Processing

Ke Wu Zhiqiang Gao Cheng Peng Xiao Wen

School of Computer Science & Engineering, Southeast University, Nanjing 210096, China

国际会议

Second CCF Conference,NLPCC2013(第二届自然语言处理与中文计算会议)

重庆

英文

1-12

2013-11-15(万方平台首次上网日期,不代表论文的发表时间)