Multi-view LSTM Language Model with Word-synchronized Auxiliary Feature for LVCSR

摘要：

　　Recently long short-term memory language model(LSTMLM)has received tremendous interests from both language and speech communities,due to its superiorty on modelling long-term dependency.Moreover,integrating auxiliary information,such as context feature,into the LSTM LM has shown improved performance in perplexity(PPL).However,improper feed of auxiliary information won”t give consistent gain on word error rate(WER)in a large vocabulary continuous speech recognition(LVCSR)task.To solve this problem,a multi-view LSTM LM architecture combining a tagging model is proposed in this paper.Firstly an on-line unidirectional LSTM-RNN is built as a tagging model,which can generate word-synchronized auxiliary feature.Then the auxiliary feature from the tagging model is combined with the word sequence to train a multi-view unidirectional LSTM LM.Different training modes for the tagging model and language model are explored and compared.The new architecture is evaluated on PTB,Fisher English and SMS Chinese data sets,and the results show that not only LM PPL promotion is observed,but also the improvements can be well transferred to WER reduction in ASR-rescore task.

关键词： LSTM language model speech recognition multi-view aux-iliary feature tagging model

作者: Yue Wu Tianxing He Zhehuai Chen Yanmin Qian Kai Yu

作者单位: Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering SpeechLab,Department of Computer Science and Engineering Brain Science and Technology Research Center Shanghai Jiao Tong University,Shanghai,China

会议类型: 国内会议

会议名称: 第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会

会议地点: 南京

会议语种:英文

页码: 1-13

在线出版日期: 2017-10-13（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Multi-view LSTM Language Model with Word-synchronized Auxiliary Feature for LVCSR