Deep Neural Network based Feature Extraction Using Convex-nonnegative Matrix Factorization for Low-resource Speech Recognition

摘要：

　　Bottleneck feature (BNF),together with Gaussian mixture models,has achieved great success compared with acoustic features in low-resource speech recognition.However,the existing of BN layer decreases classification accuracy of deep neural networks (DNN).In this paper,we investigate a better way of extracting DNN based low-dimensional features using convex-nonnegative matrix factorization (CNMF).Firstly a DNN is trained without setting the BN layer.Secondly CNMF is applied on the weights matrix of a hidden layer to form a low-dimensional feature extraction layer.Finally a new type of high-level feature is extracted by forward passing input acoustic feature.Experiments show that the new feature produces 1.6-4.6% gain over BNF baseline system in English and Czech low-resource tasks.When dropout and maxout are introduced,3.1-5.6% additional gain over BNF baseline system is observed while the training time reduces.

关键词： convex-nonnegative matrix factorization deep neural network low-dimensional features low-resource speech recognition

作者: Chuxiong Qin Lianhai Zhang

作者单位: Zhengzhou Information Science and Technology Institute Zhengzhou,China

会议类型: 国际会议

会议名称: 2016IEEE第二届信息技术、网络、电子及自动化控制会议

会议地点: 重庆

会议语种:英文

页码: 1082-1086

在线出版日期: 2016-03-20（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Deep Neural Network based Feature Extraction Using Convex-nonnegative Matrix Factorization for Low-resource Speech Recognition