Deep Neural Network based Feature Extraction Using Convex-nonnegative Matrix Factorization for Low-resource Speech Recognition
Bottleneck feature (BNF),together with Gaussian mixture models,has achieved great success compared with acoustic features in low-resource speech recognition.However,the existing of BN layer decreases classification accuracy of deep neural networks (DNN).In this paper,we investigate a better way of extracting DNN based low-dimensional features using convex-nonnegative matrix factorization (CNMF).Firstly a DNN is trained without setting the BN layer.Secondly CNMF is applied on the weights matrix of a hidden layer to form a low-dimensional feature extraction layer.Finally a new type of high-level feature is extracted by forward passing input acoustic feature.Experiments show that the new feature produces 1.6-4.6% gain over BNF baseline system in English and Czech low-resource tasks.When dropout and maxout are introduced,3.1-5.6% additional gain over BNF baseline system is observed while the training time reduces.
convex-nonnegative matrix factorization deep neural network low-dimensional features low-resource speech recognition
Chuxiong Qin Lianhai Zhang
Zhengzhou Information Science and Technology Institute Zhengzhou,China
国际会议
重庆
英文
1082-1086
2016-03-20(万方平台首次上网日期,不代表论文的发表时间)