Asymmetric Acoustic Model for Accented Speech Recognition
We propose to improve accented speech recognition performance by using asymmetric acoustic model. Our proposed model is generated based on reliable accent specific units and acoustic model reconstruction. The reliable units are extracted with time alignment recognition to cover accent variations at both acoustic and phonetic levels. The asymmetric acoustic model is obtained through selective decision tree merging together with dynamic Gaussian component selection in model reconstruction. The improved resolution of our proposed model is able to handle different levels of accented variations at different degrees. The effectiveness of our approach was evaluated on a typical Chinese accent. Our system outperforms traditional acoustic model reconstruction and MAP adaptation approaches by 8.28% and 7.14%, relatively on Syllable Error Rate (SER) reduction without sacrificing the performance on standard Mandarin speech.
Chao Zhang Yi Liu Thomas Fang Zheng
Center for Speech and Language Technologies, Division of Technology Innovation and Development, Tsin Center for Speech and Language Technologies, Division of Technology Innovation and Development, Tsin
国际会议
2011亚太信号与信息处理协会年度峰会(APSIPAASC 2011)
西安
英文
1-5
2011-10-18(万方平台首次上网日期,不代表论文的发表时间)