Parameter Mask Speech Enhancement for Robust Automatic Speech Recognition

摘要：

　　A parameter mask is proposed and analyzed in this paper to speech enhancement for robust automatic speech recognition (ASR).With the frame work of computational auditory scene analysis (CASA),ideal binary mask (IBM) is used to get the signal to noise ratio (SNR) improvement,but not the ASR performance improvement.The gap between the SNR and ASR improvement is great.To conventional ASR system,the main goal is providing the similar energy distribution to the clean target speech and no matter the energy comes from the speech or noise.We use the SNR in time frequency (T-F) unit to generate the parameter mask (PM) which is used to estimate the clean speech energy from the mixture signals.Experiment results show the higher ASR performance of the proposed method than IBM with very small SNR performance decrease.

关键词： Parameter mask Speech enhancement Automatic speech recognition (ASR) Computational auditory scene analysis (CASA) Time-frequency (T-F)

作者: Yi JIANG Xi LU Ying-Ze WANG

作者单位: The Quartermaster Equipment Research Institute,CPLA,Beijing,P.R.China

会议类型: 国内会议

会议名称: 2014年国际计算机科学与软件工程学术会议

会议地点: 杭州

会议语种:英文

页码: 1-6

在线出版日期: 2014-10-18（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Parameter Mask Speech Enhancement for Robust Automatic Speech Recognition