会议专题

Parameter Mask Speech Enhancement for Robust Automatic Speech Recognition

  A parameter mask is proposed and analyzed in this paper to speech enhancement for robust automatic speech recognition (ASR).With the frame work of computational auditory scene analysis (CASA),ideal binary mask (IBM) is used to get the signal to noise ratio (SNR) improvement,but not the ASR performance improvement.The gap between the SNR and ASR improvement is great.To conventional ASR system,the main goal is providing the similar energy distribution to the clean target speech and no matter the energy comes from the speech or noise.We use the SNR in time frequency (T-F) unit to generate the parameter mask (PM) which is used to estimate the clean speech energy from the mixture signals.Experiment results show the higher ASR performance of the proposed method than IBM with very small SNR performance decrease.

Parameter mask Speech enhancement Automatic speech recognition (ASR) Computational auditory scene analysis (CASA) Time-frequency (T-F)

Yi JIANG Xi LU Ying-Ze WANG

The Quartermaster Equipment Research Institute,CPLA,Beijing,P.R.China

国内会议

2014年国际计算机科学与软件工程学术会议

杭州

英文

1-6

2014-10-18(万方平台首次上网日期,不代表论文的发表时间)