会议专题

MTF-based Sub-band Power-envelope Restoration for Robust Speech Recognitionin Noisy Reverberant Environments

Many speech enhancement methods have been proposed to suppress the effect of either background noise or reverberation for automatic speech recognition (ASR) systems. However, most of these methods cannot simultaneously reduce the effects of both, and no method reduces the effects of both in a unified strategy for ASR systems in noisy reverberant environments. We previously proposed a method for restoring the speech power envelope from noisy reverberant speech based on a simple modulation transfer function (MTF) concept. The method does not require the impulse response and noise conditions of the room acoustics to be measured. In this study, we further tested the proposed method as a front-end for ASR systems in noisy reverberant environments. Noisy reverberant speech signals were obtained by adding white noise to reverberant speech produced by convoluting clean speech signals (from the AURORA-2J, a continuous Japanese digit speech) with artificially-made impulse response of room acoustics. The recognition performance based on the conventional Mel frequency cepstral coefficient feature was used as a baseline. Compared with the baseline, the proposed method obtained 12.19 % relative improvement in the error reduction rate (averaged of all tested noisy reverberant environments).

Shota Morita Xugang Lu Masashi Unoki Masato Akagi R(u)diger Hoffmann

School of Information Science, Japan Advanced Institute of Science and Technology, Japan National Institute of Information and Communications Technology, Japan Laboratory of Acoustics and Speech Communication, Dresden University of Technology, Germany

国际会议

2011亚太信号与信息处理协会年度峰会(APSIPAASC 2011)

西安

英文

1-4

2011-10-18(万方平台首次上网日期,不代表论文的发表时间)