会议专题

Voice Activity Detection Based on a Sequential Gaussian Mixture Model

Voice activity detection (VAD) is a basic component of noise reduction algorithms. In this paper, we propose a voice activity detector based on a sequential Gaussian Mixture Model (SGMM) in log-spectral domain. This model comprises two Gaussian components, which respectively describe the speech and nonspeech log-power distributions. The initial distributions are firstly established by EM algorithm, and then sequentially updated in an on-line manner. From the SGMM, a self-regulatory threshold for discrimination is derived at each subband. The proposed VAD does not rely on an assumption that the first several frames of an utterance are nonspeech, which is widely used in most VADs. Moreover, the speech presence probability in the time-frequency domain is a byproduct of this VAD. We tested it on speech from TIMIT database and noise from NOISEX- 92 database. The evaluations effectively showed its promising performance.

Dongwen Ying Junfeng Li Qiang Fu Yonghong Yan Jianwu Dang

The Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences Tianjin University, School of Computer Science and Technology

国际会议

2011亚太信号与信息处理协会年度峰会(APSIPAASC 2011)

西安

英文

1-6

2011-10-18(万方平台首次上网日期,不代表论文的发表时间)