A Robust Voice Activity Detector Based on Weibull and Gaussian Mixture Distribution

摘要：

In this paper, we focus on the observation and state duration distributions in hidden semi-Markov model (HSMM)-based voice activity detection. To perform robustly in noisy environment, firstly, acoustic features of noisy speech are extracted by Mel-frequency cepstrum processor after filtering the raw speech with a modified Wiener filter. According to the statistic on TIMIT database, we use Gaussian Mixture distributions (GMD) for both speech and non-speech state to correlate the MFCC feature vectors and state sequences. The transition probability in HSMM is not a constant like in HMM but depends on the elapsed time in last state, and is modeled by Weibull distribution (WD) in this paper. The final VAD decision is made according to the likelihood ratio test (LRT) inco rporating state prior knowledge. Also a adaptive threshold is used to achieve better detection results. Experiments on noisy speech data show that the proposed method performs more robustly and accurately than the standard ITU-T G.729B, AMR2, HMM-based VAD and VAD using Laplacian-Gaussian model.

关键词： Voice Actwity Detection Gaussian Mixture Distribution Weibull Distribution

作者: Yuan Liang Xianglong Liu Mi Zhou Yihua Lou Baosong Shan

作者单位: State Key Laboratory of Software Development Environment Beihang University Beijing, China

会议类型: 国际会议

会议名称: 2010 2nd International Conference on Signal Processing System(2010年信号处理系统国际会议 ICSPS 2010)

会议地点: 大连

会议语种:英文

页码: 866-870

在线出版日期: 2010-07-05（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Robust Voice Activity Detector Based on Weibull and Gaussian Mixture Distribution