会议专题

A Robust Voice Activity Detector Based on Weibull and Gaussian Mixture Distribution

In this paper, we focus on the observation and state duration distributions in hidden semi-Markov model (HSMM)-based voice activity detection. To perform robustly in noisy environment, firstly, acoustic features of noisy speech are extracted by Mel-frequency cepstrum processor after filtering the raw speech with a modified Wiener filter. According to the statistic on TIMIT database, we use Gaussian Mixture distributions (GMD) for both speech and non-speech state to correlate the MFCC feature vectors and state sequences. The transition probability in HSMM is not a constant like in HMM but depends on the elapsed time in last state, and is modeled by Weibull distribution (WD) in this paper. The final VAD decision is made according to the likelihood ratio test (LRT) inco rporating state prior knowledge. Also a adaptive threshold is used to achieve better detection results. Experiments on noisy speech data show that the proposed method performs more robustly and accurately than the standard ITU-T G.729B, AMR2, HMM-based VAD and VAD using Laplacian-Gaussian model.

Voice Actwity Detection Gaussian Mixture Distribution Weibull Distribution

Yuan Liang Xianglong Liu Mi Zhou Yihua Lou Baosong Shan

State Key Laboratory of Software Development Environment Beihang University Beijing, China

国际会议

2010 2nd International Conference on Signal Processing System(2010年信号处理系统国际会议 ICSPS 2010)

大连

英文

866-870

2010-07-05(万方平台首次上网日期,不代表论文的发表时间)