会议专题

COMBINING MAP AND MLLR APPROACHES FOR SVM BASED SPEAKER RECOGNITION WITH A MULTI-CLASS MLLR TECHNIQUE

Gaussian mixture models with an universal background model (UBM) have been the standard method for speaker recognition. Typically, maximum a posteriori (MAP) or maximum likelihood linear regression (MLLR) is used to adapt the means of the UBM. Together with the SVM modeling technique, these approaches can achieve excellent performance. MLLR is quite efficient when the amount of adaptation data is limited, but has poor asymptotic properties as the amount of data increases. MAP estimation has nice asymptotic properties, but provides only a moderate improvement when the amount of adaptation data is small. In this paper, in order to take advantage of both approaches to improve the recognition performance, a new approach for speaker adaptation consisting of MAP adaptation followed by MLLR adaptation is presented. This work is enriched by a multi-class MLLr technique, which clusters the Gaussian components into regression classes and applies a different transform to each class. Experiments on the N1ST 2006 SRE corpus show that the proposed approach improves on both MLLR and MAP adaptation systems.

Speaker recognition mazimum a posteriori mazimum likelihood linear regression support vector machine

Haipeng Wang Xiang Zhang Xiang Xiao Jianping Zhang Yonghong Yan

ThinkIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences, Beijing, P.R.China ThinkIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences,Beijing,P.R.China ThinkIT Speech Lab, Institute of Acoustics,Chinese Academy of Sciences, Beijing, P.R.China ThinkIT Speech Lab, Institute of Acoustics, Chinese Academy of Sciences, Beijing,P.R.China

国际会议

Second International Symposium on Information Science and Engineering(第二届信息科学与工程国际会议)

上海

英文

447-450

2009-12-26(万方平台首次上网日期,不代表论文的发表时间)