A Multi-Model Method for Short-Utterance Speaker Recognition

摘要：

The length of the test speech greatly influences the performance of GMM-UBM based text-independent speaker recognition system, for example when the length of valid speech is as short as 1~5 seconds, the performance decreases significantly because the GMM-UBM based speaker recognition method is a statistical one, of which sufficient data is the foundation. Considering that the use of text information will be helpful to speaker recognition, a multi-model method is proposed to improve short-utterance speaker recognition (SUSR) in Chinese. We build a few phoneme class models for each speaker to represent different parts of the characteristic space and fuse the scores to fit the test data on the models with the purpose of increasing the matching degree between training models and test utterance. Experimental results showed that the proposed method achieved a relative EER reduction of about 26％ compared with the traditional GMM-UBM method.

作者: Chenhao Zhang Xiaojun Wu Linlin Wang Gang Wang Jyh-Shing Roger Jang Thomas Fang Zheng

作者单位: Center for Speech and Language Technologies, Division of Technical Innovation and Development,Tsingh Department of Computer Science, Tsing Hua University, Hsin-chu

会议类型: 国际会议

会议名称: 2011亚太信号与信息处理协会年度峰会(APSIPAASC 2011)

会议地点: 西安

会议语种:英文

页码: 1-4

在线出版日期: 2011-10-18（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Multi-Model Method for Short-Utterance Speaker Recognition