Training Robust Acoustic Models Using Features of Pseudo-Speakers Generated by Inverse CMLLR Transformations
In this paper a novel speech feature generationbased acoustic model training method is proposed. For decades, speaker adaptation methods have been widely used. All existing adaptation methods need adaptation data. However, our proposed method creates speaker-independent acoustic models that cover not only known but also unknown speakers. We do this by adopting inverse maximum likelihood linear regression (MLLR) transformation-based feature generation, and then train our models using these features. First we obtain MLLR transformation matrices from a limited number of existing speakers. Then we extract the bases of the MLLR transformation matrices using PCA. The distribution of the weight parameters to express the MLLR transformation matrices for the existing speakers are estimated. Next we generate pseudo-speaker MLLR transformations by sampling the weight parameters from the distribution, and apply the inverse of the transformation to the normalized existing speaker features to generate the pseudospeakers’ features. Finally, using these features, we train the acoustic models. Evaluation results show that the acoustic models which are created are robust for unknown speakers.
Arata Itoh Sunao Hara Norihide Kitaoka Kazuya Takeda
Nagoya University, Nagoya, Japan
国际会议
2011亚太信号与信息处理协会年度峰会(APSIPAASC 2011)
西安
英文
1-5
2011-10-18(万方平台首次上网日期,不代表论文的发表时间)