Voice Conversion Using Structrued Gaussian Mixture Model

摘要：

Gaussian Mixture Model (GMM) is commonly used in voice conversion. However, traditional GMM based voice conversion usually extracts a conversion function from parallel corpus, which greatly limits the application of the technology. In an attempt to overcome this drawback, structured Gaussian Mixture Model (SGMM) is applied to model the speakers acoustic feature distribution. In particular, two speakers isolated SGMMs are aligned based on Acoustic Universal Structure (AUS) theory. Then the conversion function is extracted from two aligned SGMMs in a manner similar to conventional method. The subjective listening tests indicate that the proposed method achieves equivalent speech quality and speaker individuality compared with conventional method.

关键词： voice conversion SGMM AUS

作者: Daojian Zeng Yibiao Yu

作者单位: School of Electronic and Information Engineering, Soochow University, Suzhou, China

会议类型: 国际会议

会议名称: 2010 IEEE 10th International Conference on Signal Processing(第十届信号处理国际会议 ICSP 2010)

会议地点: 北京

会议语种:英文

页码: 541-544

在线出版日期: 2010-08-24（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Voice Conversion Using Structrued Gaussian Mixture Model