An Improved Method for Voice Conversion Based on Gaussian Mixture Model

摘要：

Voice conversion is a technology by modifying the source personality to mock the voice of target. This technology has a wide prospect and potential technical value both in technical field and entertainment, such as text to speech (TTS) and toys. This paper develops an improved voice conversion method. For the voiced frames, the conversion is implemented by Guassian mixture model (GMM) based on speech transformation and representation using adaptive interpolation of weighted spectral contour (STRAIGHT) algorithm. For the unvoiced frames, the envelope is stretched or compressed according to the ratio of the vocal tract length (VTL) of the source and the target. The subjective experiment shows that the proposed method indeed improve the quality of the converted voice with the introduction of VTL.

关键词： voice conversion GMM STRAIGHT vocal tract length (VTL)

作者: Xie Chen Wei-Qiang Zhang Jia Liu Xiuguo Bao

作者单位: Tsinghua National Laboratory for Information Science and Technology Department of Electronic Enginee CNCERT/CC Yumin Road, Chaoyang District Beijing 100029, China

会议类型: 国际会议

会议名称: The 2010 International Conference on Computer Application and System Modeling(2010计算机应用与系统建模国际会议 ICCASM 2010)

会议地点: 太原

会议语种:英文

页码: 404-407

在线出版日期: 2010-10-22（万方平台首次上网日期，不代表论文的发表时间）

会议专题

An Improved Method for Voice Conversion Based on Gaussian Mixture Model