An HMM-based Vietnamese Speech Synthesis System

摘要：

This paper describes an approach to the realization of a Vietnamese speech synthesis system applying a technique whereby speech is directly synthesized from Hidden Markov models (HMMs). Spectrum, pitch, and phone duration are simultaneously modeled in HMMs and their parameter distributions are clustered independently by using decision tree-based context clustering algorithms. Several contextual factors such as tone types, syllables, words, phrases, and utterances were determined and are taken into account to generate the spectrum, pitch, and state duration. The resulting system yields significant correctness for a tonal language, and a fair reproduction of the prosody.

作者: Thang Tat Vu Mai Chi Luong Satoshi Nakamura

作者单位: NICT -National Institute of Information and Communications Technology, Japan IOIT-Institute of Infor IOIT-Institute of Information Technology, Vietnam NICT -National Institute of Information and Communications Technology, Japan

会议类型: 国际会议

会议名称: 2009 Oriental COCOSDA International Conference on Speech Database and Assessments(2009 国际语音交互标准数据评估技术大会)

会议地点: 北京

会议语种:英文

页码: 116-121

在线出版日期: 2009-08-10（万方平台首次上网日期，不代表论文的发表时间）

会议专题

An HMM-based Vietnamese Speech Synthesis System