会议专题

Emphasized Speech Synthesis Based on Hidden Markov Models

This paper presents a statistical approach to synthesizing emphasized speech based on hidden Markov models (HMMs). Context-dependent HMMs are trained using emphasized speech data uttered by intentionally emphasizing an arbitrary accentual phrase in a sentence. To model acoustic characteristics of emphasized speech, new contextual factors describing an emphasized accentual phrase are additionally considered in model training. Moreover, to build HMMs for synthesizing both normal speech and emphasized speech, we investigate two training methods; one is training of individual models for normal and emphasized speech using each of these two types of speech data separately; and the other is training of a mixed model using both of them simultaneously. The experimental results demonstrate that 1) HMM-based speech synthesis is effective for synthesizing emphasized speech and 2) the mixed model allows a more compact HMM set generating more naturally sounding but slightly less emphasized speech compared with the individual models.

Kumiko Morizane Keigo Nakamura Tomoki Toda Hiroshi Saruwatari Kiyohiro Shikano

Graduate School of Information Science, Nara Institute of Science and Technology

国际会议

2009 Oriental COCOSDA International Conference on Speech Database and Assessments(2009 国际语音交互标准数据评估技术大会)

北京

英文

76-81

2009-08-10(万方平台首次上网日期,不代表论文的发表时间)