Control of Prosodic Focus in Corpus-based Generation of Fundamental Frequency Contours Based on the Generation Process Model
HMM-based speech synthesis is known to be a possible solution for realizing “flexibility in speech synthesis. However, its frame-by-frame process of acoustic features is not appropriate for prosodic features. Prosodic features cover a wider time span as compared to segmental features, and should be handled differently. From this point of view, a method has been developed for generating sentence F0 contours based on the generation process model, which models sentence F0 contours in logarithmic scale as super-positions of phrase and accent components. These components are further represented as responses of discrete commands, which have tight relations with linguistic and para-/non-linguistic information of sentences. By predicting the model commands instead of frame-by-frame F0 values, a flexible and robust F0 control can be realized. As an example of flexible control, a method is developed for generating sentence F0 contours of Japanese, when a focus is placed in one of the “bunsetsus of an utterance. The method first predicts differences in the F0 model commands between utterances with and without focus, and then applies them to the F0 model commands predicted beforehand by the baseline method without focus assignment. The baseline method is trained using a large corpus, while corpus for training command differences can be small and not necessarily be uttered by the same speaker of the large corpus. The validity of the method was proved by the experiment on F0 contour generation and speech synthesis, including interpolation/extrapolation of the F0 model commands for focus level control.
Generation process model F0 contour Corpus-based method Speech synthesis Prosodic focus
Keikichi Hirose Keiko Ochi Nobuaki Minematsu
Department of Information and Communication Engineering, the University of Tokyo, Tokyo
国际会议
2010 IEEE 10th International Conference on Signal Processing(第十届信号处理国际会议 ICSP 2010)
北京
英文
629-632
2010-08-24(万方平台首次上网日期,不代表论文的发表时间)