会议专题

Intonation and Prosody Conversion for Expressive Mandarin Speech Synthesis

  Expressive speech synthesis has a wide variety of applications.Compared with general speech synthesis for Chinese,this paper focuses on prosody and intonation.Prosody is described from three aspects,accent,pause and speaking speed.Accent can be stressed by modifying fundamental frequency and amplitude.Pause is achieved by interpolating some frames which parameter value is zero.Speaking speed is controlled by copying or deleting some frames in specified location.Mandarin is a tonal language,so intonation is significant in the synthesis.There are four patterns of intonation,rising intonation,falling intonation,fiat intonation and sinuate intonation.Use polynomial fitting function to model each intonation pattern.Apply the intonation model to convert one pattern to another.It can be seen from the experimental results,the proposed method can achieve a good quality on the conversion of tune and it can highly improve the naturalness of the speech.

speech synthesis intonation prosody polynomial fitting

Jing Zhu Yibiao Yu

School of Electronic and Information Engineering, Soochow University, Suzhou, China

国际会议

2012 IEEE 11th International Conference on Signal Processing (第11届IEEE信号处理国际会议)

北京

英文

549-552

2012-10-21(万方平台首次上网日期,不代表论文的发表时间)