An Emotional Text-Driven 3D Visual Pronunciation System for Mandarin Chinese
This paper proposes an emotional text-driven 3D visual pronunciation system for Mandarin Chinese.Firstly,based on an articulatory speech corpus collected by Electro-Magnetic Articulography(EMA),the articulatory features are trained by Hidden Markov model(HMM),and the fully context-dependent modeling is taken into account by making full use of the rich linguistic features.Secondly,considering the fact that the emotion is more remarkably adjusted in the articulatory domain owing to the independency in the manipulation of articulators,the differences between articulatory movements in different emotions are investigated.Thirdly,the emotional speech is generated by adjusting the speech parameters,such as fundamental frequency(F0),duration and intensity,based on Praat.Then when playing the generated emotional speech,the corresponding articulatory movements are synthesized by the HMM prediction rules simultaneously which is used to drive the head mesh model along with emotional speech.The experiments demonstrate the system can synthesize accurate emotional speech synchronized animation of articulators at phoneme level.
Articulatory movement Hidden Markov model Fully context-dependent modeling Emotional speech
Lingyun Yu Changwei Luo Jun Yu
Department of Automation,University of Science and Technology of China,Hefei,China Department of Automation,University of Science and Technology of China,Hefei,China;State Key Laborat
国际会议
第七届全国模式识别学术会议(The 7th Chinese Conference on Pattern Recognition,CCPR2016)
成都
英文
93-104
2016-11-03(万方平台首次上网日期,不代表论文的发表时间)