A Study on Tibetan Prosodic Model of Speech and Respiratory Signals
Prosodic model is an important component of the TTS system, and respiratory rhythm is an important factor affecting prosodic features. Based on the speech characteristics of Tibetan, the paper studies the correspondence between respiratory signals and Tibetan prosodic features, and has decided parameters of speech and respiratory signals that affect parameters of prosodic features. Combining the research experience of Chinese prosodic models, the paper established two Tibetan prosodic models with RBF neural network – speech prosodic model and prosodic model of speech and respiratory signals, so physiological signals has been introduced into the establishment of prosodic model. News corpus is used for training of these two kinds of prosodic models with a comparing test the output, the result of which shows that the prosodic model of speech and respiratory signals can generate fundamental frequency and duration parameters that is nearer to natural speech. The results of listening and phonetically identification test show that the MOS score of its synthesized speech is 3.37, with a high naturalness.
Tibetan speech signal respiratory signal neural network prosodic model
Chen Qi Yu Hongzhi Chen Chen Shi Jing
Key Lab of National Linguistic Information Technology Northwest University for Nationalities Lanzhou,Gansu,730030,China
国际会议
2010 IEEE信息与自动化国际会议(ICIA 2010)
哈尔滨
英文
1-6
2010-06-20(万方平台首次上网日期,不代表论文的发表时间)