Experimental Study of Structure to Speech Conversion -- An implementation of Infant-like Vocal Imitation on a Machine

摘要：

Most of the speech synthesizers have been developed as text (phoneme sequence) to speech converters and,in this framework,text input is a precondition for speech production.However,we can say that no child acquires spoken language by reading a given text out.Children are explained to acquire spoken language by imitating the utterances of their parents but they never imitate the voices of their parents.Developmental psychology claims that they extract a holistic and speakerinvariant sound pattern embedded in a given utterance,called word Gestalt,and realize the pattern acoustically using their short vocal tubes.In our previous studies,we mathematically defined this holistic and speakerinvariant pattern and used it for ASR 1,2,3,4.Here,we experimentally implement its inverse process,i.e.Gestalt-to-utterance conversion,on a computer.

作者: Nobuaki Minematsu Daisuke Saito Keikichi Hirose

作者单位: The University of Tokyo

会议类型: 国际会议

会议名称: 9th International Conference on Signal Processing(第九届国际信号处理学术会议)(ICSP08)

会议地点: 北京

会议语种:英文

在线出版日期: 2008-10-26（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Experimental Study of Structure to Speech Conversion -- An implementation of Infant-like Vocal Imitation on a Machine