Speech timing and cross-linguistic studies towards computational human modeling
In this paper, we introduce Japanese segmental duration characteristics and computational modeling that we have been studying for around three decades in speech synthesis. A series of experimental results are also shown on loudness dependence in the duration perception. These computational duration modeling and perceptual studies on duration error sensitivity to loudness give some insights for computational human modeling of spoken language capability. As a first trial to figure out how these findings could be efficiently employed in other field like language learning, we introduce our current efforts on the objective evaluation of 2nd language speaking skill and the research consortium of AESOP (Asian English Speech cOrpus Project) where researchers in Asian countries have started to work together.
Yoshinori Sagisaka Hiroaki Kato Minoru Tsuzaki Shizuka Nakamura Chatchawarn Hansakunbuntheung
GITI / Language and Speech Science Research Laboratories, Waseda University NICT / ATR Spoken Langua NICT / ATR Media Information Science Laboratories Kyoto City University of Arts GITI / Language and Speech Science Research Laboratories, Waseda University
国际会议
北京
英文
1-8
2009-08-10(万方平台首次上网日期,不代表论文的发表时间)