An Investigation on the Mandarin Prosody of a Parallel Multi-Speaking Rate Speech Corpus
Abstract In this paper, the prosody of a parallel multispeaking rate Mandarin read speech corpus is investigated. The corpus contains four parallel speech datasets uttered by a female professional announcer with various speech rates (SRs) of 4.40 (fast), 3.82 (normal), 2.97 (median) and 2.45 (slow) syllables/second. By using the unsupervised joint prosody labeling and modeling (PLM) method proposed previously, the relationship between SR and various prosodic features, including pause duration, patterns of three high-level prosodic constituents, and the break labels, are investigated. The analyses reported in this study could be very informative in developing prosody generation mechanism for text-tospeech and prosody modeling for automatic speech recognition in various SRs.
Chen-Yu Chiang Cheng-Chang Tang Hsiu-Min Yu Yih-Ru Wang Sin-Horng Chen
Department of Communication Engineering, National Chiao Tung University, Taiwan Language Center, Chung Hua University, Taiwan
国际会议
北京
英文
148-153
2009-08-10(万方平台首次上网日期,不代表论文的发表时间)