HMM-based Phonemic Distance in Different Speaking Styles and Its Influence on Substitutions in Mandarin Speech Recognition

摘要：

Statistical confusability between different acoustic models is important to character substitution error rate in large vocabulary continuous speech recognition. In this paper, we take factors of gender and speaking styles into consideration in Mandarin speech recognition. We modeled phonemes in different speaking styles, including read speech of female, male, and spontaneous dialogue. Then Minimum Gaussian Distances between Chinese Initial/Final model pairs are given and average phoneme distances are calculated which denote the pronunciation varieties. The effect of different style to average phonemic distance is studied and relative articulation is given for three databases. Qualitative relationship between phone size and error rate in recognition is analytical researched, showing that for a particular phoneme, pronunciation variety is one of reasons for misidentification in recognizing process, which provides us a novel mind to reduce substitution errors.

关键词： phonemic distance articulation pronunciation variety error rate

作者: Zhanlei YANG Wenju LIU Zhenyu LV

作者单位: Institute of Automation, Chinese Academy of Sciences, Beijing, China

会议类型: 国际会议

会议名称: International Conference on Natural Language Processing and Knowledge Engineering(IEEE自然语言处理与知识工程国际会议 IEEE NLP-KE 2009)

会议地点: 大连

会议语种:英文

页码: 1-5

在线出版日期: 2009-09-24（万方平台首次上网日期，不代表论文的发表时间）

会议专题

HMM-based Phonemic Distance in Different Speaking Styles and Its Influence on Substitutions in Mandarin Speech Recognition