Adaptive Frequency Cepstral Coefficients for Word Mispronunciation Detection

摘要：

Systems based on automatic speech recognition (ASR) technology can provide important functionality in computer assisted language learning applications. This is a young but growing area of research motivated by the large number of students studying foreign languages. Here we propose a Hidden Markov Model (HMM)-based method to detect mispronunciations. Exploiting the specific dialog scripting employed in language learning software, HMMs are trained for different pronunciations. New adaptive features have been developed and obtained through an adaptive warping of the frequency scale prior to computing the cepstral coefficients. The optimization criterion used for the warping function is to maximize separation of two major groups of pronunciations (native and non-native) in terms of classification rate. Experimental results show that the adaptive frequency scale yields a better coefficient representation leading to higher classification rates in comparison with conventional HMMs using Melfrequency cepstral coefficients.

关键词： ASR, Frequency scale MFCC AFCC Mispro-nunciation detection

作者: Zhenhao Ge Sudhendu R. Sharma Mark J.T. Smith

作者单位: School of Electrical and Computer Engineering Purdue University, West Lafayette, Indiana, 47907, USA

会议类型: 国际会议

会议名称: 2011 4th International Congress on Image and Signal Processing(第四届图像与信号处理国际学术会议 CISP 2011)

会议地点: 上海

会议语种:英文

页码: 2414-2417

在线出版日期: 2011-10-15（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Adaptive Frequency Cepstral Coefficients for Word Mispronunciation Detection