Key Technologies of Pre-processing and Post-processing methods for Embedded Automatic Speech Recognition Systems
Signal pre-processing and post-processing are becoming two key factors that impact embedded speech recognition systems from the laboratory to practical application. Speech endpoint detection and out-of-vocabulary rejection are the most important part of the speech pre-processing and post-processing respectively. The performance of traditional speech endpoint detection based on short-term energy and zero-crossing rate degrade dramatically in noisy environments. Methods based on frequency-domain need complex computing, and they can not meet embedded systems well. In this paper, we present a new endpoint detection algorithm that is based on statistical theory for isolated-word. The correct endpoint detection rate reaches 97.40% using the method. In this paper one-class support vector machine theory is introduced to solve out-of-vocabulary rejection. Using this algorithm system, true recognition fraction(TRF) is up to 96%, and false recognition fraction(FRF ) is about 95%.
speech recognition endpoint detection out-of-vocabulary rejection support vector machine
Dongzhi He Yibin Hou Yuanyuan Li Zhi-Hao Ding
Institute of Embedded Software and System,Beijing University of Technology, Beijing, CO 100124 China
国际会议
青岛
英文
76-80
2010-07-15(万方平台首次上网日期,不代表论文的发表时间)