Key Technologies of Pre-processing and Post-processing methods for Embedded Automatic Speech Recognition Systems

摘要：

Signal pre-processing and post-processing are becoming two key factors that impact embedded speech recognition systems from the laboratory to practical application. Speech endpoint detection and out-of-vocabulary rejection are the most important part of the speech pre-processing and post-processing respectively. The performance of traditional speech endpoint detection based on short-term energy and zero-crossing rate degrade dramatically in noisy environments. Methods based on frequency-domain need complex computing, and they can not meet embedded systems well. In this paper, we present a new endpoint detection algorithm that is based on statistical theory for isolated-word. The correct endpoint detection rate reaches 97.40％ using the method. In this paper one-class support vector machine theory is introduced to solve out-of-vocabulary rejection. Using this algorithm system, true recognition fraction(TRF) is up to 96％, and false recognition fraction(FRF ) is about 95％.

关键词： speech recognition endpoint detection out-of-vocabulary rejection support vector machine

作者: Dongzhi He Yibin Hou Yuanyuan Li Zhi-Hao Ding

作者单位: Institute of Embedded Software and System,Beijing University of Technology, Beijing, CO 100124 China

会议类型: 国际会议

会议名称: 2010 IEEE/ASME International Conference on Mechatronic and Embedded System and Applications(2010 IEEE 机电一体化和嵌入式系统与应用国际会议)

会议地点: 青岛

会议语种:英文

页码: 76-80

在线出版日期: 2010-07-15（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Key Technologies of Pre-processing and Post-processing methods for Embedded Automatic Speech Recognition Systems