Speech Attribute Recognition using Context-Dependent Modeling
Speech attributes, such as places and manners of articulation are robust against cross-speaker variation and environmental distortions. They have been used in various speech processing applications such as spoken language identification, speaker recognition and speech recognition. In this paper, we propose a method to recognize speech attributes by using context-dependent modeling of the attributes, called bi-attributes. Experimental results on the TIMIT database show that the context-dependent modeling reduces frame classification error by 13.2% and 16.1% relatively over the context-independent modeling for manner and place classification, respectively. In addition, when fused with phone posteriors to improve phone recognition accuracy, the attribute context dependent modeling gives a 9.9% relative phone error rate reduction over the attribute context independent modeling.
Van Hai Do Xiong Xiao Ville Hautam(a)ki Eng Siong Chng
School of Computer Engineering, Nanyang Technological University, Singapore Temasek Laboratories@NTU Temasek Laboratories@NTU, Nanyang Technological University, Singapore Institute for Infocomm Research, Singapore School of Computing, University of Eastern Finland, Finla
国际会议
2011亚太信号与信息处理协会年度峰会(APSIPAASC 2011)
西安
英文
1-5
2011-10-18(万方平台首次上网日期,不代表论文的发表时间)