Two-level Approach for Detecting Non-lexical Audio Events in Spontaneous Speech
Based on analyses of characteristic differences between various audio events, a two-level approach is proposed for detecting three non-lexical audio events (filled pause, laugh, and applause) in spontaneous odel-based decision. The experiments give average precision of 87.3%, recall of 93.77%, and F-measure of 90.42%. Compared with the sliding window based approach, average F-measure is improved by 7.52%. Moreover, it can more accurately determine the boundaries of non-lexical audio events in spontaneous speech.
Yan-Xiong Li Qian-Hua He Wei Li Zhi-Feng Wang
School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Tianhe District, Guangzhou City, Guangdong Province, China
国际会议
上海
英文
771-777
2010-10-20(万方平台首次上网日期,不代表论文的发表时间)