Spotting Chief Speaker from Press Conference Recordings Based on Silence Detection

摘要：

　　This paper presents a method for spotting chief speaker from press conference recordings where durations of silence segment in one utterance of chief speaker and other speakers are obviously different.In the proposed method, speech endpoint detection is first performed on the audio conference recordings for obtaining the durations of silence segment (i.e.Si sequence).Then, Si sequence is converted into 1-0 sequence where outliers are revised Finally, speech segments limited by the continuous 1 sequences are extracted as the chief speakers voices.The experiments are conducted on two data sets with different durations of silence segment in chief speakers utterances for comparing the proposed method with the conventional approach (based on speaker segmentation using B1C and spectrum clustering).The experimental results show that the proposed method achieves higher F measures (harmonic mean of precision rate and recall rate) with faster speed in comparison with the conventional approach for spotting chief speaker from press conference recordings.

关键词： press conferences recordings chief speakers speech endpoint detection speaker segmentation speaker clustering

作者: Wu Wei Li Yanxiong Wang Zili Chen Zhuyun

作者单位: School of Electronic and Information Engineering,South China University of Technology,Guangzhou,China

会议类型: 国际会议

会议名称: 2013 IEEE 11th International Conference on Electronic Measurement & Instruments(第十一届IEEE国际电子测量与仪器学术会议)

会议地点: 哈尔滨

会议语种:英文

页码: 147-150

在线出版日期: 2013-08-16（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Spotting Chief Speaker from Press Conference Recordings Based on Silence Detection