A Visual Silence Detector Constraining Speech Source Separation

摘要：

We propose an audiovisual source separation algorithM for speech signals. In our proposed algorithm we first extract the time segments with low activity of the mouth region from synchronous video recordings. An automatically selected optimal classifier is used to detect silent intervals in these instants of low visual mouth activity. Then, the source separation problem is formulated and solved for the entire signal duration. Our approach was tested on two challenging speech corpora with two speakers and two microphones, namely in the first corpus separate source signals were mixed in a simulated room, and the second corpus contains recorded conversations. The results are promising on both corpora: with the visual silence detector the performance of the source separation algorithm, measured by the signal to noise inference ratio increases.

作者: Isabel Gonzalez Dse Ravyse Henk Brouckxon Werner Verhelst Dongmei Jiang Hichem Sahli

作者单位: VUB-NPU Joint Research Group on Audio Visual Signal Processing (AVSP) Vrije Universiteit Brussel, De VUB-NPU Joint Research Group on Audio Visual Signal Processing (AVSP) Northwestern Polytechnic Unive

会议类型: 国际会议

会议名称: The Fifth International Conference on Image and Graphics(第五届国际图像图形学学术会议 ICIG 2009)

会议地点: 西安

会议语种:英文

页码: 463-470

在线出版日期: 2009-09-20（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Visual Silence Detector Constraining Speech Source Separation