Real-Time Speech-Driven Lip Synchronization

摘要：

Speech-driven lip synchronization, an important part of facial animation, is to animate a face model to render lip movements that are synchronized with the acoustic speech signal. It has many applications in human-computer interaction. In this paper, we present a framework that systematically addresses multimodal database collection and processing and real-time speech-driven lip synchronization using collaborative filtering which is a data-driven approach used by many online retailers to recommend products. Mel-frequency cepstral coefficients (MFCCs) with their delta and acceleration coefficients and Facial Animation Parameters (FAPs) supported by MPEG-4 for the visual representation of speech are utilized as acoustic features and animation parameters respectively. The proposed system is speaker independent and real-time capable. The subjective experiments show that the proposed approach generates a natural facial animation.

关键词： real-time speech-driven lip synchronization FAP MFCC collaborative filtering

作者: Kaihui Mu Jianhua Tao Jianfeng che Minghao Yang

作者单位: National Laboratory of Pattern Recognition (NLPR) Hi-tech Innovation Center Institute of Automation, Chinese Academy of SciencesBeijing, China

会议类型: 国际会议

会议名称: 2010 4th International Universal Communication Symposium(第四届国际普遍交流学术研讨会 IUCS 2010)

会议地点: 北京

会议语种:英文

页码: 377-381

在线出版日期: 2010-10-18（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Real-Time Speech-Driven Lip Synchronization