会议专题

Real-Time Speech-Driven Lip Synchronization

Speech-driven lip synchronization, an important part of facial animation, is to animate a face model to render lip movements that are synchronized with the acoustic speech signal. It has many applications in human-computer interaction. In this paper, we present a framework that systematically addresses multimodal database collection and processing and real-time speech-driven lip synchronization using collaborative filtering which is a data-driven approach used by many online retailers to recommend products. Mel-frequency cepstral coefficients (MFCCs) with their delta and acceleration coefficients and Facial Animation Parameters (FAPs) supported by MPEG-4 for the visual representation of speech are utilized as acoustic features and animation parameters respectively. The proposed system is speaker independent and real-time capable. The subjective experiments show that the proposed approach generates a natural facial animation.

real-time speech-driven lip synchronization FAP MFCC collaborative filtering

Kaihui Mu Jianhua Tao Jianfeng che Minghao Yang

National Laboratory of Pattern Recognition (NLPR) Hi-tech Innovation Center Institute of Automation, Chinese Academy of SciencesBeijing, China

国际会议

2010 4th International Universal Communication Symposium(第四届国际普遍交流学术研讨会 IUCS 2010)

北京

英文

377-381

2010-10-18(万方平台首次上网日期,不代表论文的发表时间)