Emotion Recognition in Videos via Fusing Multimodal Features

摘要：

　　Emotion recognition is a challenging task with a wide range of applications.In this paper,we present our system in the CCPR 2016 multimodal emotion recognition challenge.Multimodal features from acoustic signals,facial expressions and speech contents are extracted to recognize the emotion of the character in the video.Among them the facial CNN feature is the most discriminative feature for emotion recognition.We train SVM and random forest classifiers based on each type of features and utilize early and late fusion to combine the different modality features.To deal with the data unbalance issue,we propose to adapt the probability thresholds for each emotion class.The macro precision of our best multimodal fusion system achieves 50.34%on the testing set,which significantly outperforms the baseline of 30.63%.

关键词： Emotion recognition Multimodal features fusion CNN Features

作者: Shizhe Chen Yujie Dian Yujie Dian Xiaozhu Lin Qin Jin Haibo Liu Li Lu

作者单位: Multimedia Computing Laboratory,School of Information,Renmin University of China,Beijing,Peoples Re Tencent Inc.,Beijing,Peoples Republic of China

会议类型: 国际会议

会议名称: 第七届全国模式识别学术会议(The 7th Chinese Conference on Pattern Recognition,CCPR2016)

会议地点: 成都

会议语种:英文

页码: 632-644

在线出版日期: 2016-11-03（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Emotion Recognition in Videos via Fusing Multimodal Features