会议专题

The University of Passau Open Emotion Recognition System for the Multimodal Emotion Challenge

  This paper presents the University of Passaus approaches for the Multimodal Emotion Recognition Challenge 2016.For audio signals,we exploit Bag-of-Audio-Words techniques combining Extreme Learning Machines and Hierarchical Extreme Learning Machines.For video signals,we use not only the information from the cropped face of a video frame,but also the broader contextual information from the entire frame.This information is extracted via two Convolutional Neural Networks pre-trained for face detection and object classification.Moreover,we extract facial action units,which reflect facial muscle movements and are known to be important for emotion recognition.Long Short-Term Memory Recurrent Neural Networks are deployed to exploit temporal information in the video representation.Average late fusion of audio and video systems is applied to make prediction for multimodal emotion recognition.Experimental results on the challenge database demonstrate the effectiveness of our proposed systems when compared to the baseline.

Multimodal emotion recognition Bag-of-audio-words Transfer learning Long short-term memory Convolutional neural networks

Jun Deng Nicholas Cummins Jing Han Xinzhou Xu Zhao Ren Vedhas Pandit Zixing Zhang Bj(o)rn Schuller

Complex and Intelligent Systems,University of Passau,Passau,Germany Complex and Intelligent Systems,University of Passau,Passau,Germany;Technische Universit(a)t M(u)nch Northwestern Polytechnical University,Xian,Peoples Republic of China

国际会议

第七届全国模式识别学术会议(The 7th Chinese Conference on Pattern Recognition,CCPR2016)

成都

英文

652-666

2016-11-03(万方平台首次上网日期,不代表论文的发表时间)