会议专题

Emotion Recognition in Spontaneous Speech within Work and Family Environments

The speech signal is an important tool for conveying information between humans; at the same time, it is an indicator of a speakers emotions. In this paper, the automatic identification of affect from speech containing spontaneously expressed (not acted) emotions within different environments was investigated. The Teager Energy Operator-Perceptual Wavelet Packet (TEO-PWP) features as well as the Mel Frequency Cepstral Coefficients (MFCC) were used to model the emotions using two classifiers: the Gaussian mixture model (GMM) and the probabilistic neural network (PNN). The classification experiments were conducted using two data sets: SUSAS with three classes (high stress, moderate stress and neutral) and ORI with five classes (angry, happy, anxious, dysphoric and neutral). Depending on the features/classifier combination, the average classification results for the SUSAS data ranged from 95% to 61%, whereas the ORI data provided lower average rates ranging from 57% to 37%. The best overall performance was achieved while using the TEO-PWP in combination with the GMM classifier giving an average of 94.75% correct classifications for the SUSAS data and 56.6% for the ORI data. Different arousal levels between SUSAS and ORI emotional classes were suggested to be most likely cause for the difference in classification rates between these two data sets.

emotion recognition speech classification MFCC TEO analysis perceptual wavelet packets GMM PNN

Ling He Margaret Lech Namunu Maddage Sheeraz Memon Nicholas Allen

School of Electrical and Computer Engineering RMIT University,Melbourne,Australia Department of Psychology,The University of Melbourne,Melbourne,Australia

国际会议

The 3rd International Conference on Bioinformatics and Biomedical Engineering(iCBBE 2009)(第三届生物信息与生物医学工程国际会议)

北京

英文

1-4

2009-06-11(万方平台首次上网日期,不代表论文的发表时间)