会议专题

SPORTS AUDIO CLASSIFICATION BASED ON MFCC AND GMM

Audio segmentation and classification can provide useful information for multimedia content analysis. In this paper, we present a approach to segment and categorize the sports audio into speech, music and other environmental sounds for sports video classification and highlight detection. We investigate the performance of Mel Frequency Cepstral Coefficients (MFCC) in a Gaussian Mixture Model frame work, and compare it to traditional short-time energy and zero-crossing rate feature. We achieve a correct identification close to 90% on MFCC with its first and second derivatives.

audio classification MFCC GMM sports audio

Liu Jiqing Dong Yuan Huang Jun Zhao Xianyu Wang Haila

Beijing University of Posts and Telecommunications, Beijing France Telecom Research & Development Center, Beijing

国际会议

2009 2nd IEEE International Conference on Broadband Network & Multimedia Technology(2009年宽带网络与多媒体国际会议 IEEE IC-BNMT2009)

北京

英文

482-485

2009-10-18(万方平台首次上网日期,不代表论文的发表时间)