SPORTS AUDIO CLASSIFICATION BASED ON MFCC AND GMM
Audio segmentation and classification can provide useful information for multimedia content analysis. In this paper, we present a approach to segment and categorize the sports audio into speech, music and other environmental sounds for sports video classification and highlight detection. We investigate the performance of Mel Frequency Cepstral Coefficients (MFCC) in a Gaussian Mixture Model frame work, and compare it to traditional short-time energy and zero-crossing rate feature. We achieve a correct identification close to 90% on MFCC with its first and second derivatives.
audio classification MFCC GMM sports audio
Liu Jiqing Dong Yuan Huang Jun Zhao Xianyu Wang Haila
Beijing University of Posts and Telecommunications, Beijing France Telecom Research & Development Center, Beijing
国际会议
北京
英文
482-485
2009-10-18(万方平台首次上网日期,不代表论文的发表时间)