A STATISTICS-BASED METHOD FOR VIDEO SEMANTIC ANALYSIS
Based on statistics theory, a generic framework for video semantic content analysis is proposed in this paper.Multilayer semantic analysis and multimodal information fusion are unified in the same modal.Firstly, frame-segment key-frame strategy and attention selection model are used to concisely represent video content.With pattern classification technique, the basic visual semantics are recognized.Then, a multilayer structure modal is used to extract multi-level visual semantics.After that, an audio semantic analysis scheme is presented with the spectrum feature extracted by Fourier transform algorithm.Finally, a bionic multimodal fusion method with two level structures for video semantic concept analysis is proposed.Experiment results demonstrate the framework could fuse multimodal feature, extract semantic in different granularity and bridge semantic gap to some extent.
Video semantic analysis Video semantic concept HHMM Multimodal fusion
WEI WEI ZHEN-XIA YUE MIN HUANG
Department of Computer Science and Technology, Chengdu University of Information Technology, Chengdu, China
国际会议
2007 International Conference on Machine Learning and Cybernetics(IEEE第六届机器学习与控制论国际会议)
香港
英文
1620-1625
2007-08-19(万方平台首次上网日期,不代表论文的发表时间)