A STATISTICS-BASED METHOD FOR VIDEO SEMANTIC ANALYSIS

摘要：

Based on statistics theory, a generic framework for video semantic content analysis is proposed in this paper.Multilayer semantic analysis and multimodal information fusion are unified in the same modal.Firstly, frame-segment key-frame strategy and attention selection model are used to concisely represent video content.With pattern classification technique, the basic visual semantics are recognized.Then, a multilayer structure modal is used to extract multi-level visual semantics.After that, an audio semantic analysis scheme is presented with the spectrum feature extracted by Fourier transform algorithm.Finally, a bionic multimodal fusion method with two level structures for video semantic concept analysis is proposed.Experiment results demonstrate the framework could fuse multimodal feature, extract semantic in different granularity and bridge semantic gap to some extent.

关键词： Video semantic analysis Video semantic concept HHMM Multimodal fusion

作者: WEI WEI ZHEN-XIA YUE MIN HUANG

作者单位: Department of Computer Science and Technology, Chengdu University of Information Technology, Chengdu, China

会议类型: 国际会议

会议名称: 2007 International Conference on Machine Learning and Cybernetics(IEEE第六届机器学习与控制论国际会议)

会议地点: 香港

会议语种:英文

页码: 1620-1625

在线出版日期: 2007-08-19（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A STATISTICS-BASED METHOD FOR VIDEO SEMANTIC ANALYSIS