Learning Video Primitives from Natural Video Sequence
Block based video modeling is a hot issue of video information processing. In past literature, the size of block is set to a fixed value. Different with previous works, we find that the optimal size of primitives depend on the video content rather than fixed value. In this paper, in order to model natural video sequence, we segment video sequence to a number of spatial-temporal neighborhoods, and categorize video neighborhoods into two types: structural video primitives and textural video primitives. Structural video primitives represent structural pixels and their movement and textural video primitives represent the texture neighborhoods and their movement. We learn the size of video primitives based on genetic algorithm and spatial-temporal neighborhoods entropy. Then we map spatial-temporal neighborhoods to primitives using the structural similarity index. The experimental results demonstrate that the size of primitives depends on the content of the video rather than a fixed value. Using our method, the structural video primitives and textural video primitives are separated better than using fixed size, and the computational time for learning primitives has been greatly reduced. The primitives we learned can be used to video reconstruction, video segmentation and other applications.
structural video primitives textural video primitives genetic algorithm spatial-temporal neighborhoods
Lu Li Hong Zhang Wenyan Jia Zhi-Hong Mao Yuhu You Mingui Sun
School of Astronautics, Beihang University, Beijing, China Departments of Neurosurgery / Electrical Engineering, University of Pittsburgh, Pittsburgh, PA 15213
国际会议
2011 4th International Congress on Image and Signal Processing(第四届图像与信号处理国际学术会议 CISP 2011)
上海
英文
502-506
2011-10-15(万方平台首次上网日期,不代表论文的发表时间)