会议专题

Robust Segmentation for Video Captions with Complex Backgrounds

  Caption text contains rich information that can be used for video indexing and summarization.In this paper,we propose an effective caption text segmentation approach to improve OCR accuracy.Here,an AlexNet CNN is first trained with path signature for text tracking.Then we utilize an improved adaptive thresholding method to segment caption text in individual frames.Finally,the multi-frame integration is conducted with gamma correction and region growing.In contrast to conventional methods which extract video text in individual frames independently,we exploit the specific temporal characteristics of videos to perform segmentation.Moreover,the proposed method can effectively remove the complex backgrounds with similar intensity to text.Experimental results on different videos and comparisons with other methods show the efficiency of our approach.

Caption text segmentation Convolutional neural networks Path signature Multi-frame integration

Zong-Heng Xing Fang Zhou Shu Tian Xu-Cheng Yin

Department of Computer Science and Technology,School of Computer and Communication Engineering,University of Science and Technology Beijing,Beijing,China

国际会议

第七届全国模式识别学术会议(The 7th Chinese Conference on Pattern Recognition,CCPR2016)

成都

英文

89-100

2016-11-03(万方平台首次上网日期,不代表论文的发表时间)