Robust Segmentation for Video Captions with Complex Backgrounds

摘要：

　　Caption text contains rich information that can be used for video indexing and summarization.In this paper,we propose an effective caption text segmentation approach to improve OCR accuracy.Here,an AlexNet CNN is first trained with path signature for text tracking.Then we utilize an improved adaptive thresholding method to segment caption text in individual frames.Finally,the multi-frame integration is conducted with gamma correction and region growing.In contrast to conventional methods which extract video text in individual frames independently,we exploit the specific temporal characteristics of videos to perform segmentation.Moreover,the proposed method can effectively remove the complex backgrounds with similar intensity to text.Experimental results on different videos and comparisons with other methods show the efficiency of our approach.

关键词： Caption text segmentation Convolutional neural networks Path signature Multi-frame integration

作者: Zong-Heng Xing Fang Zhou Shu Tian Xu-Cheng Yin

作者单位: Department of Computer Science and Technology,School of Computer and Communication Engineering,University of Science and Technology Beijing,Beijing,China

会议类型: 国际会议

会议名称: 第七届全国模式识别学术会议(The 7th Chinese Conference on Pattern Recognition,CCPR2016)

会议地点: 成都

会议语种:英文

页码: 89-100

在线出版日期: 2016-11-03（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Robust Segmentation for Video Captions with Complex Backgrounds