Learning Non-local Representation for Visual Tracking

摘要：

　　Discriminative Correlation Filter(DCF)based trackers have tremendously improved the tracking performance.They adopt the first frame of video sequence to initialize the tracker and provide a fast solution due to its formulation in the Fourier domain.Previous work that applies a DCF layer on the top of pretrianed CNN,however,has not taken full advantage of CNN feature maps.In this paper,we propose a tracking architecture to fuse the local and global response map for visual tracking in an accuracy and robust way.The feature map extracted from pretrained CNN is applied to a fully-convolutional DCF layer and a nonlocal layer for capturing local and global response map.Experiments show that our method achieves state-of-the-art performance on three popular benchmarks: OTB-2013,OTB-2015 and VOT2016.

关键词： Visual tracking DCF Non-local Feature pyramid

作者: Peng Zhang Zengfu Wang

作者单位: Institute of Intelligent Machines,Chinese Academy of Sciences,Hefei,China;University of Science and Technology of China,Hefei,China;National Engineering Laboratory for Speech and Language Information Processing,Hefei,China

会议类型: 国际会议

会议名称: 中国模式识别与计算机视觉大会(PRCV2018)

会议地点: 广州

会议语种:英文

页码: 209-220

在线出版日期: 2018-11-23（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Learning Non-local Representation for Visual Tracking