会议专题

Learning Non-local Representation for Visual Tracking

  Discriminative Correlation Filter(DCF)based trackers have tremendously improved the tracking performance.They adopt the first frame of video sequence to initialize the tracker and provide a fast solution due to its formulation in the Fourier domain.Previous work that applies a DCF layer on the top of pretrianed CNN,however,has not taken full advantage of CNN feature maps.In this paper,we propose a tracking architecture to fuse the local and global response map for visual tracking in an accuracy and robust way.The feature map extracted from pretrained CNN is applied to a fully-convolutional DCF layer and a nonlocal layer for capturing local and global response map.Experiments show that our method achieves state-of-the-art performance on three popular benchmarks: OTB-2013,OTB-2015 and VOT2016.

Visual tracking DCF Non-local Feature pyramid

Peng Zhang Zengfu Wang

Institute of Intelligent Machines,Chinese Academy of Sciences,Hefei,China;University of Science and Technology of China,Hefei,China;National Engineering Laboratory for Speech and Language Information Processing,Hefei,China

国际会议

中国模式识别与计算机视觉大会(PRCV2018)

广州

英文

209-220

2018-11-23(万方平台首次上网日期,不代表论文的发表时间)