Pixel Saliency Based Encoding for Fine-Grained Image Classification

摘要：

　　Fine-grained image classification concerns categorization at subordinate levels,where the distinction between inter-class objects is very subtle and highly local.Recently,Convolutional Neural Networks(CNNs)have almost yielded the best results on the basic image classification tasks.In CNN,the direct pooling operation is always used to resize the last convolutional feature maps from n × n × c to 1 × 1 × c for feature representation.However,such pooling operation may lead to extreme saliency compression of feature map,especially in fine-grained image classification.In this paper,to more deeply explore the representation ability of the feature map,we propose a Pixel Saliency based Encoding method,which is called PS-CNN.First,in our PS-CNN,the saliency matrix is obtained by evaluating the saliency of each pixel in the feature map.Then,we segment the original feature maps into multiple ones with multiple generated binary masks via thresholding on the obtained saliency matrix,and subsequently squeeze those masked feature maps into the encoded ones.Finally,a fine-grained feature representation is generated by concatenating the original feature maps with the encoded ones.Experimental results show that our simple yet powerful PS-CNN outperforms state-of-the-art classification approaches.Specially,we can achieve 89.1%classification accuracy on the Aircraft,92.3%on the Stanford Car,and 81.9%on the NABirds.

关键词： Pixel saliency Feature encoding Fine-grained Image classification

作者: Chao Yin Lei Zhang Ji Liu

作者单位: College of Communication Engineering,Chongqing University,No.174 Shazheng Street,Shapingba district,Chongqing 400044,China

会议类型: 国际会议

会议名称: 中国模式识别与计算机视觉大会(PRCV2018)

会议地点: 广州

会议语种:英文

页码: 274-285

在线出版日期: 2018-11-23（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Pixel Saliency Based Encoding for Fine-Grained Image Classification