Generalized Max Pooling

作者: Naila Murray , Florent Perronnin

DOI: 10.1109/CVPR.2014.317

关键词:

摘要: State-of-the-art patch-based image representations involve a pooling operation that aggregates statistics computed from local descriptors. Standard operations include sum- and max-pooling. Sum-pooling lacks discriminability because the resulting representation is strongly influenced by frequent yet often uninformative descriptors, but only weakly rare potentially highly-informative ones. Max-pooling equalizes influence of descriptorsbut applicable to rely on count statistics, such as bag-of-visual-words (BOV)and its soft- sparse-coding extensions. We propose novel mechanism achieves same effect max-pooling beyond BOV especially state-of-the-art Fisher Vector --hence name Generalized Max Pooling (GMP). It involves equalizing similarity between each patch pooled representation, which shown be equivalent re-weighting per-patch statistics. show five public classification benchmarks proposedGMP can lead significant performance gains with respect toheuristic alternatives.

参考文章(41)
Serge Belongie, Peter Welinder, Florian Schroff, Pietro Perona, Steve Branson, Takeshi Mita, Catherine Wah, Caltech-UCSD Birds 200 California Institute of Technology. ,(2010)
Xi Zhou, Kai Yu, Tong Zhang, Thomas S. Huang, Image classification using super-vector coding of local image descriptors european conference on computer vision. pp. 141- 154 ,(2010) , 10.1007/978-3-642-15555-0_11
Florent Perronnin, Jorge Sánchez, Thomas Mensink, Improving the fisher kernel for large-scale image classification european conference on computer vision. ,vol. 6314, pp. 143- 156 ,(2010) , 10.1007/978-3-642-15561-1_11
T. Serre, L. Wolf, T. Poggio, Object recognition with features inspired by visual cortex computer vision and pattern recognition. ,vol. 2, pp. 994- 1000 ,(2005) , 10.1109/CVPR.2005.254
G. Csurka, Visual categorization with bags of keypoints european conference on computer vision. ,vol. 1, pp. 22- ,(2004)
Hervé Jégou, Ondřej Chum, Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening Computer Vision – ECCV 2012. pp. 774- 787 ,(2012) , 10.1007/978-3-642-33709-3_55
Serge Belongie, Peter Welinder, Pietro Perona, Steve Branson, Catherine Wah, The Caltech-UCSD Birds-200-2011 Dataset California Institute of Technology. ,(2011)
Jorge Sánchez, Florent Perronnin, Thomas Mensink, Jakob Verbeek, Image Classification with the Fisher Vector: Theory and Practice International Journal of Computer Vision. ,vol. 105, pp. 222- 245 ,(2013) , 10.1007/S11263-013-0636-X
Ken Chatfield, Victor Lempitsky, Andrea Vedaldi, Andrew Zisserman, The devil is in the details: an evaluation of recent feature encoding methods british machine vision conference. pp. 1- 12 ,(2011) , 10.5244/C.25.76
O. M. Parkhi, A. Vedaldi, A. Zisserman, C. V. Jawahar, Cats and dogs computer vision and pattern recognition. pp. 3498- 3505 ,(2012) , 10.1109/CVPR.2012.6248092