High-dimensional signature compression for large-scale image classification

作者: Jorge Sanchez , Florent Perronnin

DOI: 10.1109/CVPR.2011.5995504

关键词: Pattern recognitionHandwriting recognitionData compressionKernel (image processing)Curse of dimensionalityLossy compressionComputer scienceHash functionContextual image classificationArtificial intelligenceDimensionality reduction

摘要: We address image classification on a large-scale, i.e. when large number of images and classes are involved. First, we study accuracy as function the signature dimensionality training set size. show experimentally that larger set, higher impact accuracy. In other words, high-dimensional signatures important to obtain state-of-the-art results datasets. Second, tackle problem data compression very (on order 105 dimensions) using two lossy strategies: reduction technique known hash kernel an encoding based product quantizers. explain how gain in storage can be traded against loss and/or increase CPU cost. report databases — ImageNet dataset lM Flickr showing reduce our by factor 64 128 with little Integrating decompression classifier learning yields efficient scalable algorithm. On ILSVRC2010 74.3% at top-5, which corresponds 2.5% absolute improvement respect state-of-the-art. subset 10K top-1 16.7%, relative 160%

参考文章(32)
Jianchao Yang, Kai Yu, Thomas Huang, Efficient highly over-complete sparse coding using a mixture model european conference on computer vision. pp. 113- 126 ,(2010) , 10.1007/978-3-642-15555-0_9
Xi Zhou, Kai Yu, Tong Zhang, Thomas S. Huang, Image classification using super-vector coding of local image descriptors european conference on computer vision. pp. 141- 154 ,(2010) , 10.1007/978-3-642-15555-0_11
Florent Perronnin, Jorge Sánchez, Thomas Mensink, Improving the fisher kernel for large-scale image classification european conference on computer vision. ,vol. 6314, pp. 143- 156 ,(2010) , 10.1007/978-3-642-15561-1_11
Jia Deng, Alexander C. Berg, Kai Li, Li Fei-Fei, What does classifying more than 10,000 image categories tell us? european conference on computer vision. pp. 71- 84 ,(2010) , 10.1007/978-3-642-15555-0_6
Jonathan Brandt, Transform coding for fast approximate nearest neighbor search in high dimensions computer vision and pattern recognition. pp. 1815- 1822 ,(2010) , 10.1109/CVPR.2010.5539852
Herve Jegou, Matthijs Douze, Cordelia Schmid, Patrick Perez, Aggregating local descriptors into a compact image representation computer vision and pattern recognition. pp. 3304- 3311 ,(2010) , 10.1109/CVPR.2010.5540039
Andrea Vedaldi, Andrew Zisserman, Efficient additive kernels via explicit feature maps computer vision and pattern recognition. pp. 3539- 3546 ,(2010) , 10.1109/CVPR.2010.5539949
Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, Yihong Gong, Locality-constrained Linear Coding for image classification computer vision and pattern recognition. pp. 3360- 3367 ,(2010) , 10.1109/CVPR.2010.5540018
Dimitris Achlioptas, Database-friendly random projections: Johnson-Lindenstrauss with binary coins Journal of Computer and System Sciences. ,vol. 66, pp. 671- 687 ,(2003) , 10.1016/S0022-0000(03)00025-4
Florent Perronnin, Jorge Sanchez, Yan Liu, Large-scale image categorization with explicit data embedding computer vision and pattern recognition. pp. 2297- 2304 ,(2010) , 10.1109/CVPR.2010.5539914