Scaling Object Detection by Transferring Classification Weights

作者: Jason Kuen , Federico Perazzi , Zhe Lin , Jianming Zhang , Yap-Peng Tan

DOI: 10.1109/ICCV.2019.00614

关键词: Set (abstract data type)GeneralizationFeature (computer vision)Object detectionObject (computer science)Normalization (statistics)Artificial intelligenceScale (descriptive set theory)Pattern recognitionComputer science

摘要: Large scale object detection datasets are constantly increasing their size in terms of the number classes and annotations count. Yet, object-level categories annotated is an order magnitude smaller than image-level classification labels. State-of-the art models trained a supervised fashion this limits they can detect. In paper, we propose novel weight transfer network (WTN) to effectively efficiently knowledge from network's weights allow without box supervision. We first introduce input feature normalization schemes curb under-fitting during training vanilla WTN. then autoencoder-WTN (AE-WTN) which uses reconstruction loss preserve information over all target latent space ensure generalization classes. Compared WTN, AE-WTN obtains absolute performance gains 6% on two Open Images evaluation sets with 500 seen 57 respectively, 25% Visual Genome set 200

参考文章(44)
Ross Goroshin, Yann LeCun, Michaël Mathieu, Junbo Jake Zhao, Stacked What-Where Auto-encoders arXiv: Machine Learning. ,(2015)
Harri Valpola, Tapani Raiko, Antti Rasmus, Mikko Honkala, Mathias Berglund, Semi-supervised learning with Ladder networks neural information processing systems. ,vol. 28, pp. 3546- 3554 ,(2015)
Ross Girshick, Fast R-CNN international conference on computer vision. pp. 1440- 1448 ,(2015) , 10.1109/ICCV.2015.169
Christian Szegedy, Sergey Ioffe, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift international conference on machine learning. ,vol. 1, pp. 448- 456 ,(2015)
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, C. Lawrence Zitnick, Microsoft COCO: Common Objects in Context Computer Vision – ECCV 2014. pp. 740- 755 ,(2014) , 10.1007/978-3-319-10602-1_48
Hakan Bilen, Marco Pedersoli, Tinne Tuytelaars, Weakly supervised object detection with convex clustering computer vision and pattern recognition. pp. 1081- 1089 ,(2015) , 10.1109/CVPR.2015.7298711
Ramon Lopez de Mantaras, Eva Armengol, Machine learning from examples: inductive and lazy methods data and knowledge engineering. ,vol. 25, pp. 99- 123 ,(1998) , 10.1016/S0169-023X(97)00053-0
Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John Winn, Andrew Zisserman, The Pascal Visual Object Classes Challenge: A Retrospective International Journal of Computer Vision. ,vol. 111, pp. 98- 136 ,(2015) , 10.1007/S11263-014-0733-5
Ilya Sutskever, Geoffrey Hinton, Alex Krizhevsky, Ruslan Salakhutdinov, Nitish Srivastava, Dropout: a simple way to prevent neural networks from overfitting Journal of Machine Learning Research. ,vol. 15, pp. 1929- 1958 ,(2014)
Geoffrey E. Hinton, Richard S. Zemel, Autoencoders, Minimum Description Length and Helmholtz Free Energy neural information processing systems. ,vol. 6, pp. 3- 10 ,(1993)