Improving Spatiotemporal Self-supervision by Deep Reinforcement Learning

作者: Uta Büchler , Biagio Brattoli , Björn Ommer

DOI: 10.1007/978-3-030-01267-0_47

关键词:

摘要: Self-supervised learning of convolutional neural networks can harness large amounts cheap unlabeled data to train powerful feature representations. As surrogate task, we jointly address ordering visual in the spatial and temporal domain. The permutations training samples, which are at core self-supervision by ordering, have so far been sampled randomly from a fixed preselected set. Based on deep reinforcement propose sampling policy that adapts state network, is being trained. Therefore, new according their expected utility for updating representation. Experimental evaluation unsupervised transfer tasks demonstrates competitive performance standard benchmarks image video classification nearest neighbor retrieval.

参考文章(46)
Amir Roshan Zamir, Khurram Soomro, Mubarak Shah, UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild arXiv: Computer Vision and Pattern Recognition. ,(2012)
Xiaolong Wang, Abhinav Gupta, Unsupervised Learning of Visual Representations Using Videos 2015 IEEE International Conference on Computer Vision (ICCV). pp. 2794- 2802 ,(2015) , 10.1109/ICCV.2015.320
Carl Doersch, Abhinav Gupta, Alexei A. Efros, Unsupervised Visual Representation Learning by Context Prediction international conference on computer vision. pp. 1422- 1430 ,(2015) , 10.1109/ICCV.2015.167
Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation computer vision and pattern recognition. pp. 3431- 3440 ,(2015) , 10.1109/CVPR.2015.7298965
RONALD J. WILLIAMS, JING PENG, Function Optimization using Connectionist Reinforcement Learning Algorithms Connection Science. ,vol. 3, pp. 241- 268 ,(1991) , 10.1080/09540099108946587
Sepp Hochreiter, Jürgen Schmidhuber, Long short-term memory Neural Computation. ,vol. 9, pp. 1735- 1780 ,(1997) , 10.1162/NECO.1997.9.8.1735
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei, ImageNet Large Scale Visual Recognition Challenge International Journal of Computer Vision. ,vol. 115, pp. 211- 252 ,(2015) , 10.1007/S11263-015-0816-Y
A.G. Barto, R.S. Sutton, Reinforcement Learning: An Introduction ,(1988)
H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, T. Serre, HMDB: A large video database for human motion recognition international conference on computer vision. pp. 2556- 2563 ,(2011) , 10.1109/ICCV.2011.6126543
Daphne Koller, M. P. Kumar, Benjamin Packer, Self-Paced Learning for Latent Variable Models neural information processing systems. ,vol. 23, pp. 1189- 1197 ,(2010)