Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

作者： Aapo Kyrola , Piotr Dollár , Lukasz Wesolowski , Yangqing Jia , Andrew Tulloch

DOI:

关键词:

摘要: … offers a potential solution to this problem by dividing SGD … nontrivial growth in the SGD minibatch size. In this paper, we … loss of accuracy when training with large minibatch sizes up to …

arxiv.org PDF 下载加速

参考文章(33)

William Gropp, Ewing Lusk, Anthony Skjellum, Using MPI: Portable Parallel Programming with the Message-Passing Interface ,(1994)

Rolf Rabenseifner, Optimization of Collective Reduction Operations international conference on computational science. pp. 1- 9 ,(2004) , 10.1007/978-3-540-24685-5_1

Alex Krizhevsky, One weird trick for parallelizing convolutional neural networks arXiv: Neural and Evolutionary Computing. ,(2014)

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification international conference on computer vision. pp. 1026- 1034 ,(2015) , 10.1109/ICCV.2015.123

M. Gürbüzbalaban, A. Ozdaglar, P. A. Parrilo, Why Random Reshuffling Beats Stochastic Gradient Descent arXiv: Optimization and Control. ,(2015) , 10.1007/S10107-019-01440-W

Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation computer vision and pattern recognition. pp. 3431- 3440 ,(2015) , 10.1109/CVPR.2015.7298965

M Sai Praneeth, Xudong Peng, Alice Li, Shahrzad Hosseini Vajargah, Going deeper with convolutions computer vision and pattern recognition. pp. 1- 9 ,(2015) , 10.1109/CVPR.2015.7298594

Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation computer vision and pattern recognition. pp. 580- 587 ,(2014) , 10.1109/CVPR.2014.81

M. Barnett, L. Shuler, R. van de Geijn, S. Gupta, D.G. Payne, J. Watts, Interprocessor collective communication library (InterCom) ieee international conference on high performance computing data and analytics. pp. 357- 364 ,(1994) , 10.1109/SHPCC.1994.296665

10.

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei, ImageNet Large Scale Visual Recognition Challenge International Journal of Computer Vision. ,vol. 115, pp. 211- 252 ,(2015) , 10.1007/S11263-015-0816-Y

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

来源期刊

我的账户

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

来源期刊

相似文章 10

我的账户