DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

作者: Liang-Chieh Chen , George Papandreou , Iasonas Kokkinos , Kevin Murphy , Alan L. Yuille

DOI: 10.1109/TPAMI.2017.2699184

关键词: PyramidComputer visionConditional random fieldUpsamplingDeep learningCRFSComputer scienceGraphical modelConvolutional neural networkPattern recognitionArtificial intelligenceScale-space segmentationConvolutionTest set

摘要: In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First , highlight convolution upsampled filters, or ‘atrous convolution’, as a powerful tool in dense prediction tasks. Atrous allows us explicitly control resolution at which feature responses computed within Convolutional Neural Networks. It also effectively enlarge field view filters incorporate larger context without increasing number parameters amount computation. Second propose atrous spatial pyramid pooling (ASPP) robustly segment objects multiple scales. ASPP probes an incoming convolutional layer sampling rates effective fields-of-views, thus capturing well Third improve localization object boundaries by combining methods from DCNNs probabilistic graphical models. The commonly deployed combination max-pooling downsampling achieves invariance but has toll on accuracy. We overcome final DCNN fully connected Conditional Random Field (CRF), is both qualitatively quantitatively performance. Our proposed “DeepLab” system sets new state-of-art PASCAL VOC-2012 task, reaching 79.7 percent mIOU test set, advances results other datasets: PASCAL-Context, PASCAL-Person-Part, Cityscapes. All our code made publicly available online.

参考文章(99)
Alexander Vezhnevets, Vittorio Ferrari, Joachim M. Buhmann, Weakly supervised semantic segmentation with a multi-image model international conference on computer vision. pp. 643- 650 ,(2011) , 10.1109/ICCV.2011.6126299
Aurélien Lucchi, Yunpeng Li, Xavier Boix, Kevin Smith, Pascal Fua, ETH BIWI, Are spatial and global constraints really necessary for segmentation international conference on computer vision. pp. 9- 16 ,(2011) , 10.1109/ICCV.2011.6126219
Vladlen Koltun, Philipp Krähenbühl, Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials neural information processing systems. ,vol. 24, pp. 109- 117 ,(2011)
S. Lazebnik, C. Schmid, J. Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories computer vision and pattern recognition. ,vol. 2, pp. 2169- 2178 ,(2006) , 10.1109/CVPR.2006.68
Ilya Sutskever, Geoffrey E. Hinton, Alex Krizhevsky, ImageNet Classification with Deep Convolutional Neural Networks neural information processing systems. ,vol. 25, pp. 1097- 1105 ,(2012)
Sadeep Jayasumana, Shuai Zheng, Philip H. S. Torr, Anurag Arnab, Higher Order Potentials in End-to-End Trainable Conditional Random Fields arXiv: Computer Vision and Pattern Recognition. ,(2015)
Ram Nevatia, Liang-Chieh Chen, Kan Chen, Haoyuan Gao, Jiang Wang, Wei Xu, ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering arXiv: Computer Vision and Pattern Recognition. ,(2015)
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition computer vision and pattern recognition. pp. 770- 778 ,(2016) , 10.1109/CVPR.2016.90
Alan L. Yuille, Liang-Chieh Chen, Fangting Xia, Peng Wang, Zoom Better to See Clearer: Human Part Segmentation with Auto Zoom Net. arXiv: Computer Vision and Pattern Recognition. ,(2015)
Ilya Sutskever, Ian J. Goodfellow, Gregory S. Corrado, Michael Isard, Matthieu Devin, Vincent Vanhoucke, Martin Wicke, Manjunath Kudlur, Rajat Monga, Vijay Vasudevan, Geoffrey Irving, Yangqing Jia, Fernanda B. Viégas, Kunal Talwar, Martin Wattenberg, Ashish Agarwal, Martín Abadi, Yuan Yu, Rafal Józefowicz, Craig Citro, Sherry Moore, Paul Barham, Benoit Steiner, Pete Warden, Josh Levenberg, Derek Gordon Murray, Paul A. Tucker, Jonathon Shlens, Jeffrey Dean, Xiaoqiang Zheng, Chris Olah, Andy Davis, Dan Mané, Mike Schuster, Sanjay Ghemawat, Andrew Harp, Oriol Vinyals, Eugene Brevdo, Zhifeng Chen, Lukasz Kaiser, TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems arXiv: Distributed, Parallel, and Cluster Computing. ,(2015)