Global Deconvolutional Networks for Semantic Segmentation

作者: Jaesik Choi , Janghoon Ju , Vladimir Nekrasov

DOI:

关键词:

摘要: Semantic image segmentation is a principal problem in computer vision, where the aim to correctly classify each individual pixel of an into semantic label. Its widespread use many areas, including medical imaging and autonomous driving, has fostered extensive research recent years. Empirical improvements tackling this task have primarily been motivated by successful exploitation Convolutional Neural Networks (CNNs) pre-trained for classification object recognition. However, pixel-wise labelling with CNNs its own unique challenges: (1) accurate deconvolution, or upsampling, low-resolution output higher-resolution mask (2) inclusion global information, context, within locally extracted features. To address these issues, we propose novel architecture conduct equivalent deconvolution operation globally acquire dense predictions. We demonstrate that it leads improved performance state-of-the-art models on PASCAL VOC 2012 benchmark, reaching 74.0% mean IU accuracy test set.

参考文章(42)
Jose M Alvarez, Theo Gevers, Yann LeCun, Antonio M Lopez, None, Road Scene Segmentation from a Single Image Computer Vision – ECCV 2012. ,vol. 7578, pp. 376- 389 ,(2012) , 10.1007/978-3-642-33786-4_28
Volodymyr Mnih, Geoffrey E. Hinton, Learning to Detect Roads in High-Resolution Aerial Images Computer Vision – ECCV 2010. pp. 210- 223 ,(2010) , 10.1007/978-3-642-15567-3_16
Vijay Badrinarayanan, Roberto Cipolla, Ankur Handa, SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling computer vision and pattern recognition. ,(2015)
Amy Bearman, Olga Russakovsky, Vittorio Ferrari, Li Fei-Fei, What’s the Point: Semantic Segmentation with Point Supervision Computer Vision – ECCV 2016. pp. 549- 565 ,(2016) , 10.1007/978-3-319-46478-7_34
Carl Doersch, Abhinav Gupta, Alexei A. Efros, Context as Supervisory Signal: Discovering Objects with Predictable Context european conference on computer vision. pp. 362- 377 ,(2014) , 10.1007/978-3-319-10578-9_24
Pierre Sermanet, Yann LeCun, David Eigen, Rob Fergus, Michael Mathieu, Xiang Zhang, OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks arXiv: Computer Vision and Pattern Recognition. ,(2013)
David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams, Learning representations by back-propagating errors Nature. ,vol. 323, pp. 696- 699 ,(1988) , 10.1038/323533A0
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, Yoshua Bengio, None, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention international conference on machine learning. ,vol. 3, pp. 2048- 2057 ,(2015)
Alan L. Yuille, Liang-Chieh Chen, Kevin Murphy, George Papandreou, Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation arXiv: Computer Vision and Pattern Recognition. ,(2015)
Karen Simonyan, Andrew Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition computer vision and pattern recognition. ,(2014)