Semantic Understanding of Scenes Through the ADE20K Dataset

作者: Bolei Zhou , Hang Zhao , Xavier Puig , Tete Xiao , Sanja Fidler

DOI: 10.1007/S11263-018-1140-0

关键词:

摘要: Semantic understanding of visual scenes is one the holy grails computer vision. Despite efforts community in data collection, there are still few image datasets covering a wide range and object categories with pixel-wise annotations for scene understanding. In this work, we present densely annotated dataset ADE20K, which spans diverse scenes, objects, parts some cases even parts. Totally 25k images complex everyday containing variety objects their natural spatial context. On average 19.5 instances 10.5 classes per image. Based on construct benchmarks parsing instance segmentation. We provide baseline performances both re-implement state-of-the-art models open source. further evaluate effect synchronized batch normalization find that reasonably large size crucial semantic segmentation performance. show networks trained ADE20K able to segment objects.

参考文章(41)
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, Rob Fergus, Indoor Segmentation and Support Inference from RGBD Images Computer Vision – ECCV 2012. pp. 746- 760 ,(2012) , 10.1007/978-3-642-33715-4_54
Hyeonwoo Noh, Seunghoon Hong, Bohyung Han, Learning Deconvolution Network for Semantic Segmentation international conference on computer vision. pp. 1520- 1528 ,(2015) , 10.1109/ICCV.2015.178
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, C. Lawrence Zitnick, Microsoft COCO: Common Objects in Context Computer Vision – ECCV 2014. pp. 740- 755 ,(2014) , 10.1007/978-3-319-10602-1_48
Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation computer vision and pattern recognition. pp. 3431- 3440 ,(2015) , 10.1109/CVPR.2015.7298965
Jifeng Dai, Kaiming He, Jian Sun, Convolutional feature masking for joint object and stuff segmentation computer vision and pattern recognition. pp. 3992- 4000 ,(2015) , 10.1109/CVPR.2015.7299025
Shuran Song, Samuel P. Lichtenberg, Jianxiong Xiao, SUN RGB-D: A RGB-D scene understanding benchmark suite computer vision and pattern recognition. pp. 567- 576 ,(2015) , 10.1109/CVPR.2015.7298655
Sean Bell, Paul Upchurch, Noah Snavely, Kavita Bala, Material recognition in the wild with the Materials in Context Database computer vision and pattern recognition. pp. 3479- 3487 ,(2015) , 10.1109/CVPR.2015.7298970
Jianxiong Xiao, James Hays, Krista A. Ehinger, Aude Oliva, Antonio Torralba, SUN database: Large-scale scene recognition from abbey to zoo computer vision and pattern recognition. pp. 3485- 3492 ,(2010) , 10.1109/CVPR.2010.5539970
Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, Andrew Zisserman, The Pascal Visual Object Classes (VOC) Challenge International Journal of Computer Vision. ,vol. 88, pp. 303- 338 ,(2010) , 10.1007/S11263-009-0275-4
Sean Bell, Paul Upchurch, Noah Snavely, Kavita Bala, OpenSurfaces: a richly annotated catalog of surface appearance international conference on computer graphics and interactive techniques. ,vol. 32, pp. 111- ,(2013) , 10.1145/2461912.2462002