Incremental and Multi-Task Learning Strategies for Coarse-To-Fine Semantic Segmentation

作者: Mazen Mel , Umberto Michieli , Pietro Zanuttigh

DOI: 10.3390/TECHNOLOGIES8010001

关键词:

摘要: The semantic understanding of a scene is key problem in the computer vision field. In this work, we address multi-level segmentation task where deep neural network first trained to recognize an initial, coarse, set few classes. Then, incremental-like approach, it adapted segment and label new objects’ categories hierarchically derived from subdividing classes initial set. We propose strategies output coarse classifiers fed architectures performing finer classification. Furthermore, investigate possibility predict different levels together, which also helps achieve higher accuracy. Experimental results on New York University Depth v2 (NYUDv2) dataset show promising insights understanding.

参考文章(39)
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, Rob Fergus, Indoor Segmentation and Support Inference from RGBD Images Computer Vision – ECCV 2012. pp. 746- 760 ,(2012) , 10.1007/978-3-642-33715-4_54
Clément Farabet, Camille Couprie, Yann LeCun, Laurent Najman, Convolutional nets and watershed cuts for real-time semantic Labeling of RGBD videos Journal of Machine Learning Research. ,vol. 15, pp. 3489- 3511 ,(2014)
Anran Wang, Jiwen Lu, Gang Wang, Jianfei Cai, Tat-Jen Cham, None, Multi-modal Unsupervised Feature Learning for RGB-D Scene Labeling Computer Vision – ECCV 2014. pp. 453- 467 ,(2014) , 10.1007/978-3-319-10602-1_30
Clément Farabet, Clément Farabet, Camille Couprie, Yann LeCun, Laurent Najman, Indoor Semantic Segmentation using depth information arXiv: Computer Vision and Pattern Recognition. ,(2013)
Saurabh Gupta, Ross Girshick, Pablo Arbeláez, Jitendra Malik, Learning Rich Features from RGB-D Images for Object Detection and Segmentation european conference on computer vision. pp. 345- 360 ,(2014) , 10.1007/978-3-319-10584-0_23
Geoffrey Hinton, Oriol Vinyals, Jeff Dean, Distilling the Knowledge in a Neural Network arXiv: Machine Learning. ,(2015)
Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation computer vision and pattern recognition. pp. 3431- 3440 ,(2015) , 10.1109/CVPR.2015.7298965
César Cadena, Jana Košecká, Semantic parsing for priming object detection in indoors RGB-D scenes The International Journal of Robotics Research. ,vol. 34, pp. 582- 597 ,(2015) , 10.1177/0278364914549488
Saurabh Gupta, Pablo Arbeláez, Ross Girshick, Jitendra Malik, Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation International Journal of Computer Vision. ,vol. 112, pp. 133- 149 ,(2015) , 10.1007/S11263-014-0777-6
Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, Andrew Zisserman, The Pascal Visual Object Classes (VOC) Challenge International Journal of Computer Vision. ,vol. 88, pp. 303- 338 ,(2010) , 10.1007/S11263-009-0275-4