Segmentation and semantic labelling of RGBD data with convolutional neural networks and surface fitting

作者: Giampaolo Pagnutti , Ludovico Minto , Pietro Zanuttigh

DOI: 10.1049/IET-CVI.2016.0502

关键词:

摘要: We present an approach for segmentation and semantic labelling of RGBD data exploiting together geometrical cues deep learning techniques. An initial over-segmentation is performed using spectral clustering a set non-uniform rational B-spline surfaces fitted on the extracted segments. Then convolutional neural network (CNN) receives in input colour geometry with surface fitting parameters. The made nine stages followed by softmax classifier produces vector descriptors each sample. In next step, iterative merging algorithm recombines output into larger regions matching various elements scene. couples adjacent segments higher similarity according to CNN features are candidate be merged accuracy used detect which belong same surface. Finally, labelled obtained combining from CNN. Experimental results show how proposed outperforms state-of-the-art methods provides accurate labelling.

参考文章(35)
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, Rob Fergus, Indoor Segmentation and Support Inference from RGBD Images Computer Vision – ECCV 2012. pp. 746- 760 ,(2012) , 10.1007/978-3-642-33715-4_54
Clément Farabet, Camille Couprie, Yann LeCun, Laurent Najman, Convolutional nets and watershed cuts for real-time semantic Labeling of RGBD videos Journal of Machine Learning Research. ,vol. 15, pp. 3489- 3511 ,(2014)
Nico Höft, Hannes Schulz, Sven Behnke, Fast Semantic Segmentation of RGB-D Scenes with GPU-Accelerated Deep Neural Networks Joint German/Austrian Conference on Artificial Intelligence (Künstliche Intelligenz). pp. 80- 85 ,(2014) , 10.1007/978-3-319-11206-0_9
Anran Wang, Jiwen Lu, Gang Wang, Jianfei Cai, Tat-Jen Cham, None, Multi-modal Unsupervised Feature Learning for RGB-D Scene Labeling Computer Vision – ECCV 2014. pp. 453- 467 ,(2014) , 10.1007/978-3-319-10602-1_30
Saurabh Gupta, Ross Girshick, Pablo Arbeláez, Jitendra Malik, Learning Rich Features from RGB-D Images for Object Detection and Segmentation european conference on computer vision. pp. 345- 360 ,(2014) , 10.1007/978-3-319-10584-0_23
David Eigen, Rob Fergus, Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture 2015 IEEE International Conference on Computer Vision (ICCV). pp. 2650- 2658 ,(2015) , 10.1109/ICCV.2015.304
S. Holzer, R. B. Rusu, M. Dixon, S. Gedikli, N. Navab, Adaptive neighborhood selection for real-time surface normal estimation from organized point cloud data using integral images intelligent robots and systems. pp. 2684- 2689 ,(2012) , 10.1109/IROS.2012.6385999
Saurabh Gupta, Pablo Arbeláez, Ross Girshick, Jitendra Malik, Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation International Journal of Computer Vision. ,vol. 112, pp. 133- 149 ,(2015) , 10.1007/S11263-014-0777-6
Steven Hickson, Irfan Essa, Henrik Christensen, Semantic Instance Labeling Leveraging Hierarchical Segmentation workshop on applications of computer vision. pp. 1068- 1075 ,(2015) , 10.1109/WACV.2015.147