Deep Hough Voting for 3D Object Detection in Point Clouds

作者: Charles R. Qi , Or Litany , Kaiming He , Leonidas Guibas

DOI: 10.1109/ICCV.2019.00937

关键词:

摘要: Current 3D object detection methods are heavily influenced by 2D detectors. In order to leverage architectures in detectors, they often convert point clouds regular grids (i.e., voxel or bird's eye view images), rely on images propose boxes. Few works have attempted directly detect objects clouds. this work, we return first principles construct a pipeline for cloud data and as generic possible. However, due the sparse nature of -- samples from manifolds space face major challenge when predicting bounding box parameters scene points: centroid can be far any surface thus hard regress accurately one step. To address challenge, VoteNet, an end-to-end network based synergy deep set networks Hough voting. Our model achieves state-of-the-art two large datasets real scans, ScanNet SUN RGB-D with simple design, compact size high efficiency. Remarkably, VoteNet outperforms previous using purely geometric information without relying color images.

参考文章(50)
D.H. Ballard, Generalizing the hough transform to detect arbitrary shapes Pattern Recognition. ,vol. 13, pp. 714- 725 ,(1987) , 10.1016/0031-3203(81)90009-1
Shuran Song, Jianxiong Xiao, Sliding Shapes for 3D Object Detection in Depth Images Computer Vision – ECCV 2014. pp. 634- 651 ,(2014) , 10.1007/978-3-319-10599-4_41
Min Sun, Gary Bradski, Bing-Xin Xu, Silvio Savarese, Depth-encoded hough voting for joint object detection and shape recovery european conference on computer vision. pp. 658- 671 ,(2010) , 10.1007/978-3-642-15555-0_48
Shuran Song, Samuel P. Lichtenberg, Jianxiong Xiao, SUN RGB-D: A RGB-D scene understanding benchmark suite computer vision and pattern recognition. pp. 567- 576 ,(2015) , 10.1109/CVPR.2015.7298655
Yangyan Li, Angela Dai, Leonidas Guibas, Matthias Nießner, Database-Assisted Object Retrieval for Real-Time 3D Reconstruction Computer Graphics Forum. ,vol. 34, pp. 435- 446 ,(2015) , 10.1111/CGF.12573
Oliver J. Woodford, Minh-Tri Pham, Atsuto Maki, Frank Perbet, Björn Stenger, Demisting the Hough Transform for 3D Shape Recognition and Registration International Journal of Computer Vision. ,vol. 106, pp. 332- 341 ,(2014) , 10.1007/S11263-013-0623-2
Bastian Leibe, Aleš Leonardis, Bernt Schiele, Robust Object Detection with Interleaved Categorization and Segmentation International Journal of Computer Vision. ,vol. 77, pp. 259- 289 ,(2008) , 10.1007/S11263-007-0095-3
Byung-soo Kim, Shili Xu, Silvio Savarese, Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses computer vision and pattern recognition. pp. 3182- 3189 ,(2013) , 10.1109/CVPR.2013.409
Liangliang Nan, Ke Xie, Andrei Sharf, A search-classify approach for cluttered indoor scene understanding ACM Transactions on Graphics. ,vol. 31, pp. 1- 10 ,(2012) , 10.1145/2366145.2366156
Jan Knopp, Mukta Prasad, Luc Van Gool, Orientation invariant 3D object classification using hough transform based methods Proceedings of the ACM workshop on 3D object retrieval. pp. 15- 20 ,(2010) , 10.1145/1877808.1877813