Learning to decompose for object detection and instance segmentation

作者: Eunbyung Park , Alexander C. Berg

DOI:

关键词:

摘要: Although deep convolutional neural networks(CNNs) have achieved remarkable results on object detection and segmentation, pre- post-processing steps such as region proposals non-maximum suppression(NMS), been required. These result in high computational complexity sensitivity to hyperparameters, e.g. thresholds for NMS. In this work, we propose a novel end-to-end trainable network architecture, which consists of recurrent layers, that generates the correct number instances their bounding boxes (or segmentation masks) given an image, using only single evaluation without any or steps. We tested detecting digits multi-digit images synthesized MNIST, automatically segmenting these images, cars KITTI benchmark dataset. The proposed approach outperforms strong CNN baseline datasets shows promising car detection.

参考文章(25)
Russell Stewart, Mykhaylo Andriluka, Andrew Y. Ng, End-to-End People Detection in Crowded Scenes computer vision and pattern recognition. pp. 2325- 2333 ,(2016) , 10.1109/CVPR.2016.255
Ronan Collobert, Clément Farabet, Koray Kavukcuoglu, Torch7: A Matlab-like Environment for Machine Learning neural information processing systems. ,(2011)
Ross Girshick, Jitendra Malik, Bharath Hariharan, Pablo Arbeláez, Simultaneous Detection and Segmentation european conference on computer vision. pp. 297- 312 ,(2014) , 10.1007/978-3-319-10584-0_20
Ross Girshick, Fast R-CNN international conference on computer vision. pp. 1440- 1448 ,(2015) , 10.1109/ICCV.2015.169
Hyeonwoo Noh, Seunghoon Hong, Bohyung Han, Learning Deconvolution Network for Semantic Segmentation international conference on computer vision. pp. 1520- 1528 ,(2015) , 10.1109/ICCV.2015.178
Shuicheng Yan, Qiang Chen, Min Lin, Network In Network arXiv: Neural and Evolutionary Computing. ,(2013)
Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation computer vision and pattern recognition. pp. 3431- 3440 ,(2015) , 10.1109/CVPR.2015.7298965
Li Wan, David Eigen, Rob Fergus, End-to-end integration of a Convolutional Network, Deformable Parts Model and non-maximum suppression computer vision and pattern recognition. pp. 851- 859 ,(2015) , 10.1109/CVPR.2015.7298686
Jianming Zhang, Shugao Ma, Mehrnoosh Sameki, Stan Sclaroff, Margrit Betke, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech, Salient Object Subitizing computer vision and pattern recognition. pp. 4045- 4054 ,(2015) , 10.1109/CVPR.2015.7299031
Dumitru Erhan, Christian Szegedy, Alexander Toshev, Dragomir Anguelov, Scalable Object Detection Using Deep Neural Networks computer vision and pattern recognition. pp. 2155- 2162 ,(2014) , 10.1109/CVPR.2014.276