作者: Eunbyung Park , Alexander C. Berg
DOI:
关键词:
摘要: Although deep convolutional neural networks(CNNs) have achieved remarkable results on object detection and segmentation, pre- post-processing steps such as region proposals non-maximum suppression(NMS), been required. These result in high computational complexity sensitivity to hyperparameters, e.g. thresholds for NMS. In this work, we propose a novel end-to-end trainable network architecture, which consists of recurrent layers, that generates the correct number instances their bounding boxes (or segmentation masks) given an image, using only single evaluation without any or steps. We tested detecting digits multi-digit images synthesized MNIST, automatically segmenting these images, cars KITTI benchmark dataset. The proposed approach outperforms strong CNN baseline datasets shows promising car detection.