作者: Vijay Badrinarayanan , Alex Kendall , Roberto Cipolla
DOI:
关键词:
摘要: We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable engine consists of an encoder network, corresponding decoder followed by classification layer. The the is topologically identical to 13 layers in VGG16 network. role map low resolution feature maps full input classification. novelty SegNet lies manner which upsamples its lower map(s). Specifically, uses pooling indices computed max-pooling step perform non-linear upsampling. eliminates need learning upsample. upsampled are sparse then convolved with filters produce dense maps. compare our proposed widely adopted FCN also well known DeepLab-LargeFOV, DeconvNet architectures. comparison reveals memory versus accuracy trade-off involved achieving good performance. SegNet was primarily motivated scene understanding applications. Hence, it designed be efficient both terms computational time during inference. It significantly smaller number parameters than other competing performed controlled benchmark architectures on road scenes SUN RGB-D indoor tasks. show that provides performance competitive inference more memory-wise as compared provide Caffe implementation web demo at this http URL.