作者: Jaesik Choi , Janghoon Ju , Vladimir Nekrasov
DOI:
关键词:
摘要: Semantic image segmentation is a principal problem in computer vision, where the aim to correctly classify each individual pixel of an into semantic label. Its widespread use many areas, including medical imaging and autonomous driving, has fostered extensive research recent years. Empirical improvements tackling this task have primarily been motivated by successful exploitation Convolutional Neural Networks (CNNs) pre-trained for classification object recognition. However, pixel-wise labelling with CNNs its own unique challenges: (1) accurate deconvolution, or upsampling, low-resolution output higher-resolution mask (2) inclusion global information, context, within locally extracted features. To address these issues, we propose novel architecture conduct equivalent deconvolution operation globally acquire dense predictions. We demonstrate that it leads improved performance state-of-the-art models on PASCAL VOC 2012 benchmark, reaching 74.0% mean IU accuracy test set.