作者: Golnaz Ghiasi , Charless C. Fowlkes
DOI: 10.1007/978-3-319-46487-9_32
关键词:
摘要: CNN architectures have terrific recognition performance but rely on spatial pooling which makes it difficult to adapt them tasks that require dense, pixel-accurate labeling. This paper two contributions: (1) We demonstrate while the apparent resolution of convolutional feature maps is low, high-dimensional representation contains significant sub-pixel localization information. (2) describe a multi-resolution reconstruction architecture based Laplacian pyramid uses skip connections from higher and multiplicative gating successively refine segment boundaries reconstructed lower-resolution maps. approach yields state-of-the-art semantic segmentation results PASCAL VOC Cityscapes benchmarks without resorting more complex random-field inference or instance detection driven architectures.