Higher Order Potentials in End-to-End Trainable Conditional Random Fields

作者: Sadeep Jayasumana , Shuai Zheng , Philip H. S. Torr , Anurag Arnab

DOI:

关键词:

摘要: We tackle the problem of semantic segmentation using deep learning techniques. Most systems include a Conditional Random Field (CRF) model to produce structured output that is consistent with visual features image. With recent advances in learning, it becoming increasingly common perform CRF inference within neural network facilitate joint pixel-wise Convolutional Neural Network (CNN) classifier. While basic CRFs use only unary and pairwise potentials, has been shown addition higher order potentials defined on cliques more than two nodes can result better outcome. In this paper, we show types potential, namely, object detection based superpixel be included embedded network. design these allow efficient differentiable mean-field algorithm, making possible implement our as stack layers As result, all parameters richer jointly learned CNN classifier during end-to-end training entire find significant improvement results introduction trainable potentials.

参考文章(34)
João Carreira, Rui Caseiro, Jorge Batista, Cristian Sminchisescu, Semantic Segmentation with Second-Order Pooling Computer Vision – ECCV 2012. pp. 430- 443 ,(2012) , 10.1007/978-3-642-33786-4_32
Jifeng Dai, Kaiming He, Jian Sun, BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation international conference on computer vision. pp. 1635- 1643 ,(2015) , 10.1109/ICCV.2015.191
Ross Girshick, Jitendra Malik, Bharath Hariharan, Pablo Arbeláez, Simultaneous Detection and Segmentation european conference on computer vision. pp. 297- 312 ,(2014) , 10.1007/978-3-319-10584-0_20
Nir Friedman, Daniel L. Koller, Probabilistic graphical models : principles and techniques The MIT Press. ,(2009)
Ross Girshick, Fast R-CNN international conference on computer vision. pp. 1440- 1448 ,(2015) , 10.1109/ICCV.2015.169
Ľubor Ladický, Paul Sturgess, Karteek Alahari, Chris Russell, Philip H. S. Torr, What, Where and How Many? Combining Object Detectors and CRFs Computer Vision – ECCV 2010. pp. 424- 437 ,(2010) , 10.1007/978-3-642-15561-1_31
Karen Simonyan, Andrew Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition computer vision and pattern recognition. ,(2014)
Lubor Ladicky, Chris Russell, Pushmeet Kohli, Philip H. S. Torr, Graph cut based inference with co-occurrence statistics european conference on computer vision. pp. 239- 253 ,(2010) , 10.1007/978-3-642-15555-0_18
Min Sun, Byung-soo Kim, Pushmeet Kohli, Silvio Savarese, Relating Things and Stuff via ObjectProperty Interactions IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 36, pp. 1370- 1383 ,(2014) , 10.1109/TPAMI.2013.193
Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation computer vision and pattern recognition. pp. 3431- 3440 ,(2015) , 10.1109/CVPR.2015.7298965