RAID: a relation-augmented image descriptor

作者: Paul Guerrero , Niloy J. Mitra , Peter Wonka

DOI: 10.1145/2897824.2925939

关键词: SegmentationRAIDArtificial intelligenceImage retrievalComputer scienceVisual descriptorsBridging (programming)SketchComputer vision

摘要: As humans, we regularly interpret scenes based on how objects are related, rather than the themselves. For example, see a person riding an object X or plank bridging two objects. Current methods provide limited support to search for content such relations. We present raid, relation-augmented image descriptor that supports queries inter-region The key idea of our is encode region-to-region relations as spatial distribution point-to-region relationships between regions. raid allows sketch-based retrieval and requires minimal training data, thus making it suited even querying uncommon evaluate proposed by into large databases successfully extract non-trivial images demonstrating complex relations, which easily missed erroneously classified existing methods. assess robustness multiple datasets when region segmentation computed automatically very noisy.

参考文章(62)
Tian Lan, Weilong Yang, Yang Wang, Greg Mori, Image retrieval with structured object queries using latent ranking SVM european conference on computer vision. pp. 129- 142 ,(2012) , 10.1007/978-3-642-33783-3_10
ByoungChul Ko, Hyeran Byun, Multiple Regions and Their Spatial Relationship-Based Image Retrieval conference on image and video retrieval. pp. 81- 90 ,(2002) , 10.1007/3-540-45479-9_9
Sybren Jansen, Amirhosein Shantia, Marco A. Wiering, The neural-SIFT feature descriptor for visual vocabulary object recognition international joint conference on neural network. pp. 1- 8 ,(2015) , 10.1109/IJCNN.2015.7280660
M.E. Celebi, Y.A. Aslandogan, A comparative study of three moment-based shape descriptors international conference on information technology coding and computing. ,vol. 1, pp. 788- 793 ,(2005) , 10.1109/ITCC.2005.3
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, C. Lawrence Zitnick, Microsoft COCO: Common Objects in Context Computer Vision – ECCV 2014. pp. 740- 755 ,(2014) , 10.1007/978-3-319-10602-1_48
Jonathan Long, Evan Shelhamer, Trevor Darrell, Fully convolutional networks for semantic segmentation computer vision and pattern recognition. pp. 3431- 3440 ,(2015) , 10.1109/CVPR.2015.7298965
Andrej Karpathy, Li Fei-Fei, Deep visual-semantic alignments for generating image descriptions computer vision and pattern recognition. pp. 3128- 3137 ,(2015) , 10.1109/CVPR.2015.7298932
J. Flusser, Invariant shape description and measure of object similarity Image Processing and its Applications, 1992., International Conference on. pp. 139- 142 ,(1992)
Shao Huang, Weiqiang Wang, Hui Zhang, Retrieving images using saliency detection and graph matching 2014 IEEE International Conference on Image Processing (ICIP). pp. 3087- 3091 ,(2014) , 10.1109/ICIP.2014.7025624
Xi Zhao, He Wang, Taku Komura, Indexing 3D Scenes Using the Interaction Bisector Surface ACM Transactions on Graphics. ,vol. 33, pp. 22- ,(2014) , 10.1145/2574860