Mining semantic affordances of visual object categories

作者: Yu-Wei Chao , Zhan Wang , Rada Mihalcea , Jia Deng

DOI: 10.1109/CVPR.2015.7299054

关键词:

摘要: Affordances are fundamental attributes of objects. reveal the functionalities objects and possible actions that can be performed on them. Understanding affordances is crucial for recognizing human activities in visual data robots to interact with world. In this paper we introduce new problem mining knowledge semantic affordance: given an object, determining whether action it. This equivalent connecting verb nodes noun WordNet, or filling affordance matrix encoding plausibility each action-object pair. We a benchmark crowdsourced ground truth 20 PASCAL VOC object classes 957 classes. explore number approaches including text mining, collaborative filtering. Our analyses yield significant insights most effective ways collecting affordances.

参考文章(51)
Ilya Sutskever, Geoffrey E. Hinton, Alex Krizhevsky, ImageNet Classification with Deep Convolutional Neural Networks neural information processing systems. ,vol. 25, pp. 1097- 1105 ,(2012)
A. Gupta, A. Kembhavi, L.S. Davis, Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 31, pp. 1775- 1789 ,(2009) , 10.1109/TPAMI.2009.83
Bangpeng Yao, Jiayuan Ma, Li Fei-Fei, Discovering Object Functionality international conference on computer vision. pp. 2512- 2519 ,(2013) , 10.1109/ICCV.2013.312
Kate Saenko, Subhashini Venugopalan, Raymond Mooney, Sergio Guadarrama, Jesse Thomason, Integrating Language and Vision to Generate Natural Language Descriptions of Videos in the Wild international conference on computational linguistics. pp. 1218- 1227 ,(2014)
Devi Parikh, Kristen Grauman, Relative attributes international conference on computer vision. pp. 503- 510 ,(2011) , 10.1109/ICCV.2011.6126281
Andrej Karpathy, Rahul Sukthankar, Li Fei-Fei, Thomas Leung, Sanketh Shetty, George Toderici, Large-scale Video Classification with Convolutional Neural Networks ,(2014)
Tinghui Zhou, Hanhuai Shan, Arindam Banerjee, Guillermo Sapiro, Kernelized probabilistic matrix factorization: Exploiting graphs and side information siam international conference on data mining. pp. 403- 414 ,(2012) , 10.1137/1.9781611972825.35
Serge Belongie, C. Lawrence Zitnick, Deva Ramanan, Piotr Dollár, Pietro Perona, James Hays, Michael Maire, Ross Girshick, Lubomir Bourdev, Tsung-Yi Lin, Microsoft COCO: Common Objects in Context arXiv: Computer Vision and Pattern Recognition. ,(2014)
Philip Resnik, Selectional Preference and Sense Disambiguation Tagging Text with Lexical Semantics: Why, What, and How?. ,(1997)
Christoph H. Lampert, Hannes Nickisch, Stefan Harmeling, Learning to detect unseen object classes by between-class attribute transfer computer vision and pattern recognition. pp. 951- 958 ,(2009) , 10.1109/CVPR.2009.5206594