作者: Scott Satkin , Maheen Rashid , Jason Lin , Martial Hebert
DOI: 10.1007/S11263-014-0734-4
关键词: Rendering (computer graphics) 、 Computer vision 、 Affordance 、 Leverage (statistics) 、 Computer science 、 Similarity measure 、 Object detection 、 Artificial intelligence 、 Machine learning 、 Viewpoints 、 k-nearest neighbors algorithm 、 Segmentation
摘要: In this paper, we describe a data-driven approach to leverage repositories of 3D models for scene understanding. Our ability relate what see in an image large collection allows us transfer information from these models, creating rich understanding the scene. We develop framework auto-calibrating camera, rendering viewpoint was taken, and computing similarity measure between each model input image. demonstrate context geometry estimation show find identities, poses styles objects The true benefit 3DNN compared traditional 2D nearest-neighbor is that by generalizing across viewpoints, free ourselves need have training examples captured all possible viewpoints. Thus, are able achieve comparable results using orders magnitude less data, recognize never-before-seen work, algorithm rigorously evaluate its performance tasks object detection/segmentation, as well two novel applications: affordance photorealistic insertion.