作者: Renaud Marlet , Mathieu Aubry , Francisco Massa
DOI:
关键词:
摘要: In this paper we study the application of convolutional neural networks for jointly detecting objects depicted in still images and estimating their 3D pose. We identify different feature representations oriented objects, energies that lead a network to learn representations. The choice representation is crucial since pose an object has natural, continuous structure while its category discrete variable. evaluate approaches on joint detection estimation task Pascal3D+ benchmark using Average Viewpoint Precision. show classification approach discretized viewpoints achieves state-of-the-art performance estimation, significantly outperforms existing baselines benchmark.