Shifting the Baseline: Single Modality Performance on Visual Navigation & QA

作者: Yonatan Bisk , Jesse Thomason , Daniel Gordon

DOI:

关键词: Modality (human–computer interaction)Machine learningComputer scienceArtificial intelligenceVisual navigationBaseline (configuration management)

摘要: … We investigate visual navigation and question answering tasks, where agents move through simulated environments using egocentric (first person) vision. We find that unimodal …

参考文章(36)
J. O'Kane, S. LaValle, On Comparing the Power of Mobile Robots robotics science and systems. ,vol. 02, pp. 65- 72 ,(2006) , 10.15607/RSS.2006.II.009
Brian Stankiewicz, Benjamin Kuipers, Matt MacMahon, Walk the talk: connecting language, knowledge, and action in route instructions national conference on artificial intelligence. pp. 1475- 1482 ,(2006)
C. Lawrence Zitnick, Margaret Mitchell, Saurabh Gupta, Jacob Devlin, Ross B. Girshick, Exploring Nearest Neighbor Approaches for Image Captioning arXiv: Computer Vision and Pattern Recognition. ,(2015)
C. Lawrence Zitnick, Piotr Dollár, Ramakrishna Vedantam, Saurabh Gupta, Tsung-Yi Lin, Hao Fang, Xinlei Chen, Microsoft COCO Captions: Data Collection and Evaluation Server arXiv: Computer Vision and Pattern Recognition. ,(2015)
Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh, VQA: Visual Question Answering 2015 IEEE International Conference on Computer Vision (ICCV). pp. 2425- 2433 ,(2015) , 10.1109/ICCV.2015.279
Felix Duvallet, Thomas Kollar, Anthony Stentz, Imitation learning for natural language direction following through unknown environments international conference on robotics and automation. pp. 1047- 1053 ,(2013) , 10.1109/ICRA.2013.6630702
David L. Chen, Raymond J. Mooney, Learning to interpret natural language navigation instructions from observations national conference on artificial intelligence. pp. 859- 865 ,(2011)
Vladimir J. Lumelsky, Alexander A. Stepanov, Path-planning strategies for a point mobile automaton moving amidst unknown obstacles of arbitrary shape Algorithmica. ,vol. 2, pp. 403- 430 ,(1987) , 10.1007/BF01840369
K. Taylor, S.M. LaValle, I-Bug: An intensity-based bug algorithm international conference on robotics and automation. pp. 3466- 3471 ,(2009) , 10.1109/ROBOT.2009.5152728
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition computer vision and pattern recognition. pp. 770- 778 ,(2016) , 10.1109/CVPR.2016.90