作者: Girish Kulkarni , Visruth Premraj , Sagnik Dhar , Siming Li , Yejin Choi
DOI: 10.1109/CVPR.2011.5995466
关键词: Natural language 、 Computer science 、 Image (mathematics) 、 Baby talk 、 Simple (philosophy) 、 Natural language processing 、 Text mining 、 Parsing 、 Artificial intelligence
摘要: We posit that visually descriptive language offers computer vision researchers both information about the world, and how people describe world. The potential benefit from this source is made more significant due to enormous amount of data easily available today. present a system automatically generate natural descriptions images exploits statistics gleaned parsing large quantities text recognition algorithms vision. very effective at producing relevant sentences for images. It also generates are notably true specific image content than previous work.