Studying Relationships between Human Gaze, Description, and Computer Vision

作者: Kiwon Yun , Yifan Peng , Dimitris Samaras , Gregory J. Zelinsky , Tamara L. Berg

DOI: 10.1109/CVPR.2013.101

关键词:

摘要: We posit that user behavior during natural viewing of images contains an abundance information about the content as well related to intent and defined importance. In this paper, we conduct experiments better understand relationship between images, eye movements people make while how construct language describe images. explore these relationships in context two commonly used computer vision datasets. then further relate human cues with outputs current visual recognition systems demonstrate prototype applications for gaze-enabled detection annotation.

参考文章(33)
Alfred Lukianovich Yarbus, Eye Movements and Vision ,(1967)
Jia Deng, Alexander C. Berg, Kai Li, Li Fei-Fei, What does classifying more than 10,000 image categories tell us? european conference on computer vision. pp. 71- 84 ,(2010) , 10.1007/978-3-642-15555-0_6
Derrick Parkhurst, Klinton Law, Ernst Niebur, Modeling the role of salience in the allocation of overt visual attention. Vision Research. ,vol. 42, pp. 107- 123 ,(2002) , 10.1016/S0042-6989(01)00250-4
Myung Jin Choi, Joseph J. Lim, Antonio Torralba, Alan S. Willsky, Exploiting hierarchical context on a large database of object categories computer vision and pattern recognition. pp. 129- 136 ,(2010) , 10.1109/CVPR.2010.5540221
Pierre Baldi, Laurent Itti, Of bits and wows: A Bayesian theory of surprise with applications to attention Neural Networks. ,vol. 23, pp. 649- 666 ,(2010) , 10.1016/J.NEUNET.2009.12.007
J Henderson, Human gaze control during real-world scene perception Trends in Cognitive Sciences. ,vol. 7, pp. 498- 504 ,(2003) , 10.1016/J.TICS.2003.09.006
Laura Walker Renninger, Preeti Verghese, James Coughlan, Where to look next? Eye movements reduce local uncertainty Journal of Vision. ,vol. 7, pp. 6- 6 ,(2007) , 10.1167/7.3.6
Teófilo de Campos, Gabriela Csurka, Florent Perronnin, Images as sets of locally weighted features Computer Vision and Image Understanding. ,vol. 116, pp. 68- 85 ,(2012) , 10.1016/J.CVIU.2011.07.011
Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John Winn, Andrew Zisserman, The Pascal Visual Object Classes Challenge: A Retrospective International Journal of Computer Vision. ,vol. 111, pp. 98- 136 ,(2015) , 10.1007/S11263-014-0733-5