Visual Question Reasoning on General Dependency Tree

作者: Xiaodan Liang , Liang Lin , Bailing Li , Guanbin Li , Qingxing Cao

DOI:

关键词:

摘要: … The key to this task is the capability of coreasoning over both image and language domains. However, most of the previous methods [21, 20, 16] work more like a black-box manner, ie, …

参考文章(33)
Joakim Nivre, Filip Ginter, Timothy Dozat, Marie-Catherine de Marneffe, Katri Haverinen, Natalia Silveira, Christopher D. Manning, Universal Stanford dependencies: A cross-linguistic typology language resources and evaluation. pp. 4585- 4592 ,(2014)
M. Schuster, K.K. Paliwal, Bidirectional recurrent neural networks IEEE Transactions on Signal Processing. ,vol. 45, pp. 2673- 2681 ,(1997) , 10.1109/78.650093
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition computer vision and pattern recognition. pp. 770- 778 ,(2016) , 10.1109/CVPR.2016.90
Marcus Rohrbach, Trevor Darrell, Jacob Andreas, Dan Klein, Learning to Compose Neural Networks for Question Answering arXiv: Computation and Language. ,(2016)
Danqi Chen, Christopher Manning, A Fast and Accurate Dependency Parser using Neural Networks Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). pp. 740- 750 ,(2014) , 10.3115/V1/D14-1082
Stephen Merity, Richard Socher, Caiming Xiong, Dynamic Memory Networks for Visual and Textual Question Answering arXiv: Neural and Evolutionary Computing. ,(2016)
Shuicheng Yan, Jiashi Feng, Ilija Ilievski, A Focused Dynamic Attention Model for Visual Question Answering arXiv: Computer Vision and Pattern Recognition. ,(2016)
Anthony Dick, Chunhua Shen, Anton van den Hengel, Qi Wu, Peng Wang, FVQA: Fact-based Visual Question Answering arXiv: Computer Vision and Pattern Recognition. ,(2016)
Dhruv Batra, Devi Parikh, Jiasen Lu, Jianwei Yang, Hierarchical Question-Image Co-Attention for Visual Question Answering arXiv: Computer Vision and Pattern Recognition. ,(2016)
Yash Goyal, Tejas Khot, Douglas Summers-Stay, Dhruv Batra, Devi Parikh, Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 6325- 6334 ,(2017) , 10.1109/CVPR.2017.670