Curiosity-driven Reinforcement Learning for Diverse Visual Paragraph Generation

作者: Zi Huang , Yang Yang , Zheng Zhang , Yadan Luo , Jingjing Li

DOI:

关键词:

摘要: Visual paragraph generation aims to automatically describe a given image from different perspectives and organize sentences in a coherent way. In this paper, we address three …

参考文章(56)
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, Yoshua Bengio, None, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention international conference on machine learning. ,vol. 3, pp. 2048- 2057 ,(2015)
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan, Show and tell: A neural image caption generator computer vision and pattern recognition. pp. 3156- 3164 ,(2015) , 10.1109/CVPR.2015.7298935
Andrej Karpathy, Li Fei-Fei, Deep visual-semantic alignments for generating image descriptions computer vision and pattern recognition. pp. 3128- 3137 ,(2015) , 10.1109/CVPR.2015.7298932
Ramakrishna Vedantam, C. Lawrence Zitnick, Devi Parikh, CIDEr: Consensus-based image description evaluation computer vision and pattern recognition. pp. 4566- 4575 ,(2015) , 10.1109/CVPR.2015.7299087
Max Welling, Diederik P Kingma, Auto-Encoding Variational Bayes international conference on learning representations. ,(2014)
Sepp Hochreiter, Jürgen Schmidhuber, Long short-term memory Neural Computation. ,vol. 9, pp. 1735- 1780 ,(1997) , 10.1162/NECO.1997.9.8.1735
, Generative Adversarial Nets neural information processing systems. ,vol. 27, pp. 2672- 2680 ,(2014) , 10.3156/JSOFT.29.5_177_2
Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, BLEU Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02. pp. 311- 318 ,(2001) , 10.3115/1073083.1073135
Jiwei Li, Thang Luong, Dan Jurafsky, A Hierarchical Neural Autoencoder for Paragraphs and Documents Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). ,vol. 1, pp. 1106- 1115 ,(2015) , 10.3115/V1/P15-1107