作者: Chunpu Xu , Wei Zhao , Min Yang , Xiang Ao , Wangrong Cheng
关键词:
摘要: Recent image captioning approaches are typically trained on generation-based or retrieval-based approaches. Both methods have their advantages but limited by the disadvantages. In this paper, we propose a Unified Generation-Retrieval framework for Image Captioning (UGRIC) using adversarial learning. Different from previous methods, proposed UGRIC model leverages informative contents of N-best response candidates provided to enhance method. addition, further improve informativeness generated caption, employ copying mechanism choose words retrieved candidate captions and put them into proper positions output sequence. Experiments MSCOCO dataset demonstrate effectiveness through various evaluation metrics.\footnoteCode data available at: \urlhttp://tinyurl.com/y6z2x6ho.