作者: Jason J. Corso , Caiming Xiong , Ran Xu , Wei Chen
DOI:
关键词:
摘要: … represent the child nodes, we reconstruct the child nodes … to build the video-language space and compare video retrieval/text … deep video feature and average of word vector to learn the …