作者: Aditya Krishna Menon , Didi Surian , Sanjay Chawla
DOI: 10.1137/1.9781611974010.23
关键词:
摘要: Content is increasingly available in multiple modalities (such as images, text, and video), each of which provides a different representation some entity. The cross-modal retrieval problem is: given the an entity one modality, find its best all other modalities. We propose novel approach to this based on pairwise classification. seamlessly applies both settings where ground-truth annotations for entities are absent present. In former case, considers positive unlabelled links that arise standard datasets. Empirical comparisons show improvements over state-of-theart methods retrieval.