Learning descriptive visual representation by semantic regularized matrix factorization

作者: Zhiwu Lu , Yuxin Peng

DOI:

关键词: Machine learningSemantic gapFeature vectorContextual image classificationRepresentation (mathematics)MathematicsMatrix decompositionSemanticsPattern recognitionBenchmark (computing)Cluster analysisArtificial intelligence

摘要: This paper presents a novel semantic regularized matrix factorization method for learning descriptive visual bag-of-words (BOW) representation. Although very influential in image classification, the traditional BOW representation has one distinct drawback. That is, efficiency purposes, this is often generated by directly clustering low-level feature vectors extracted from local keypoints or regions, without considering high-level semantics of images. In other words, still suffers gap and may lead to significant performance degradation more challenging tasks (e.g., classification community-contributed images with large intra-class variations). To overcome drawback, we develop adding Laplacian regularization defined tags (easy access although noisy) into factorization. Experimental results on two benchmark datasets show promising proposed method.

参考文章(29)
Ajit P Singh, Geoffrey J Gordon, None, A Unified View of Matrix Factorization Models european conference on machine learning. pp. 358- 373 ,(2008) , 10.1007/978-3-540-87481-2_24
A. Barla, F. Odone, A. Verri, Histogram intersection kernel for image classification international conference on image processing. ,vol. 3, pp. 513- 516 ,(2003) , 10.1109/ICIP.2003.1247294
Aude Oliva, Antonio Torralba, Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope International Journal of Computer Vision. ,vol. 42, pp. 145- 175 ,(2001) , 10.1023/A:1011139631724
Joost van de Weijer, Cordelia Schmid, Coloring local feature extraction european conference on computer vision. ,vol. 3952, pp. 334- 348 ,(2006) , 10.1007/11744047_26
Daniel D. Lee, H. Sebastian Seung, Learning the parts of objects by non-negative matrix factorization Nature. ,vol. 401, pp. 788- 791 ,(1999) , 10.1038/44565
Pavan K. Mallapragada, Rong Jin, Anil K. Jain, Online visual vocabulary pruning using pairwise constraints computer vision and pattern recognition. pp. 3073- 3080 ,(2010) , 10.1109/CVPR.2010.5540062
Matthieu Guillaumin, Jakob Verbeek, Cordelia Schmid, Multimodal semi-supervised learning for image classification computer vision and pattern recognition. pp. 902- 909 ,(2010) , 10.1109/CVPR.2010.5540120
Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, Andrew Zisserman, The Pascal Visual Object Classes (VOC) Challenge International Journal of Computer Vision. ,vol. 88, pp. 303- 338 ,(2010) , 10.1007/S11263-009-0275-4
Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John Winn, Andrew Zisserman, The Pascal Visual Object Classes Challenge: A Retrospective International Journal of Computer Vision. ,vol. 111, pp. 98- 136 ,(2015) , 10.1007/S11263-014-0733-5
Zhiwu Lu, Yuxin Peng, Combining latent semantic learning and reduced hypergraph learning for semi-supervised image categorization Proceedings of the 19th ACM international conference on Multimedia - MM '11. pp. 1409- 1412 ,(2011) , 10.1145/2072298.2072027