A Bayesian Framework for Learning Shared and Individual Subspaces from Multiple Data Sources

作者: Sunil Kumar Gupta , Dinh Phung , Brett Adams , Svetha Venkatesh

DOI: 10.1007/978-3-642-20841-6_12

关键词: Data miningMutual knowledgeContext (language use)Gibbs samplingData setMachine learningBayesian probabilityLinear subspaceMetadataSocial mediaFactorizationComputer scienceArtificial intelligence

摘要: This paper presents a novel Bayesian formulation to exploit shared structures across multiple data sources, constructing foundations for effective mining and retrieval disparate domains. We jointly analyze diverse sources using unifying piece of metadata (textual tags). propose method based on Probabilistic Matrix Factorization (BPMF) which is able explicitly model the partial knowledge common datasets subspaces specific each dataset individual subspaces. For proposed model, we derive an efficient algorithm learning joint factorization Gibbs sampling. The effectiveness demonstrated by social media tasks single media. solution applicable wider context, providing formal framework suitable exploiting as well mutual present heterogeneous many kinds.

参考文章(14)
r;ribeiro-neto bueza-yates (b), Modern Information Retrieval ,(1999)
Börkur Sigurbjörnsson, Roelof van Zwol, Flickr tag recommendation based on collective knowledge Proceeding of the 17th international conference on World Wide Web - WWW '08. pp. 327- 336 ,(2008) , 10.1145/1367497.1367542
Aki Vehtari, David B. Dunson, Andrew Gelman, Hal S. Stern, Donald B. Rubin, John B. Carlin, Bayesian Data Analysis ,(1995)
Shuiwang Ji, Lei Tang, Shipeng Yu, Jieping Ye, Extracting shared subspace for multi-label classification Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08. pp. 381- 389 ,(2008) , 10.1145/1401890.1401939
Rong Yan, Jelena Tesic, John R. Smith, Model-shared subspace boosting for multi-label classification Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '07. pp. 834- 843 ,(2007) , 10.1145/1281192.1281281
Ruslan Salakhutdinov, Andriy Mnih, Bayesian probabilistic matrix factorization using Markov chain Monte Carlo Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 880- 887 ,(2008) , 10.1145/1390156.1390267
Ritendra Datta, Dhiraj Joshi, Jia Li, James Z. Wang, Image retrieval ACM Computing Surveys. ,vol. 40, pp. 1- 60 ,(2008) , 10.1145/1348246.1348248
Yi Yang, Yue-Ting Zhuang, Fei Wu, Yun-He Pan, Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval IEEE Transactions on Multimedia. ,vol. 10, pp. 437- 446 ,(2008) , 10.1109/TMM.2008.917359
Renzo Angles, Claudio Gutierrez, Survey of graph database models ACM Computing Surveys. ,vol. 40, pp. 1- 39 ,(2008) , 10.1145/1322432.1322433
Yi Yang, Dong Xu, Feiping Nie, Jiebo Luo, Yueting Zhuang, Ranking with local regression and global alignment for cross media retrieval acm multimedia. pp. 175- 184 ,(2009) , 10.1145/1631272.1631298