Scaling MCMC Inference and Belief Propagation to Large, Dense Graphical Models

作者: Sameer Singh

DOI: 10.7275/HHC2-RG97

关键词:

摘要: With the physical constraints of semiconductor-based electronics becoming increasingly limiting in the past decade, single-core CPUs have given way to multi-core and distributed computing platforms. At the same time, access to large data collections is progressively becoming commonplace due to the lowering cost of storage and bandwidth. Traditional machine learning paradigms that have been designed to operate sequentially on single processor architectures seem destined to become obsolete in this world of multi-core, multi-node systems and massive data sets. Inference for graphical models is one such example for which most existing algorithms are sequential in nature and are difficult to scale using parallel computations. Further, modeling large datasets leads to an escalation in the number of variables, factors, domains, and the density of the models, all of which have a substantial impact on the computational and storage complexity of inference. To achieve scalability, existing techniques impose strict independence assumptions on the model, resulting in tractable inference at the expense of expressiveness, and therefore of accuracy and utility, of the model. Motivated by the need to scale inference to large, dense graphical models, in this thesis we explore approximations to Markov chain Monte Carlo (MCMC) and belief propagation (BP) that induce dynamic sparsity in the model to utilize parallelism. In particular, since computations over some factors, variables, and values are more important than over others at different stages of inference, proposed approximations that prioritize and parallelize such computations facilitate efficient inference. First, we show that a synchronously distributed MCMC algorithm that uses dynamic partitioning of the model achieves scalable inference. We then identify bottlenecks in the synchronous architecture, and demonstrate that a collection of MCMC techniques that use asynchronous updates are able to address these drawbacks. For large domains and high-order factors, we find that dynamically inducing sparsity in variable domains, results in scalable belief propagation that enables joint inference. We also show that formulating distributed BP and joint inference as generalized BP on cluster graphs, and by using cluster message approximations, provides significantly lower communication cost and running time. With these tools for inference in hand, we are able to tackle entity tagging, relation extraction, entity resolution, cross-document coreference, joint inference, and other information extraction tasks over large text corpora.

参考文章(116)
Khashayar Rohanimanesh, Michael L. Wick, Andrew McCallum, Aron Culotta, An Entity Based Model for Coreference Resolution siam international conference on data mining. pp. 365- 376 ,(2009)
Danny Bickson, Aapo Kyrola, Carlos Guestrin, Joseph Hellerstein, Yucheng Low, Joseph E. Gonzalez, GraphLab: A New Parallel Framework for Machine Learning ,(2010)
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, Carlos Guestrin, None, PowerGraph: distributed graph-parallel computation on natural graphs operating systems design and implementation. pp. 17- 30 ,(2012) , 10.5555/2387880.2387883
Alexander T. Ihler, David A. McAllester, Particle Belief Propagation international conference on artificial intelligence and statistics. ,vol. 5, pp. 256- 263 ,(2009)
Andrew Mccallum, Aron Culotta, Learning and inference in weighted logic with application to natural language processing University of Massachusetts Amherst. ,(2008)
Khashayar Rohanimanesh, Kedar Bellare, Michael L. Wick, Andrew Mccallum, Aron Culotta, SampleRank: Training Factor Graphs with Atomic Gradients international conference on machine learning. pp. 777- 784 ,(2011)
Chung H. Gooi, James Allan, Cross-Document Coreference on a Large Scale Corpus north american chapter of the association for computational linguistics. pp. 9- 16 ,(2004) , 10.21236/ADA458579
David McAllester, Michael Collins, Fernando Pereira, Case-factor diagrams for structured probabilistic modeling uncertainty in artificial intelligence. pp. 382- 391 ,(2004) , 10.5555/1036843.1036890
Yair Weiss, Kevin P. Murphy, Michael I. Jordan, Loopy belief propagation for approximate inference: an empirical study uncertainty in artificial intelligence. pp. 467- 475 ,(1999)
Tommi Jaakkola, Amir Globerson, Convergent propagation algorithms via oriented trees uncertainty in artificial intelligence. pp. 133- 140 ,(2007)