Fast collapsed gibbs sampling for latent dirichlet allocation

作者: Ian Porteous , David Newman , Alexander Ihler , Arthur Asuncion , Padhraic Smyth

DOI: 10.1145/1401890.1401960

关键词: Pattern recognitionMathematicsSampling (statistics)Gibbs samplingArtificial intelligenceAlgorithmComputationLatent Dirichlet allocationSample (statistics)Range (statistics)Slice samplingSpeedup

摘要: … collapsed Gibbs sampling method for the widely used latent Dirichlet … Conventional Gibbs sampling schemes for LDA require O(K) … Our proposed method draws equivalent samples but …

参考文章(19)
Khaled Alsabti, Sanjay Ranka, Vineet Singh, An efficient k-means clustering algorithm ,(1997)
Dan Pelleg, Andrew W. Moore, X-means: Extending K-means with Efficient Estimation of the Number of Clusters international conference on machine learning. pp. 727- 734 ,(2000)
David M Blei, Andrew Y Ng, Michael I Jordan, None, Latent dirichlet allocation Journal of Machine Learning Research. ,vol. 3, pp. 993- 1022 ,(2003) , 10.5555/944919.944937
Dan Pelleg, Andrew Moore, Accelerating exact k-means algorithms with geometric reasoning knowledge discovery and data mining. pp. 277- 281 ,(1999) , 10.1145/312129.312248
T. L. Griffiths, M. Steyvers, Finding scientific topics Proceedings of the National Academy of Sciences of the United States of America. ,vol. 101, pp. 5228- 5235 ,(2004) , 10.1073/PNAS.0307752101
Kenichi Kurihara, Max Welling, Bayesian k-Means as a Maximization-expectation algorithm Neural Computation. ,vol. 21, pp. 1145- 1172 ,(2009) , 10.1162/NECO.2008.12-06-421
Xing Wei, W. Bruce Croft, LDA-based document models for ad-hoc retrieval Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06. pp. 178- 185 ,(2006) , 10.1145/1148170.1148204
Wei Li, Andrew McCallum, Pachinko allocation Proceedings of the 23rd international conference on Machine learning - ICML '06. pp. 577- 584 ,(2006) , 10.1145/1143844.1143917
Sami Perttu, Ville Tuulos, Antti Tuominen, Jukka Perkio, Vladimir Poroshin, Henry Tirri, Tomi Silander, Jaakko Lofstrom, Wray Buntine, A Scalable Topic-Based Open Source Search Engine web intelligence. pp. 228- 234 ,(2004) , 10.1109/WI.2004.12
David Newman, Kat Hagedorn, Chaitanya Chemudugunta, Padhraic Smyth, Subject metadata enrichment using statistical topic models Proceedings of the 2007 conference on Digital libraries - JCDL '07. pp. 366- 375 ,(2007) , 10.1145/1255175.1255248