作者: Scott W. Linderman , Ryan P. Adams , Matthew J. Johnson
DOI:
关键词:
摘要: Many practical modeling problems involve discrete data that are best represented as draws from multinomial or categorical distributions. For example, nucleotides in a DNA sequence, children's names given state and year, text documents all commonly modeled with In of these cases, we expect some form dependency between the draws: nucleotide at one position strand may depend on preceding nucleotides, highly correlated year to topics be dynamic. These dependencies not naturally captured by typical Dirichlet-multinomial formulation. Here, leverage logistic stick-breaking representation recent innovations Polya-gamma augmentation reformulate distribution terms latent variables jointly Gaussian likelihoods, enabling us take advantage host Bayesian inference techniques for models minimal overhead.