作者: David Mimno , Andrew McCallum
DOI:
关键词:
摘要: Although fully generative models have been successfully used to model the contents of text documents, they are often awkward apply combinations data and document metadata. In this paper we propose a Dirichlet-multinomial regression (DMR) topic that includes log-linear prior on document-topic distributions is function observed features document, such as author, publication venue, references, dates. We show by selecting appropriate features, DMR can meet or exceed performance several previously published designed for specific data.