作者: Xiangfeng Luo , Junyu Xuan , Jie Lu , Guangquan Zhang , Richard Yi Da Xu
DOI:
关键词:
摘要: Incorporating the side information of text corpus, i.e., authors, time stamps, and emotional tags, into traditional mining models has gained significant interests in area retrieval, statistical natural language processing, machine learning. One branch these works is so-called Author Topic Model (ATM), which incorporates authors's as classical topic model. However, existing ATM needs to predefine number topics, difficult inappropriate many real-world settings. In this paper, we propose an Infinite (IAT) model resolve issue. Instead assigning a discrete probability on fixed use stochastic process determine topics from data itself. To be specific, extend gamma-negative binomial three levels order capture author-document-keyword hierarchical structure. Furthermore, each document assigned mixed gamma that accounts for multi-author's contribution towards document. An efficient Gibbs sampling inference algorithm with conditional distribution being closed-form developed IAT Experiments several datasets show capabilities our learn hidden authors' simultaneously.