作者: Zoubin Ghahramani , Sara Wade
DOI: 10.1214/17-BA1073
关键词: Maximum a posteriori estimation 、 Data mining 、 Variation of information 、 Cluster analysis 、 Credible interval 、 Bayesian probability 、 Mixture model 、 Directed acyclic graph 、 Point estimation 、 Mathematics
摘要: Clustering is widely studied in statistics and machine learning, with applications a variety of fields. As opposed to popular algorithms such as agglomerative hierarchical clustering or k-means which return single solution, Bayesian nonparametric models provide posterior over the entire space partitions, allowing one assess statistical properties, uncertainty on number clusters. However, an important problem how summarize posterior; huge dimension partition difficulties visualizing it add this problem. In analysis, real-valued parameter interest often summarized by reporting point estimate mean along 95% credible intervals characterize uncertainty. paper, we extend these ideas develop appropriate estimates sets structure based decision information theoretic techniques.