Estimation and selection for the latent block model on categorical data

作者: Christine Keribin , Vincent Brault , Gilles Celeux , Gérard Govaert

DOI: 10.1007/S11222-014-9472-2

关键词: Model selectionIdentifiabilitySelection (genetic algorithm)AlgorithmMachine learningPrior probabilityExpectation–maximization algorithmArtificial intelligenceMathematicsBayesian information criterionGibbs samplingCategorical variable

摘要: This paper deals with estimation and model selection in the Latent Block Model (LBM) for categorical data. First, after providing sufficient conditions ensuring identifiability of this model, we generalise procedures criteria derived binary Secondly, develop Bayesian inference through Gibbs sampling a well calibrated non informative prior distribution, order to get MAP estimator: is proved avoid traps encountered by LBM maximum likelihood methodology. Then are presented. In particular an exact expression integrated completed criterion requiring no asymptotic approximation derived. Finally numerical experiments on both simulated real data sets highlight appeal proposed procedures.

参考文章(34)
Edward Meeds, Sam Roweis, Nonparametric Bayesian Biclustering Department of Computer Science, University of Toronto. ,(2007)
Sylvia Frühwirth-Schnatter, Finite Mixture and Markov Switching Models ,(2006)
Christine Keribin, Gilles Celeux, V. Brault, Gérard Govaert, Model selection for the binary latent block model 20th International Conference on Computational Statistics (COMPSTAT 2012). pp. 379- 390 ,(2012)
J McLachlan, G, D. Peel, Finite Mixture Models ,(2000)
Jean-Patrick Baudry, Sélection de modèle pour la classification non supervisée. Choix du nombre de classes. Université Paris Sud - Paris XI. ,(2009)
Christine Keribin, Méthodes bayésiennes variationnelles : concepts et applications en neuroimagerie Journal de la Société Française de Statistique & revue de statistique appliquée. ,vol. 151, pp. 107- 131 ,(2011)
Jean-Claude Biscarat, Gilles Celeux, Jean Diebolt, On Stochastic Versions of the EM Algorithm INRIA. ,(1992) , 10.21236/ADA246929
Judith Rousseau, Kerrie Mengersen, Asymptotic behaviour of the posterior distribution in overfitted mixture models Journal of The Royal Statistical Society Series B-statistical Methodology. ,vol. 73, pp. 689- 710 ,(2011) , 10.1111/J.1467-9868.2011.00781.X
Srujana Merugu, Inderjit Dhillon, Arindam Banerjee, Joydeep Ghosh, Dharmendra S. Modha, A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation Journal of Machine Learning Research. ,vol. 8, pp. 1919- 1986 ,(2007) , 10.5555/1314498.1314563