On Learning Discrete Graphical Models using Group-Sparse Regularization

作者: Sujay Sanghavi , Vishvas Vasuki , Ali Jalali , Pradeep D. Ravikumar

DOI:

关键词: Logistic regressionGraphical modelCombinatoricsMathematicsMaximum sizeConsistency analysisPairwise comparisonRegularization (mathematics)Sparse regularizationScaling

摘要: We study the problem of learning graph structure associated with a general discrete graphical models (each variable can take any m > 1 values, clique factors have maximum size c ≥ 2) from samples, under high-dimensional scaling where number variables p could be larger than samples n. provide quantitative consistency analysis procedure based on node-wise multi-class logistic regression group-sparse regularization. first consider m-ary pairwise – each factor depends at most two variables. show that when scale as n K(m − 1) 2 d log((m −1) (p −1))– is degree and K fixed constant succeeds in recovering high probability. For c-way factors, natural multi-way extension method quickly becomes very computationally complex. So we studied effectiveness using even while true model has higher order factors. Surprisingly, slightly more stringent conditions, still recovers structure, 3 ).

参考文章(36)
Nathan Srebro, Maximum Likelihood Bounded Tree-Width Markov Networks uncertainty in artificial intelligence. pp. 504- 511 ,(2001)
Clark N. Glymour, Peter Spirtes, Richard Scheines, Causation, prediction, and search ,(1993)
Christopher D. Manning, Hinrich Schütze, Foundations of Statistical Natural Language Processing ,(1999)
Massimiliano Pontil, Alexandre B. Tsybakov, Karim Lounici, Sara van de Geer, Taking Advantage of Sparsity in Multi-Task Learning arXiv: Machine Learning. ,(2009)
Adam J. Rothman, Elizaveta Levina, Peter J. Bickel, Ji Zhu, Sparse permutation invariant covariance estimation Electronic Journal of Statistics. ,vol. 2, pp. 494- 515 ,(2008) , 10.1214/08-EJS176
Martin Hassner, Jack Sklansky, The use of Markov Random Fields as models of texture Computer Graphics and Image Processing. ,vol. 12, pp. 357- 370 ,(1980) , 10.1016/0146-664X(80)90019-2
Andrew Y. Ng, Feature selection, L1 vs. L2 regularization, and rotational invariance Twenty-first international conference on Machine learning - ICML '04. pp. 78- ,(2004) , 10.1145/1015330.1015435
Nicolai Meinshausen, Peter Bühlmann, High-dimensional graphs and variable selection with the Lasso Annals of Statistics. ,vol. 34, pp. 1436- 1462 ,(2006) , 10.1214/009053606000000281
Emmanuel Candes, Terence Tao, None, The Dantzig selector: Statistical estimation when p is much larger than n Annals of Statistics. ,vol. 35, pp. 2313- 2351 ,(2007) , 10.1214/009053606000001523
Corinne Dahinden, Giovanni Parmigiani, Mark C Emerick, Peter Bühlmann, Penalized likelihood for sparse contingency tables with an application to full-length cDNA libraries BMC Bioinformatics. ,vol. 8, pp. 476- 476 ,(2007) , 10.1186/1471-2105-8-476