Structured Dimensionality Reduction for Additive Model Regression

作者: Alhussein Fawzi , Jean-Baptiste Fiot , Bei Chen , Mathieu Sinn , Pascal Frossard

DOI: 10.1109/TKDE.2016.2525996

关键词: Machine learningOverfittingProjection pursuit regressionAdditive modelCurse of dimensionalityNonparametric regressionUnivariateIdentifiabilityCovariateComputer scienceArtificial intelligenceInterpretabilityDimensionality reductionComputational Theory and MathematicsInformation SystemsComputer Science Applications

摘要: Additive models are regression methods which model the response variable as sum of univariate transfer functions input variables. Key benefits additive their accuracy and interpretability on many real-world tasks. however not adapted to problems involving a large number (e.g., hundreds) variables, they prone overfitting in addition losing interpretability. In this paper, we introduce novel framework for applying The key idea is reduce task dimensionality by deriving small new covariates obtained linear combinations inputs, where weights estimated with regard problem at hand. moreover constrained prevent facilitate interpretation derived covariates. We establish identifiability proposed under mild assumptions present an efficient approximate learning algorithm. Experiments synthetic data demonstrate that our approach compares favorably baseline terms accuracy, while resulting lower complexity yielding practical insights into high-dimensional Our broadens applicability maintaining potential provide insights.

参考文章(27)
Robert Tibshirani, Trevor Hastie, Jerome H. Friedman, The Elements of Statistical Learning ,(2001)
Jeffrey S. Racine, Liangjun Su, Aman Ullah, Liangjun Su, Yonghui Zhang, Variable Selection in Nonparametric and Semiparametric Regression Models ,(2014) , 10.1093/OXFORDHB/9780199857944.013.009
Yining Chen, Richard J. Samworth, Generalized additive and index models with shape constraints Journal of the Royal Statistical Society: Series B (Statistical Methodology). ,vol. 78, pp. 729- 754 ,(2016) , 10.1111/RSSB.12137
Daniel D. Lee, H. Sebastian Seung, Learning the parts of objects by non-negative matrix factorization Nature. ,vol. 401, pp. 788- 791 ,(1999) , 10.1038/44565
Hui Zou, Trevor Hastie, Robert Tibshirani, Sparse Principal Component Analysis Journal of Computational and Graphical Statistics. ,vol. 15, pp. 265- 286 ,(2006) , 10.1198/106186006X113430
Xiaofeng Zhu, Zi Huang, Yang Yang, Heng Tao Shen, Changsheng Xu, Jiebo Luo, Self-taught dimensionality reduction on the high-dimensional small-sized data Pattern Recognition. ,vol. 46, pp. 215- 229 ,(2013) , 10.1016/J.PATCOG.2012.07.018
Yannig Goude, Raphael Nedellec, Nicolas Kong, Local Short and Middle Term Electricity Load Forecasting With Semi-Parametric Additive Models IEEE Transactions on Smart Grid. ,vol. 5, pp. 440- 446 ,(2014) , 10.1109/TSG.2013.2278425
Jing Hu, John E. Mitchell, Jong-Shi Pang, Kristin P. Bennett, Gautam Kunapuli, On the Global Solution of Linear Programs with Linear Complementarity Constraints SIAM Journal on Optimization. ,vol. 19, pp. 445- 471 ,(2008) , 10.1137/07068463X