作者: Dimitris Berberidis , Vassilis Kekatos , Georgios B. Giannakis
关键词: Statistical inference 、 Approximation algorithm 、 Estimator 、 Random projection 、 Computer science 、 Dimensionality reduction 、 Censoring (statistics) 、 Censoring (clinical trials) 、 Estimation theory 、 Mathematical optimization 、 Stochastic approximation 、 Maximum likelihood 、 Online algorithm 、 Linear regression
摘要: On par with data-intensive applications, the sheer size of modern linear regression problems creates an ever-growing demand for efficient solvers. Fortunately, a significant percentage data accrued can be omitted while maintaining certain quality statistical inference affordable computational budget. This work introduces means identifying and omitting less informative observations in online data-adaptive fashion. Given streaming data, related maximum-likelihood estimator is sequentially found using first- second-order stochastic approximation algorithms. These schemes are well suited when inherently censored or aim to save communication overhead decentralized learning setups. In different operational scenario, task joint censoring estimation put forth solve large-scale regressions centralized setup. Novel algorithms developed enjoying simple closed-form updates provable (non)asymptotic convergence guarantees. To attain desired patterns levels dimensionality reduction, thresholding rules investigated too. Numerical tests on real synthetic datasets corroborate efficacy proposed methods compared data-agnostic random projection-based alternatives.