作者: Nicolai Meinshausen , Rajen D. Shah
DOI:
关键词: Regression analysis 、 Sparse matrix 、 Regression 、 Ordinary least squares 、 Mathematics 、 Estimator 、 Hash function 、 Statistics 、 Algorithm 、 Contrast (statistics) 、 Context (language use)
摘要: We study large-scale regression analysis where both the number of variables, p, and observations, n, may be large in order millions or more. This is very dierent from now well-studied high-dimensional context \large small n". For example, our n" setting, an ordinary least squares estimator inappropriate for computational, rather than statistical, reasons. In to make progress, one must seek a compromise between statistical computational eciency. Furthermore, contrast common assumption signal sparsity data, here it design matrices that are typically sparse applications. Our approach dealing with this large, data based on b-bit min-wise hashing