作者: L. Grigori , J. Demmel , K. Fountoulakis , M. W. Mahoney , S. Das
DOI:
关键词:
摘要: We are interested in parallelizing the Least Angle Regression (LARS) algorithm for fitting linear regression models to high-dimensional data. consider two parallel and communication avoiding versions of basic LARS algorithm. The algorithms have different asymptotic costs practical performance. One offers more speedup other produces accurate output. first is bLARS, a block version algorithm, where we update b columns at each iteration. Assuming that data row-partitioned, bLARS reduces number arithmetic operations, latency, bandwidth by factor b. second Tournament-bLARS (T-bLARS), tournament processors compete running several computations choose new be added solution. column-partitioned, T-bLARS latency Similarly LARS, our proposed methods generate sequence models. present extensive numerical experiments illustrate speedups up 4x compared without any compromise solution quality.