Efficient Non-oblivious Randomized Reduction for Risk Minimization with Improved Excess Risk Guarantee

作者: Tianbao Yang , Haiqin Yang , Yi Xu , Lijun Zhang

DOI:

关键词: Subspace topologyApproximation errorComputer scienceBenchmark (computing)Matrix multiplicationGeneralizationRandom projectionClustering high-dimensional dataMathematical optimizationMatrix (mathematics)Training setReduction (complexity)

摘要: In this paper, we address learning problems for high dimensional data. Previously, oblivious random projection based approaches that project features onto a subspace have been used in practice tackling high-dimensionality challenge machine learning. Recently, various non-oblivious randomized reduction methods developed and deployed solving many numerical such as matrix product approximation, low-rank etc. However, they are less explored the tasks, e.g., classification. More seriously, theoretical analysis of excess risk bounds minimization, an important measure generalization performance, has not established methods. It therefore remains open problem what is benefit using them over previous approaches. To tackle these challenges, propose algorithmic framework employing method general empirical minimizing where original high-dimensional projected derived from data with small approximation error. We then derive first bound proposed approach without requiring strong assumptions on training The exhibits provides much better performance it also sheds more insights about different Finally, conduct extensive experiments both synthetic real-world benchmark datasets, whose dimension scales to $O(10^7)$, demonstrate efficacy our approach.

参考文章(18)
Michael B. Cohen, Jelani Nelson, David P. Woodruff, Optimal approximate matrix product in terms of stable rank arXiv: Data Structures and Algorithms. ,(2015)
Avrim Blum, Random Projection, Margins, Kernels, and Feature-Selection Subspace, Latent Structure and Feature Selection. pp. 52- 68 ,(2006) , 10.1007/11752790_3
Alex Gittens, The spectral norm error of the naive Nystrom extension arXiv: Numerical Analysis. ,(2011)
Daniel M. Kane, Jelani Nelson, Sparser Johnson-Lindenstrauss Transforms Journal of the ACM. ,vol. 61, pp. 4- ,(2014) , 10.1145/2559902
Sanjoy Dasgupta, Anupam Gupta, An elementary proof of a theorem of Johnson and Lindenstrauss Random Structures and Algorithms. ,vol. 22, pp. 60- 65 ,(2003) , 10.1002/RSA.10073
Malik Magdon-Ismail, Christos Boutsidis, Petros Drineas, Saurabh Paul, Random Projections for Support Vector Machines international conference on artificial intelligence and statistics. pp. 498- 506 ,(2013)
Yu-feng Li, Rong Jin, Mehrdad Mahdavi, Tianbao Yang, Zhi-Hua Zhou, Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison neural information processing systems. ,vol. 25, pp. 476- 484 ,(2012)
Olivier Bousquet, Stéphane Boucheron, Gábor Lugosi, Introduction to Statistical Learning Theory Lecture Notes in Computer Science. pp. 169- 207 ,(2004) , 10.1007/978-3-540-28650-9_8
Shai Shalev-shwartz, Nathan Srebro, Karthik Sridharan, Fast Rates for Regularized Objectives neural information processing systems. ,vol. 21, pp. 1545- 1552 ,(2008)
Vladimir Naumovich Vapnik, Vlamimir Vapnik, Statistical learning theory John Wiley & Sons. ,(1998)