作者: Eric P. Xing , Wittawat Jitkrittum , Masashi Sugiyama , Leonid Sigal , Makoto Yamada
DOI: 10.1162/NECO_A_00537
关键词: Kernel (statistics) 、 Pattern recognition (psychology) 、 Feature (computer vision) 、 Lasso (statistics) 、 Mutual information 、 Computer science 、 Feature selection 、 Dependency (UML) 、 Pattern recognition 、 Independence (probability theory) 、 Artificial intelligence
摘要: The goal of supervised feature selection is to find a subset input features that are responsible for predicting output values. least absolute shrinkage and operator (Lasso) allows computationally efficient based on linear dependency between In this letter, we consider feature-wise kernelized Lasso capturing nonlinear input-output dependency. We first show with particular choices kernel functions, nonredundant strong statistical dependence values can be found in terms kernel-based independence measures such as the Hilbert-Schmidt criterion. then globally optimal solution efficiently computed; makes approach scalable high-dimensional problems. effectiveness proposed method demonstrated through experiments classification regression thousands features.