Deep Feature Selection using a Teacher-Student Network

作者: Vahid Pourahmadi , Hamid Sheikhzadeh , Ali Mirzaei , Mehran Soltani

DOI:

关键词:

摘要: High-dimensional data in many machine learning applications leads to computational and analytical complexities. Feature selection provides an effective way for solving these problems by removing irrelevant redundant features, thus reducing model complexity improving accuracy generalization capability of the model. In this paper, we present a novel teacher-student feature (TSFS) method which 'teacher' (a deep neural network or complicated dimension reduction method) is first employed learn best representation low dimension. Then 'student' simple network) used perform minimizing reconstruction error dimensional representation. Although scheme not new, our knowledge, it time that selection. The proposed TSFS can be both supervised unsupervised This evaluated on different datasets compared with state-of-the-art existing methods. results show performs better terms classification clustering accuracies error. Moreover, experimental evaluations demonstrate degree sensitivity parameter method.

参考文章(23)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
Robert M. Hamer, Forrest W. Young, Multidimensional Scaling: History, Theory, and Applications ,(1987)
Geoffrey Hinton, Oriol Vinyals, Jeff Dean, Distilling the Knowledge in a Neural Network arXiv: Machine Learning. ,(2015)
Joshua B Tenenbaum, Vin de Silva, John C Langford, A Global Geometric Framework for Nonlinear Dimensionality Reduction Science. ,vol. 290, pp. 2319- 2323 ,(2000) , 10.1126/SCIENCE.290.5500.2319
Sam T Roweis, Lawrence K Saul, Nonlinear Dimensionality Reduction by Locally Linear Embedding Science. ,vol. 290, pp. 2323- 2326 ,(2000) , 10.1126/SCIENCE.290.5500.2323
Pengfei Zhu, Wangmeng Zuo, Lei Zhang, Qinghua Hu, Simon C.K. Shiu, Unsupervised feature selection by regularized self-representation Pattern Recognition. ,vol. 48, pp. 438- 446 ,(2015) , 10.1016/J.PATCOG.2014.08.006
Deng Cai, Chiyuan Zhang, Xiaofei He, Unsupervised feature selection for multi-cluster data knowledge discovery and data mining. pp. 333- 342 ,(2010) , 10.1145/1835804.1835848
Partha Niyogi, Deng Cai, Xiaofei He, Laplacian Score for Feature Selection neural information processing systems. ,vol. 18, pp. 507- 514 ,(2005)
Hanchuan Peng, Fuhui Long, C. Ding, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 27, pp. 1226- 1238 ,(2005) , 10.1109/TPAMI.2005.159
Partha Niyogi, Mikhail Belkin, Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering neural information processing systems. ,vol. 14, pp. 585- 591 ,(2001)