作者: Michel Verleysen , Fabrice Rossi , Damien François
DOI: 10.1007/978-3-642-01805-3_4
关键词: Multivariate mutual information 、 Feature selection 、 Machine learning 、 Context (language use) 、 Relevance (information retrieval) 、 Artificial intelligence 、 Curse of dimensionality 、 Smoothing 、 Linear discriminant analysis 、 Data mining 、 Computer science 、 Mutual information
摘要: The selection of features that are relevant for a prediction or classification problem is an important in many domains involving high-dimensional data. Selecting helps fighting the curse dimensionality, improving performances methods, and interpreting application. In nonlinear context, mutual information widely used as relevance criterion sets features. Nevertheless, it suffers from at least three major limitations: estimators depend on smoothing parameters, there no theoretically justified stopping feature greedy procedure, estimation itself dimensionality. This chapter shows how to deal with these problems. two first ones addressed by using resampling techniques provide statistical basis select estimator parameters stop search procedure. third one modifying into measure complementary (and not only informative) hand.