作者: Francesca Chiaromonte , Jessica Martinelli
DOI: 10.1016/S0025-5564(01)00106-7
关键词: Dimensionality reduction 、 Categorical variable 、 Sliced inverse regression 、 Mathematics 、 Regression analysis 、 Sufficient dimension reduction 、 Regression 、 Expression (mathematics) 、 Linear combination 、 Data mining
摘要: The analysis of global gene expression data from microarrays is breaking new ground in genetics research, while confronting modelers and statisticians with many critical issues. In this paper, we consider sets which a categorical or continuous response recorded, along expression, on given number experimental samples. Data type are usually employed to create prediction mechanism for the based identify subset relevant genes. This defines regression setting characterized by dramatic under-resolution respect predictors (genes), whose exceeds orders magnitude available observations (samples). We present dimension reduction strategy that, under appropriate assumptions, allows us restrict attention few linear combinations original profiles, thus overcome under-resolution. These can then be used build validate model standard techniques. Moreover, they rank predictors, ultimately select them through comparison background 'chance scenario' independent randomizations. apply publicly leukemia classification.