作者: Blaise Hanczar
DOI: 10.1007/11527862_19
关键词:
摘要: This paper describes and experimentally analyses a new dimension reduction method for microarray data. Microarrays, which allow simultaneous measurement of the level expression thousands genes in given situation (tissue, cell or time), produce data poses particular machine-learning problems. The disproportion between number attributes (tens thousands) examples (hundreds) requires dimension. While gene/class mutual information is often used to filter we propose an approach takes into account gene-pair/class information. A gene selection heuristic based on this principle proposed as well automatic feature-construction procedure forcing learning algorithms make use these pairs. We report significant improvements accuracy several public databases.