Tissue-based Alzheimer gene expression markers–comparison of multiple machine learning approaches and investigation of redundancy in small biomarker sets

作者: Lena Scheubert , Mitja Luštrek , Rainer Schmidt , Dirk Repsilber , Georg Fuellen

DOI: 10.1186/1471-2105-13-266

关键词:

摘要: Alzheimer’s disease has been known for more than 100 years and the underlying molecular mechanisms are not yet completely understood. The identification of genes involved in processes Alzheimer affected brain is an important step towards such understanding. Genes differentially expressed diseased healthy brains promising candidates. Based on microarray data we identify potential biomarkers as well biomarker combinations using three feature selection methods: information gain, mean decrease accuracy random forest a wrapper genetic algorithm support vector machine (GA/SVM). Information gain two commonly used methods. We compare their output to results obtained from GA/SVM. GA/SVM rarely analysis data, but it able capable classifying tissues into different classes at least reference Compared other methods, advantage finding small, less redundant sets that, combination, show superior classification characteristics. biological significance gene pairs discussed.

参考文章(64)
Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)
Sung-Bae Cho, Hong-Hee Won, Machine learning in DNA microarray analysis for cancer classification asia pacific bioinformatics conference. pp. 189- 198 ,(2003)
J. David Schaffer, Proceedings of the third international conference on Genetic algorithms international conference on genetic algorithms. ,(1989)
Jose Crispin Hernandez Hernandez, Béatrice Duval, Jin-Kao Hao, A Genetic Embedded Approach for Gene Selection and Classification of Microarray Data Lecture Notes in Computer Science. pp. 90- 101 ,(2007) , 10.1007/978-3-540-71783-6_9
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Rosa-Magda Alvarado-Mallart, Matías Hidalgo-Sánchez, Antonio Simeone, Fgf8 and Gbx2 induction concomitant with Otx2 repression is correlated with midbrain-hindbrain fate of caudal prosencephalon Development. ,vol. 126, pp. 3191- 3203 ,(1999) , 10.1242/DEV.126.14.3191
George H. John, Pat Langley, Estimating continuous distributions in Bayesian classifiers uncertainty in artificial intelligence. pp. 338- 345 ,(1995)