Statistical Significance Assessment for Biological Feature Selection: Methods and Issues

作者: Juntao Li , Kwok Pui Choi , Yudi Pawitan , Radha Krishna Murthy Karuturi

DOI: 10.1002/9781118617151.CH15

关键词: Feature selectionCluster analysisGranularityFeature (machine learning)Knowledge extractionBiclusteringData miningProcess (engineering)PsychologyComputational biologyConsensus clustering

摘要: Biological knowledge discovery is aimed at quantifying the treatment effects and establishing functional mechanistic characterization of drug, treatment, or disease [1, 2]. It involves eliciting information multiple layers granularity generalization. may be identifying gene(s) proteins, gene–gene interactions, regulatory networks, biological processes involved in a other phenomena. The different levels are achieved through both supervised unsupervised analysis frameworks. Sample attributes [3] used to identify genes contributing most thereby gene–disease associations. framework [4] for class involving multitude clustering methodologies such as hierarchical/partitional [5–7], biclustering [8], consensus clustering. Alternatively, it even combination [9]. However, irrespective hypothesis framework, necessary requirement feature sets consisting genes, single-nucleotide polymorphisms (SNPs), linkage disequilibrium (LD) blocks, pathways, interactions process under study from huge number possibilities present data. Furthermore, required analyze implications identified based on available literature which too needs selection few described literature.

参考文章(73)
Juntao Li, Kwok Pui Choi, R. Krishna Murthy Karuturi, Iterative piecewise linear regression to accurately assess statistical significance in batch confounded differential expression analysis international symposium on bioinformatics research and applications. pp. 153- 164 ,(2012) , 10.1007/978-3-642-30191-9_15
Venüs Ümmiye Onay, Laurent Briollais, Julia A Knight, Ellen Shi, Yuanyuan Wang, Sean Wells, Hong Li, Isaac Rajendram, Irene L Andrulis, Hilmi Ozcelik, SNP-SNP interactions in breast cancer susceptibility BMC Cancer. ,vol. 6, pp. 114- 114 ,(2006) , 10.1186/1471-2407-6-114
J Aubert, A Bar-Hen, J-J Daudin, S Robin, Determination of the differentially expressed genes in microarray experiments using local FDR. BMC Bioinformatics. ,vol. 5, pp. 125- 125 ,(2004) , 10.1186/1471-2105-5-125
Joseph G. Hacia, Jian-Bing Fan, Oliver Ryder, Li Jin, Keith Edgemon, Ghassan Ghandour, R. Aeryn Mayer, Bryan Sun, Linda Hsie, Christiane M. Robbins, Lawrence C. Brody, David Wang, Eric S. Lander, Robert Lipshutz, Stephen P.A. Fodor, Francis S. Collins, Determination of ancestral alleles for human single-nucleotide polymorphisms using high-density oligonucleotide arrays Nature Genetics. ,vol. 22, pp. 164- 167 ,(1999) , 10.1038/9674
Daniel Yekutieli, Yoav Benjamini, THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY Annals of Statistics. ,vol. 29, pp. 1165- 1188 ,(2001) , 10.1214/AOS/1013699998
J. D. Storey, R. Tibshirani, Statistical significance for genomewide studies Proceedings of the National Academy of Sciences of the United States of America. ,vol. 100, pp. 9440- 9445 ,(2003) , 10.1073/PNAS.1530509100
Keith A Baggerly, Kevin R Coombes, Kenneth R Hess, David N Stivers, Lynne V Abruzzo, Wei Zhang, None, Identifying differentially expressed genes in cDNA microarray experiments. Journal of Computational Biology. ,vol. 8, pp. 639- 659 ,(2001) , 10.1089/106652701753307539
Cory Y McLean, Dave Bristor, Michael Hiller, Shoa L Clarke, Bruce T Schaar, Craig B Lowe, Aaron M Wenger, Gill Bejerano, GREAT improves functional interpretation of cis-regulatory regions Nature Biotechnology. ,vol. 28, pp. 495- 501 ,(2010) , 10.1038/NBT.1630