Genomic integrative analysis to improve fusion transcript detection, liquid association and biclustering

作者: Shuchang Liu

DOI:

关键词:

摘要: More data provide more possibilities. Growing number of genomic new perspectives to understand some complex biological problems. Many algorithms for single-study have been developed, however, their results are not stable small sample size or overwhelmed by study-specific signals. Taking the advantage high throughput from multiple cohorts, in this dissertation, we able detect novel fusion transcripts, explore gene regulations and discovery disease subtypes within an integrative analysis framework. In first project, evaluated 15 transcript detection tools paired-end RNA-seq data. Though no single method had distinguished performance over others, several top were selected according F-measures. We further developed a meta-caller algorithm combining methods re-prioritize candidate transcripts. The showed that our can successfully balance precision recall compared any tool. In second extended liquid association two meta-analytic frameworks (MetaLA MetaMLA). Liquid is dynamic gene-gene correlation depending on expression level third gene. Our MetaLA MetaMLA provided stronger signals consistent analysis. When applied five Yeast datasets related environmental changes, genes triplets highly enriched fundamental processes corresponding changes. In plaid model cohorts bicluster detection. meta-biclustering biclusters with higher Jaccard accuracy toward large noise size. also introduced concept gap statistic pruning parameter estimation. In addition, detected breast cancer mRNA select associated many pathways split samples significantly different survival behaviors. In conclusion, improved transcripts detection, through integrative-analysis frameworks. These strong evidence structure variation, three-way regulation subtype thus contribute better understanding mechanism ultimately.

参考文章(104)
Doruk Bozdağ, Jeffrey D. Parvin, Umit V. Catalyurek, A Biclustering Method to Discover Co-regulated Genes Using Diverse Gene Expression Datasets international conference on bioinformatics. pp. 151- 163 ,(2009) , 10.1007/978-3-642-00727-9_16
Pauline C. Ng, Ewen F. Kirkness, Whole Genome Sequencing Methods in Molecular Biology. ,vol. 628, pp. 215- 226 ,(2010) , 10.1007/978-1-60327-367-1_12
George M. Church, Yizong Cheng, Biclustering of Expression Data intelligent systems in molecular biology. ,vol. 8, pp. 93- 103 ,(2000)
Francis Crick, On protein synthesis. Symposia of the Society for Experimental Biology. ,vol. 12, pp. 138- ,(1958)
Jian Zhang, A Bayesian model for biclustering with applications Journal of The Royal Statistical Society Series C-applied Statistics. ,vol. 59, pp. 635- 656 ,(2010) , 10.1111/J.1467-9876.2010.00716.X
Stefano Monti, Pablo Tamayo, Jill Mesirov, Todd Golub, Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data Machine Learning. ,vol. 52, pp. 91- 118 ,(2003) , 10.1023/A:1023949509487
Yan P. Yu, Silvia Liu, Zhiguang Huo, Amantha Martin, Joel B. Nelson, George C. Tseng, Jian-Hua Luo, Genomic Copy Number Variations in the Genomes of Leukocytes Predict Prostate Cancer Clinical Outcomes. PLOS ONE. ,vol. 10, ,(2015) , 10.1371/JOURNAL.PONE.0135982
Jiajun Gu, Jun S Liu, Bayesian biclustering of gene expression data. BMC Genomics. ,vol. 9, pp. 1- 10 ,(2008) , 10.1186/1471-2164-9-S1-S4
Jose Caldas, Samuel Kaski, Bayesian biclustering with the plaid model international workshop on machine learning for signal processing. pp. 291- 296 ,(2008) , 10.1109/MLSP.2008.4685495