eXplainable Artificial Intelligence (XAI) for the identification of biologically relevant gene expression patterns in longitudinal human studies, insights from obesity research

作者: Augusto Anguita-Ruiz , Alberto Segura-Delgado , Rafael Alcalá , Concepción M. Aguilera , Jesús Alcalá-Fdez

DOI: 10.1371/JOURNAL.PCBI.1007792

关键词:

摘要: Until date, several machine learning approaches have been proposed for the dynamic modeling of temporal omics data. Although they yielded impressive results in terms model accuracy and predictive ability, most these applications are based on "Black-box" algorithms more interpretable models claimed by research community. The recent eXplainable Artificial Intelligence (XAI) revolution offers a solution this issue, were rule-based highly suitable explanatory purposes. further integration data mining process along with functional-annotation pathway analyses is an additional way towards biologically soundness models. In paper, we present novel XAI strategy (including pre-processing, knowledge-extraction functional validation) finding relevant sequential patterns from longitudinal human gene expression (GED). To illustrate performance our pipeline, work vivo GED collected within course long-term dietary intervention 57 subjects obesity (GSE77962). As validation populations, employ three independent datasets following same experimental design. result, validate primarily extracted prove goodness gene-gene relations. Our whole pipeline has gathered under open-source software could be easily extended to other applications.

参考文章(69)
Edward H. Shortliffe, Bruce G. Buchanan, A model of inexact reasoning in medicine Bellman Prize in Mathematical Biosciences. ,vol. 23, pp. 259- 275 ,(1990) , 10.1016/0025-5564(75)90047-4
Monica Battle, Corey Gillespie, Alexander Quarshie, Viola Lanier, Tia Harmon, Kaamilah Wilson, Marta Torroella-Kouri, Ruben R. Gonzalez-Perez, Obesity induced a leptin-Notch signaling axis in breast cancer International Journal of Cancer. ,vol. 134, pp. 1605- 1616 ,(2014) , 10.1002/IJC.28496
Peter Langfelder, Steve Horvath, WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. ,vol. 9, pp. 559- 559 ,(2008) , 10.1186/1471-2105-9-559
Willeke de Haan, Alpana Bhattacharjee, Piers Ruddle, Martin H. Kang, Michael R. Hayden, ABCA1 in adipocytes regulates adipose tissue lipid content, glucose tolerance, and insulin sensitivity. Journal of Lipid Research. ,vol. 55, pp. 516- 523 ,(2014) , 10.1194/JLR.M045294
Pedro González-Muniesa, María Marrades, José Martínez, María Moreno-Aliaga, Differential Proinflammatory and Oxidative Stress Response and Vulnerability to Metabolic Syndrome in Habitual High-Fat Young Male Consumers Putatively Predisposed by Their Genetic Background International Journal of Molecular Sciences. ,vol. 14, pp. 17238- 17255 ,(2013) , 10.3390/IJMS140917238
M. Kanehisa, S. Goto, Y. Sato, M. Furumichi, M. Tanabe, KEGG for integration and interpretation of large-scale molecular data sets Nucleic Acids Research. ,vol. 40, pp. 109- 114 ,(2012) , 10.1093/NAR/GKR988
Subhagata Chattopadhyay, Saurabh Rakesh, Lesley Peek Wee Land, U. Rajendra Acharya, Studying infant mortality rate: a data mining approach Health technology. ,vol. 1, pp. 25- 34 ,(2011) , 10.1007/S12553-011-0005-0
Nils Gehlenborg, Seán I O'Donoghue, Nitin S Baliga, Alexander Goesmann, Matthew A Hibbs, Hiroaki Kitano, Oliver Kohlbacher, Heiko Neuweger, Reinhard Schneider, Dan Tenenbaum, Anne-Claude Gavin, Visualization of omics data for systems biology Nature Methods. ,vol. 7, ,(2010) , 10.1038/NMETH.1436
Hojung Nam, KiYoung Lee, Doheon Lee, Identification of temporal association rules from time-series microarray data sets BMC Bioinformatics. ,vol. 10, pp. 1- 9 ,(2009) , 10.1186/1471-2105-10-S3-S6
Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur, Dynamic itemset counting and implication rules for market basket data international conference on management of data. ,vol. 26, pp. 255- 264 ,(1997) , 10.1145/253260.253325