Pathway-Level Information ExtractoR (PLIER): a generative model for gene expression data

作者: Weiguang Mao , Elena Zaslavsky , Boris M. Hartmann , Stuart C. Sealfon , Maria Chikina

DOI: 10.1101/116061

关键词:

摘要: Genome scale molecular datasets are often highly structured, with many correlated measurements. This general phenomenon can be related to the underlying data generating process. In assays of mixed cell populations, such as blood, variation in cell-type proportion induces a complex correlation structure at gene-level. Likewise, groups genes co-regulated/co-expressed through shared transcription factors and signaling pathways. Many applications gene expression analysis rely on their ability reflect these unobserved biological processes order draw mechanistic conclusions. On other hand, patterns may also nuisance factors, batch effects, which interfere correct interpretation. The choice method is heavily dependent (nuisance or interesting-biological) believed account for more optimal variance strategy remains an open question. this study we describe infer biologically grounded model that provides estimates processes, including explicitly identified pathway-level effects. Specifically, formulate new matrix decomposition framework, PLIER (Pathway-level Information ExtractoR), incorporates prior knowledge. Using simulations, demonstrate superiority our recovering true model. real data, show approach able recover interpretable variables, reproduce previous findings simplified distinguish technical variation, provide additional insight. auxiliary functions compiled R package available https://github.com/wgmao/PLIER.

参考文章(35)
Jacob H. Levine, Erin F. Simonds, Sean C. Bendall, Kara L. Davis, El-ad D. Amir, Michelle D. Tadmor, Oren Litvin, Harris G. Fienberg, Astraea Jager, Eli R. Zunder, Rachel Finck, Amanda L. Gedman, Ina Radtke, James R. Downing, Dana Pe’er, Garry P. Nolan, Data-Driven Phenotypic Dissection of AML Reveals Progenitor-like Cells that Correlate with Prognosis. Cell. ,vol. 162, pp. 184- 197 ,(2015) , 10.1016/J.CELL.2015.05.047
Nicolas Gillis, The Why and How of Nonnegative Matrix Factorization. arXiv: Machine Learning. ,(2014)
Taizo Mori, Yukiko Iwasaki, Yoichi Seki, Masanori Iseki, Hiroko Katayama, Kazuhiko Yamamoto, Kiyoshi Takatsu, Satoshi Takaki, Lnk/Sh2b3 controls the production and function of dendritic cells and regulates the induction of IFN-γ-producing T cells. Journal of Immunology. ,vol. 193, pp. 1728- 1736 ,(2014) , 10.4049/JIMMUNOL.1303243
Guillaume Obozinski, Jean-Philippe Vert, Laurent Jacob, Group Lasso with Overlaps: the Latent Group Lasso approach arXiv: Machine Learning. pp. 60- ,(2011)
S Mostafavi, A Battle, X Zhu, J B Potash, M M Weissman, J Shi, K Beckman, C Haudenschild, C McCormick, R Mei, M J Gameroff, H Gindes, P Adams, F S Goes, F M Mondimore, D F MacKinnon, L Notes, B Schweizer, D Furman, S B Montgomery, A E Urban, D Koller, D F Levinson, Type I interferon signaling genes in recurrent major depression: increased expression detected by whole-blood RNA sequencing. Molecular Psychiatry. ,vol. 19, pp. 1267- 1274 ,(2014) , 10.1038/MP.2013.161
Hui Zou, Trevor Hastie, Robert Tibshirani, Sparse Principal Component Analysis Journal of Computational and Graphical Statistics. ,vol. 15, pp. 265- 286 ,(2006) , 10.1198/106186006X113430
Tracy S P Heng, , Michio W Painter, Kutlu Elpek, Veronika Lukacs-Kornek, Nora Mauermann, Shannon J Turley, Daphne Koller, Francis S Kim, Amy J Wagers, Natasha Asinovski, Scott Davis, Marlys Fassett, Markus Feuerer, Daniel H D Gray, Sokol Haxhinasto, Jonathan A Hill, Gordon Hyatt, Catherine Laplace, Kristen Leatherbee, Diane Mathis, Christophe Benoist, Radu Jianu, David H Laidlaw, J Adam Best, Jamie Knell, Ananda W Goldrath, Jessica Jarjoura, Joseph C Sun, Yanan Zhu, Lewis L Lanier, Ayla Ergun, Zheng Li, James J Collins, Susan A Shinton, Richard R Hardy, Randall Friedline, Katelyn Sylvia, Joonsoo Kang, The Immunological Genome Project: networks of gene expression in immune cells Nature Immunology. ,vol. 9, pp. 1091- 1094 ,(2008) , 10.1038/NI1008-1091
Aaron M Newman, Chih Long Liu, Michael R Green, Andrew J Gentles, Weiguo Feng, Yue Xu, Chuong D Hoang, Maximilian Diehn, Ash A Alizadeh, Robust enumeration of cell subsets from tissue expression profiles Nature Methods. ,vol. 12, pp. 453- 457 ,(2015) , 10.1038/NMETH.3337
Alexander R. Abbas, Kristen Wolslegel, Dhaya Seshasayee, Zora Modrusan, Hilary F. Clark, Deconvolution of Blood Microarray Data Identifies Cellular Activation Patterns in Systemic Lupus Erythematosus PLoS ONE. ,vol. 4, pp. e6098- ,(2009) , 10.1371/JOURNAL.PONE.0006098