作者: Ting Gong , Nicole Hartmann , Isaac S. Kohane , Volker Brinkmann , Frank Staedtler
DOI: 10.1371/JOURNAL.PONE.0027156
关键词:
摘要: Large-scale molecular profiling technologies have assisted the identification of disease biomarkers and facilitated basic understanding cellular processes. However, samples collected from human subjects in clinical trials possess a level complexity, arising multiple cell types, that can obfuscate analysis data derived them. Failure to identify, quantify, incorporate sources heterogeneity into an widespread detrimental effects on subsequent statistical studies. We describe approach builds upon linear latent variable model, which expression levels mixed populations are modeled as weighted average different types. We solve these equations using quadratic programming, efficiently identifies globally optimal solution while preserving non-negativity fraction cells. applied our method various existing platforms estimate proportions pure or tissue types gene profilings distinct phenotypes, with focus complex trials. We tested methods several well controlled benchmark sets known mixing fractions mRNA trial. Accurate agreement between predicted actual was observed. In addition, able predict for more than ten species circulating cells provide accurate estimates relatively rare (<10% total population). Furthermore, changes leukocyte trafficking associated Fingolomid (FTY720) treatment were identified consistent previous results generated by both counts flow cytometry. These suggest one open questions regarding transcriptional data: namely, how identify given experiment.