作者: S. R. Piccolo , M. R. Withers , O. E. Francis , A. H. Bild , W. E. Johnson
关键词: DNA microarray 、 Workflow 、 Data mining 、 Single sample 、 Biology 、 Gene expression profiling 、 Software 、 Data integration 、 Mixture model 、 Profiling (information science)
摘要: Over the past two decades, many biotechnology platforms have been developed for high-throughput gene expression profiling. However, because each platform is subject to technology-specific biases and produces distinct raw-data distributions, researchers experienced difficulty in integrating data across platforms. Data integration crucial data-generating consortiums, transitioning newer profiling technologies, individuals seeking aggregate experiments. We address this need with our Universal exPression Code (UPC) approach, which corrects platform-specific background noise using models that account genomic base composition length of target regions; approach also uses a mixture model estimate whether active particular sample. The latter standardized UPC values on zero-to-one scale, so they can be interpreted consistently, irrespective technology, thus enabling downstream analysis pipelines platform-agnostic manner. method applied one- two-channel microarrays next-generation sequencing (RNA sequencing). Furthermore, UPCs are derived information from within given sample only—no ancillary samples required at processing time. Thus, suitable personalized-medicine workflows where must processed individually rather than batches. In variety analyses comparisons, perform comparably other methods designed specifically or RNA most settings. Software calculating freely available www.bioconductor.org/packages/release/bioc/html/SCAN.UPC.html.