Integrity, standards, and QC-related issues with big data in pre-clinical drug discovery.

作者: John F. Brothers , Matthew Ung , Renan Escalante-Chong , Jermaine Ross , Jenny Zhang

DOI: 10.1016/J.BCP.2018.03.014

关键词: ScalabilityField (computer science)Quality (business)Data scienceBig dataPre-clinical developmentComputer scienceData analysisData integrityBiological data

摘要: The tremendous expansion of data analytics and public private big datasets presents an important opportunity for pre-clinical drug discovery development. In the field life sciences, growth genetic, genomic, transcriptomic proteomic is partly driven by a rapid decline in experimental costs as biotechnology improves throughput, scalability, speed. Yet far too many researchers tend to underestimate challenges consequences involving integrity quality standards. Given effect on scientific interpretation, these issues have significant implications during preclinical We describe standardized approaches maximizing utility publicly available or privately generated biological address some common pitfalls. also discuss increasing interest integrate interpret cross-platform data. Principles outlined here should serve useful broad guide existing analytical practices pipelines tool developing additional insights into therapeutics using

参考文章(68)
Samuel Andrew Stouffer, Adjustment during army life MA/AH Pub., Sunflower University Press. ,(1977)
Robert M. Kaplan, David A. Chambers, Russell E. Glasgow, Big Data and Large Sample Size: A Cautionary Note on the Potential for Bias Clinical and Translational Science. ,vol. 7, pp. 342- 346 ,(2014) , 10.1111/CTS.12178
Nikolay Kolesnikov, Emma Hastings, Maria Keays, Olga Melnichuk, Y. Amy Tang, Eleanor Williams, Miroslaw Dylag, Natalja Kurbatova, Marco Brandizi, Tony Burdett, Karyn Megy, Ekaterina Pilicheva, Gabriella Rustici, Andrew Tikhonov, Helen Parkinson, Robert Petryszak, Ugis Sarkans, Alvis Brazma, ArrayExpress update—simplifying data submissions Nucleic Acids Research. ,vol. 43, pp. 1113- 1116 ,(2015) , 10.1093/NAR/GKU1057
Kaoru Saijo, Christopher K. Glass, Microglial cell origin and phenotypes in health and disease Nature Reviews Immunology. ,vol. 11, pp. 775- 787 ,(2011) , 10.1038/NRI3086
Josef Spidlen, Karin Breuer, Chad Rosenberg, Nikesh Kotecha, Ryan R. Brinkman, FlowRepository: A resource of annotated flow cytometry datasets associated with peer‐reviewed publications Cytometry Part A. ,vol. 81, pp. 727- 731 ,(2012) , 10.1002/CYTO.A.22106
Harun Pirim, Burak Ekşioğlu, Andy D. Perkins, Çetin Yüceer, Clustering of high throughput gene expression data Computers & Operations Research. ,vol. 39, pp. 3046- 3061 ,(2012) , 10.1016/J.COR.2012.03.008
L. Jiang, F. Schlesinger, C. A. Davis, Y. Zhang, R. Li, M. Salit, T. R. Gingeras, B. Oliver, Synthetic spike-in standards for RNA-seq experiments Genome Research. ,vol. 21, pp. 1543- 1551 ,(2011) , 10.1101/GR.121095.111
A. Liberzon, A. Subramanian, R. Pinchback, H. Thorvaldsdottir, P. Tamayo, J. P. Mesirov, Molecular signatures database (MSigDB) 3.0 Bioinformatics. ,vol. 27, pp. 1739- 1740 ,(2011) , 10.1093/BIOINFORMATICS/BTR260
Ron Edgar, Michael Domrachev, Alex E Lash, Gene Expression Omnibus: NCBI gene expression and hybridization array data repository Nucleic Acids Research. ,vol. 30, pp. 207- 210 ,(2002) , 10.1093/NAR/30.1.207
John P A Ioannidis, David B Allison, Catherine A Ball, Issa Coulibaly, Xiangqin Cui, Aedín C Culhane, Mario Falchi, Cesare Furlanello, Laurence Game, Giuseppe Jurman, Jon Mangion, Tapan Mehta, Michael Nitzberg, Grier P Page, Enrico Petretto, Vera van Noort, Repeatability of published microarray gene expression analyses. Nature Genetics. ,vol. 41, pp. 149- 155 ,(2009) , 10.1038/NG.295