Design and implementation of a generalized laboratory data model.

作者: Michael C Wendl , Scott Smith , Craig S Pohl , David J Dooling , Asif T Chinwalla

DOI: 10.1186/1471-2105-8-362

关键词:

摘要: Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased rates at which they can generate data. In many environments, themselves also evolve a rapid fluid manner. These observations point importance of robust information management systems modern laboratory. Designing implementing such is non-trivial it appears that cases database project ultimately proves unserviceable. We describe general modeling framework for data its implementation as an system. The model utilizes several abstraction techniques, focusing especially on concepts inheritance meta-data. Traditional approaches commingle event-oriented with regular entity ad hoc ways. Instead, we define distinct event schemas, but fully integrate these via standardized interface. design allows straightforward definition "processing pipeline" sequence events, obviating need separate workflow systems. A layer above schema integrates events into by defining directives", act automated managers items Directives be added or modified almost trivial fashion, i.e., without modification re-certification applications. Association between entities managed simple "many-to-many" relationships. programming interface, well techniques handling input/output, process control, state transitions. described here has served Washington University Genome Sequencing Center's primary system years. It handles all transactions underlying throughput rate about 9 million sequencing reactions various kinds per month handily weathered number major pipeline reconfigurations. basic readily adapted other high-volume processing environments.

参考文章(35)
Tim Bunce, Jeff Zucker, Programming the Perl DBI ,(2000)
Kevin Loney, None, Oracle Database 10g The Complete Reference McGraw-Hill, Inc.. ,(2004)
Nicole Donofrio, Ravi Rajagopalon, Douglas Brown, Stephen Diener, Donald Windham, Shelly Nolin, Anna Floyd, Thomas Mitchell, Natalia Galadima, Sara Tucker, Marc J Orbach, Gayatri Patel, Mark Farman, Vishal Pampanwar, Cari Soderlund, Yong-Hwan Lee, Ralph A Dean, 'PACLIMS': A component LIM system for high-throughput functional genomic analysis BMC Bioinformatics. ,vol. 6, pp. 94- 94 ,(2005) , 10.1186/1471-2105-6-94
Michael C. Wendl, Simon Dear, Dave Hodgson, LaDeana Hillier, Automated Sequence Preprocessing in a Large-Scale Sequencing Environment Genome Research. ,vol. 8, pp. 975- 984 ,(1998) , 10.1101/GR.8.9.975
Jaime Prilusky, Eric Oueillet, Nathalie Ulryck, Anne Pajon, Julie Bernauer, Isabelle Krimm, Sophie Quevillon-Cheruel, Nicolas Leulliot, Marc Graille, Dominique Liger, Lionel Trésaugues, Joel L. Sussman, Joël Janin, Herman van Tilbeurgh, Anne Poupon, HalX: an open-source LIMS (Laboratory Information Management System) for small- to large-scale laboratories Acta Crystallographica Section D-biological Crystallography. ,vol. 61, pp. 671- 678 ,(2005) , 10.1107/S0907444905001290
Peter W. Haebel, Vickery L. Arcus, Edward N. Baker, Peter Metcalf, LISA: an intranet-based flexible database for protein crystallography project management. Acta Crystallographica Section D-biological Crystallography. ,vol. 57, pp. 1341- 1343 ,(2001) , 10.1107/S0907444901009295
L.D. Stein, J. Thierry-Mieg, AceDB: a genome database management system Computing in Science and Engineering. ,vol. 1, pp. 44- 52 ,(1999) , 10.1109/5992.764215
R.K. WILSON, T.J. LEY, F.S. COLE, J.D. MILBRANDT, S. CLIFTON, L. FULTON, G. FEWELL, P. MINX, H. SUN, M. MCLELLAN, C. POHL, E.R. MARDIS, Mutational profiling in the human genome. Cold Spring Harbor Symposia on Quantitative Biology. ,vol. 68, pp. 23- 30 ,(2003) , 10.1101/SQB.2003.68.23
James K. Bonfield, Rodger Staden, Experiment files and their application during large-scale sequencing projects Dna Sequence. ,vol. 6, pp. 109- 117 ,(1996) , 10.3109/10425179609010197