作者: Marylyn D. Ritchie , William S. Bush
DOI: 10.1016/B978-0-12-380862-2.00001-1
关键词: Biological data 、 Environmental data 、 Data simulation 、 Biology 、 Machine learning 、 Genome 、 Human genomics 、 Artificial intelligence 、 Bioinformatics 、 Genomics 、 In silico 、 Software
摘要: Simulated data is a necessary first step in the evaluation of new analytic methods because simulated true effects are known. To successfully develop novel statistical and computational for genetic analysis, it vital to simulate datasets consisting single nucleotide polymorphisms (SNPs) spread throughout genome at density similar that observed by high-throughput molecular genomics studies. In addition, simulation environmental will be essential properly formulate risk models complex disorders. Data simulations often criticized they much less noisy than natural biological data, as nearly impossible multitude possible sources experimental variability. However, simulating silico most straightforward way test potential during development. Thus, advances increase complexity permit investigators better assess analytical methods. this work, we briefly describe some current approaches human describing advantages disadvantages various approaches. We also include details on software packages available simulation. Finally, expand upon one particular approach creation complex, genomic uses forward-time population algorithm: genomeSIMLA. Many hallmark features can synthesized silico; still research needed enhance our capabilities create capture datasets.