作者: Dan He , Zhanyong Wang , Laxmi Parida
DOI: 10.1186/1471-2105-16-S1-S10
关键词: Regression analysis 、 Genetic marker 、 Quantitative trait locus 、 Trait 、 Genetic architecture 、 Genetics 、 Biology 、 Epistasis 、 Linear model 、 Allele 、 Computational biology
摘要: Given a set of biallelic molecular markers, such as SNPs, with genotype values on collection plant, animal or human samples, the goal quantitative genetic trait prediction is to predict by simultaneously modeling all marker effects. Quantitative usually represented linear regression models which require encodings for genotypes: three distinct values, corresponding one heterozygous and two homozygous alleles, are coded integers, manipulated algebraically in model. Further, epistasis between multiple markers modeled multiplication markers: it unclear that model continues be effective under this. In this work we investigate effects problem. We first showed different lead accuracies, many test cases. then proposed data-driven encoding strategy, where encode genotypes according their distribution phenotypes allow each have encodings. show our experiments strategy able improve performance method more helpful oligogenic traits, whose rely relatively small markers. To best knowledge, paper discusses