A novel statistical method to estimate the effective SNP size in vertebrate genomes and categorized genomic regions

作者: Daekwan Seo , Cizhong Jiang , Zhongming Zhao

DOI: 10.1186/1471-2164-7-329

关键词:

摘要: The local environment of single nucleotide polymorphisms (SNPs) contains abundant genetic information for the study mechanisms mutation, genome evolution, and causes diseases. Recent studies revealed that neighboring-nucleotide biases on SNPs were strong genome-wide bias patterns could be represented by a small subset total SNPs. It remains unsolved estimation effective SNP size, number are sufficient to represent observed from whole data. To estimate we developed novel statistical method, SNPKS, which considers both biological significances. SNPKS consists two major steps: obtain an initial size Kolmogorov-Smirnov test (KS test) find intermediate interval evaluation. algorithm was implemented in computer programs applied real estimated 38,200, 39,300, 38,000, 38,700 human, chimpanzee, dog, mouse genomes, respectively, 39,100, 39,600, 39,200, 42,200 human intergenic, genic, intronic, CpG island regions, respectively. is first method size. runs efficiently greatly outperforms SNPNB. application data similar (38,000 – 42,200) genomes as well genomic regions. findings suggest influence factors across vertebrate genomes.

参考文章(31)
Peter A Jones, Daiya Takai, The CpG island searcher: a new WWW resource. in Silico Biology. ,vol. 3, pp. 235- 240 ,(2003)
Loïc Ponger, Laurent Duret, Dominique Mouchiroud, Determinants of CpG islands: expression in early embryo and isochore structure. Genome Research. ,vol. 11, pp. 1854- 1860 ,(2001) , 10.1101/GR.174501
Takashi Gojobori, Wen-Hsiung Li, Dan Graur, Patterns of Nucleotide Substitution in Pseudogenes and Functional Genes Journal of Molecular Evolution. ,vol. 18, pp. 360- 369 ,(1982) , 10.1007/BF01733904
Francis S. Collins, , Eric D. Green, Alan E. Guttmacher, Mark S. Guyer, A vision for the future of genomics research Nature. ,vol. 422, pp. 835- 847 ,(2003) , 10.1038/NATURE01626
A. L. Hughes, B. Packer, R. Welch, A. W. Bergen, S. J. Chanock, M. Yeager, Widespread purifying selection at polymorphic sites in human protein-coding loci Proceedings of the National Academy of Sciences of the United States of America. ,vol. 100, pp. 15754- 15757 ,(2003) , 10.1073/PNAS.2536718100
Wen-Hsiung Li, Chung-I Wu, Chi-Cheng Luo, Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications Journal of Molecular Evolution. ,vol. 21, pp. 58- 71 ,(1984) , 10.1007/BF02100628
Edward V. Ball, David N. Cooper, Michael Krawczak, Neighboring-nucleotide effects on the rates of germ-line single-base-pair substitution in human genes. American Journal of Human Genetics. ,vol. 63, pp. 474- 488 ,(1998) , 10.1086/301965
Adrian P. Bird, DNA methylation and the frequency of CpG in animal DNA Nucleic Acids Research. ,vol. 8, pp. 1499- 1504 ,(1980) , 10.1093/NAR/8.7.1499