作者: Daekwan Seo , Cizhong Jiang , Zhongming Zhao
关键词:
摘要: The local environment of single nucleotide polymorphisms (SNPs) contains abundant genetic information for the study mechanisms mutation, genome evolution, and causes diseases. Recent studies revealed that neighboring-nucleotide biases on SNPs were strong genome-wide bias patterns could be represented by a small subset total SNPs. It remains unsolved estimation effective SNP size, number are sufficient to represent observed from whole data. To estimate we developed novel statistical method, SNPKS, which considers both biological significances. SNPKS consists two major steps: obtain an initial size Kolmogorov-Smirnov test (KS test) find intermediate interval evaluation. algorithm was implemented in computer programs applied real estimated 38,200, 39,300, 38,000, 38,700 human, chimpanzee, dog, mouse genomes, respectively, 39,100, 39,600, 39,200, 42,200 human intergenic, genic, intronic, CpG island regions, respectively. is first method size. runs efficiently greatly outperforms SNPNB. application data similar (38,000 – 42,200) genomes as well genomic regions. findings suggest influence factors across vertebrate genomes.