NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

作者: K. D. Pruitt , T. Tatusova , G. R. Brown , D. R. Maglott

DOI: 10.1093/NAR/GKR1079

关键词:

摘要: The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of genomic, transcript and protein sequence records. These records are selected curated from public archives represent significant reduction in redundancy compared to the volume data archived by International Nucleotide Database Collaboration. includes over 16,00 organisms, 2.4 × 0(6) genomic records, 13 10(6) proteins 2 RNA spanning prokaryotes, eukaryotes viruses (RefSeq release 49, September 2011). RefSeq maintained combined approach automated analyses, collaboration manual curation generate an up-to-date representation sequence, its features, names cross-links related sources information. We report here on recent growth, status curating human set, more extensive feature annotation current policy eukaryotic genome via NCBI pipeline. More information about resource available online (see http://www.ncbi.nlm.nih.gov/RefSeq/).

参考文章(14)
K. D. Pruitt, T. Tatusova, W. Klimke, D. R. Maglott, NCBI Reference Sequences: current status, policy and new initiatives. Nucleic Acids Research. ,vol. 37, pp. 32- 36 ,(2009) , 10.1093/NAR/GKN721
A. Kozomara, S. Griffiths-Jones, miRBase: integrating microRNA annotation and deep-sequencing data Nucleic Acids Research. ,vol. 39, pp. 152- 157 ,(2011) , 10.1093/NAR/GKQ1027
Tulika Prakash, Vineet K. Sharma, Naoki Adati, Ritsuko Ozawa, Naveen Kumar, Yuichiro Nishida, Takayoshi Fujikake, Tadayuki Takeda, Todd D. Taylor, Expression of Conjoined Genes: Another Mechanism for Gene Regulation in Eukaryotes PLoS ONE. ,vol. 5, pp. e13284- ,(2010) , 10.1371/JOURNAL.PONE.0013284
Kim D. Pruitt, Kenneth S. Katz, Hugues Sicotte, Donna R. Maglott, Introducing RefSeq and LocusLink: curated human genome resources at the NCBI Trends in Genetics. ,vol. 16, pp. 44- 47 ,(2000) , 10.1016/S0168-9525(99)01882-X
J. D. Jackson, Z. Ke, C. J. Lanczycki, F. Lu, G. H. Marchler, M. Mullokandov, M. V. Omelchenko, C. L. Robertson, J. S. Song, N. Thanki, R. A. Yamashita, D. Zhang, N. Zhang, C. Zheng, S. H. Bryant, A. Marchler-Bauer, S. Lu, J. B. Anderson, F. Chitsaz, M. K. Derbyshire, C. DeWeese-Scott, J. H. Fong, L. Y. Geer, R. C. Geer, N. R. Gonzales, M. Gwadz, D. I. Hurwitz, CDD: a Conserved Domain Database for the functional annotation of proteins Nucleic Acids Research. ,vol. 39, pp. 225- 229 ,(2011) , 10.1093/NAR/GKQ1189
Deanna M Church, Valerie A Schneider, Tina Graves, Katherine Auger, Fiona Cunningham, Nathan Bouk, Hsiu-Chuan Chen, Richa Agarwala, William M McLaren, Graham RS Ritchie, Derek Albracht, Milinn Kremitzki, Susan Rock, Holland Kotkiewicz, Colin Kremitzki, Aye Wollam, Lee Trani, Lucinda Fulton, Robert Fulton, Lucy Matthews, Siobhan Whitehead, Will Chow, James Torrance, Matthew Dunn, Glenn Harden, Glen Threadgold, Jonathan Wood, Joanna Collins, Paul Heath, Guy Griffiths, Sarah Pelan, Darren Grafham, Evan E Eichler, George Weinstock, Elaine R Mardis, Richard K Wilson, Kerstin Howe, Paul Flicek, Tim Hubbard, None, Modernizing Reference Genome Assemblies PLoS Biology. ,vol. 9, pp. e1001091- ,(2011) , 10.1371/JOURNAL.PBIO.1001091
Thomas Nordahl Petersen, Søren Brunak, Gunnar von Heijne, Henrik Nielsen, SignalP 4.0: discriminating signal peptides from transmembrane regions Nature Methods. ,vol. 8, pp. 785- 786 ,(2011) , 10.1038/NMETH.1701
Kim D Pruitt, Jennifer Harrow, Rachel A Harte, Craig Wallin, Mark Diekhans, Donna R Maglott, Steve Searle, Catherine M Farrell, Jane E Loveland, Barbara J Ruef, Elizabeth Hart, Marie-Marthe Suner, Melissa J Landrum, Bronwen Aken, Sarah Ayling, Robert Baertsch, Julio Fernandez-Banet, Joshua L Cherry, Val Curwen, Michael DiCuccio, Manolis Kellis, Jennifer Lee, Michael F Lin, Michael Schuster, Andrew Shkeda, Clara Amid, Garth Brown, Oksana Dukhanina, Adam Frankish, Jennifer Hart, Bonnie L Maidak, Jonathan Mudge, Michael R Murphy, Terence Murphy, Jeena Rajan, Bhanu Rajput, Lillian D Riddick, Catherine Snow, Charles Steward, David Webb, Janet A Weber, Laurens Wilming, Wenyu Wu, Ewan Birney, David Haussler, Tim Hubbard, James Ostell, Richard Durbin, David Lipman, The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes Genome Research. ,vol. 19, pp. 1316- 1323 ,(2009) , 10.1101/GR.080531.108
D. Maglott, J. Ostell, K. D. Pruitt, T. Tatusova, Entrez Gene: gene-centered information at NCBI Nucleic Acids Research. ,vol. 33, pp. 26- 31 ,(2004) , 10.1093/NAR/GKL993
Stephen T Sherry, M-H Ward, M Kholodov, J Baker, Lon Phan, Elizabeth M Smigielski, Karl Sirotkin, dbSNP: the NCBI database of genetic variation Nucleic Acids Research. ,vol. 29, pp. 308- 311 ,(2001) , 10.1093/NAR/29.1.308