New EST trimming strategy

作者: Christian Baudet , Zanoni Dias

DOI: 10.1007/11532323_24

关键词:

摘要: Trimming procedures are an important part of the sequence analysis pipeline in EST Sequencing Project. In general, trimming is done several phases, each one detecting and removing some kind undesirable artifact, such as low quality sequence, vectors or adapters, contamination. However, this strategy often results a phase being unable to recognize its target because it was removed during previous phase. To remedy drawback, we propose new strategy, where detects but does not remove target, leaving decision post processing step occurring after all phases. Our tests show that can significantly improve detection artifacts.

参考文章(12)
Hans Zischler, Matthias Hoss, Oliva Handt, A Von Haeseler, AC Van Der Kuyl, J Goudsmit, Svante Pääbo, Detecting dinosaur DNA Science. ,vol. 268, pp. 1192- 1193 ,(1995) , 10.1126/SCIENCE.7605504
H.-H. Chou, M. H. Holmes, DNA sequence quality trimming and vector removal Bioinformatics. ,vol. 17, pp. 1093- 1104 ,(2001) , 10.1093/BIOINFORMATICS/17.12.1093
Owen White, Ted Dunning, Granger Sutton, Mark Adams, J. Craig Venter, Chris Fields, A quality control algorithm for DNA sequencing projects Nucleic Acids Research. ,vol. 21, pp. 3829- 3838 ,(1993) , 10.1093/NAR/21.16.3829
Guilherme P Telles, Felipe R da Silva, None, Trimming and clustering sugarcane ESTs Genetics and Molecular Biology. ,vol. 24, pp. 17- 23 ,(2001) , 10.1590/S1415-47572001000100004
M. Adams, J. Kelley, J. Gocayne, M. Dubnick, M. Polymeropoulos, H. Xiao, C. Merril, A. Wu, B. Olde, R. Moreno, a. et, Complementary DNA sequencing : expressed sequence tags and human genome project Science. ,vol. 252, pp. 1651- 1656 ,(1991) , 10.1126/SCIENCE.2047873
Rotem Sorek, Hershel M Safer, A novel algorithm for computational identification of contaminated EST libraries Nucleic Acids Research. ,vol. 31, pp. 1067- 1074 ,(2003) , 10.1093/NAR/GKG170
Mark R Band, Joshua H Larson, Mark Rebeiz, Cheryl A Green, D Wayne Heyen, Jena Donovan, Ryan Windish, Chad Steining, Prapti Mahyuddin, James E Womack, Harris A Lewin, An ordered comparative map of the cattle and human genomes. Genome Research. ,vol. 10, pp. 1359- 1368 ,(2000) , 10.1101/GR.145900
T. E. Scheetz, N. Trivedi, C. A. Roberts, T. Kucaba, B. Berger, N. L. Robinson, C. L. Birkett, A. J. Gavin, B. O'Leary, T. A. Braun, M. F. Bonaldo, J. P. Robinson, V. C. Sheffield, M. B. Soares, T. L. Casavant, ESTprep: preprocessing cDNA sequence reads. Bioinformatics. ,vol. 19, pp. 1318- 1324 ,(2003) , 10.1093/BIOINFORMATICS/BTG159
Zheng Zhang, Webb Miller, David J Lipman, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research. ,vol. 25, pp. 3389- 3402 ,(1997) , 10.1093/NAR/25.17.3389
Xiaoqiu Huang, Anup Madan, CAP3: A DNA Sequence Assembly Program Genome Research. ,vol. 9, pp. 868- 877 ,(1999) , 10.1101/GR.9.9.868