Analysis and comparison of benchmarks for multiple sequence alignment.

作者: Desmond G. Higgins , Gordon Blackshields , Iain M. Wallace , Mark A. Larkin

DOI:

关键词: AnnotationOver trainingComputer scienceSoftwareSequence alignmentMeasure (data warehouse)Multiple sequence alignmentData miningBenchmark (computing)Protein superfamily

摘要: The most popular way of comparing the performance multiple sequence alignment programs is to use empirical testing on sets test sequences. Several such now exist, each with potential strengths and weaknesses. We apply several different packages 6 benchmark datasets, compare their relative performances. HOMSTRAD, a collection alignments homologous proteins, regularly used as for though it not designed such, lacks annotation reliable regions within alignment. introduce this into HOMSTRAD using protein structural superposition. Results database show that method dependent input Alignment benchmarks are in combination measure across spectrum problems. Through combining benchmarks, possible detect whether program has been over-optimised single dataset, or problem type.

参考文章(5)
Timo Lassmann, Erik L.L Sonnhammer, Quality assessment of multiple alignment programs. FEBS Letters. ,vol. 529, pp. 126- 130 ,(2002) , 10.1016/S0014-5793(02)03189-7
R. C. Edgar, MUSCLE: multiple sequence alignment with high accuracy and high throughput Nucleic Acids Research. ,vol. 32, pp. 1792- 1797 ,(2004) , 10.1093/NAR/GKH340
C. Lee, C. Grasso, M. F. Sharlow, Multiple sequence alignment using partial order graphs Bioinformatics. ,vol. 18, pp. 452- 464 ,(2002) , 10.1093/BIOINFORMATICS/18.3.452
Cédric Notredame, Desmond G Higgins, Jaap Heringa, T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology. ,vol. 302, pp. 205- 217 ,(2000) , 10.1006/JMBI.2000.4042