作者: Desmond G. Higgins , Gordon Blackshields , Iain M. Wallace , Mark A. Larkin
DOI:
关键词: Annotation 、 Over training 、 Computer science 、 Software 、 Sequence alignment 、 Measure (data warehouse) 、 Multiple sequence alignment 、 Data mining 、 Benchmark (computing) 、 Protein superfamily
摘要: The most popular way of comparing the performance multiple sequence alignment programs is to use empirical testing on sets test sequences. Several such now exist, each with potential strengths and weaknesses. We apply several different packages 6 benchmark datasets, compare their relative performances. HOMSTRAD, a collection alignments homologous proteins, regularly used as for though it not designed such, lacks annotation reliable regions within alignment. introduce this into HOMSTRAD using protein structural superposition. Results database show that method dependent input Alignment benchmarks are in combination measure across spectrum problems. Through combining benchmarks, possible detect whether program has been over-optimised single dataset, or problem type.