摘要: The example of a multiple-sequence alignment shown in Figure 3.1 is a set of aminoacid sequences of globins that have been aligned so that homologous residues are arranged in columns as much as possible. The sequences have different lengths, which means that gaps (shown as hyphens in the figure) must be used in some positions to achieve the alignment. If the sequences were all the same length, the gaps would not be needed. The generation of alignments is one of the most common tasks in computational sequence analysis because alignments are required for many other analyses, such as structure prediction or to demonstrate sequence similarity within a family of sequences. Of course, one of the most common reasons for generating them is that they are an essential prerequisite for most phylogenetic analyses. Rates or patterns of change in sequences cannot be analyzed unless the sequences can be aligned.The final goal should clearly be recognized when discussing how to carry out an alignment, either manually or using a computer program. Any phylogeny inference based on molecular data begins by comparing the homologous residues (ie, those that descend from a common ancestral residue) with different deoxyribonucleic acid (DNA) or protein sequences. The best way to do this is to align sequences one on top of another, so that homologous residues from different sequences line up in the same column. If the sequences are evolutionarily related, they began as identical to each other and diverged over time by the accumulation of substitutions, as well as insertions and deletions. Given a set of sequences, it is then …