作者: Inge Jonassen , Ingvar Eidhammer , William R. Taylor
DOI: 10.1002/(SICI)1097-0134(19990201)34:2<206::AID-PROT6>3.0.CO;2-N
关键词: Data mining 、 Computational biology 、 Structural motif 、 Sequence motif 、 Protein family 、 Structural alignment 、 Amino acid 、 Protein Data Bank (RCSB PDB) 、 Protein structure 、 PROSITE 、 Biology 、 Biochemistry 、 Molecular biology 、 Structural biology
摘要: We present a language for describing structural patterns of residues in protein structures and method the discovery such that recur set structures. The impose restrictions on spatial position each residue, their order along amino acid chain, which acids are allowed position. Unlike other methods comparing sets structures, our is not based use pairwise structure comparisons often time consuming can produce inconsistent results. Instead, simultaneously takes into account information from all search conserved potential motifs. neighborhoods residue as string applying sequence pattern to find common subsets these strings. Finally it checked whether similarities between neighborhood strings correspond spatially similar substructures. apply analyze very disparate proteins four different families: serine proteases, cuprodoxins, cysteine proteinases, ferredoxins. motifs found by well site motif given annotation PDB, Swiss-Prot, PROSITE. Furthermore, confirmed using data constrain alignment obtained with program SAP. This gave best superposition/alignment assignment.