作者: Chris Bystroff , Mohammed J. Zaki , Xiaolan Shen , Yu Shao , Jingjing Hu
DOI:
关键词:
摘要: The 3D conformation of a protein may be compactly represented in symmetrical, square, boolean matrix pairwise, inter-residue contacts, or "contact map". contact map provides host useful information about the protein's structure. In this paper we describe how data mining can used to extract valuable from maps. For example, clusters contacts represent certain secondary structures, and also capture non-local interactions, giving clues tertiary structure. In focus on two main tasks: 1) Given database sequences, discover an extensive set (frequent) dense patterns their maps, compile library such interactions. 2) Cluster these based similarities evaluate clustering quality. We show via experiments that our techniques are effective characterizing across different proteins, improve prediction for unknown proteins as well learn folding pathways.