Mining protein contact maps

作者: Chris Bystroff , Mohammed J. Zaki , Xiaolan Shen , Yu Shao , Jingjing Hu

DOI:

关键词:

摘要: The 3D conformation of a protein may be compactly represented in symmetrical, square, boolean matrix pairwise, inter-residue contacts, or "contact map". contact map provides host useful information about the protein's structure. In this paper we describe how data mining can used to extract valuable from maps. For example, clusters contacts represent certain secondary structures, and also capture non-local interactions, giving clues tertiary structure. In focus on two main tasks: 1) Given database sequences, discover an extensive set (frequent) dense patterns their maps, compile library such interactions. 2) Cluster these based similarities evaluate clustering quality. We show via experiments that our techniques are effective characterizing across different proteins, improve prediction for unknown proteins as well learn folding pathways.

参考文章(21)
Heikki Mannila, Hannu T. T. Toivonen, Rakesh Agrawal, Rayadurgam Srikant, Verkamo: Fast Discovery of Association Rules knowledge discovery and data mining. ,(1996)
Cyrus Levinthal, Are there pathways for protein folding Journal de Chimie Physique. ,vol. 65, pp. 44- 45 ,(1968) , 10.1051/JCP/1968650044
Heikki Mannila, A. Inkeri Verkamo, Ramakrishnan Srikant, Hannu Toivonen, Rakesh Agrawal, Fast discovery of association rules knowledge discovery and data mining. pp. 307- 328 ,(1996)
Christopher Bystroff, Vesteinn Thorsson, David Baker, HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. Journal of Molecular Biology. ,vol. 301, pp. 173- 190 ,(2000) , 10.1006/JMBI.2000.3837
Yuri I Wolf, Nick V Grishin, Eugene V Koonin, Estimating the number of protein folds and families from complete genome data Journal of Molecular Biology. ,vol. 299, pp. 897- 905 ,(2000) , 10.1006/JMBI.2000.3786
David J. Thomas, Georg Casari, Chris Sander, THE PREDICTION OF PROTEIN CONTACTS FROM MULTIPLE SEQUENCE ALIGNMENTS Protein Engineering. ,vol. 9, pp. 941- 948 ,(1996) , 10.1093/PROTEIN/9.11.941
Michele Vendruscolo, Edo Kussell, Eytan Domany, Recovery of protein structure from contact maps Folding and Design. ,vol. 2, pp. 295- 306 ,(1997) , 10.1016/S1359-0278(97)00041-2
John Moult, Jan T Pedersen, Richard Judson, Krzysztof Fidelis, None, A large-scale experiment to assess protein structure prediction methods Proteins: Structure, Function, and Genetics. ,vol. 23, pp. ii- iv ,(1995) , 10.1002/PROT.340230303