Circular clustering of protein dihedral angles by Minimum Message Length

作者: Dix Ti , Dowe Dl , Hunter L , Allison L , Wallace Cs

DOI:

关键词:

摘要: Early work on proteins identified the existence of helices and extended sheets in protein secondary structures, a high-level classification which remains popular today. Using Snob program for information-theoretic Minimum Message Length (MML) classification, we are able to take dihedral angles as determined by X-ray crystallography, cluster sets into groups. Previous Hunter States has applied similar Bayesian method, AutoClass, data with site position represented 3 Cartesian co-ordinates each alpha-Carbon, beta-Carbon Nitrogen, totalling 9 co-ordinates. By using von Mises circular distribution program, instead represent local properties two angles, phi psi. Since can be modelled having 2 degrees freedom, this orientation-invariant angle representation is more compact than that nine highly-correlated message length concepts discussed paper, such concise model likely underlying generating process from came. We report results our plotting classes (phi, psi) space; introducing symmetric distance measure build minimum spanning tree between classes. also give transition matrix note three region approximately -1.09 rad psi -0.75 close have high inter-transition probabilities. This gives rise tight, abundant self-perpetuating structure.

参考文章(25)
C. S. Wallace, P. R. Freeman, Estimation and Inference by Compact Coding Journal of the royal statistical society series b-methodological. ,vol. 49, pp. 240- 252 ,(1987) , 10.1111/J.2517-6161.1987.TB01695.X
D. L. Dowe, J. J. Oliver, R. A. Baxter, C. S. Wallace, Bayesian Estimation of the Von Mises Concentration Parameter Maximum Entropy and Bayesian Methods. pp. 51- 60 ,(1996) , 10.1007/978-94-011-5430-7_6
Ray J. Solomonoff, The discovery of algorithmic probability: A guide for the programming of true creativity european conference on computational learning theory. pp. 1- 22 ,(1995) , 10.1007/3-540-59119-2_165
C. S. Wallace, Classification by minimum-message-length inference ICCI'90 Proceedings of the international conference on Advances in computing and information. pp. 72- 81 ,(1991) , 10.1007/3-540-53504-7_63
Marianne J. Rooman, Joaquin Rodriguez, Shoshana J. Wodak, Automatic definition of recurrent local structure motifs in proteins. Journal of Molecular Biology. ,vol. 213, pp. 327- 336 ,(1990) , 10.1016/S0022-2836(05)80194-9
Gregory J. Chaitin, On the Length of Programs for Computing Finite Binary Sequences Journal of the ACM. ,vol. 13, pp. 547- 569 ,(1966) , 10.1145/321356.321363
Michael Levitt, Accurate modeling of protein conformation by automatic segment matching Journal of Molecular Biology. ,vol. 226, pp. 507- 533 ,(1992) , 10.1016/0022-2836(92)90964-L
PETER CHEESEMAN, JAMES KELLY, MATTHEW SELF, JOHN STUTZ, WILL TAYLOR, DON FREEMAN, AutoClass: a Bayesian classification system international conference on machine learning. pp. 431- 441 ,(1993) , 10.1016/B978-0-934613-64-4.50011-6