Analysing the localisation sites of proteins through neural networks ensembles

作者: Aristoklis D. Anastasiadis , George D. Magoulas

DOI: 10.1007/S00521-006-0029-Y

关键词:

摘要: Scientists involved in the area of proteomics are currently seeking integrated, customised and validated research solutions to better expedite their work analyses drug discoveries. Some drugs most cell targets proteins, because proteins dictate biological phenotype. In this context, automated analysis protein localisation is more complex than DNA sequences; nevertheless benefits be derived same or greater importance. order accomplish target, right choice kind methods for these applications, especially when data set drastically imbalanced, very important crucial. paper we investigate performance some commonly used classifiers, such as K nearest neighbours feed-forward neural networks with without cross-validation, a class imbalanced problems from bioinformatics domain. Furthermore, construct ensemble-based schemes using notion diversity, empirically test on problems. The experimental results favour generation network ensembles able produce good generalisation ability significant improvement compared other single classifier methods.

参考文章(35)
Harvey F. Lodish, Molecular Cell Biology ,(1986)
Tülay Yildirim, Bülten Bolat, A DATA SELECTION METHOD FOR PROBABILISTIC NEURAL NETWORKS IU-Journal of Electrical & Electronics Engineering. ,vol. 4, pp. 1137- 1140 ,(2004)
Kenta Nakai, Paul Horton, A Probabilistic Classification System for Predicting the Cellular Localization Sites of Proteins intelligent systems in molecular biology. ,vol. 4, pp. 109- 115 ,(1996)
David E. Rumelhart, James L. McClelland, , Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations Computational Models of Cognition and Perception. ,(1986) , 10.7551/MITPRESS/5236.001.0001
Ron Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection international joint conference on artificial intelligence. ,vol. 2, pp. 1137- 1143 ,(1995)
Chris D Nugent, Jesus A Lopez, Ann E Smith, Norman D Black, Prediction models in the design of neural network based ECG classifiers: A neural network and genetic programming approach BMC Medical Informatics and Decision Making. ,vol. 2, pp. 1- 6 ,(2002) , 10.1186/1472-6947-2-1
Gabriele Zenobi, Pádraig Cunningham, Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error european conference on machine learning. pp. 576- 587 ,(2001) , 10.1007/3-540-44795-4_49
AMANDA J. C SHARKEY, On Combining Artificial Neural Nets Connection Science. ,vol. 8, pp. 299- 314 ,(1996) , 10.1080/095400996116785
Ping Liang, Bernard Labedan, Monica Riley, Physiological genomics of Escherichia coli protein families. Physiological Genomics. ,vol. 9, pp. 15- 26 ,(2002) , 10.1152/PHYSIOLGENOMICS.00086.2001
Daniel Neagu, Vasile Palade, A neuro-fuzzy approach for functional genomics data interpretation and analysis Neural Computing and Applications. ,vol. 12, pp. 153- 159 ,(2003) , 10.1007/S00521-003-0388-6