Nonhypothesis-Driven Research: Data Mining and Knowledge Discovery

作者: Mollie R. Cummins

DOI: 10.1007/978-1-84882-448-5_15

关键词:

摘要: Clinical information, stored over time, is a potentially rich source of data for clinical research. Knowledge discovery in databases (KDD), commonly known as mining, process pattern and predictive modeling large databases. KDD makes extensive use mining methods, automated processes, algorithms that enable recognition. Characteristically, involves the machine learning methods developed domain artificial intelligence. These have been applied to healthcare biomedical variety purposes with good success potential or realized translation. Herein, Fayyad model knowledge introduced. The steps are described select examples from research informatics. range initial selection interpretation evaluation. Commonly used surveyed: neural networks, decision tree induction, support vector machines (kernel methods), association rule k-nearest neighbor. Methods evaluating models result closely linked diagnostic medicine. include measures derived confusion matrix receiver operating characteristic curve analysis. Data partitioning validation critical aspects International efforts develop refine repositories critically these developing new knowledge.

参考文章(24)
Dominik Aronsky, Peter J. Haug, Wendy W. Chapman, Marcelo Fiszman, Combining decision support methodologies to diagnose pneumonia. american medical informatics association annual symposium. pp. 12- 16 ,(2001)
Gregory Piatetsky-Shapiro, Usama M. Fayyad, Padhraic Smyth, From data mining to knowledge discovery: an overview knowledge discovery and data mining. pp. 1- 34 ,(1996)
Warren S. McCulloch, Walter Pitts, A logical calculus of the ideas immanent in nervous activity Bulletin of Mathematical Biology. ,vol. 52, pp. 99- 115 ,(1990) , 10.1007/BF02478259
M.E. Matheny, L. Ohno-Machado, F.S. Resnic, Discrimination and calibration of mortality risk prediction models in interventional cardiology Journal of Biomedical Informatics. ,vol. 38, pp. 367- 375 ,(2005) , 10.1016/J.JBI.2005.02.007
Thomas A. Lasko, Jui G. Bhagwat, Kelly H. Zou, Lucila Ohno-Machado, The use of receiver operating characteristic curves in biomedical informatics Journal of Biomedical Informatics. ,vol. 38, pp. 404- 415 ,(2005) , 10.1016/J.JBI.2005.02.008
Dominik Aronsky, Peter J. Haug, Charles Lagor, Nathan C. Dean, Accuracy of administrative data for identifying patients with pneumonia. American Journal of Medical Quality. ,vol. 20, pp. 319- 328 ,(2005) , 10.1177/1062860605280358
F. Cordero, M. Botta, R. A. Calogero, Microarray data analysis and mining approaches. Briefings in Functional Genomics and Proteomics. ,vol. 6, pp. 265- 281 ,(2008) , 10.1093/BFGP/ELM034