Method and system for cleansing training data for predictive models

作者: Yaser I. Suleiman , Michael Zoll , Subhransu Basu , Thomas Herter , Thomas Breidt

DOI:

关键词: End userArtificial intelligenceMachine learningSelection (genetic algorithm)Set (abstract data type)Domain (software engineering)Training setComputer science

摘要: Described is an improved approach to implement selection of training data for machine learning, by presenting a designated set specific indicators where these correspond metrics that end users are familiar with and easily understood ordinary DBAs within their knowledge domain. Selection would correlate automatically corresponding other metrics/signals less understandable user. Additional analysis the selected can then be performed identify correct any statistical problems data.

参考文章(122)
Ioan Bogdan Crivat, C. James MacLennan, Machine learning semantic model ,(2013)
Rajesh Jugulum, Sami Huovilainen, Robert Granese, Satya Vithala, Eliud Polanco, Scott Lustig, H. Ian Joyce, Raji Ramachandran, Jagmeet Singh, Ron Gugggenheimer, Methods and systems for evaluating predictive models ,(2013)
Philip Simon Tuffs, Ariel Tseitlin, Roy Rapoport, Critical systems inspector ,(2013)
Jonathon E. Giftakis, Timothy J. Denison, Patient data display ,(2011)
Qing Xie, Teresa Tung, Qian Zhu, Adaptive fault diagnosis ,(2013)
Kimberly P. Gerra, George R. Oliver, Sambamurthy Nalla, Rubendran Amarasingham, Yu Qian, Ying Ma, Christopher A. Clark, Timothy S. Swanson, Clinical predictive and monitoring system and method ,(2012)
Angelo Pruscino, Yaser Ib Suleiman, Michael Zoll, Generating database cluster health alerts using machine learning ,(2013)
Madhu Syamala, Gopalan Arun, Anjani Kalyan Prathipati, Sumit Chougule, Failure handling in the execution flow of provisioning operations in a cloud environment ,(2013)
Kamakshi Lakshminarayan, Robert Goldman, Steven A. Harp, Tariq Samad, Imputation of missing data using machine learning techniques knowledge discovery and data mining. pp. 140- 145 ,(1996)