Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality

作者: Carlos Sáez , Pedro Pereira Rodrigues , João Gama , Montserrat Robles , Juan M García-Gómez

DOI: 10.1007/S10618-014-0378-6

关键词:

摘要: Knowledge discovery on biomedical data can be based on-line, data-stream analyses, or using retrospective, timestamped, off-line datasets. In both cases, changes in the processes that generate their quality features through time may hinder either knowledge process generalization of past knowledge. These problems seen as a lack temporal stability. This work establishes stability dimension and proposes new methods for its assessment probabilistic framework. Concretely, are proposed (1) monitoring changes, (2) characterizing trends detecting subgroups. First, change detection algorithm is Statistical Process Control posterior Beta distribution Jensen---Shannon distance, with memoryless forgetting mechanism. (PDF-SPC) classifies degree current three states: In-Control, Warning, Out-of-Control. Second, novel method to visualize characterize projection non-parametric information-geometric statistical manifold windows. facilitates exploration IGT-plot and, by means unsupervised learning methods, discovering conceptually-related Methods evaluated real simulated National Hospital Discharge Survey (NHDS) dataset.

参考文章(61)
Ralf Klinkenberg, Lehrstuhl Informatik Viii, Daimler-Benz Ag, Ingrid Renz, Adaptive Information Filtering: Learning in the Presence of Concept Drifts ,(1998)
Teuvo Kohonen, Self-organized formation of topologically correct feature maps Biological Cybernetics. ,vol. 43, pp. 509- 521 ,(1988) , 10.1007/BF00337288
Raquel Sebastião, João Gama, Pedro Pereira Rodrigues, João Bernardes, Monitoring incremental histogram distribution for change detection in data streams knowledge discovery and data mining. ,vol. 5840, pp. 25- 42 ,(2008) , 10.1007/978-3-642-12519-5_2
João Gama, Auroop Ganguly, Olufemi Omitaomu, Raju Vatsavai, Mohamed Gaber, Knowledge discovery from data streams intelligent data analysis. ,vol. 13, pp. 403- 404 ,(2009) , 10.1201/EBK1439826119
Tamraparni Dasu, Shankar Krishnan, Dongyu Lin, Suresh Venkatasubramanian, Kevin Yi, Change (Detection) You Can Believe in: Finding Distributional Shifts in Data Streams intelligent data analysis. ,vol. 5772, pp. 21- 34 ,(2009) , 10.1007/978-3-642-03915-7_3
Richard Y. Wang, Diane M. Strong, Beyond accuracy: what data quality means to data consumers Journal of Management Information Systems. ,vol. 12, pp. 5- 33 ,(1996) , 10.1080/07421222.1996.11518099
Shun-ichi Amari, Hiroshi Nagaoka, Methods of information geometry ,(2000)
Mohamed Medhat Gaber, Joao Gama, Learning from Data Streams: Processing Techniques in Sensor Networks Springer. ,(2007)