Selection Bias Tracking and Detailed Subset Comparison for High-Dimensional Data

作者: David Borland , Wenyuan Wang , Jonathan Zhang , Joshua Shrestha , David Gotz

DOI: 10.1109/TVCG.2019.2934209

关键词:

摘要: The collection of large, complex datasets has become common across a wide variety domains. Visual analytics tools increasingly play key role in exploring and answering questions about these large datasets. However, many visualizations are not designed to concurrently visualize the number dimensions present (e.g. tens thousands distinct codes an electronic health record system). This fact, combined with ability visual systems enable rapid, ad-hoc specification groups, or cohorts, individuals based on small subset visualized dimensions, leads possibility introducing selection bias–when user creates cohort specified set differences other unseen may also be introduced. These unintended side effects result no longer being representative larger population intended studied, which can negatively affect validity subsequent analyses. We techniques for bias tracking visualization that incorporated into high-dimensional exploratory systems, focus medical data existing hierarchies. include: (1) tree-based provenance visualization, including user-specified baseline all cohorts compared against, encoding “drift”, indicates where have occurred, (2) visualizations, novel icicle-plot compare detail per-dimension between cohort. integrated temporal event sequence tool. example use cases report findings from domain expert interviews.

参考文章(53)
Georges G. Grinstein, Andreas Wierse, Usama Fayyad, Information Visualization in Data Mining and Knowledge Discovery ,(2001)
James J Thomas Kristin A Cook, None, Illuminating the Path: The Research and Development Agenda for Visual Analytics United States. Department of Homeland Security. ,(2005)
Keith E. Campbell, Kent A. Spackman, Roger A. Côté, SNOMED RT: a reference terminology for health care. conference of american medical informatics association. pp. 640- 644 ,(1997)
Jurgen Dollner, Jonas Trumper, Dominik Moritz, Sebastian Hahn, Visualization of varying hierarchies by stable layout of voronoi treemaps international conference on information visualization theory and applications. pp. 50- 58 ,(2014)
Miguel A. Hernán, Sonia Hernández-Díaz, James M. Robins, A structural approach to selection bias. Epidemiology. ,vol. 15, pp. 615- 625 ,(2004) , 10.1097/01.EDE.0000135174.63482.43
Kenneth F Schulz, Iain Chalmers, Richard J Hayes, Douglas G Altman, Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials JAMA: The Journal of the American Medical Association. ,vol. 273, pp. 408- 412 ,(1995) , 10.1001/JAMA.273.5.408
Sebastian Bremm, Tatiana von Landesberger, Martin Hess, Tobias Schreck, Philipp Weil, Kay Hamacherk, Interactive visual comparison of multiple trees visual analytics science and technology. pp. 31- 40 ,(2011) , 10.1109/VAST.2011.6102439
Michael Gleicher, Danielle Albers, Rick Walker, Ilir Jusufi, Charles D Hansen, Jonathan C Roberts, None, Visual comparison for information visualization Information Visualization. ,vol. 10, pp. 289- 309 ,(2011) , 10.1177/1473871611416549
David Moher, Sally Hopewell, Kenneth F Schulz, Victor Montori, Peter C Gøtzsche, Philip J Devereaux, Diana Elbourne, Matthias Egger, Douglas G Altman, None, CONSORT 2010 Explanation and Elaboration: updated guidelines for reporting parallel group randomised trials BMJ. ,vol. 340, pp. 28- 55 ,(2010) , 10.1136/BMJ.C869