作者: Robert H. Carver
DOI:
关键词:
摘要: Statistics education reformers have for years called the use of real data in teaching introductory statistics(Ballman, 1997; Garfield et al., 2004; Hogg, 1991). Instructors now ready access to cases, textbookproblems and other exercises with accompanying well-documented sets or realistic data. On-line portalsand libraries provide a huge array keyed variously substantive topics statisticaltechniques suitable students. The vast majority these datasets tend already been cleaned up by their preparers. As enrichingas resources are, relatively few them offer students first-hand experience essential messiness of“real” There is good case be made that cleaning preparation belong introductorycourses (Burger & Leopold, 2001). Certainly, problems missing, dirty, incomplete are importanttopics within field (Hoyle, 1971; Rubin, 1976; Wagner, 2002). Using from Wright Brothers’ 1904 experiments, this leads intermediate studentsthrough process preparation, illustrating five common steps cleaning:standardizing format records, deciding how treat ambiguously recorded data, conversion measurementsto single standard unit, detecting resolving issues outliers, imputation missing