作者: Erhard Rahm , Hong Hai Do
DOI:
关键词: Data science 、 Computer science 、 Process (engineering) 、 Data warehouse 、 Data quality 、 Database design 、 Data modeling 、 Current (fluid) 、 Data cleansing
摘要: We classify data quality problems that are addressed by cleaning and provide an overview of the main solution approaches. Data is especially required when integrating heterogeneous sources should be together with schema-related transformations. In warehouses, a major part so-called ETL process. also discuss current tool support for cleaning.