German English

Data Cleaning: Problems and Current Approaches

PDF
further information
Google Scholar
Rahm, E.; Do, H.H.
Data Cleaning: Problems and Current Approaches
IEEE Techn. Bulletin on Data Engineering, Dec. 2000
2000

Further information: http://lips.informatik.uni-leipzig.de/pub/2000-45

Description

We classify data quality problems that are addressed by data cleaning and provide an overview of the
main solution approaches. Data cleaning is especially required when integrating heterogeneous data
sources and should be addressed together with schema-related data transformations. In data warehouses,
data cleaning is a major part of the so-called ETL process. We also discuss current tool support
for data cleaning.