German English

Learning-based approaches for matching web data entities

PDF
further information
Google Scholar
Köpcke, H.; Thor, A.; Rahm, E.
Learning-based approaches for matching web data entities
IEEE Internet Computing 14(4), 2010
2010-07

Further information: http://doi.ieeecomputersociety.org/10.1109/MIC.2010.58

Description

Entity matching is a key task for data integration and especially challenging for web data. Effective entity matching typically re-quires the combination of several match techniques and finding suitable configuration parameters such as similarity thresholds. We investigate to which degree the use of machine learning helps to semi-automatically determine suitable match strategies with a limited amount of manual effort for training. We use a new framework, FEVER, to evaluate several learning-based approaches for matching different sets of web data entities. In particular, we study different approaches to select training data and study how much training is needed to find effective combined match strategies and their configuration.