Beschreibung
Despite the huge amount of recent research efforts on entity
resolution (matching) there has not yet been a comparative
evaluation on the relative effectiveness and efficiency of alternate
approaches. We therefore present such an evaluation of existing
implementations on challenging real-world match tasks. We
consider approaches both with and without using machine
learning to find suitable parameterization and combination of
similarity functions. In addition to approaches from the research
community we also consider a state-of-the-art commercial entity
resolution implementation. Our results indicate significant quality
and efficiency differences between different approaches. We also
find that some challenging resolution tasks such as matching
product entities from online shops are not sufficiently solved with
conventional approaches based on the similarity of attribute
values.