German English

ScaDS Research on Scalable Privacy-preserving Record Linkage

PDF
further information
Google Scholar
Franke, Martin; Gladbach, Marcel; Sehili, Ziad; Rohde, Florens; Rahm, Erhard
ScaDS Research on Scalable Privacy-preserving Record Linkage
Datenbank-Spektrum
2019-02

Further information: https://doi.org/10.1007/s13222-019-00305-y

Description

Privacy-preserving record linkage (PPRL) supports the matching and integration of person-related data, e.g., o n patients or customers without compromising privacy. It is based on the encoding of sensitive attribute values needed for matching and often involves trusted parties for linkage. We report on recent research results from the Big Data center ScaDS Dresden/Leipzig to improve the efficien cy, scalability and quality of PPRL, and to apply PPRL in the medical domain. In particular, we present the use of pivot-based filtering techniques and LSH (locality-sensitive hashing)-based blocking to reduce the number of comparisons. Furthermore, we report on parallel linkage implementations based on Apache Flink supporting scalability to millions of records.