German English

Web Data Integration

Most proposed approaches on data integration rely on the notion of a global schema to provide a unified and consistent view of the underlying data sources. While it has been successful for data warehouses, the effort to integrate new sources usually is high. This makes it difficult for such approaches to scale to many sources. Furthermore, for virtual data integration it is challenging to obtain a good data quality.

Our work on web data integration focusses on dynamic information fusion of data sources available on the web. Similar to the idea of mashups, we want to achieve a fast development of data integration applications by reusing existing services and entity search engines within a workflow-like data integration. Integration workflows are defined using a script language supporting powerful generic operators.

Projects

Our work on web data integration contains the following projects:

  • With iFuice we developed an approach to information fusion of data sources using instance based peer-to-peer mappings between them.
  • Object Matching is a crucial task in data integration systems. Our MOMA framework can be used for defining object matching workflows in a mapping-based P2P environment.
  • Data Integration Applications: Based on the results of the above mentioned projects we design and implement several domain-specific applications, e.g., BioFuice for the integration of biological data. Based on bibliographic data source we also performed comprehensive citation analysis of database publications.
  • We currently work on a mashup framework that aims for supporting online (ad-hoc) data integration in dynamic web applications.

Project Members

Publications

PDF

Google Scholar
Thor, A.; Rahm, E.
CloudFuice: A flexible Cloud-based Data Integration System
Proc. of 10th Intl. Conference on Web Engineering (ICWE), 2011
2011-06
PDF

Google Scholar
Rahm, E.; Thor, A.; Aumueller, D.
Dynamic Fusion of Web Data
Proc. 5th Intl. XML Database Symposium (XSym), 2007
2007-09
PDF

Google Scholar
publication iconThor, Andreas; Aumueller, David; Rahm, Erhard
Data Integration Support for Mashups
Proc. 6th Intl. Workshop on Information Integration on the Web (IIWeb), 2007
2007-07
PDF

Google Scholar
Kirsten, T.; Thor, A.; Rahm, E.
Instance-based matching of large life science ontologies
Proc. of 4th Intl. Workshop on Data Integration in the Life Sciences (DILS), 2007
2007-06
PDF

Google Scholar
Thor, A.; Kirsten, T.; Rahm, E.
Instance-based matching of hierarchical ontologies
Proc. of 12. GI-Fachtagung für Datenbanksysteme in Business, Technologie und Web (BTW), 2007
2007-03
PDF

Google Scholar
Köpcke, H.; Rahm, E.
Analyse von Zitierungshäufigkeiten für die Datenbankkonferenz BTW
Datenbank-Spektrum, 7. Jahrgang, Heft 20
2007-02
PDF

Google Scholar
Thor, A.; Rahm, E.
MOMA - A Mapping-based Object Matching System
Proc. 3rd Conference on Innovative Data Systems Research (CIDR), 2007
2007-01
PDF
further information
Google Scholar
Kirsten, Toralf; Rahm, Erhard
BioFuice: Mapping-based data integration in bioinformatics
Proc. of 3rd Int. Workshop on Data Integration in the Life Sciences (DILS), Springer LNCS 4075, 2006
2006-07
PDF
further information
Google Scholar
Rahm, E.; Thor, A.
Citation analysis of database publications
ACM Sigmod Record 24(4), 2005
2005-12
PDF

Google Scholar
Rahm, E.; Thor, A.; Aumueller, D.; Do, H.H.; Golovin, N.; Kirsten, T.
iFuice - Information Fusion utilizing Instance Correspondences and Peer Mappings
Proc. 8th Intl. Workshop on the Web and Databases (WebDB), 2005
2005-06


Google Scholar
publication iconKirsten, T.; Rahm, E.
BioFuice: A decentralized Approach to integrate molecular-biological Data
Proc 4th Research Festival for Life Sciences, Leipzig, Dec. 2005
2005