Description
The data quality improvement of DBpedia has been in the focus of many publications in the past years with topics covering both knowledge enrichment techniques such as type learning, taxonomy generation, interlinking as well as error detection strategies such as property or value outlier detection, type checking, ontology constraints, or unit-tests,to name just a few. The concrete innovation of the DBpedia FlexiFusion workflow, leveraging the novel DBpedia PreFusion dataset, which we present in this paper, is to massively cut down the engineering workload to apply any of the vast methods available in shorter time and also make it easier to produce customized knowledge graphs or DBpedias. While FlexiFusion is flexible to accommodate other use cases, our main use case in this paper is the generation of richer, language-specific DBpedias for the 20+ DBpedia chapters, which we demonstrate on the Catalan DBpedia. In this paper, we define a set of quality metrics and evaluate them for Wikidata and DBpedia datasets of several language chapters. Moreover, we show that an implementation of FlexiFusion, performed on the proposed PreFusion dataset, increases data size, richness as well as quality in comparison to the source datasets.