Evaluation of Pipelines for Data Integration into Knowledge Graphs

· 2026 · cs.AI · arXiv 2605.22304

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Integrating new data into knowledge graphs (KG) typically involves different tasks that are executed within workflows or pipelines There are many possible pipelines for a specific integration problem but there is not yet a general approach to evaluate the overall quality and performance of such pipelines to be able to determine the best choices. We therefore propose a new benchmark KGI-Bench to evaluate integration pipelines that ingest different kinds of input data into an existing KG. We evaluate pipelines by analyzing their output, i.e., the updated KG, with the three complementary quality metrics coverage, correctness and consistency. We also provide benchmark datasets (seed KG, overlapping input data of three formats, reference KG as a ground truth) for the movie domain. To demonstrate the applicability and usefulness of the proposed benchmark, we comparatively evaluate 12 pipelines and analyze their behavior across different input data formats and design choices.

representative citing papers

MaDI-Bench: An End-to-End Data Integration Benchmark

cs.DB · 2026-06-29 · unverdicted · novelty 7.0

MaDI-Bench supplies the first end-to-end benchmark tasks for full relational data integration pipelines across domains plus a variant-generation method to slow saturation.

citing papers explorer

Showing 1 of 1 citing paper.

MaDI-Bench: An End-to-End Data Integration Benchmark cs.DB · 2026-06-29 · unverdicted · none · ref 37 · internal anchor
MaDI-Bench supplies the first end-to-end benchmark tasks for full relational data integration pipelines across domains plus a variant-generation method to slow saturation.

Evaluation of Pipelines for Data Integration into Knowledge Graphs

fields

years

verdicts

representative citing papers

citing papers explorer