arxiv: 2604.19205 · v1 · submitted 2026-04-21 · 💻 cs.DB

Recognition: unknown

Demonstrating Online Schema Alignment in Decentralized Knowledge Graphs Querying

Bryan-Elliott Tam, Pieter Colpaert, Ruben Taelman

Pith reviewed 2026-05-10 01:36 UTC · model grok-4.3

classification 💻 cs.DB

keywords online schema alignmentdecentralized knowledge graphslink traversal query processingvocabulary heterogeneityLTQPdynamic alignmentquery execution

0 comments

The pith

Online schema alignment recovers complete query results from vocabulary-mismatched decentralized knowledge graphs with low overhead.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Decentralized knowledge graphs let users integrate data from many sources without a central authority, but differing vocabularies across those sources often break queries. This paper demonstrates an online schema alignment method for link traversal query processing that finds, limits the scope of, and applies alignment rules while a query is running. The approach avoids the need to predict every vocabulary mismatch ahead of time and leaves the original link-following behavior unchanged. In a social-media example, the technique produces full results at low extra cost, supporting the possibility of web-scale decentralized querying.

Core claim

The demonstration establishes that an online schema alignment approach for link traversal query processing discovers, scopes, and applies alignment rules dynamically during query execution while preserving traversal behavior, recovering complete query results with low overhead in a decentralized social-media scenario.

What carries the argument

Online schema alignment process that discovers, scopes, and applies rules dynamically at runtime during link traversal query execution.

If this is right

Query issuers obtain complete results without pre-defining every possible vocabulary difference.
Alignment occurs without breaking the step-by-step link traversal that defines LTQP.
The method supports web-scale reasoning because overhead remains low enough for practical use.
Existing link traversal engines can incorporate the alignment step without redesigning their core traversal logic.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same dynamic alignment pattern could apply to other decentralized domains such as scientific or open-government data.
Automated rule discovery services might further reduce the need for manual alignment definitions.
Live monitoring of alignment rule usage could reveal common mismatch patterns across the web.

Load-bearing premise

Alignment rules can be discovered, scoped, and applied dynamically at runtime without disrupting link traversal behavior or requiring anticipation of all vocabulary mismatches in advance.

What would settle it

Running the same set of decentralized queries on a social-media knowledge graph after deliberately introducing new vocabulary mismatches and checking whether complete results are still returned or whether overhead rises sharply.

read the original abstract

Decentralized Knowledge Graphs querying enables integrating distributed data without centralization, but is highly sensitive to vocabulary heterogeneity. Query issuers cannot realistically anticipate all vocabulary mismatches, especially when alignment rules are local, scoped, or discovered at runtime. We present an online schema alignment approach for Link Traversal Query Processing (LTQP) that discovers, scopes, and applies alignment rules dynamically during query execution while preserving traversal behavior. This demo paper demonstrates the approach on a decentralized social-media scenario through a web interface built on a Comunica-based LTQP engine. Source code, a CLI, and a reusable library are publicly available. The demonstration shows that online schema alignment recovers complete query results with low overhead, providing a practical foundation for web-scale reasoning in LTQP systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A working demo of runtime schema alignment for LTQP with public code, but the single-scenario illustration leaves the efficiency claims unquantified.

read the letter

The main takeaway is a working demonstration of online schema alignment integrated into link traversal query processing. It shows how rules can be found and applied at runtime for a decentralized social media query scenario. The new part is the dynamic discovery and scoping during execution. This avoids the need to anticipate all vocabulary issues in advance, which is realistic for web-scale decentralized graphs. The implementation on Comunica with a web interface makes the idea concrete. They did well to release the source code, a CLI, and a reusable library. That turns the demo into something others can inspect and extend directly. The approach preserves the original traversal behavior while adding the alignment step. The soft spot is the lack of quantitative evaluation. The abstract mentions low overhead and complete results, but without numbers, baselines, or tests on multiple cases, it's hard to see how general or efficient it really is. One scenario is a start, but more would strengthen it. This is aimed at people in the semantic web community who work on decentralized querying and vocabulary heterogeneity. A reader looking for practical solutions in LTQP would find the implementation details useful. The paper shows honest engagement with the problem through its public artifacts. It deserves peer review to get feedback on expanding the evaluation and clarifying the scope. Recommendation: send it for review in a demo or systems track.

Referee Report

1 major / 2 minor

Summary. The paper presents an online schema alignment approach for Link Traversal Query Processing (LTQP) in decentralized knowledge graphs. It demonstrates how alignment rules can be discovered, scoped, and applied dynamically at runtime during query execution on a decentralized social-media scenario, using a web interface built on a Comunica-based LTQP engine. The central claim is that this recovers complete query results with low overhead while preserving traversal behavior, with publicly available source code, CLI, and reusable library.

Significance. If the demonstration holds, the work provides a practical foundation for addressing vocabulary heterogeneity in web-scale decentralized querying, a key barrier to LTQP adoption. The explicit provision of executable code, CLI, and library is a clear strength, enabling direct reproduction and falsification of the single-scenario behavior. This enhances the manuscript's value as a demonstration paper even if broader quantitative evaluation is limited.

major comments (1)

[Demonstration] Demonstration section: the claims that online alignment 'recovers complete query results with low overhead' rest on a single illustrative scenario without reported quantitative metrics (e.g., execution time deltas, number of alignments discovered/applied, or baseline comparison without alignment). This makes the evidence for practicality illustrative rather than rigorously quantified, though the scoped claim remains directly testable via the released code.

minor comments (2)

The abstract and introduction could more explicitly note the single-scenario scope and absence of multi-scenario benchmarks to set reader expectations.
[Demonstration] Consider adding a small table or figure in the demonstration section that tabulates query completeness and overhead numbers for the social-media example.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment of our work, the recognition of its practical foundation for addressing vocabulary heterogeneity, and the value placed on the released code, CLI, and library. We address the single major comment below.

read point-by-point responses

Referee: Demonstration section: the claims that online alignment 'recovers complete query results with low overhead' rest on a single illustrative scenario without reported quantitative metrics (e.g., execution time deltas, number of alignments discovered/applied, or baseline comparison without alignment). This makes the evidence for practicality illustrative rather than rigorously quantified, though the scoped claim remains directly testable via the released code.

Authors: As a demonstration paper, the manuscript intentionally focuses on illustrating the dynamic discovery, scoping, and application of alignment rules through a single, reproducible end-to-end scenario in a decentralized social-media setting, rather than conducting a broad quantitative study. The claims are scoped accordingly to what the demonstration shows: recovery of complete results with low overhead in this case, while preserving LTQP traversal. We agree that the manuscript reports no quantitative metrics such as execution time deltas, counts of alignments, or baseline comparisons, rendering the evidence illustrative. The released artifacts directly enable the quantitative verification noted by the referee. In revision we will add a short description of the alignments discovered and applied in the scenario to give readers more process-level insight without altering the demonstration-oriented scope. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

This is a scoped demonstration paper whose central claim is limited to observable behavior of a publicly available implementation on one decentralized social-media scenario. No equations, fitted parameters, predictions, or self-citation chains appear in the provided text; the demonstration is directly executable and falsifiable via the supplied code, CLI, and library. The work is therefore self-contained against external benchmarks with no load-bearing step that reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard domain assumptions from decentralized KG querying and LTQP; no free parameters, invented entities, or ad-hoc axioms are introduced in the abstract.

axioms (1)

domain assumption Link traversal query processing preserves its behavior when alignment rules are discovered and applied dynamically at runtime.
Invoked in the description of the approach preserving traversal behavior.

pith-pipeline@v0.9.0 · 5423 in / 1128 out tokens · 49394 ms · 2026-05-10T01:36:36.234641+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 1 canonical work pages

[1]

Reconciling on- tologies and the web of data

Ziawasch Abedjan, Johannes Lorey, and Felix Naumann. “Reconciling on- tologies and the web of data”. In:21st ACM. 2012, pp. 1532–1536

2012
[2]

Languages and systems for RDF stream processing, a survey: P. Bonte et al

Pieter Bonte et al. “Languages and systems for RDF stream processing, a survey: P. Bonte et al.” In:The VLDB Journal34.4 (2025), p. 50

2025
[3]

Considering vocabulary mappings in query plans for federations of RDF data sources

Sijin Cheng, Sebastián Ferrada, and Olaf Hartig. “Considering vocabulary mappings in query plans for federations of RDF data sources”. In:Inter- national Conference on Cooperative Information Systems. Springer. 2023

2023
[4]

The birth of Prolog

Alain Colmerauer and Philippe Roussel. “The birth of Prolog”. In:His- tory of Programming Languages—II. New York, NY, USA: Association for Computing Machinery, 1996, pp. 331–367.isbn: 0201895021.url:https: //doi.org/10.1145/234286.1057820

work page doi:10.1145/234286.1057820 1996
[5]

FoundationsofTraversalBased Query Execution over Linked Data

OlafHartigandJohann-ChristophFreytag.“FoundationsofTraversalBased Query Execution over Linked Data”. In:Conference on Hypertext and So- cial Media. HT ’12. Milwaukee, Wisconsin, USA: ACM, 2012, pp. 43–52

2012
[6]

Alignment-Based Querying of Linked Open Data

Amit Krishna Joshi et al. “Alignment-Based Querying of Linked Open Data”. In:On the Move to Meaningful Internet Systems: OTM 2012. 2012. [8]Linked Data.https : / / www . w3 . org / DesignIssues / LinkedData. Ac- cessed: 2025-10-27. 2006

2012
[7]

Rdf ontology (re-) engineering through large-scale data mining

Johannes Lorey et al. “Rdf ontology (re-) engineering through large-scale data mining”. In:Semantic Web Challenge(2011)

2011
[8]

POD-QUERY: Schema Mapping and Query Rewriting for Solid Pods

Vandenbrande Maarten et al. “POD-QUERY: Schema Mapping and Query Rewriting for Solid Pods”. In:ISWC 2023. 2023

2023
[9]

A simple standard for sharing ontological map- pings (SSSOM)

Nicolas Matentzoglu et al. “A simple standard for sharing ontological map- pings (SSSOM)”. In:Database2022 (2022), baac035

2022
[10]

Ruben Taelman and Ruben Verborgh.Link Traversal Query Processing Over Decentralized Environments with Structural Assumptions.Cham,2023

2023
[11]

Comunica: a modular SPARQL query engine for the web

Ruben Taelman et al. “Comunica: a modular SPARQL query engine for the web”. In:International Semantic Web Conference. Springer. 2018

2018
[12]

TravelingwithaMap:ReducingtheSearchSpace of Link Traversal Queries Using RDF Shapes

Bryan-ElliottTametal.“TravelingwithaMap:ReducingtheSearchSpace of Link Traversal Queries Using RDF Shapes”. Submitted to Semantic Web Journal. 2025

2025
[13]

Hylar+ improving hybrid location-agnostic reasoning with incremental rule-based update

Mehdi Terdjimi, Lionel Médini, and Michael Mrissa. “Hylar+ improving hybrid location-agnostic reasoning with incremental rule-based update”. In:25th International Conference Companion on World Wide Web. 2016

2016
[14]

Drawing Conclusions from Linked Data on the Web: The EYE Reasoner

Ruben Verborgh and Jos De Roo. “Drawing Conclusions from Linked Data on the Web: The EYE Reasoner”. In:IEEE Software32.3 (2015)

2015