Query-Aware Spreading Activation for Multi-Hop Retrieval over Knowledge Graphs

Illia Makarov; Mykola Glybovets

arxiv: 2606.30133 · v1 · pith:CLPPRJ4Knew · submitted 2026-06-29 · 💻 cs.LG · cs.AI· cs.IR

Query-Aware Spreading Activation for Multi-Hop Retrieval over Knowledge Graphs

Illia Makarov , Mykola Glybovets This is my paper

Pith reviewed 2026-06-30 06:55 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.IR

keywords spreading activationquery-aware retrievalknowledge graphsmulti-hop question answeringgraph RAGsemantic gateCypher query

0 comments

The pith

A spreading-activation method with a single semantic gate performs query-aware multi-hop retrieval over knowledge graphs as a fixed-step process inside the database.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to demonstrate that query-aware traversal for graph-based retrieval can be achieved without loading the entire graph into memory or running variable-iteration solvers. It replaces the complex flow-diffusion approach with a spreading-activation procedure whose propagation weight at each step is the cosine similarity between an entity's stored description and the input question. If this replacement works, the full retrieval pipeline becomes expressible as one database query that returns top-ranked entities after a preset number of steps. A reader would care because the change removes the memory and integration barriers that currently separate query-aware methods from ordinary graph-database usage while preserving accuracy on multi-hop questions.

Core claim

The central claim is that a spreading-activation procedure equipped with a per-step semantic gate defined by cosine similarity between entity descriptions and the question produces query-aware traversal equivalent to a flow-diffusion solver. The procedure uses a fixed number of iterations, never moves the graph out of the database, and is written as a single Cypher statement. On the MuSiQue dataset this yields exact-match scores statistically indistinguishable from the flow-diffusion baseline while exceeding a purely structural baseline by 5.3 exact-match points; an ablation that removes the gate shows simultaneous drops in answer quality and increases in latency.

What carries the argument

The semantic gate, a per-step multiplier equal to the cosine similarity between the candidate entity's description and the question, which modulates propagation strength during spreading activation.

If this is right

Multi-hop retrieval becomes possible in a single round-trip to the graph database without materializing the full graph in application memory.
Retrieval latency decreases by a factor between 1.5 and 4.9 compared with the ungated structural baseline.
The same accuracy level previously obtained only by flow-diffusion solvers is reached with a fixed-iteration, database-native procedure.
Disabling the gate simultaneously lowers answer quality and raises latency, confirming that the gate supplies both the query-awareness and the efficiency benefit.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The fixed-iteration design could be combined with early-stopping heuristics based on score saturation to handle graphs of varying diameter without manual tuning.
Because the gate operates on textual descriptions, the method may transfer to graphs whose nodes carry richer metadata such as temporal or provenance attributes.
Expressing the entire pipeline as one database query opens the possibility of pushing further post-processing steps, such as context assembly, inside the same transaction.

Load-bearing premise

Cosine similarity between stored entity descriptions and the question text supplies a reliable relevance signal at every propagation step, and a preset number of steps is enough to surface the needed entities.

What would settle it

Run the same retrieval task on the identical graph with the gate replaced by a constant weight of 1; if answer quality and latency remain unchanged or improve, the gate is not the source of the reported gains.

Figures

Figures reproduced from arXiv: 2606.30133 by Illia Makarov, Mykola Glybovets.

**Figure 1.** Figure 1: Two-phase architecture. The indexing phase (top) extracts entities and relations from the corpus via an LLM, resolves and embeds them, and stores the result as a typed knowledge graph in Neo4j. The query phase (bottom) uses an LLM to map the question to seed nodes, runs spreading activation for T iterations in a single Cypher query, and passes the activated subgraph to an LLM to generate the answer [PITH_… view at source ↗

**Figure 2.** Figure 2: Knowledge-graph schema. Document and paragraph nodes preserve the original structure of the corpus; [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Sensitivity analysis over the propagation depth (left) and the decay factor (right) at [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

read the original abstract

Retrieval-augmented generation built on knowledge graphs (Graph RAG) outperforms flat passage retrieval on multi-hop question answering by leveraging graph structure. In most existing systems, however, the question only sets the seed nodes; the subsequent traversal becomes "query-blind", depending solely on the graph structure. The exception is QAFD-RAG, which implements query-aware traversal via a flow-diffusion solver with combined edge re-weighting. This architecture requires loading the full graph into Python memory and an iterative solver with a variable number of iterations complicating integration with the graph database. We propose a spreading-activation method that achieves the same query-aware traversal with a single per-step semantic gate: the step weight is the cosine similarity between the candidate entity's description and the question, and the number of iterations is fixed. The whole retrieval procedure - seed mapping, propagation, top-K selection and context assembly - is expressed as a single Cypher query executed in one round-trip to Neo4j; the graph never leaves the database. On MuSiQue our method matches QAFD-RAG by exact match (32.80 vs 33.50) and outperforms the strongest purely-structural baseline in our comparison, HippoRAG, by 5.3 EM and 3.4 F1; on 2WikiMultiHopQA HippoRAG and QAFD-RAG retain an advantage due to their phrase-node architectures. An ablation with the gate disabled confirms that the gate is the source of a simultaneous F1 gain of 3.6 to 7.4 points and a retrieval-latency reduction by a factor of 1.5 to 4.9.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A single-Cypher-query spreading activation with a cosine gate that matches QAFD-RAG on MuSiQue and cuts latency, but the gate's reliability as a query-aware filter is the open question.

read the letter

The main takeaway is that this paper delivers a practical, database-resident alternative to flow-diffusion solvers for query-aware graph retrieval. Instead of loading the graph into Python and running variable iterations, they encode seed mapping, per-step cosine gating on entity descriptions, fixed propagation, and top-K selection as one Cypher query against Neo4j. The graph stays in the store and the round-trip count drops.

What works is the engineering and the ablation. The gate produces reported F1 lifts of 3.6-7.4 points and latency reductions of 1.5-4.9x. On MuSiQue the method lands within 0.7 EM of QAFD-RAG and beats HippoRAG by 5.3 EM. The single-query design is a genuine simplification for anyone already committed to Neo4j.

The softer part is whether the cosine gate actually steers activation toward answer paths in a robust way. The ablation shows the gate helps, yet there is no per-hop similarity breakdown, no check on description length or noise, and no test of whether fixed iterations cover the needed depth when distractors are lexically close. The fact that the method trails on 2WikiMultiHopQA suggests the approach is sensitive to how nodes and descriptions are constructed.

This is for engineers building Graph RAG pipelines who want to stay inside the database rather than for theorists. It deserves peer review. The implementation is concrete, the ablation gives measurable evidence, and the latency claim is worth checking even if the gate needs more diagnostics to stand as a general solution.

Referee Report

2 major / 1 minor

Summary. The paper proposes a spreading-activation procedure for query-aware multi-hop retrieval in Graph RAG. It replaces QAFD-RAG's flow-diffusion solver with a fixed-iteration propagation whose per-step weight is the cosine similarity between each candidate entity's description and the input question; the entire pipeline (seed mapping, gated propagation, top-K selection, context assembly) is expressed as one Cypher query against Neo4j. On MuSiQue the method reports 32.80 EM (vs. 33.50 for QAFD-RAG) and outperforms HippoRAG by 5.3 EM / 3.4 F1; an ablation attributes 3.6–7.4 F1 points and a 1.5–4.9× latency reduction to the semantic gate.

Significance. If the central performance claims hold, the work supplies a practical, database-resident alternative to iterative solvers that require full-graph materialization in Python. The single-round-trip Cypher formulation and the explicit ablation of the gate are concrete strengths that could ease deployment and reproducibility.

major comments (2)

[Abstract] Abstract: the claim that the method 'achieves the same query-aware traversal' as QAFD-RAG rests on the untested premise that cosine similarity between entity descriptions and the question reliably up-weights answer-containing paths at each hop. No per-hop similarity statistics, gold-path vs. distractor analysis, or description-quality breakdown is supplied, so the reported parity (32.80 vs 33.50 EM) could be coincidental rather than evidence of equivalent query-awareness.
[Experimental results] Experimental results (abstract and ablation paragraph): the F1 gains attributed to the gate (3.6–7.4 points) and the latency reduction are presented without variance estimates, number of runs, or statistical significance tests, and without specifying the exact baseline configuration used for the latency comparison. These omissions make it impossible to judge whether the reported improvements are robust.

minor comments (1)

[Abstract] The phrase 'phrase-node architectures' is used to explain the 2WikiMultiHopQA results but is not defined or referenced to prior work in the abstract or visible method description.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below with clarifications and commitments to revisions that strengthen the empirical support without altering the core claims.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the method 'achieves the same query-aware traversal' as QAFD-RAG rests on the untested premise that cosine similarity between entity descriptions and the question reliably up-weights answer-containing paths at each hop. No per-hop similarity statistics, gold-path vs. distractor analysis, or description-quality breakdown is supplied, so the reported parity (32.80 vs 33.50 EM) could be coincidental rather than evidence of equivalent query-awareness.

Authors: The equivalence claim is grounded in the architectural analogy (query-dependent weighting at each propagation step) together with the observed performance parity and the ablation isolating the gate's contribution. We nevertheless agree that direct validation of the premise would be stronger. The revised manuscript will add a dedicated analysis subsection reporting per-hop cosine similarity distributions, gold-path versus distractor comparisons, and a brief description-quality breakdown on MuSiQue. revision: yes
Referee: [Experimental results] Experimental results (abstract and ablation paragraph): the F1 gains attributed to the gate (3.6–7.4 points) and the latency reduction are presented without variance estimates, number of runs, or statistical significance tests, and without specifying the exact baseline configuration used for the latency comparison. These omissions make it impossible to judge whether the reported improvements are robust.

Authors: We accept that the current reporting lacks these statistical details. The revised version will state the number of runs performed, report means with standard deviations, include paired significance tests where appropriate, and explicitly document the baseline configurations, hardware, and measurement protocol used for all latency figures. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation or claims

full rationale

The paper introduces a spreading-activation procedure whose core operation (per-step multiplication by cosine similarity between entity description and query) is an explicit design choice, not derived from prior results or fitted parameters within the paper. Performance numbers are reported from direct experiments on MuSiQue and 2WikiMultiHopQA; the ablation isolating the gate is likewise an empirical measurement. No equations, uniqueness theorems, or self-citations appear that would reduce the central claim to a tautology or to the inputs by construction. The method is presented as an independent, simpler alternative to QAFD-RAG's flow-diffusion solver.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the effectiveness of the cosine similarity gate and the assumption that the graph contains suitable entity descriptions; no free parameters or invented entities are introduced.

axioms (1)

domain assumption Cosine similarity between entity descriptions and the question is a valid measure for relevance in propagation steps.
This forms the per-step semantic gate in the spreading activation process.

pith-pipeline@v0.9.1-grok · 5839 in / 1194 out tokens · 38250 ms · 2026-06-30T06:55:46.355664+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 14 canonical work pages · 6 internal anchors

[1]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

P. Lewiset al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” inAdvances in Neural Information Processing Systems, 2020, pp. 9459–9474. doi: 10.48550/arXiv.2005.11401

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2005.11401 2020
[2]

Retrieval-Augmented Generation for Large Language Models: A Survey

Y . Gaoet al., “Retrieval-augmented generation for large language mod- els: A survey,”arXiv preprint, 2023. doi: 10.48550/arXiv.2312.10997

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2312.10997 2023
[3]

Dense passage retrieval for open-domain ques- tion answering,

V . Karpukhinet al., “Dense passage retrieval for open-domain ques- tion answering,” inProc. 2020 Conf. Empirical Methods in Nat- ural Language Processing (EMNLP), 2020, pp. 6769–6781. doi: 10.18653/v1/2020.emnlp-main.550

work page doi:10.18653/v1/2020.emnlp-main.550 2020
[4]

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

D. Edgeet al., “From local to global: A graph RAG ap- proach to query-focused summarization,”arXiv preprint, 2024. doi: 10.48550/arXiv.2404.16130

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2404.16130 2024
[5]

LightRAG: Simple and Fast Retrieval-Augmented Generation

Z. Guo, L. Xia, Y . Yu, T. Ao, and C. Huang, “LightRAG: Sim- ple and fast retrieval-augmented generation,” inFindings of the As- sociation for Computational Linguistics: EMNLP 2025, 2025. doi: 10.48550/arXiv.2410.05779

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2410.05779 2025
[6]

Hip- poRAG: Neurobiologically inspired long-term memory for large lan- guage models,

B. Jiménez Gutiérrez, Y . Shu, Y . Gu, M. Yasunaga, and Y . Su, “Hip- poRAG: Neurobiologically inspired long-term memory for large lan- guage models,” inAdvances in Neural Information Processing Systems,
[7]

doi: 10.48550/arXiv.2405.14831

work page doi:10.48550/arxiv.2405.14831
[8]

From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

B. Jiménez Gutiérrez, Y . Shu, W. Qi, S. Zhou, and Y . Su, “From RAG to memory: Non-parametric continual learning for large language models,” inProc. 42nd Int. Conf. Machine Learning (ICML), 2025. doi: 10.48550/arXiv.2502.14802

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2502.14802 2025
[9]

PathRAG: Pruning graph-based retrieval aug- mented generation with relational paths,

B. Chenet al., “PathRAG: Pruning graph-based retrieval aug- mented generation with relational paths,”arXiv preprint, 2025. doi: 10.48550/arXiv.2502.14902

work page doi:10.48550/arxiv.2502.14902 2025
[10]

Query-aware flow diffusion for graph-based RAG with retrieval guarantees,

Z. Zhouet al., “Query-aware flow diffusion for graph-based RAG with retrieval guarantees,” inInt. Conf. Learning Representations (ICLR),
[11]

doi: 10.48550/arXiv.2605.18775

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.18775
[12]

Scaling personalized web search,

G. Jeh and J. Widom, “Scaling personalized web search,” inProc. 12th Int. Conf. World Wide Web (WWW), 2003, pp. 271–279. doi: 10.1145/775152.775191

work page doi:10.1145/775152.775191 2003
[13]

Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang

N. F. Liuet al., “Lost in the middle: How language models use long contexts,”Trans. Association for Computational Linguistics, vol. 12, pp. 157–173, 2024. doi: 10.1162/tacl_a_00638

work page doi:10.1162/tacl_a_00638 2024
[14]

Application of spreading activation techniques in informa- tion retrieval,

F. Crestani, “Application of spreading activation techniques in informa- tion retrieval,”Artificial Intelligence Review, vol. 11, no. 6, pp. 453–482,
[15]

doi: 10.1023/A:1006569829653

work page doi:10.1023/a:1006569829653
[16]

URLhttps://aclanthology.org/2022.tacl-1.31/

H. Trivedi, N. Balasubramanian, T. Khot, and A. Sabharwal, “MuSiQue: Multihop questions via single-hop question composition,”Trans. Asso- ciation for Computational Linguistics, vol. 10, pp. 539–554, 2022. doi: 10.1162/tacl_a_00475

work page doi:10.1162/tacl_a_00475 2022
[17]

Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps

X. Ho, A.-K. Duong Nguyen, S. Sugawara, and A. Aizawa, “Construct- ing a multi-hop QA dataset for comprehensive evaluation of reasoning steps,” inProc. 28th Int. Conf. Computational Linguistics (COLING), 2020, pp. 6609–6625. doi: 10.18653/v1/2020.coling-main.580. APPENDIX: FULLTEXT OF THECYPHERQUERY The full query, which implements the procedure of Sectio...

work page doi:10.18653/v1/2020.coling-main.580 2020
[18]

YIELD node, score 13RETURNnode.nameASseed_name, scoreASseed_score 14} 15 16WITHseed_name,max(seed_score)ASseed_score 17WITHcollect({name: seed_name, score: seed_score})ASraw, 18max(seed_score)ASmx 19WITH[sINraw | s.name]ASseed_names, 20apoc.map.fromPairs([sINraw | [s.name, s.score / mx]])ASresource_map 21 22// === Fragment 2: propagation block (repeated T...

[1] [1]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

P. Lewiset al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” inAdvances in Neural Information Processing Systems, 2020, pp. 9459–9474. doi: 10.48550/arXiv.2005.11401

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2005.11401 2020

[2] [2]

Retrieval-Augmented Generation for Large Language Models: A Survey

Y . Gaoet al., “Retrieval-augmented generation for large language mod- els: A survey,”arXiv preprint, 2023. doi: 10.48550/arXiv.2312.10997

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2312.10997 2023

[3] [3]

Dense passage retrieval for open-domain ques- tion answering,

V . Karpukhinet al., “Dense passage retrieval for open-domain ques- tion answering,” inProc. 2020 Conf. Empirical Methods in Nat- ural Language Processing (EMNLP), 2020, pp. 6769–6781. doi: 10.18653/v1/2020.emnlp-main.550

work page doi:10.18653/v1/2020.emnlp-main.550 2020

[4] [4]

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

D. Edgeet al., “From local to global: A graph RAG ap- proach to query-focused summarization,”arXiv preprint, 2024. doi: 10.48550/arXiv.2404.16130

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2404.16130 2024

[5] [5]

LightRAG: Simple and Fast Retrieval-Augmented Generation

Z. Guo, L. Xia, Y . Yu, T. Ao, and C. Huang, “LightRAG: Sim- ple and fast retrieval-augmented generation,” inFindings of the As- sociation for Computational Linguistics: EMNLP 2025, 2025. doi: 10.48550/arXiv.2410.05779

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2410.05779 2025

[6] [6]

Hip- poRAG: Neurobiologically inspired long-term memory for large lan- guage models,

B. Jiménez Gutiérrez, Y . Shu, Y . Gu, M. Yasunaga, and Y . Su, “Hip- poRAG: Neurobiologically inspired long-term memory for large lan- guage models,” inAdvances in Neural Information Processing Systems,

[7] [7]

doi: 10.48550/arXiv.2405.14831

work page doi:10.48550/arxiv.2405.14831

[8] [8]

From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

B. Jiménez Gutiérrez, Y . Shu, W. Qi, S. Zhou, and Y . Su, “From RAG to memory: Non-parametric continual learning for large language models,” inProc. 42nd Int. Conf. Machine Learning (ICML), 2025. doi: 10.48550/arXiv.2502.14802

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2502.14802 2025

[9] [9]

PathRAG: Pruning graph-based retrieval aug- mented generation with relational paths,

B. Chenet al., “PathRAG: Pruning graph-based retrieval aug- mented generation with relational paths,”arXiv preprint, 2025. doi: 10.48550/arXiv.2502.14902

work page doi:10.48550/arxiv.2502.14902 2025

[10] [10]

Query-aware flow diffusion for graph-based RAG with retrieval guarantees,

Z. Zhouet al., “Query-aware flow diffusion for graph-based RAG with retrieval guarantees,” inInt. Conf. Learning Representations (ICLR),

[11] [11]

doi: 10.48550/arXiv.2605.18775

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.18775

[12] [12]

Scaling personalized web search,

G. Jeh and J. Widom, “Scaling personalized web search,” inProc. 12th Int. Conf. World Wide Web (WWW), 2003, pp. 271–279. doi: 10.1145/775152.775191

work page doi:10.1145/775152.775191 2003

[13] [13]

Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang

N. F. Liuet al., “Lost in the middle: How language models use long contexts,”Trans. Association for Computational Linguistics, vol. 12, pp. 157–173, 2024. doi: 10.1162/tacl_a_00638

work page doi:10.1162/tacl_a_00638 2024

[14] [14]

Application of spreading activation techniques in informa- tion retrieval,

F. Crestani, “Application of spreading activation techniques in informa- tion retrieval,”Artificial Intelligence Review, vol. 11, no. 6, pp. 453–482,

[15] [15]

doi: 10.1023/A:1006569829653

work page doi:10.1023/a:1006569829653

[16] [16]

URLhttps://aclanthology.org/2022.tacl-1.31/

H. Trivedi, N. Balasubramanian, T. Khot, and A. Sabharwal, “MuSiQue: Multihop questions via single-hop question composition,”Trans. Asso- ciation for Computational Linguistics, vol. 10, pp. 539–554, 2022. doi: 10.1162/tacl_a_00475

work page doi:10.1162/tacl_a_00475 2022

[17] [17]

Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps

X. Ho, A.-K. Duong Nguyen, S. Sugawara, and A. Aizawa, “Construct- ing a multi-hop QA dataset for comprehensive evaluation of reasoning steps,” inProc. 28th Int. Conf. Computational Linguistics (COLING), 2020, pp. 6609–6625. doi: 10.18653/v1/2020.coling-main.580. APPENDIX: FULLTEXT OF THECYPHERQUERY The full query, which implements the procedure of Sectio...

work page doi:10.18653/v1/2020.coling-main.580 2020

[18] [18]

YIELD node, score 13RETURNnode.nameASseed_name, scoreASseed_score 14} 15 16WITHseed_name,max(seed_score)ASseed_score 17WITHcollect({name: seed_name, score: seed_score})ASraw, 18max(seed_score)ASmx 19WITH[sINraw | s.name]ASseed_names, 20apoc.map.fromPairs([sINraw | [s.name, s.score / mx]])ASresource_map 21 22// === Fragment 2: propagation block (repeated T...