ContextRAG: Extraction-Free Hierarchical Graph Construction for Retrieval-Augmented Generation

Roman Prosvirnin; Sergei Kuznetsov; Seungmin Jin

arxiv: 2605.19735 · v1 · pith:TQ7AST4Inew · submitted 2026-05-19 · 💻 cs.CL · cs.AI

ContextRAG: Extraction-Free Hierarchical Graph Construction for Retrieval-Augmented Generation

Roman Prosvirnin , Sergei Kuznetsov , Seungmin Jin This is my paper

Pith reviewed 2026-05-20 05:09 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords ContextRAGgraph RAGfuzzy concept analysisretrieval-augmented generationmulti-hop reasoningembedding-based indexingLukasiewicz logicresidual quantization

0 comments

The pith

ContextRAG builds a fuzzy concept graph from chunk embeddings alone, replacing LLM entity extraction with residual-quantization k-means and Lukasiewicz residuated logic to cut indexing costs while supporting multi-hop RAG.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a hierarchical graph for retrieval-augmented generation can be derived directly from embeddings without any LLM calls for entities, relations, or summaries. It does so by clustering embeddings with residual-quantization k-means, then applying Formal Concept Analysis under Lukasiewicz logic to produce fuzzy concept nodes through soft join and meet operations. A sympathetic reader would care because conventional graph RAG systems incur token and latency costs that scale linearly with corpus size, often requiring hundreds of LLM calls even for modest task sets. ContextRAG demonstrates the approach on a 130-task UltraDomain subset, using only 30 LLM calls and 22k tokens total while recording 33.6% F1 overall and 36.8% on multi-hop questions.

Core claim

ContextRAG derives a fuzzy concept graph over chunk embeddings using residual-quantization k-means and Formal Concept Analysis with Lukasiewicz residuated logic. Bridge-like and meet-derived context nodes are induced by soft fuzzy join and meet operations rather than by LLM-written graph edges. On a 130-task UltraDomain subset the resulting index requires 30 LLM calls and 22,073 tokens; the system obtains 33.6% F1 overall and 36.8% F1 on multi-hop tasks. Queries that retrieve at least one lattice-derived node in the top five show a +3.9 percentage point F1 advantage.

What carries the argument

fuzzy concept graph constructed by residual-quantization k-means followed by Formal Concept Analysis under Lukasiewicz residuated logic, with soft join and meet operations inducing bridge and meet-derived context nodes

If this is right

Indexing cost drops from hundreds of LLM calls and millions of tokens to 30 calls and roughly 22k tokens on the evaluated task set.
Overall F1 of 33.6% and multi-hop F1 of 36.8% are achieved on the 130-task UltraDomain subset.
Queries that surface at least one lattice-derived node among the top five retrieved items gain 3.9 percentage points F1.
Graph construction remains stable on the full task set where an extraction-based baseline fails during scaling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same embedding-driven lattice construction could be applied to corpora orders of magnitude larger where extraction costs become prohibitive.
Soft fuzzy operations may surface relational patterns that crisp entity-relation extraction overlooks in noisy or domain-specific text.
The diagnostic link between lattice-node retrieval and higher F1 suggests a natural test: whether forcing retrieval of such nodes at inference time further lifts accuracy.

Load-bearing premise

The relational structure captured by the fuzzy concept lattice induced from embeddings is sufficient to support effective retrieval-augmented generation on multi-hop questions without any LLM-based entity or relation extraction.

What would settle it

A controlled run on the same UltraDomain subset in which all lattice-derived nodes are withheld from retrieval and performance falls back to the level of a plain vector baseline on multi-hop items.

read the original abstract

Graph-structured retrieval-augmented generation (RAG) systems can improve answer quality on multi-hop questions, but many current systems rely on large language models (LLMs) to extract entities, relations, and summaries during indexing. These calls add token and wall-clock costs that grow with corpus size. We present ContextRAG, a graph RAG system whose graph topology is constructed without LLM-based entity or relation extraction. ContextRAG derives a fuzzy concept graph over chunk embeddings using residual-quantization k-means and Formal Concept Analysis with Lukasiewicz residuated logic. Bridge-like and meet-derived context nodes are induced by soft fuzzy join and meet operations, rather than by LLM-written graph edges. On a 130-task UltraDomain subset, ContextRAG builds its index with 30 LLM calls and 22,073 tokens. In contrast, a local HiRAG reproduction stress test required 870 indexing calls and 3.54M tokens on a 20-task subset before failing during graph construction; linear extrapolation to 130 tasks implies over 23M indexing tokens. ContextRAG obtains 33.6% F1 overall and 36.8% F1 on multi-hop tasks. An activation analysis shows that queries retrieving at least one lattice-derived node in the top five achieve +3.9 percentage points F1 over queries that do not; this association is diagnostic rather than causal.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ContextRAG builds a fuzzy hierarchical graph for RAG straight from chunk embeddings using residual k-means and Lukasiewicz FCA, cutting indexing tokens sharply but without an ablation to show the lattice structure itself drives the gains.

read the letter

The paper's core move is to skip LLM extraction entirely when building the graph. It clusters embeddings with residual-quantization k-means, then runs Formal Concept Analysis under Lukasiewicz residuated logic to produce a fuzzy concept lattice. Bridge and meet-derived nodes come from the soft join and meet operations on that lattice rather than from any LLM-written edges or summaries. On the 130-task UltraDomain subset this uses only 30 LLM calls and 22k tokens total for indexing. The HiRAG reproduction they ran hit 870 calls and 3.5M tokens on a 20-task slice before failing, with linear extrapolation pointing to over 23M tokens at full size. Those efficiency numbers are the clearest practical signal in the work. The reported F1 of 33.6% overall and 36.8% on multi-hop tasks is given directly, and the activation check shows a 3.9-point lift when at least one lattice node appears in the top-five retrievals. That check is labeled diagnostic, which is honest. The construction itself is defined from embedding operations and standard FCA constructs, so there is no obvious circularity in the performance claims. The soft spot is the missing causal test. There is no ablation that keeps the same chunks and embeddings but removes the fuzzy lattice step, so it remains possible the gains trace to embedding similarity or chunking rather than the induced hierarchy. The HiRAG comparison rests on extrapolation from a small subset, and the abstract gives no error bars, full protocol, or dataset splits. These are real gaps but not load-bearing contradictions. The paper is for groups trying to scale graph RAG to larger corpora while keeping indexing costs manageable. A reader already working on embedding-based or lattice methods could extract the construction details and test them. It shows clear enough thinking on its own terms to go to a serious referee, though the review would need to press for ablations and tighter controls on the multi-hop claims.

Referee Report

2 major / 2 minor

Summary. ContextRAG constructs a fuzzy concept graph for RAG over chunk embeddings via residual-quantization k-means and Formal Concept Analysis using Lukasiewicz residuated logic, inducing bridge-like and meet-derived nodes through soft fuzzy join/meet operations without any LLM-based entity or relation extraction. On a 130-task UltraDomain subset the index is built with 30 LLM calls and 22,073 tokens; a HiRAG stress-test reproduction on a 20-task subset already required 870 calls and 3.54 M tokens before failing. The system reports 33.6 % overall F1 and 36.8 % F1 on multi-hop tasks, together with a diagnostic activation analysis showing a +3.9 pp F1 lift when at least one lattice-derived node appears in the top-5 retrieved items.

Significance. If the central claim holds, the work offers a concrete route to scalable graph RAG that avoids the token and latency costs of LLM extraction at indexing time. The reported token counts, the explicit comparison with HiRAG, and the use of standard FCA constructs on top of residual-quantized embeddings constitute measurable strengths. The diagnostic character of the activation analysis, however, leaves open whether the observed gains are attributable to the induced hierarchical topology or simply to the underlying embeddings and chunking.

major comments (2)

[Abstract] Abstract: the reported +3.9 pp F1 association between retrieval of lattice-derived nodes and answer quality is explicitly labeled diagnostic rather than causal. No ablation is described that retains the same chunk embeddings while removing the residual-quantization k-means + Lukasiewicz FCA construction, so it remains possible that performance gains trace to embedding similarity alone rather than to the induced bridge and meet-derived nodes.
[Evaluation] Evaluation section: the HiRAG stress-test comparison is performed on a 20-task subset with linear extrapolation to 130 tasks; because graph-construction failure occurred before completion, the extrapolated 23 M token figure is not a measured quantity and weakens the efficiency claim.

minor comments (2)

The manuscript does not report error bars, full dataset splits, or the precise hyper-parameters of the residual-quantization k-means (number of clusters is listed as a free parameter).
Notation for the soft fuzzy join and meet operations should be introduced with an explicit equation rather than described only in prose.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We respond to each major comment below and indicate the changes we will make to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the reported +3.9 pp F1 association between retrieval of lattice-derived nodes and answer quality is explicitly labeled diagnostic rather than causal. No ablation is described that retains the same chunk embeddings while removing the residual-quantization k-means + Lukasiewicz FCA construction, so it remains possible that performance gains trace to embedding similarity alone rather than to the induced bridge and meet-derived nodes.

Authors: We agree that the activation analysis is correlational and does not constitute a causal demonstration. An ablation that holds the chunk embeddings and retrieval pipeline fixed while removing only the residual-quantization k-means and Lukasiewicz FCA steps would provide clearer attribution. We will add this ablation to the revised manuscript, reporting the resulting F1 scores on the same 130-task subset so that readers can directly compare the contribution of the induced hierarchical nodes. revision: yes
Referee: [Evaluation] Evaluation section: the HiRAG stress-test comparison is performed on a 20-task subset with linear extrapolation to 130 tasks; because graph-construction failure occurred before completion, the extrapolated 23 M token figure is not a measured quantity and weakens the efficiency claim.

Authors: The referee correctly observes that the 23 M token figure is an extrapolation rather than a direct measurement. In the revised evaluation section we will (i) report the exact measured call and token counts from the 20-task HiRAG run as the primary data point, (ii) explicitly label the 23 M figure as a linear extrapolation, and (iii) note that the failure occurred during graph construction, thereby avoiding any implication that the extrapolated number is an observed quantity. revision: yes

Circularity Check

0 steps flagged

No significant circularity; ContextRAG graph construction is an explicit algorithmic procedure from embeddings and standard FCA constructs

full rationale

The paper's derivation chain defines the fuzzy concept graph directly via residual-quantization k-means on chunk embeddings followed by Formal Concept Analysis using Lukasiewicz residuated logic, with bridge-like and meet-derived nodes produced by soft fuzzy join and meet operations. These steps constitute a constructive definition of the index rather than a self-referential loop or a fitted parameter renamed as a prediction. Reported results such as 33.6% F1 overall, 36.8% F1 on multi-hop tasks, and the +3.9pp diagnostic activation difference are presented as measured empirical outcomes on the UltraDomain subset, not quantities entailed by the construction equations themselves. No load-bearing self-citations, uniqueness theorems imported from prior author work, or ansatzes smuggled via citation appear in the central claims; the extraction-free property follows by design from the embedding-based operations, rendering the approach self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central construction rests on applying established mathematical frameworks to modern embeddings; no new physical entities or ad-hoc fitted constants are introduced beyond standard clustering and fuzzy-logic parameters.

free parameters (1)

number of clusters in residual-quantization k-means
Cluster count must be chosen or tuned to produce the concept lattice; value not stated in abstract.

axioms (1)

domain assumption Lukasiewicz residuated logic supports soft fuzzy join and meet operations that induce meaningful context nodes from embedding-derived concepts
Invoked to replace LLM-written edges with lattice-derived nodes.

pith-pipeline@v0.9.0 · 5783 in / 1387 out tokens · 48434 ms · 2026-05-20T05:09:49.812102+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ContextRAG derives a fuzzy concept graph over chunk embeddings using residual-quantization k-means and Formal Concept Analysis with Łukasiewicz residuated logic... soft fuzzy join and meet operations
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Łukasiewicz t-norm a⊗b=max(0,a+b−1), residuum a→b=min(1,1−a+b)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages

[1]

Retrieval-Augmented Generation for Knowledge-Intensive

Patrick Lewis and Ethan Perez and Aleksandra Piktus and Fabio Petroni and Vladimir Karpukhin and Naman Goyal and Heinrich K. Retrieval-Augmented Generation for Knowledge-Intensive. Advances in Neural Information Processing Systems 33 (NeurIPS 2020) , year =

work page 2020
[2]

Dense Passage Retrieval for Open-Domain Question Answering , booktitle =

Vladimir Karpukhin and Barlas O. Dense Passage Retrieval for Open-Domain Question Answering , booktitle =. 2020 , doi =

work page 2020
[3]

arXiv preprint , volume =

Darren Edge and Ha Trinh and Newman Cheng and Joshua Bradley and Alex Chao and Apurva Mody and Steven Truitt and Jonathan Larson , title =. arXiv preprint , volume =. 2024 , url =

work page 2024
[4]

arXiv preprint , volume =

Zirui Guo and Lianghao Xia and Yanhua Yu and Tu Ao and Chao Huang , title =. arXiv preprint , volume =. 2024 , url =

work page 2024
[5]

arXiv preprint , volume =

Lei Liang and Mengshu Sun and Zhengke Gui and Zhongshu Zhu and Zhouyu Jiang and Ling Zhong and Yuan Qu and Peilong Zhao and Zhongpu Bo and Jin Yang and Huaidong Xiong and Lin Yuan and Jun Xu and Zaoyang Wang and Zhiqiang Zhang and Wen Zhang and Huajun Chen and Wenguang Chen and Jun Zhou , title =. arXiv preprint , volume =. 2024 , url =

work page 2024
[6]

arXiv preprint , volume =

Haoyu Huang and Yongfeng Huang and Junjie Yang and Zhenyu Pan and Yongqiang Chen and Kaili Ma and Hongzhi Chen and James Cheng , title =. arXiv preprint , volume =. 2025 , url =

work page 2025
[7]

arXiv preprint , volume =

Yunfan Gao and Yun Xiong and Xinyu Gao and Kangxiang Jia and Jinliu Pan and Yuxi Bi and Yi Dai and Jiawei Sun and Meng Wang and Haofen Wang , title =. arXiv preprint , volume =. 2023 , url =

work page 2023
[8]

arXiv preprint , volume =

Penghao Zhao and Hailin Zhang and Qinhan Yu and Zhengren Wang and Yunteng Geng and Fangcheng Fu and Ling Yang and Wentao Zhang and Jie Jiang and Bin Cui , title =. arXiv preprint , volume =. 2024 , url =

work page 2024
[9]

2024 , publisher =

Bernhard Ganter and Rudolf Wille , title =. 2024 , publisher =

work page 2024
[10]

John Wiley & Sons , year =

Claudio Carpineto and Giovanni Romano , title =. John Wiley & Sons , year =

work page
[11]

Annual Review of Information Science and Technology , volume =

Uta Priss , title =. Annual Review of Information Science and Technology , volume =. 2006 , doi =

work page 2006
[12]

Fuzzy Relational Systems: Foundations and Principles , year =

Radim B. Fuzzy Relational Systems: Foundations and Principles , year =

work page
[13]

Metamathematics of Fuzzy Logic , year =

Petr H. Metamathematics of Fuzzy Logic , year =

work page
[14]

Gray , title =

Robert M. Gray , title =. IEEE ASSP Magazine , volume =. 1984 , doi =

work page 1984
[15]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =

Artem Babenko and Victor Lempitsky , title =. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =. 2015 , doi =

work page 2015
[16]

A Survey of Product Quantization , journal =

Yusuke Matsui and Yusuke Uchida and Herv. A Survey of Product Quantization , journal =. 2018 , doi =

work page 2018
[17]

arXiv preprint , volume =

Liang Wang and Nan Yang and Xiaolong Huang and Binxing Jiao and Linjun Yang and Daxin Jiang and Rangan Majumder and Furu Wei , title =. arXiv preprint , volume =. 2022 , url =

work page 2022
[18]

Aho and Jeffrey D

Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

work page 1972
[19]

Publications Manual , year = "1983", publisher =

work page 1983
[20]

Chandra and Dexter C

Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

work page doi:10.1145/322234.322243 1981
[21]

Scalable training of

Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

work page
[22]

Dan Gusfield , title =. 1997

work page 1997
[23]

Tetreault , title =

Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

work page 2015
[24]

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =

work page

[1] [1]

Retrieval-Augmented Generation for Knowledge-Intensive

Patrick Lewis and Ethan Perez and Aleksandra Piktus and Fabio Petroni and Vladimir Karpukhin and Naman Goyal and Heinrich K. Retrieval-Augmented Generation for Knowledge-Intensive. Advances in Neural Information Processing Systems 33 (NeurIPS 2020) , year =

work page 2020

[2] [2]

Dense Passage Retrieval for Open-Domain Question Answering , booktitle =

Vladimir Karpukhin and Barlas O. Dense Passage Retrieval for Open-Domain Question Answering , booktitle =. 2020 , doi =

work page 2020

[3] [3]

arXiv preprint , volume =

Darren Edge and Ha Trinh and Newman Cheng and Joshua Bradley and Alex Chao and Apurva Mody and Steven Truitt and Jonathan Larson , title =. arXiv preprint , volume =. 2024 , url =

work page 2024

[4] [4]

arXiv preprint , volume =

Zirui Guo and Lianghao Xia and Yanhua Yu and Tu Ao and Chao Huang , title =. arXiv preprint , volume =. 2024 , url =

work page 2024

[5] [5]

arXiv preprint , volume =

Lei Liang and Mengshu Sun and Zhengke Gui and Zhongshu Zhu and Zhouyu Jiang and Ling Zhong and Yuan Qu and Peilong Zhao and Zhongpu Bo and Jin Yang and Huaidong Xiong and Lin Yuan and Jun Xu and Zaoyang Wang and Zhiqiang Zhang and Wen Zhang and Huajun Chen and Wenguang Chen and Jun Zhou , title =. arXiv preprint , volume =. 2024 , url =

work page 2024

[6] [6]

arXiv preprint , volume =

Haoyu Huang and Yongfeng Huang and Junjie Yang and Zhenyu Pan and Yongqiang Chen and Kaili Ma and Hongzhi Chen and James Cheng , title =. arXiv preprint , volume =. 2025 , url =

work page 2025

[7] [7]

arXiv preprint , volume =

Yunfan Gao and Yun Xiong and Xinyu Gao and Kangxiang Jia and Jinliu Pan and Yuxi Bi and Yi Dai and Jiawei Sun and Meng Wang and Haofen Wang , title =. arXiv preprint , volume =. 2023 , url =

work page 2023

[8] [8]

arXiv preprint , volume =

Penghao Zhao and Hailin Zhang and Qinhan Yu and Zhengren Wang and Yunteng Geng and Fangcheng Fu and Ling Yang and Wentao Zhang and Jie Jiang and Bin Cui , title =. arXiv preprint , volume =. 2024 , url =

work page 2024

[9] [9]

2024 , publisher =

Bernhard Ganter and Rudolf Wille , title =. 2024 , publisher =

work page 2024

[10] [10]

John Wiley & Sons , year =

Claudio Carpineto and Giovanni Romano , title =. John Wiley & Sons , year =

work page

[11] [11]

Annual Review of Information Science and Technology , volume =

Uta Priss , title =. Annual Review of Information Science and Technology , volume =. 2006 , doi =

work page 2006

[12] [12]

Fuzzy Relational Systems: Foundations and Principles , year =

Radim B. Fuzzy Relational Systems: Foundations and Principles , year =

work page

[13] [13]

Metamathematics of Fuzzy Logic , year =

Petr H. Metamathematics of Fuzzy Logic , year =

work page

[14] [14]

Gray , title =

Robert M. Gray , title =. IEEE ASSP Magazine , volume =. 1984 , doi =

work page 1984

[15] [15]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =

Artem Babenko and Victor Lempitsky , title =. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume =. 2015 , doi =

work page 2015

[16] [16]

A Survey of Product Quantization , journal =

Yusuke Matsui and Yusuke Uchida and Herv. A Survey of Product Quantization , journal =. 2018 , doi =

work page 2018

[17] [17]

arXiv preprint , volume =

Liang Wang and Nan Yang and Xiaolong Huang and Binxing Jiao and Linjun Yang and Daxin Jiang and Rangan Majumder and Furu Wei , title =. arXiv preprint , volume =. 2022 , url =

work page 2022

[18] [18]

Aho and Jeffrey D

Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

work page 1972

[19] [19]

Publications Manual , year = "1983", publisher =

work page 1983

[20] [20]

Chandra and Dexter C

Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

work page doi:10.1145/322234.322243 1981

[21] [21]

Scalable training of

Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

work page

[22] [22]

Dan Gusfield , title =. 1997

work page 1997

[23] [23]

Tetreault , title =

Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

work page 2015

[24] [24]

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =

work page