arxiv: 2509.24276 · v4 · submitted 2025-09-29 · 💻 cs.AI

G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge

Linhao Luo , Zicheng Zhao , Junnan Liu , Zhangchi Qiu , Junnan Dong , Serge Panev , Chen Gong , Thuy-Trang Vu

show 4 more authors

Gholamreza Haffari Dinh Phung Alan Wee-Chung Liew Shirui Pan

This is my paper

Pith reviewed 2026-05-18 13:18 UTC · model grok-4.3

classification 💻 cs.AI

keywords G-reasonerQuadGraphgraph foundation modelsLLM reasoningknowledge graphsgraph RAGunified reasoning

0 comments

The pith

A four-layer QuadGraph abstraction unifies heterogeneous knowledge sources so a small graph foundation model can enhance LLM reasoning at scale.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents G-reasoner to fix how large language models struggle with external knowledge that is fragmented or poorly structured. It standardizes many kinds of knowledge into one common graph form using QuadGraph, then trains a 34-million-parameter graph model that learns both connections and text meanings. This graph model is combined with existing LLMs to support reasoning without needing custom graph designs or expensive agent systems for each new task. Experiments across six benchmarks show gains in accuracy, speed, and the ability to work on graphs the system has not seen before.

Core claim

QuadGraph, a standardized four-layer abstraction, unifies heterogeneous knowledge sources into a common graph representation; a 34M-parameter graph foundation model trained on this representation jointly captures topology and textual semantics and integrates with LLMs to deliver scalable reasoning that outperforms ad-hoc GraphRAG approaches.

What carries the argument

QuadGraph, a four-layer abstraction that converts diverse knowledge into a single standardized graph form, allowing a graph foundation model to model both structure and semantics for downstream LLM reasoning.

If this is right

LLMs can reason over graph data without ad-hoc graph construction or costly agent pipelines for each application.
Mixed-precision training and distributed message passing make the graph model efficient enough to scale with additional GPUs.
The system shows strong cross-graph generalization, maintaining performance when applied to previously unseen knowledge structures.
Reasoning quality improves on knowledge-intensive tasks that currently suffer from fragmented information in standard retrieval methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the unification holds, many existing knowledge bases could be converted once and then reused across different reasoning applications without redesign.
The approach might be extended to dynamic or streaming graphs by updating the foundation model incrementally rather than retraining from scratch.
Hybrid systems could emerge in which the graph model handles explicit relations while the LLM supplies implicit context or handles ambiguous queries.

Load-bearing premise

The four-layer QuadGraph abstraction can preserve the relational and semantic details needed for reasoning without important losses when unifying different knowledge sources.

What would settle it

Run the same set of complex relational queries on knowledge first converted to QuadGraph and then to a task-specific graph; if accuracy falls sharply on queries that depend on fine details flattened by the four layers, the unification claim is falsified.

Figures

Figures reproduced from arXiv: 2509.24276 by Alan Wee-Chung Liew, Chen Gong, Dinh Phung, Gholamreza Haffari, Junnan Dong, Junnan Liu, Linhao Luo, Serge Panev, Shirui Pan, Thuy-Trang Vu, Zhangchi Qiu, Zicheng Zhao.

**Figure 1.** Figure 1: The overall framework of G-reasoner. First, G-reasoner provides a unified graph interface, QuadGraph, that integrates diverse graph-structured knowledge from different domains into a standard format. Then, it adopts a GNN-powered foundation model to jointly reason over the graphstructured knowledge and make versatile predictions. Last, we enhance the LLMs with the graph reasoning results to improve the p… view at source ↗

**Figure 2.** Figure 2: Illustration of QuadGraph for unifying existing graph-structured knowledge. To address this limitation, G-reasoner proposes a unified graph interface called QuadGraph that standardizes diverse graph-structured knowledge from different domains into a unified format. Specifically, we design a 4-layer graph structure that consists of the following layers: (1) attribute layer that captures the common attri… view at source ↗

**Figure 3.** Figure 3: Memory and throughput gain brought by mixed precision training. 100k×1024 200k×2048 400k×4096 800k×8192 Compute Cost | | × d 40 160 640 2560 Total GPU Memory Required (GB) Compute Scaling #GPU = (| | × d) × 2.56 1 × 10 6 MGPU GPU Memory (GB) # GPU (80GB) 0 10 20 30 # GPU (80GB) [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 5.** Figure 5: The illustration of distributed message passing in G-reasoner. [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗

**Figure 6.** Figure 6: Scaling of G-reasoner with different model sizes and graph sizes. [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

**Figure 7.** Figure 7: The prompt template for LLM Reasoning . 22 [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗

read the original abstract

Large language models (LLMs) excel at complex reasoning but remain limited by static and incomplete parametric knowledge. Retrieval-augmented generation (RAG) mitigates this by incorporating external knowledge, yet existing RAGs struggle with knowledge-intensive tasks due to fragmented information and weak modeling of knowledge structure. Graphs offer a natural way to model relationships within knowledge, but LLMs are inherently unstructured and cannot effectively reason over graph-structured data. Recent graph-enhanced RAG (GraphRAG) attempts to bridge this gap by constructing tailored graphs and enabling LLMs to reason on them. However, these methods often depend on ad-hoc graph designs, heuristic search, or costly agent pipelines, which hinder scalability and generalization. To address these challenges, we present G-reasoner, a unified framework that integrates graph and language foundation models for scalable reasoning over diverse graph-structured knowledge. Central to our approach is QuadGraph, a standardized four-layer abstraction that unifies heterogeneous knowledge sources into a common graph representation. Building on this, we introduce a 34M-parameter graph foundation model (GFM) that jointly captures graph topology and textual semantics, and is integrated with LLMs to enhance reasoning in downstream applications. To ensure scalability and efficiency, mixed-precision training and distributed message-passing are implemented to scale GFM with more GPUs. Extensive experiments on six benchmarks show that G-reasoner consistently outperforms state-of-the-art baselines, significantly enhances LLM reasoning, and achieves strong efficiency and cross-graph generalization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

G-reasoner introduces QuadGraph and a 34M GFM to standardize graph knowledge for LLMs, but the unification step lacks direct checks on information preservation and the abstract omits result details.

read the letter

The key point is that this paper presents G-reasoner, which uses a QuadGraph abstraction to standardize knowledge graphs and a 34M-parameter GFM to jointly model graph structure and text for better LLM reasoning. It does a decent job laying out the limitations of existing GraphRAG approaches, such as their reliance on tailored graphs and expensive pipelines. The idea of a common four-layer representation to handle diverse sources is a reasonable step toward scalability and cross-graph use. Adding mixed-precision training and distributed message-passing shows attention to practical efficiency concerns. The reported experiments on six benchmarks are positioned as showing consistent gains in performance, efficiency, and generalization. If the full paper has clear baselines and ablations, that could provide useful data points for the field. That said, the abstract gives no concrete metrics or comparisons, which makes it hard to assess how meaningful the improvements are. The central assumption that QuadGraph preserves the necessary relational and semantic details without significant loss also lacks supporting evidence like similarity scores or information loss measures. If the abstraction simplifies too much, the benefits might trace back to the GFM or integration rather than the unification. This kind of work is for researchers focused on combining structured knowledge with language models in AI applications. A reader dealing with knowledge graphs in domains like question answering or recommendation systems could extract some practical ideas here. Overall, the paper has a clear proposal and experimental claims that justify sending it to peer review for a closer look at the methods and results.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes G-reasoner, a unified framework integrating graph and language foundation models for scalable reasoning over diverse graph-structured knowledge. Central components are QuadGraph, a standardized four-layer abstraction claimed to unify heterogeneous knowledge sources into a common representation, and a 34M-parameter graph foundation model (GFM) that jointly models topology and textual semantics. The GFM is integrated with LLMs, supported by mixed-precision training and distributed message-passing for scalability. The paper claims that extensive experiments on six benchmarks demonstrate consistent outperformance over state-of-the-art baselines, enhanced LLM reasoning capabilities, efficiency gains, and strong cross-graph generalization.

Significance. If the empirical claims and the information-preservation properties of QuadGraph hold, the work would represent a meaningful advance in graph-enhanced RAG and LLM reasoning. A standardized, learnable abstraction plus a compact dedicated GFM could reduce reliance on ad-hoc graph construction and heuristic pipelines common in prior GraphRAG methods, while the reported efficiency techniques and cross-graph generalization would be practically valuable for deployment at scale.

major comments (2)

[Abstract / QuadGraph definition] Abstract and method description of QuadGraph: The central claim that QuadGraph converts arbitrary knowledge graphs into a common four-layer representation while retaining the relational structure and textual semantics required for downstream reasoning gains is not accompanied by any quantitative preservation metrics (e.g., graph-edit distance, triple-level semantic similarity, or information-theoretic loss between original and abstracted graphs). This assumption is load-bearing for attributing benchmark improvements to the unification step rather than to the GFM or LLM integration alone.
[Abstract / Experimental results] Results section (implied by abstract claims): The statement that G-reasoner 'consistently outperforms state-of-the-art baselines' on six benchmarks is presented without any numerical results, baseline names, statistical significance tests, error bars, or data-exclusion criteria. This absence prevents verification of the magnitude and robustness of the reported gains.

minor comments (1)

[Abstract] The abstract would benefit from a concise listing of the six benchmarks and the primary baselines used, even at a high level, to allow readers to immediately contextualize the claimed improvements.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate planned revisions to strengthen the presentation of QuadGraph and the experimental claims.

read point-by-point responses

Referee: [Abstract / QuadGraph definition] Abstract and method description of QuadGraph: The central claim that QuadGraph converts arbitrary knowledge graphs into a common four-layer representation while retaining the relational structure and textual semantics required for downstream reasoning gains is not accompanied by any quantitative preservation metrics (e.g., graph-edit distance, triple-level semantic similarity, or information-theoretic loss between original and abstracted graphs). This assumption is load-bearing for attributing benchmark improvements to the unification step rather than to the GFM or LLM integration alone.

Authors: We agree that explicit quantitative metrics would better substantiate the information-preservation properties of QuadGraph and help attribute gains specifically to the unification step. The manuscript currently emphasizes the design rationale and downstream empirical results. In the revised version we will add a dedicated analysis (in Section 3 or an appendix) reporting graph-edit distance, triple-level semantic similarity, and information-theoretic measures between original graphs and their QuadGraph abstractions on the six benchmark datasets. revision: yes
Referee: [Abstract / Experimental results] Results section (implied by abstract claims): The statement that G-reasoner 'consistently outperforms state-of-the-art baselines' on six benchmarks is presented without any numerical results, baseline names, statistical significance tests, error bars, or data-exclusion criteria. This absence prevents verification of the magnitude and robustness of the reported gains.

Authors: The full manuscript contains a detailed Experiments section (Section 4) with tables reporting exact numerical results, baseline names, statistical significance tests, error bars, and evaluation protocols. To address the concern that the abstract claim is not self-contained, we will revise the abstract to include the most salient quantitative improvements (e.g., average gains and key baselines) while preserving brevity, and we will add a brief reference to the full results tables and statistical details. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical framework with external evaluation

full rationale

The paper introduces QuadGraph as a four-layer standardization and a 34M-parameter GFM trained to capture topology and semantics, then reports benchmark results on six external datasets. No equations, fitted parameters, or predictions are shown that reduce by construction to the inputs; the unification step is presented as a design choice whose fidelity is assumed rather than derived tautologically. No load-bearing self-citations or uniqueness theorems imported from prior author work appear in the provided text, and the performance claims rest on standard empirical comparison rather than self-referential renaming or ansatz smuggling.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

The central claim rests on the effectiveness of the newly introduced QuadGraph abstraction and the GFM's ability to jointly model topology and semantics; these are postulated without independent evidence outside the paper's experiments.

invented entities (2)

QuadGraph no independent evidence
purpose: Standardized four-layer abstraction to unify heterogeneous knowledge sources into a common graph representation
Introduced as central to the approach for handling diverse graph-structured knowledge.
GFM no independent evidence
purpose: 34M-parameter graph foundation model that jointly captures graph topology and textual semantics
New model proposed and integrated with LLMs for enhanced reasoning.

pith-pipeline@v0.9.0 · 5842 in / 1228 out tokens · 36562 ms · 2026-05-18T13:18:46.169401+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Central to our approach is QuadGraph, a standardized four-layer abstraction that unifies heterogeneous knowledge sources into a common graph representation... 34M-parameter graph foundation model (GFM) that jointly captures graph topology and textual semantics
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

G-reasoner adopts the query-dependent GNN... Msg function uses DistMult... Optimization... KL divergence between pseudo-label distribution and prediction

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages · 11 internal anchors

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page
[2]

GPT-4 Technical Report

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[3]

M3-Embedding: Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

Jianlv Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, and Zheng Liu. Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation. arXiv preprint arXiv:2402.03216, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[4]

Bert: Pre-training of deep bidirectional transformers for language understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pp.\ 4171--4186, 2019

work page 2019
[5]

Youtu-graphrag: Vertically unified agents for graph retrieval-augmented complex reasoning

Junnan Dong, Siyu An, Yifei Yu, Qian-Wen Zhang, Linhao Luo, Xiao Huang, Yunsheng Wu, Di Yin, and Xing Sun. Youtu-graphrag: Vertically unified agents for graph retrieval-augmented complex reasoning. arXiv preprint arXiv:2508.19855, 2025

work page arXiv 2025
[6]

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, Dasha Metropolitansky, Robert Osazuwa Ness, and Jonathan Larson. From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[7]

Lenssen, and Jure Leskovec

Matthias Fey, Jinu Sunil, Akihiro Nitta, Rishi Puri, Manan Shah, Blaz Stojanovic, Ramona Bendias, Barghi Alexandria, Vid Kocijan, Zecheng Zhang, Xinwei He, Jan E. Lenssen, and Jure Leskovec. Pyg 2.0: Scalable learning on real world graphs. In Temporal Graph Learning Workshop @ KDD, 2025

work page 2025
[8]

Towards foundation models for knowledge graph reasoning

Mikhail Galkin, Xinyu Yuan, Hesham Mostafa, Jian Tang, and Zhaocheng Zhu. Towards foundation models for knowledge graph reasoning. In The Twelfth International Conference on Learning Representations, 2024

work page 2024
[9]

Retrieval-Augmented Generation for Large Language Models: A Survey

Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, and Haofen Wang. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[10]

Gpt4graph: Can large language models understand graph structured data? an empirical evaluation and benchmarking

Jiayan Guo, Lun Du, Hengyu Liu, Mengyu Zhou, Xinyi He, and Shi Han. Gpt4graph: Can large language models understand graph structured data? an empirical evaluation and benchmarking. arXiv preprint arXiv:2305.15066, 2023

work page arXiv 2023
[11]

Lightrag: Simple and fast retrieval-augmented generation

Zirui Guo, Lianghao Xia, Yanhua Yu, Tu Ao, and Chao Huang. Lightrag: Simple and fast retrieval-augmented generation. 2024

work page 2024
[12]

From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

Bernal Jiménez Gutiérrez, Yiheng Shu, Weijian Qi, Sizhe Zhou, and Yu Su. From rag to memory: Non-parametric continual learning for large language models, 2025. URL https://arxiv.org/abs/2502.14802

work page internal anchor Pith review Pith/arXiv arXiv 2025
[13]

Retrieval-Augmented Generation with Graphs (GraphRAG)

Haoyu Han, Yu Wang, Harry Shomer, Kai Guo, Jiayuan Ding, Yongjia Lei, Mahantesh Halappanavar, Ryan A Rossi, Subhabrata Mukherjee, Xianfeng Tang, et al. Retrieval-augmented generation with graphs (graphrag). arXiv preprint arXiv:2501.00309, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[14]

G-retriever: Retrieval-augmented generation for textual graph understanding and question answering

Xiaoxin He, Yijun Tian, Yifei Sun, Nitesh Chawla, Thomas Laurent, Yann LeCun, Xavier Bresson, and Bryan Hooi. G-retriever: Retrieval-augmented generation for textual graph understanding and question answering. Advances in Neural Information Processing Systems, 37: 0 132876--132907, 2024

work page 2024
[15]

Distilling the Knowledge in a Neural Network

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[16]

Constructing a multi-hop qa dataset for comprehensive evaluation of reasoning steps

Xanh Ho, Anh-Khoa Duong Nguyen, Saku Sugawara, and Akiko Aizawa. Constructing a multi-hop qa dataset for comprehensive evaluation of reasoning steps. In Proceedings of the 28th International Conference on Computational Linguistics, pp.\ 6609--6625, 2020

work page 2020
[17]

Knowledge graphs

Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d’Amato, Gerard De Melo, Claudio Gutierrez, Sabrina Kirrane, Jos \'e Emilio Labra Gayo, Roberto Navigli, Sebastian Neumaier, et al. Knowledge graphs. ACM Computing Surveys (Csur), 54 0 (4): 0 1--37, 2021

work page 2021
[18]

Retrieval and structuring augmented generation with large language models

Pengcheng Jiang, Siru Ouyang, Yizhu Jiao, Ming Zhong, Runchu Tian, and Jiawei Han. Retrieval and structuring augmented generation with large language models. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2, pp.\ 6032--6042, 2025

work page 2025
[19]

Hipporag: Neurobiologically inspired long-term memory for large language models

Bernal Jimenez Gutierrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, and Yu Su. Hipporag: Neurobiologically inspired long-term memory for large language models. Advances in Neural Information Processing Systems, 37: 0 59532--59569, 2024

work page 2024
[20]

Large language models on graphs: A comprehensive survey

Bowen Jin, Gang Liu, Chi Han, Meng Jiang, Heng Ji, and Jiawei Han. Large language models on graphs: A comprehensive survey. IEEE Transactions on Knowledge and Data Engineering, 2024

work page 2024
[21]

P ub M ed QA : A dataset for biomedical research question answering

Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William Cohen, and Xinghua Lu. P ub M ed QA : A dataset for biomedical research question answering. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language P...

work page doi:10.18653/v1/d19-1259 2019
[22]

Bridging law and data: Augmenting reasoning via a semi-structured dataset with irac methodology

Xiaoxi Kang, Lizhen Qu, Lay-Ki Soon, Zhuang Li, and Adnan Trakic. Bridging law and data: Augmenting reasoning via a semi-structured dataset with irac methodology. arXiv preprint arXiv:2406.13217, 2024

work page arXiv 2024
[23]

Metis: A software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices

George Karypis and Vipin Kumar. Metis: A software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices. 1997

work page 1997
[24]

Dalk: Dynamic co-augmentation of llms and kg to answer alzheimer’s disease questions with scientific literature

Dawei Li, Shu Yang, Zhen Tan, Jae Baik, Sukwon Yun, Joseph Lee, Aaron Chacko, Bojian Hou, Duy Duong-Tran, Ying Ding, et al. Dalk: Dynamic co-augmentation of llms and kg to answer alzheimer’s disease questions with scientific literature. In Findings of the Association for Computational Linguistics: EMNLP 2024, pp.\ 2187--2205, 2024

work page 2024
[25]

Simple is effective: The roles of graphs and large language models in knowledge-graph-based retrieval-augmented generation

Mufei Li, Siqi Miao, and Pan Li. Simple is effective: The roles of graphs and large language models in knowledge-graph-based retrieval-augmented generation. In The Thirteenth International Conference on Learning Representations, 2025 a

work page 2025
[26]

Structrag: Boosting knowledge intensive reasoning of llms via inference-time hybrid information structurization

Zhuoqun Li, Xuanang Chen, Haiyang Yu, Hongyu Lin, Yaojie Lu, Qiaoyu Tang, Fei Huang, Xianpei Han, Le Sun, and Yongbin Li. Structrag: Boosting knowledge intensive reasoning of llms via inference-time hybrid information structurization. In The Thirteenth International Conference on Learning Representations, 2025 b

work page 2025
[28]

DeepSeek-V3 Technical Report

Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, et al. Deepseek-v3 technical report. arXiv preprint arXiv:2412.19437, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[29]

Gfm-rag: graph foundation model for retrieval augmented generation

Linhao Luo, Zicheng Zhao, Gholamreza Haffari, Dinh Phung, Chen Gong, and Shirui Pan. Gfm-rag: graph foundation model for retrieval augmented generation. NeurIPS, 2025

work page 2025
[30]

Think-on-graph 2.0: Deep and faithful large language model reasoning with knowledge-guided retrieval augmented generation

Shengjie Ma, Chengjin Xu, Xuhui Jiang, Muzhi Li, Huaren Qu, Cehao Yang, Jiaxin Mao, and Jian Guo. Think-on-graph 2.0: Deep and faithful large language model reasoning with knowledge-guided retrieval augmented generation. In The Thirteenth International Conference on Learning Representations, 2025

work page 2025
[31]

GNN - RAG : Graph neural retrieval for efficient large language model reasoning on knowledge graphs

Costas Mavromatis and George Karypis. GNN - RAG : Graph neural retrieval for efficient large language model reasoning on knowledge graphs. In Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar (eds.), Findings of the Association for Computational Linguistics: ACL 2025, pp.\ 16682--16699, Vienna, Austria, July 2025 a . Association ...

work page doi:10.18653/v1/2025.findings-acl.856 2025
[32]

Gnn-rag: Graph neural retrieval for efficient large language model reasoning on knowledge graphs

Costas Mavromatis and George Karypis. Gnn-rag: Graph neural retrieval for efficient large language model reasoning on knowledge graphs. In Findings of the Association for Computational Linguistics: ACL 2025, pp.\ 16682--16699, 2025 b

work page 2025
[33]

Dyknow: dynamically verifying time-sensitive factual knowledge in llms

Seyed Mahed Mousavi, Simone Alghisi, and Giuseppe Riccardi. Dyknow: dynamically verifying time-sensitive factual knowledge in llms. arXiv preprint arXiv:2404.08700, 2024

work page arXiv 2024
[34]

Hello gpt-4o, 2024

OpenAI. Hello gpt-4o, 2024. URL https://openai.com/index/hello-gpt-4o/

work page 2024
[35]

Graph retrieval-augmented generation: A survey

Boci Peng, Yun Zhu, Yongchao Liu, Xiaohe Bo, Haizhou Shi, Chuntao Hong, Yan Zhang, and Siliang Tang. Graph retrieval-augmented generation: A survey. arXiv preprint arXiv:2408.08921, 2024

work page arXiv 2024
[36]

Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval

Stephen E Robertson and Steve Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR’94: Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, organised by Dublin City University, pp.\ 232--241. Springer, 1994

work page 1994
[37]

Relational world knowledge representation in contextual language models: A review

Tara Safavi and Danai Koutra. Relational world knowledge representation in contextual language models: A review. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.\ 1053--1067, 2021

work page 2021
[38]

Colbertv2: Effective and efficient retrieval via lightweight late interaction

Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, and Matei Zaharia. Colbertv2: Effective and efficient retrieval via lightweight late interaction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.\ 3715--3734, 2022

work page 2022
[39]

Raptor: Recursive abstractive processing for tree-organized retrieval

Parth Sarthi, Salman Abdullah, Aditi Tuli, Shubh Khanna, Anna Goldie, and Christopher D Manning. Raptor: Recursive abstractive processing for tree-organized retrieval. In The Twelfth International Conference on Learning Representations, 2024

work page 2024
[40]

Injecting domain-specific knowledge into large language models: a comprehensive survey

Zirui Song, Bin Yan, Yuhan Liu, Miao Fang, Mingzhe Li, Rui Yan, and Xiuying Chen. Injecting domain-specific knowledge into large language models: a comprehensive survey. arXiv preprint arXiv:2502.10708, 2025

work page arXiv 2025
[41]

Think-on-graph: Deep and responsible reasoning of large language model on knowledge graph

Jiashuo Sun, Chengjin Xu, Lumingyuan Tang, Saizhuo Wang, Chen Lin, Yeyun Gong, Lionel Ni, Heung-Yeung Shum, and Jian Guo. Think-on-graph: Deep and responsible reasoning of large language model on knowledge graph. In The Twelfth International Conference on Learning Representations, 2024

work page 2024
[42]

Musique: Multihop questions via single-hop question composition

Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, and Ashish Sabharwal. Musique: Multihop questions via single-hop question composition. Transactions of the Association for Computational Linguistics, 10: 0 539--554, 2022

work page 2022
[43]

Deep graph library: A graph-centric, highly-performant package for graph neural networks

Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315, 2019

work page arXiv 1909
[44]

ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation

Shu Wang, Yixiang Fang, Yingli Zhou, Xilin Liu, and Yuchi Ma. Archrag: Attributed community-based hierarchical retrieval-augmented generation. arXiv preprint arXiv:2502.09891, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[45]

Knowledge graph prompting for multi-document question answering

Yu Wang, Nedim Lipka, Ryan A Rossi, Alexa Siu, Ruiyi Zhang, and Tyler Derr. Knowledge graph prompting for multi-document question answering. In Proceedings of the AAAI conference on artificial intelligence, volume 38, pp.\ 19206--19214, 2024

work page 2024
[46]

A comprehensive survey on graph neural networks

Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S Yu. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32 0 (1): 0 4--24, 2020

work page 2020
[47]

When to use graphs in rag: A comprehensive analysis for graph retrieval-augmented generation

Zhishang Xiang, Chuanjie Wu, Qinggang Zhang, Shengyuan Chen, Zijin Hong, Xiao Huang, and Jinsong Su. When to use graphs in rag: A comprehensive analysis for graph retrieval-augmented generation. arXiv preprint arXiv:2506.05690, 2025

work page arXiv 2025
[48]

Graphrag-bench: Challenging domain-specific reasoning for evaluating graph retrieval-augmented generation

Yilin Xiao, Junnan Dong, Chuang Zhou, Su Dong, Qian-wen Zhang, Di Yin, Xing Sun, and Xiao Huang. Graphrag-bench: Challenging domain-specific reasoning for evaluating graph retrieval-augmented generation. arXiv preprint arXiv:2506.02404, 2025

work page arXiv 2025
[49]

Qwen3 Technical Report

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report. arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[50]

Embedding entities and relations for learning and inference in knowledge bases

Bishan Yang, Scott Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. Embedding entities and relations for learning and inference in knowledge bases. In Proceedings of the International Conference on Learning Representations (ICLR) 2015, 2015

work page 2015
[51]

Hotpotqa: A dataset for diverse, explainable multi-hop question answering

Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and Christopher D Manning. Hotpotqa: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp.\ 2369--2380, 2018

work page 2018
[52]

Sirerag: Indexing similar and related information for multihop reasoning

Nan Zhang, Prafulla Kumar Choubey, Alexander Fabbri, Gabriel Bernadett-Shapiro, Rui Zhang, Prasenjit Mitra, Caiming Xiong, and Chien-Sheng Wu. Sirerag: Indexing similar and related information for multihop reasoning. In The Thirteenth International Conference on Learning Representations, 2025 a

work page 2025
[53]

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Yanzhao Zhang, Mingxin Li, Dingkun Long, Xin Zhang, Huan Lin, Baosong Yang, Pengjun Xie, An Yang, Dayiheng Liu, Junyang Lin, et al. Qwen3 embedding: Advancing text embedding and reranking through foundation models. arXiv preprint arXiv:2506.05176, 2025 b

work page internal anchor Pith review Pith/arXiv arXiv 2025
[54]

Neural bellman-ford networks: A general graph neural network framework for link prediction

Zhaocheng Zhu, Zuobai Zhang, Louis-Pascal Xhonneux, and Jian Tang. Neural bellman-ford networks: A general graph neural network framework for link prediction. Advances in Neural Information Processing Systems, 34: 0 29476--29490, 2021

work page 2021
[55]

@esa (Ref

\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

work page
[56]

\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

work page
[57]

GNN - RAG : Graph Neural Retrieval for Efficient Large Language Model Reasoning on Knowledge Graphs

@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...

work page doi:10.1145/3701716.3715240 2019