arxiv: 2501.00309 · v2 · pith:W2GAWQYInew · submitted 2024-12-31 · 💻 cs.IR · cs.CL· cs.LG

Retrieval-Augmented Generation with Graphs (GraphRAG)

Haoyu Han , Yu Wang , Harry Shomer , Kai Guo , Jiayuan Ding , Yongjia Lei , Mahantesh Halappanavar , Ryan A. Rossi

show 10 more authors

Subhabrata Mukherjee Xianfeng Tang Qi He Zhigang Hua Bo Long Tong Zhao Neil Shah Amin Javari Yinglong Xia Jiliang Tang

This is my paper

Pith reviewed 2026-05-18 04:29 UTC · model grok-4.3

classification 💻 cs.IR cs.CLcs.LG

keywords GraphRAGRetrieval-Augmented GenerationGraph-structured knowledgeSurveyDomain-specific techniquesRelational data retrieval

0 comments

The pith

GraphRAG needs its own framework because graph data carries relational patterns that standard retrieval methods do not handle uniformly across domains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The survey organizes recent work on equipping retrieval-augmented generation with graphs. Graphs link entities through edges and thereby supply structured context that improves downstream tasks such as question answering and recommendation. The authors first lay out a shared pipeline with five parts: query processor, retriever, organizer, generator, and data source. They then collect the distinct adaptations required by different application areas whose graphs follow their own relational rules. The result is both a reference map for practitioners and a list of open problems that cut across fields.

Core claim

The paper proposes a holistic GraphRAG framework whose five components are the query processor, the retriever, the organizer, the generator, and the data source; it then reviews the specialized techniques that each domain applies to exploit its particular graph patterns.

What carries the argument

The five-component GraphRAG pipeline that separates query handling, retrieval, organization, generation, and the underlying graph data source.

If this is right

Practitioners in any single domain can adopt the shared pipeline and then insert only the techniques already catalogued for that domain.
Cross-domain comparisons become possible once every system is described with the same five components.
New GraphRAG work can be positioned by stating which of the five components it modifies and which domain patterns it targets.
Identified challenges in query processing and organization supply concrete targets for the next round of technical papers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The organizer component may become a natural place to insert graph-summarization or compression steps that current RAG pipelines lack.
Benchmarks that test the same task across multiple graph domains would make the survey's domain distinctions directly measurable.
The framework could be extended to dynamic graphs whose edges arrive over time, an aspect left implicit in the static-data focus of most reviewed work.

Load-bearing premise

That the relational structure and domain-specific formatting of graphs create challenges distinct enough to require separate designs rather than a single uniform approach.

What would settle it

An experiment in which one unmodified neural-embedding retriever and generator achieves comparable accuracy on graph data drawn from several unrelated domains without any domain-specific organizer or data-source adjustments.

read the original abstract

Retrieval-augmented generation (RAG) is a powerful technique that enhances downstream task execution by retrieving additional information, such as knowledge, skills, and tools from external sources. Graph, by its intrinsic "nodes connected by edges" nature, encodes massive heterogeneous and relational information, making it a golden resource for RAG in tremendous real-world applications. As a result, we have recently witnessed increasing attention on equipping RAG with Graph, i.e., GraphRAG. However, unlike conventional RAG, where the retriever, generator, and external data sources can be uniformly designed in the neural-embedding space, the uniqueness of graph-structured data, such as diverse-formatted and domain-specific relational knowledge, poses unique and significant challenges when designing GraphRAG for different domains. Given the broad applicability, the associated design challenges, and the recent surge in GraphRAG, a systematic and up-to-date survey of its key concepts and techniques is urgently desired. Following this motivation, we present a comprehensive and up-to-date survey on GraphRAG. Our survey first proposes a holistic GraphRAG framework by defining its key components, including query processor, retriever, organizer, generator, and data source. Furthermore, recognizing that graphs in different domains exhibit distinct relational patterns and require dedicated designs, we review GraphRAG techniques uniquely tailored to each domain. Finally, we discuss research challenges and brainstorm directions to inspire cross-disciplinary opportunities. Our survey repository is publicly maintained at https://github.com/Graph-RAG/GraphRAG/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a useful organizing survey on GraphRAG that introduces a five-part framework and domain reviews, though it offers no new experiments.

read the letter

This GraphRAG survey is mainly useful as an organizing document for a new subfield. It proposes a framework with five components and reviews techniques by domain, which gives some shape to the recent papers in this area. The framework itself is straightforward: query processor, retriever, organizer, generator, and data source. That breakdown lets the authors group existing work and highlight where graphs add value over plain vector retrieval. The domain sections address real differences, like how knowledge graphs differ from molecular graphs, and the challenges section points to open problems. The public repo is a practical touch for keeping the survey alive. One limitation is that the paper stays at the level of review. No new methods or large-scale comparisons appear, so its strength rests on how thoroughly it covers the literature and whether the taxonomy holds up. The motivation about unique challenges per domain is stated but not quantified here, which is fine for a survey but means the paper's value is more in curation than in testing ideas. Researchers in information retrieval or graph-based ML would get the most from it as a starting point. It could also serve as a reference for practitioners building RAG systems with graph data. I would recommend peer review. The topic is timely and the structure is clear enough that referees could improve the coverage and sharpen the framework.

Referee Report

0 major / 3 minor

Summary. The manuscript surveys Retrieval-Augmented Generation with Graphs (GraphRAG). It argues that graphs' node-edge structure makes them particularly suitable for RAG due to their ability to encode heterogeneous and relational information. The authors propose a holistic framework by defining key components including the query processor, retriever, organizer, generator, and data source. They review GraphRAG techniques tailored to different domains and discuss research challenges along with future directions. The survey is supported by a publicly maintained GitHub repository.

Significance. This work is significant for organizing the growing literature on GraphRAG and providing a common framework that can guide research in information retrieval and related areas. The domain-specific review and challenge discussion could promote cross-disciplinary applications. The public repository is a notable strength for ensuring the survey remains current and for enabling reproducibility of the literature collection.

minor comments (3)

[Abstract] The statement that graph-structured data poses 'unique and significant challenges' is central to the motivation; consider adding a short illustrative example of such a challenge to make the distinction from standard RAG more concrete.
[Introduction] The claim of a 'recent surge in GraphRAG' would be strengthened by including a brief citation analysis or count of recent papers.
The repository link is provided, but the manuscript should specify what resources (e.g., paper list, code) are available there to aid readers.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of our survey on Retrieval-Augmented Generation with Graphs (GraphRAG). We appreciate the recognition of its significance in organizing the growing literature, proposing a holistic framework with components such as the query processor, retriever, organizer, generator, and data source, as well as the domain-tailored reviews, challenges, and future directions. The public GitHub repository is also noted as a strength. We accept the recommendation for minor revision and stand ready to incorporate any specific suggestions.

Circularity Check

0 steps flagged

No significant circularity in this literature survey

full rationale

This paper is a survey that organizes existing GraphRAG literature into a proposed holistic framework (query processor, retriever, organizer, generator, data source) and domain-specific reviews, without any mathematical derivations, equations, fitted parameters, or predictions that reduce to the paper's own inputs by construction. The motivation regarding unique challenges of graph-structured data is presented as background for the survey's organization rather than a load-bearing derived claim. No self-citation chains, uniqueness theorems, or ansatzes are invoked in a manner that would make the central contribution circular; the work explicitly references an external surge in papers and a public GitHub repository as external context.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper. No free parameters, mathematical axioms, or invented entities are introduced in the sense of original technical derivations. The framework is a high-level organizational construct drawn from existing components in the RAG and graph literature.

pith-pipeline@v0.9.0 · 5869 in / 1180 out tokens · 33542 ms · 2026-05-18T04:29:19.362133+00:00 · methodology

discussion (0)

Forward citations

Cited by 19 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation
cs.AI 2026-04 unverdicted novelty 7.0

XGRAG uses graph perturbations to quantify component contributions in GraphRAG and achieves 14.81% better explanation quality than text-based baselines on QA datasets, with correlations to graph centrality.
BizCompass: Benchmarking the Reasoning Capabilities of LLMs in Business Knowledge and Applications
cs.CE 2026-04 unverdicted novelty 7.0

BizCompass is a dual-axis benchmark evaluating LLMs on business knowledge in finance, economics, statistics, and operations management, linked to analyst, trader, and consultant roles, with public datasets released af...
Do We Still Need GraphRAG? Benchmarking RAG and GraphRAG for Agentic Search Systems
cs.IR 2026-04 unverdicted novelty 7.0

Agentic search narrows the gap between dense RAG and GraphRAG but does not remove GraphRAG's advantage on complex multi-hop reasoning.
Hypergraph Enterprise Agentic Reasoner over Heterogeneous Business Systems
cs.AI 2026-05 unverdicted novelty 6.0

HEAR uses a stratified hypergraph ontology to orchestrate evidence-driven multi-hop reasoning over heterogeneous business systems, reaching 94.7% accuracy on supply-chain root-cause tasks with open-weight models.
Why Retrieval-Augmented Generation Fails: A Graph Perspective
cs.CL 2026-05 unverdicted novelty 6.0

Attribution graphs reveal that RAG failures arise from shallow fragmented evidence flow in LLMs, enabling topology-based detection and targeted interventions that reinforce question-guided routing.
LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation
cs.IR 2026-05 unverdicted novelty 6.0

LARAG improves RAG answer quality on hyperlinked technical documentation by using author-defined links for retrieval, achieving higher BERTScore while using fewer chunks and tokens than standard embedding-based RAG.
NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains
cs.IR 2026-04 unverdicted novelty 6.0

NeocorRAG uses Evidence Chains to achieve SOTA retrieval quality in RAG on HotpotQA, 2WikiMultiHopQA, MuSiQue, and NQ for 3B and 70B models while using under 20% of the tokens of comparable methods.
Graph-to-Frame RAG: Visual-Space Knowledge Fusion for Training-Free and Auditable Video Reasoning
cs.CV 2026-04 unverdicted novelty 6.0

G2F-RAG converts retrieved knowledge subgraphs into a single visual reasoning frame appended to videos, enabling training-free and interpretable improvements for LMM-based video reasoning on knowledge-intensive tasks.
Memory in the LLM Era: Modular Architectures and Strategies in a Unified Framework
cs.CL 2026-04 unverdicted novelty 6.0

A unified framework for LLM agent memory is benchmarked, with a new hybrid method outperforming state-of-the-art on standard tasks.
HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling
cs.AI 2026-02 unverdicted novelty 6.0

HyMem introduces dual-granular memory storage with a lightweight summary module for fast responses and selective activation of a deep LLM module for complex queries, outperforming full-context baselines by 92.6% lower...
UnWeaving the knots of GraphRAG -- turns out VectorRAG is almost enough
cs.IR 2026-02 unverdicted novelty 6.0

UnWeaver disentangles documents into entities via LLM to retrieve original chunks, yielding a simpler alternative to GraphRAG that still reduces noise and preserves source fidelity.
Automatic Ontology Construction Using LLMs as an External Layer of Memory, Verification, and Planning for Hybrid Intelligent Systems
cs.AI 2026-04 unverdicted novelty 5.0

A hybrid system augments LLMs with an automated external RDF/OWL ontology layer for long-term memory, SHACL/OWL validation, and improved multi-step reasoning on tasks like Tower of Hanoi.
Injecting Structured Biomedical Knowledge into Language Models: Continual Pretraining vs. GraphRAG
cs.CL 2026-04 unverdicted novelty 5.0

Continual pretraining on UMLS-derived text improves BERT on BLURB biomedical tasks while GraphRAG boosts LLaMA 3-8B accuracy by over 3 points on PubMedQA and 5 on BioASQ without retraining.
Search-R3: Unifying Reasoning and Embedding in Large Language Models
cs.CL 2025-10 unverdicted novelty 5.0

Search-R3 trains LLMs to output search embeddings as a direct product of step-by-step reasoning via supervised pre-training and a specialized RL environment that avoids full corpus re-encoding.
G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge
cs.AI 2025-09 unverdicted novelty 5.0

G-reasoner uses QuadGraph abstraction and a 34M-parameter graph foundation model integrated with LLMs to enable scalable reasoning over diverse graph-structured knowledge, outperforming baselines on six benchmarks.
AssemPlanner: A Multi-Agent Based Task Planning Framework for Flexible Assembly System
cs.RO 2026-05 unverdicted novelty 4.0

AssemPlanner is a ReAct-based multi-agent system that autonomously generates production plans from natural language inputs by integrating scheduling, knowledge, line balancing, and scene graph feedback.
Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation
cs.CL 2026-04 unverdicted novelty 4.0

The survey unifies LLM augmentation techniques along the single axis of structured context supplied at inference time and supplies a literature screening protocol plus deployment decision framework.
Sentra-Guard: A Real-Time Multilingual Defense Against Adversarial LLM Prompts
cs.CR 2025-10 unverdicted novelty 4.0

Sentra-Guard reports 99.96% detection of adversarial LLM prompts with AUC 1.00 and ASR of 0.004% using a hybrid SBERT-FAISS and transformer classifier architecture with multilingual translation and human feedback.
Development and Preliminary Evaluation of a Domain-Specific Large Language Model for Tuberculosis Care in South Africa
cs.CL 2026-03 unverdicted novelty 3.0

A domain-specific LLM for TB care in South Africa, created by fine-tuning BioMistral-7B with QLoRA and GraphRAG on local guidelines, shows improved contextual alignment over the base model.

Reference graph

Works this paper leans on

299 extracted references · 299 canonical work pages · cited by 19 Pith papers · 20 internal anchors

[1]

Accurate structure prediction of biomolecular interactions with alphafold 3

Josh Abramson, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneberger, Lindsay Willmore, Andrew J Ballard, Joshua Bambrick, et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature, pages 1–3, 2024

work page 2024
[2]

Supporting student decisions on learning recommendations: An llm-based chatbot with knowl- edge graph contextualization for conversational explainability and mentoring

Hasan Abu-Rasheed, Mohamad Hussam Abdulsalam, Christian Weber, and Madjid Fathi. Supporting student decisions on learning recommendations: An llm-based chatbot with knowl- edge graph contextualization for conversational explainability and mentoring. arXiv preprint arXiv:2401.08517, 2024

work page arXiv 2024
[3]

Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training

Oshin Agarwal, Heming Ge, Siamak Shakeri, and Rami Al-Rfou. Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training. arXiv preprint arXiv:2010.12688, 2020

work page arXiv 2010
[4]

On edge classification in networks with structure and content

Charu C Aggarwal, Yao Li, S Yu Philip, and Yuchen Zhao. On edge classification in networks with structure and content. In 2017 IEEE 33rd international conference on data engineering (ICDE), pages 187–190. IEEE, 2017

work page 2017
[5]

Learning Role-based Graph Embeddings

Nesreen K Ahmed, Ryan Rossi, John Boaz Lee, Theodore L Willke, Rong Zhou, Xiang- nan Kong, and Hoda Eldardiry. Learning role-based graph embeddings. arXiv preprint arXiv:1802.02896, 2018. 53

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

Named entity extraction for knowledge graphs: A literature overview

Tareq Al-Moslmi, Marc Gallofré Ocaña, Andreas L Opdahl, and Csaba Veres. Named entity extraction for knowledge graphs: A literature overview. IEEE Access, 8:32862–32881, 2020

work page 2020
[7]

Statistical mechanics of complex networks

Réka Albert and Albert-László Barabási. Statistical mechanics of complex networks. Reviews of modern physics, 74(1):47, 2002

work page 2002
[8]

Graph-based text classification: learn from your neighbors

Ralitsa Angelova and Gerhard Weikum. Graph-based text classification: learn from your neighbors. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 485–492, 2006

work page 2006
[9]

Joint embeddings for graph instruction tuning

Vlad Argatu, Aaron Haag, and Oliver Lohse. Joint embeddings for graph instruction tuning. arXiv preprint arXiv:2405.20684, 2024

work page arXiv 2024
[10]

Learning to retrieve reasoning paths over wikipedia graph for question answering

Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, and Caiming Xiong. Learning to retrieve reasoning paths over wikipedia graph for question answering. arXiv preprint arXiv:1911.10470, 2019

work page arXiv 1911
[11]

Retrieval-based language models and applications

Akari Asai, Sewon Min, Zexuan Zhong, and Danqi Chen. Retrieval-based language models and applications. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts), pages 41–46, 2023

work page 2023
[12]

Query expansion techniques for information retrieval: a survey

Hiteshwar Kumar Azad and Akshay Deepak. Query expansion techniques for information retrieval: a survey. Information Processing & Management, 56(5):1698–1735, 2019

work page 2019
[13]

The pitfalls of next-token prediction

Gregor Bachmann and Vaishnavh Nagarajan. The pitfalls of next-token prediction. arXiv preprint arXiv:2403.06963, 2024

work page arXiv 2024
[14]

Knowledge-augmented language model prompting for zero-shot knowledge graph question answering

Jinheon Baek, Alham Fikri Aji, and Amir Saffari. Knowledge-augmented language model prompting for zero-shot knowledge graph question answering. arXiv preprint arXiv:2306.04136, 2023

work page arXiv 2023
[15]

Atj-net: Auto- table-join network for automatic learning on relational databases

Jinze Bai, Jialin Wang, Zhao Li, Donghui Ding, Ji Zhang, and Jun Gao. Atj-net: Auto- table-join network for automatic learning on relational databases. In Proceedings of the Web Conference 2021, pages 1540–1551, 2021

work page 2021
[16]

Hypergraph convolution and hypergraph attention

Song Bai, Feihu Zhang, and Philip HS Torr. Hypergraph convolution and hypergraph attention. Pattern Recognition, 110:107637, 2021

work page 2021
[17]

Abstract meaning representation for sembanking

Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, and Nathan Schneider. Abstract meaning representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, pages 178–186, 2013

work page 2013
[18]

Emergence of scaling in random networks

Albert-László Barabási and Réka Albert. Emergence of scaling in random networks. science, 286(5439):509–512, 1999

work page 1999
[19]

Graph convolution for semi- supervised classification: Improved linear separability and out-of-distribution generalization

Aseem Baranwal, Kimon Fountoulakis, and Aukosh Jagannath. Graph convolution for semi- supervised classification: Improved linear separability and out-of-distribution generalization. arXiv preprint arXiv:2102.06966, 2021

work page arXiv 2021
[20]

Large generative ai models for telecom: The next big thing? IEEE Communications Magazine, 2024

Lina Bariah, Qiyang Zhao, Hang Zou, Yu Tian, Faouzi Bader, and Merouane Debbah. Large generative ai models for telecom: The next big thing? IEEE Communications Magazine, 2024

work page 2024
[21]

Seven failure points when engineering a retrieval augmented generation system

Scott Barnett, Stefanus Kurniawan, Srikanth Thudumu, Zach Brannelly, and Mohamed Ab- delrazek. Seven failure points when engineering a retrieval augmented generation system. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI, pages 194–199, 2024

work page 2024
[22]

The technological landscape and applications of single-cell multi-omics

Alev Baysoy, Zhiliang Bai, Rahul Satija, and Rong Fan. The technological landscape and applications of single-cell multi-omics. Nature Reviews Molecular Cell Biology , 24(10): 695–713, 2023

work page 2023
[23]

Tabgraphs: A benchmark and strong baselines for learning on graphs with tabular node features

Gleb Bazhenov, Oleg Platonov, and Liudmila Prokhorenkova. Tabgraphs: A benchmark and strong baselines for learning on graphs with tabular node features. arXiv e-prints, pages arXiv–2409, 2024. 54

work page 2024
[24]

Graph of thoughts: Solving elaborate problems with large language models

Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michal Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, et al. Graph of thoughts: Solving elaborate problems with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 17682–17690, 2024

work page 2024
[25]

The unified medical language system (umls): integrating biomedical terminology

Olivier Bodenreider. The unified medical language system (umls): integrating biomedical terminology. Nucleic acids research, 32(suppl_1):D267–D270, 2004

work page 2004
[26]

Weisfeiler and lehman go cellular: Cw networks

Cristian Bodnar, Fabrizio Frasca, Nina Otter, Yuguang Wang, Pietro Lio, Guido F Montufar, and Michael Bronstein. Weisfeiler and lehman go cellular: Cw networks. Advances in neural information processing systems, 34:2625–2640, 2021

work page 2021
[27]

Freebase: a collaboratively created graph database for structuring human knowledge

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247–1250, 2008

work page 2008
[28]

A review of biomedical datasets relating to drug discovery: a knowledge graph perspective

Stephen Bonner, Ian P Barrett, Cheng Ye, Rowan Swiers, Ola Engkvist, Andreas Bender, Charles Tapley Hoyt, and William L Hamilton. A review of biomedical datasets relating to drug discovery: a knowledge graph perspective. Briefings in Bioinformatics, 23(6):bbac404, 2022

work page 2022
[29]

Transposition of native chromatin for multimodal regulatory analysis and personal epigenomics

Jason D Buenrostro, Paul G Giresi, Lisa C Zaba, Howard Y Chang, and William J Greenleaf. Transposition of native chromatin for multimodal regulatory analysis and personal epigenomics. Nature methods, 10(12):1213, 2013

work page 2013
[30]

Preliminary study on the construction of chinese medical knowledge graph

Odma Byambasuren, Yunfei Yang, Zhifang Sui, Damai Dai, Baobao Chang, Sujian Li, and Hongying Zan. Preliminary study on the construction of chinese medical knowledge graph. Journal of Chinese Information Processing, 33(10):1–9, 2019

work page 2019
[31]

Xai meets llms: A survey of the relation between explainable ai and large language models

Erik Cambria, Lorenzo Malandri, Fabio Mercorio, Navid Nobani, and Andrea Seveso. Xai meets llms: A survey of the relation between explainable ai and large language models. arXiv preprint arXiv:2407.15248, 2024

work page arXiv 2024
[32]

A graph based analysis of leak localization in urban water networks

Antonio Candelieri, Dante Conti, and Francesco Archetti. A graph based analysis of leak localization in urban water networks. Procedia Engineering, 70:228–237, 2014

work page 2014
[33]

Random geometric graphs, 2005

Chris Cannings. Random geometric graphs, 2005

work page 2005
[34]

Relational data imputation with graph neural networks

Riccardo Cappuzzo, Saravanan Thirumuruganathan, and Paolo Papotti. Relational data imputation with graph neural networks. In EDBT/ICDT 2024, 27th International Conference on Extending Database Technology, 2024

work page 2024
[35]

The use of mmr, diversity-based reranking for reordering documents and producing summaries

Jaime Carbonell and Jade Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval , pages 335–336, 1998

work page 1998
[36]

Graphllm: Boosting graph reasoning ability of large language model

Ziwei Chai, Tianjie Zhang, Liang Wu, Kaiqiao Han, Xiaohai Hu, Xuanwen Huang, and Yang Yang. Graphllm: Boosting graph reasoning ability of large language model. arXiv preprint arXiv:2310.05845, 2023

work page arXiv 2023
[37]

Building a knowledge graph to enable precision medicine

Payal Chandak, Kexin Huang, and Marinka Zitnik. Building a knowledge graph to enable precision medicine. Scientific Data, 10(1):67, 2023

work page 2023
[38]

Path-based explanation for knowledge graph completion

Heng Chang, Jiangnan Ye, Alejo Lopez-Avila, Jinhua Du, and Jia Li. Path-based explanation for knowledge graph completion. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 231–242, 2024

work page 2024
[39]

Fairness-aware graph neural networks: A survey

April Chen, Ryan A Rossi, Namyong Park, Puja Trivedi, Yu Wang, Tong Yu, Sungchul Kim, Franck Dernoncourt, and Nesreen K Ahmed. Fairness-aware graph neural networks: A survey. ACM Transactions on Knowledge Discovery from Data, 18(6):1–23, 2024. 55

work page 2024
[40]

Autokg: Efficient automated knowledge graph generation for language models

Bohan Chen and Andrea L Bertozzi. Autokg: Efficient automated knowledge graph generation for language models. In 2023 IEEE International Conference on Big Data (BigData), pages 3117–3126. IEEE, 2023

work page 2023
[41]

Sur- vey and open problems in privacy-preserving knowledge graph: merging, query, representation, completion, and applications

Chaochao Chen, Fei Zheng, Jamie Cui, Yuwei Cao, Guanfeng Liu, Jia Wu, and Jun Zhou. Sur- vey and open problems in privacy-preserving knowledge graph: merging, query, representation, completion, and applications. International Journal of Machine Learning and Cybernetics, pages 1–20, 2024

work page 2024
[42]

Rich knowledge sources bring complex knowledge conflicts: Recalibrating models to reflect conflicting evidence

Hung-Ting Chen, Michael JQ Zhang, and Eunsol Choi. Rich knowledge sources bring complex knowledge conflicts: Recalibrating models to reflect conflicting evidence. arXiv preprint arXiv:2210.13701, 2022

work page arXiv 2022
[43]

Understanding retrieval augmentation for long-form question answering

Hung-Ting Chen, Fangyuan Xu, Shane A Arora, and Eunsol Choi. Understanding retrieval augmentation for long-form question answering. arXiv preprint arXiv:2310.12150, 2023

work page arXiv 2023
[44]

Gedi: A graph-based end-to- end data imputation framework

Katrina Chen, Xiuqin Liang, Zheng Ma, and Zhibin Zhang. Gedi: A graph-based end-to- end data imputation framework. In 2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI), pages 723–730. IEEE, 2023

work page 2023
[45]

Sgsum: transform- ing multi-document summarization into sub-graph selection

Moye Chen, Wei Li, Jiachen Liu, Xinyan Xiao, Hua Wu, and Haifeng Wang. Sgsum: transform- ing multi-document summarization into sub-graph selection. arXiv preprint arXiv:2110.12645, 2021

work page arXiv 2021
[46]

Hytrel: Hypergraph-enhanced tabular data representation learning

Pei Chen, Soumajyoti Sarkar, Leonard Lausen, Balasubramaniam Srinivasan, Sheng Zha, Ruihong Huang, and George Karypis. Hytrel: Hypergraph-enhanced tabular data representation learning. Advances in Neural Information Processing Systems, 36, 2024

work page 2024
[47]

Knowedu: A system to construct knowledge graph for education

Penghe Chen, Yu Lu, Vincent W Zheng, Xiyang Chen, and Boda Yang. Knowedu: A system to construct knowledge graph for education. Ieee Access, 6:31553–31563, 2018

work page 2018
[48]

Improving commonsense question answering by graph-based iterative retrieval over multiple knowledge sources

Qianglong Chen, Feng Ji, Haiqing Chen, and Yin Zhang. Improving commonsense question answering by graph-based iterative retrieval over multiple knowledge sources. arXiv preprint arXiv:2011.02705, 2020

work page arXiv 2011
[49]

Llaga: Large language and graph assistant

Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, and Zhangyang Wang. Llaga: Large language and graph assistant. arXiv preprint arXiv:2402.08170, 2024

work page arXiv 2024
[50]

Xgboost: A scalable tree boosting system

Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining , pages 785–794, 2016

work page 2016
[51]

AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chen Qian, Chi-Min Chan, Yujia Qin, Yaxi Lu, Ruobing Xie, et al. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents. arXiv preprint arXiv:2308.10848, 2(4):6, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[52]

Graph generative pre-trained transformer

Xiaohui Chen, Yinkai Wang, Jiaxing He, Yuanqi Du, Soha Hassoun, Xiaolin Xu, and Li-Ping Liu. Graph generative pre-trained transformer. arXiv preprint arXiv:2501.01073, 2025

work page arXiv 2025
[53]

A review: Knowledge reasoning over knowledge graph

Xiaojun Chen, Shengbin Jia, and Yang Xiang. A review: Knowledge reasoning over knowledge graph. Expert systems with applications, 141:112948, 2020

work page 2020
[54]

Analyze, generate and refine: Query expansion with llms for zero-shot open-domain qa

Xinran Chen, Xuanang Chen, Ben He, Tengfei Wen, and Le Sun. Analyze, generate and refine: Query expansion with llms for zero-shot open-domain qa. In Findings of the Association for Computational Linguistics ACL 2024, pages 11908–11922, 2024

work page 2024
[55]

Label-free node classification on graphs with large language models (llms).arXiv preprint arXiv:2310.04668, 2023

Zhikai Chen, Haitao Mao, Hongzhi Wen, Haoyu Han, Wei Jin, Haiyang Zhang, Hui Liu, and Jiliang Tang. Label-free node classification on graphs with large language models (llms).arXiv preprint arXiv:2310.04668, 2023

work page arXiv 2023
[56]

Exploring the potential of large language models (llms) in learning on graphs

Zhikai Chen, Haitao Mao, Hang Li, Wei Jin, Hongzhi Wen, Xiaochi Wei, Shuaiqiang Wang, Dawei Yin, Wenqi Fan, Hui Liu, et al. Exploring the potential of large language models (llms) in learning on graphs. ACM SIGKDD Explorations Newsletter, 25(2):42–61, 2024. 56

work page 2024
[57]

Anti- money laundering by group-aware deep graph learning

Dawei Cheng, Yujia Ye, Sheng Xiang, Zhenwei Ma, Ying Zhang, and Changjun Jiang. Anti- money laundering by group-aware deep graph learning. IEEE Transactions on Knowledge and Data Engineering, 35(12):12444–12457, 2023

work page 2023
[58]

Structure guided prompt: Instructing large language model in multi-step reasoning by exploring graph structure of the text

Kewei Cheng, Nesreen K Ahmed, Theodore Willke, and Yizhou Sun. Structure guided prompt: Instructing large language model in multi-step reasoning by exploring graph structure of the text. arXiv preprint arXiv:2402.13415, 2024

work page arXiv 2024
[59]

Multi-hop question answering under temporal knowledge editing

Keyuan Cheng, Gang Lin, Haoyang Fei, Lu Yu, Muhammad Asif Ali, Lijie Hu, Di Wang, et al. Multi-hop question answering under temporal knowledge editing. arXiv preprint arXiv:2404.00492, 2024

work page arXiv 2024
[60]

scgac: a graph attentional architecture for clustering single-cell rna-seq data

Yi Cheng and Xiuli Ma. scgac: a graph attentional architecture for clustering single-cell rna-seq data. Bioinformatics, 38(8):2187–2193, 2022

work page 2022
[61]

Hitab: A hierarchical table dataset for question answering and natural language generation

Zhoujun Cheng, Haoyu Dong, Zhiruo Wang, Ran Jia, Jiaqi Guo, Yan Gao, Shi Han, Jian-Guang Lou, and Dongmei Zhang. Hitab: A hierarchical table dataset for question answering and natural language generation. arXiv preprint arXiv:2108.06712, 2021

work page arXiv 2021
[62]

Complex logical reasoning over knowledge graphs using large language models

Nurendra Choudhary and Chandan K Reddy. Complex logical reasoning over knowledge graphs using large language models. arXiv preprint arXiv:2305.01157, 2023

work page arXiv 2023
[63]

Connecting the dots: Document- level neural relation extraction with edge-oriented graphs

Fenia Christopoulou, Makoto Miwa, and Sophia Ananiadou. Connecting the dots: Document- level neural relation extraction with edge-oriented graphs. arXiv preprint arXiv:1909.00228, 2019

work page arXiv 1909
[64]

Gnn-based embedding for clustering scrna-seq data

Madalina Ciortan and Matthieu Defrance. Gnn-based embedding for clustering scrna-seq data. Bioinformatics, 38(4):1037–1044, 2022

work page 2022
[65]

Training Verifiers to Solve Math Word Problems

Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, et al. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[66]

Relational database: A practical foundation for productivity

Edgar F Codd. Relational database: A practical foundation for productivity. In ACM Turing award lectures, page 1981. 2007

work page 1981
[67]

Latent-graph learning for disease prediction

Luca Cosmo, Anees Kazi, Seyed-Ahmad Ahmadi, Nassir Navab, and Michael Bronstein. Latent-graph learning for disease prediction. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part II 23, pages 643–653. Springer, 2020

work page 2020
[68]

On positional and structural node features for graph neural networks on non-attributed graphs

Hejie Cui, Zijie Lu, Pan Li, and Carl Yang. On positional and structural node features for graph neural networks on non-attributed graphs. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 3898–3902, 2022

work page 2022
[69]

More: Multi-modal retrieval augmented generative commonsense reasoning

Wanqing Cui, Keping Bi, Jiafeng Guo, and Xueqi Cheng. More: Multi-modal retrieval augmented generative commonsense reasoning. arXiv preprint arXiv:2402.13625, 2024

work page arXiv 2024
[70]

Relational data embeddings for feature enrichment with background information

Alexis Cvetkov-Iliev, Alexandre Allauzen, and Gaël Varoquaux. Relational data embeddings for feature enrichment with background information. Machine Learning, 112(2):687–720, 2023

work page 2023
[71]

Supervised learning on relational databases with graph neural networks

Milan Cvitkovic. Supervised learning on relational databases with graph neural networks. arXiv preprint arXiv:2002.02046, 2020

work page arXiv 2002
[72]

Counter-intuitive: Large language models can better understand knowledge graphs than we thought

Xinbang Dai, Yuncheng Hua, Tongtong Wu, Yang Sheng, and Guilin Qi. Counter-intuitive: Large language models can better understand knowledge graphs than we thought. arXiv preprint arXiv:2402.11541, 2024

work page arXiv 2024
[73]

Revisiting the graph reasoning ability of large language models: Case studies in translation, connectivity and shortest path

Xinnan Dai, Qihao Wen, Yifei Shen, Hongzhi Wen, Dongsheng Li, Jiliang Tang, and Caihua Shan. Revisiting the graph reasoning ability of large language models: Case studies in translation, connectivity and shortest path. arXiv preprint arXiv:2408.09529, 2024

work page arXiv 2024
[74]

Security and privacy challenges of large language models: A survey

Badhan Chandra Das, M Hadi Amini, and Yanzhao Wu. Security and privacy challenges of large language models: A survey. arXiv preprint arXiv:2402.00888, 2024. 57

work page arXiv 2024
[75]

Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning

Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, and Andrew McCallum. Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning. In International Conference on Learning Representations, 2018

work page 2018
[76]

, title =

Nicola De Cao, Wilker Aziz, and Ivan Titov. Question answering by reasoning across documents with graph convolutional networks. arXiv preprint arXiv:1808.09920, 2018

work page arXiv 2018
[77]

A review of modern recommender systems using generative models (gen-recsys)

Yashar Deldjoo, Zhankui He, Julian McAuley, Anton Korikov, Scott Sanner, Arnau Ramisa, René Vidal, Maheswaran Sathiamoorthy, Atoosa Kasirzadeh, and Silvia Milano. A review of modern recommender systems using generative models (gen-recsys). arXiv preprint arXiv:2404.00579, 2024

work page arXiv 2024
[78]

Graph-based retriever captures the long tail of biomedical knowledge

Julien Delile, Srayanta Mukherjee, Anton Van Pamel, and Leonid Zhukov. Graph-based retriever captures the long tail of biomedical knowledge. arXiv preprint arXiv:2402.12352, 2024

work page arXiv 2024
[79]

Pandora: Jailbreak gpts by retrieval augmented generation poisoning, 2024

Gelei Deng, Yi Liu, Kailong Wang, Yuekang Li, Tianwei Zhang, and Yang Liu. Pandora: Jailbreak gpts by retrieval augmented generation poisoning, 2024. URL https://arxiv. org/abs/2402.08416

work page arXiv 2024
[80]

Contextual stochastic block models

Yash Deshpande, Subhabrata Sen, Andrea Montanari, and Elchanan Mossel. Contextual stochastic block models. Advances in Neural Information Processing Systems, 31, 2018

work page 2018

Showing first 80 references.