pith. machine review for the scientific record. sign in

arxiv: 2501.00309 · v2 · pith:W2GAWQYInew · submitted 2024-12-31 · 💻 cs.IR · cs.CL· cs.LG

Retrieval-Augmented Generation with Graphs (GraphRAG)

Pith reviewed 2026-05-18 04:29 UTC · model grok-4.3

classification 💻 cs.IR cs.CLcs.LG
keywords GraphRAGRetrieval-Augmented GenerationGraph-structured knowledgeSurveyDomain-specific techniquesRelational data retrieval
0
0 comments X

The pith

GraphRAG needs its own framework because graph data carries relational patterns that standard retrieval methods do not handle uniformly across domains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The survey organizes recent work on equipping retrieval-augmented generation with graphs. Graphs link entities through edges and thereby supply structured context that improves downstream tasks such as question answering and recommendation. The authors first lay out a shared pipeline with five parts: query processor, retriever, organizer, generator, and data source. They then collect the distinct adaptations required by different application areas whose graphs follow their own relational rules. The result is both a reference map for practitioners and a list of open problems that cut across fields.

Core claim

The paper proposes a holistic GraphRAG framework whose five components are the query processor, the retriever, the organizer, the generator, and the data source; it then reviews the specialized techniques that each domain applies to exploit its particular graph patterns.

What carries the argument

The five-component GraphRAG pipeline that separates query handling, retrieval, organization, generation, and the underlying graph data source.

If this is right

  • Practitioners in any single domain can adopt the shared pipeline and then insert only the techniques already catalogued for that domain.
  • Cross-domain comparisons become possible once every system is described with the same five components.
  • New GraphRAG work can be positioned by stating which of the five components it modifies and which domain patterns it targets.
  • Identified challenges in query processing and organization supply concrete targets for the next round of technical papers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The organizer component may become a natural place to insert graph-summarization or compression steps that current RAG pipelines lack.
  • Benchmarks that test the same task across multiple graph domains would make the survey's domain distinctions directly measurable.
  • The framework could be extended to dynamic graphs whose edges arrive over time, an aspect left implicit in the static-data focus of most reviewed work.

Load-bearing premise

That the relational structure and domain-specific formatting of graphs create challenges distinct enough to require separate designs rather than a single uniform approach.

What would settle it

An experiment in which one unmodified neural-embedding retriever and generator achieves comparable accuracy on graph data drawn from several unrelated domains without any domain-specific organizer or data-source adjustments.

read the original abstract

Retrieval-augmented generation (RAG) is a powerful technique that enhances downstream task execution by retrieving additional information, such as knowledge, skills, and tools from external sources. Graph, by its intrinsic "nodes connected by edges" nature, encodes massive heterogeneous and relational information, making it a golden resource for RAG in tremendous real-world applications. As a result, we have recently witnessed increasing attention on equipping RAG with Graph, i.e., GraphRAG. However, unlike conventional RAG, where the retriever, generator, and external data sources can be uniformly designed in the neural-embedding space, the uniqueness of graph-structured data, such as diverse-formatted and domain-specific relational knowledge, poses unique and significant challenges when designing GraphRAG for different domains. Given the broad applicability, the associated design challenges, and the recent surge in GraphRAG, a systematic and up-to-date survey of its key concepts and techniques is urgently desired. Following this motivation, we present a comprehensive and up-to-date survey on GraphRAG. Our survey first proposes a holistic GraphRAG framework by defining its key components, including query processor, retriever, organizer, generator, and data source. Furthermore, recognizing that graphs in different domains exhibit distinct relational patterns and require dedicated designs, we review GraphRAG techniques uniquely tailored to each domain. Finally, we discuss research challenges and brainstorm directions to inspire cross-disciplinary opportunities. Our survey repository is publicly maintained at https://github.com/Graph-RAG/GraphRAG/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript surveys Retrieval-Augmented Generation with Graphs (GraphRAG). It argues that graphs' node-edge structure makes them particularly suitable for RAG due to their ability to encode heterogeneous and relational information. The authors propose a holistic framework by defining key components including the query processor, retriever, organizer, generator, and data source. They review GraphRAG techniques tailored to different domains and discuss research challenges along with future directions. The survey is supported by a publicly maintained GitHub repository.

Significance. This work is significant for organizing the growing literature on GraphRAG and providing a common framework that can guide research in information retrieval and related areas. The domain-specific review and challenge discussion could promote cross-disciplinary applications. The public repository is a notable strength for ensuring the survey remains current and for enabling reproducibility of the literature collection.

minor comments (3)
  1. [Abstract] The statement that graph-structured data poses 'unique and significant challenges' is central to the motivation; consider adding a short illustrative example of such a challenge to make the distinction from standard RAG more concrete.
  2. [Introduction] The claim of a 'recent surge in GraphRAG' would be strengthened by including a brief citation analysis or count of recent papers.
  3. The repository link is provided, but the manuscript should specify what resources (e.g., paper list, code) are available there to aid readers.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of our survey on Retrieval-Augmented Generation with Graphs (GraphRAG). We appreciate the recognition of its significance in organizing the growing literature, proposing a holistic framework with components such as the query processor, retriever, organizer, generator, and data source, as well as the domain-tailored reviews, challenges, and future directions. The public GitHub repository is also noted as a strength. We accept the recommendation for minor revision and stand ready to incorporate any specific suggestions.

Circularity Check

0 steps flagged

No significant circularity in this literature survey

full rationale

This paper is a survey that organizes existing GraphRAG literature into a proposed holistic framework (query processor, retriever, organizer, generator, data source) and domain-specific reviews, without any mathematical derivations, equations, fitted parameters, or predictions that reduce to the paper's own inputs by construction. The motivation regarding unique challenges of graph-structured data is presented as background for the survey's organization rather than a load-bearing derived claim. No self-citation chains, uniqueness theorems, or ansatzes are invoked in a manner that would make the central contribution circular; the work explicitly references an external surge in papers and a public GitHub repository as external context.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper. No free parameters, mathematical axioms, or invented entities are introduced in the sense of original technical derivations. The framework is a high-level organizational construct drawn from existing components in the RAG and graph literature.

pith-pipeline@v0.9.0 · 5869 in / 1180 out tokens · 33542 ms · 2026-05-18T04:29:19.362133+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 19 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation

    cs.AI 2026-04 unverdicted novelty 7.0

    XGRAG uses graph perturbations to quantify component contributions in GraphRAG and achieves 14.81% better explanation quality than text-based baselines on QA datasets, with correlations to graph centrality.

  2. BizCompass: Benchmarking the Reasoning Capabilities of LLMs in Business Knowledge and Applications

    cs.CE 2026-04 unverdicted novelty 7.0

    BizCompass is a dual-axis benchmark evaluating LLMs on business knowledge in finance, economics, statistics, and operations management, linked to analyst, trader, and consultant roles, with public datasets released af...

  3. Do We Still Need GraphRAG? Benchmarking RAG and GraphRAG for Agentic Search Systems

    cs.IR 2026-04 unverdicted novelty 7.0

    Agentic search narrows the gap between dense RAG and GraphRAG but does not remove GraphRAG's advantage on complex multi-hop reasoning.

  4. Hypergraph Enterprise Agentic Reasoner over Heterogeneous Business Systems

    cs.AI 2026-05 unverdicted novelty 6.0

    HEAR uses a stratified hypergraph ontology to orchestrate evidence-driven multi-hop reasoning over heterogeneous business systems, reaching 94.7% accuracy on supply-chain root-cause tasks with open-weight models.

  5. Why Retrieval-Augmented Generation Fails: A Graph Perspective

    cs.CL 2026-05 unverdicted novelty 6.0

    Attribution graphs reveal that RAG failures arise from shallow fragmented evidence flow in LLMs, enabling topology-based detection and targeted interventions that reinforce question-guided routing.

  6. LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation

    cs.IR 2026-05 unverdicted novelty 6.0

    LARAG improves RAG answer quality on hyperlinked technical documentation by using author-defined links for retrieval, achieving higher BERTScore while using fewer chunks and tokens than standard embedding-based RAG.

  7. NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

    cs.IR 2026-04 unverdicted novelty 6.0

    NeocorRAG uses Evidence Chains to achieve SOTA retrieval quality in RAG on HotpotQA, 2WikiMultiHopQA, MuSiQue, and NQ for 3B and 70B models while using under 20% of the tokens of comparable methods.

  8. Graph-to-Frame RAG: Visual-Space Knowledge Fusion for Training-Free and Auditable Video Reasoning

    cs.CV 2026-04 unverdicted novelty 6.0

    G2F-RAG converts retrieved knowledge subgraphs into a single visual reasoning frame appended to videos, enabling training-free and interpretable improvements for LMM-based video reasoning on knowledge-intensive tasks.

  9. Memory in the LLM Era: Modular Architectures and Strategies in a Unified Framework

    cs.CL 2026-04 unverdicted novelty 6.0

    A unified framework for LLM agent memory is benchmarked, with a new hybrid method outperforming state-of-the-art on standard tasks.

  10. HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling

    cs.AI 2026-02 unverdicted novelty 6.0

    HyMem introduces dual-granular memory storage with a lightweight summary module for fast responses and selective activation of a deep LLM module for complex queries, outperforming full-context baselines by 92.6% lower...

  11. UnWeaving the knots of GraphRAG -- turns out VectorRAG is almost enough

    cs.IR 2026-02 unverdicted novelty 6.0

    UnWeaver disentangles documents into entities via LLM to retrieve original chunks, yielding a simpler alternative to GraphRAG that still reduces noise and preserves source fidelity.

  12. Automatic Ontology Construction Using LLMs as an External Layer of Memory, Verification, and Planning for Hybrid Intelligent Systems

    cs.AI 2026-04 unverdicted novelty 5.0

    A hybrid system augments LLMs with an automated external RDF/OWL ontology layer for long-term memory, SHACL/OWL validation, and improved multi-step reasoning on tasks like Tower of Hanoi.

  13. Injecting Structured Biomedical Knowledge into Language Models: Continual Pretraining vs. GraphRAG

    cs.CL 2026-04 unverdicted novelty 5.0

    Continual pretraining on UMLS-derived text improves BERT on BLURB biomedical tasks while GraphRAG boosts LLaMA 3-8B accuracy by over 3 points on PubMedQA and 5 on BioASQ without retraining.

  14. Search-R3: Unifying Reasoning and Embedding in Large Language Models

    cs.CL 2025-10 unverdicted novelty 5.0

    Search-R3 trains LLMs to output search embeddings as a direct product of step-by-step reasoning via supervised pre-training and a specialized RL environment that avoids full corpus re-encoding.

  15. G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge

    cs.AI 2025-09 unverdicted novelty 5.0

    G-reasoner uses QuadGraph abstraction and a 34M-parameter graph foundation model integrated with LLMs to enable scalable reasoning over diverse graph-structured knowledge, outperforming baselines on six benchmarks.

  16. AssemPlanner: A Multi-Agent Based Task Planning Framework for Flexible Assembly System

    cs.RO 2026-05 unverdicted novelty 4.0

    AssemPlanner is a ReAct-based multi-agent system that autonomously generates production plans from natural language inputs by integrating scheduling, knowledge, line balancing, and scene graph feedback.

  17. Beyond the Parameters: A Technical Survey of Contextual Enrichment in Large Language Models: From In-Context Prompting to Causal Retrieval-Augmented Generation

    cs.CL 2026-04 unverdicted novelty 4.0

    The survey unifies LLM augmentation techniques along the single axis of structured context supplied at inference time and supplies a literature screening protocol plus deployment decision framework.

  18. Sentra-Guard: A Real-Time Multilingual Defense Against Adversarial LLM Prompts

    cs.CR 2025-10 unverdicted novelty 4.0

    Sentra-Guard reports 99.96% detection of adversarial LLM prompts with AUC 1.00 and ASR of 0.004% using a hybrid SBERT-FAISS and transformer classifier architecture with multilingual translation and human feedback.

  19. Development and Preliminary Evaluation of a Domain-Specific Large Language Model for Tuberculosis Care in South Africa

    cs.CL 2026-03 unverdicted novelty 3.0

    A domain-specific LLM for TB care in South Africa, created by fine-tuning BioMistral-7B with QLoRA and GraphRAG on local guidelines, shows improved contextual alignment over the base model.

Reference graph

Works this paper leans on

299 extracted references · 299 canonical work pages · cited by 19 Pith papers · 20 internal anchors

  1. [1]

    Accurate structure prediction of biomolecular interactions with alphafold 3

    Josh Abramson, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneberger, Lindsay Willmore, Andrew J Ballard, Joshua Bambrick, et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature, pages 1–3, 2024

  2. [2]

    Supporting student decisions on learning recommendations: An llm-based chatbot with knowl- edge graph contextualization for conversational explainability and mentoring

    Hasan Abu-Rasheed, Mohamad Hussam Abdulsalam, Christian Weber, and Madjid Fathi. Supporting student decisions on learning recommendations: An llm-based chatbot with knowl- edge graph contextualization for conversational explainability and mentoring. arXiv preprint arXiv:2401.08517, 2024

  3. [3]

    Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training

    Oshin Agarwal, Heming Ge, Siamak Shakeri, and Rami Al-Rfou. Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training. arXiv preprint arXiv:2010.12688, 2020

  4. [4]

    On edge classification in networks with structure and content

    Charu C Aggarwal, Yao Li, S Yu Philip, and Yuchen Zhao. On edge classification in networks with structure and content. In 2017 IEEE 33rd international conference on data engineering (ICDE), pages 187–190. IEEE, 2017

  5. [5]

    Learning Role-based Graph Embeddings

    Nesreen K Ahmed, Ryan Rossi, John Boaz Lee, Theodore L Willke, Rong Zhou, Xiang- nan Kong, and Hoda Eldardiry. Learning role-based graph embeddings. arXiv preprint arXiv:1802.02896, 2018. 53

  6. [6]

    Named entity extraction for knowledge graphs: A literature overview

    Tareq Al-Moslmi, Marc Gallofré Ocaña, Andreas L Opdahl, and Csaba Veres. Named entity extraction for knowledge graphs: A literature overview. IEEE Access, 8:32862–32881, 2020

  7. [7]

    Statistical mechanics of complex networks

    Réka Albert and Albert-László Barabási. Statistical mechanics of complex networks. Reviews of modern physics, 74(1):47, 2002

  8. [8]

    Graph-based text classification: learn from your neighbors

    Ralitsa Angelova and Gerhard Weikum. Graph-based text classification: learn from your neighbors. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 485–492, 2006

  9. [9]

    Joint embeddings for graph instruction tuning

    Vlad Argatu, Aaron Haag, and Oliver Lohse. Joint embeddings for graph instruction tuning. arXiv preprint arXiv:2405.20684, 2024

  10. [10]

    Learning to retrieve reasoning paths over wikipedia graph for question answering

    Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, and Caiming Xiong. Learning to retrieve reasoning paths over wikipedia graph for question answering. arXiv preprint arXiv:1911.10470, 2019

  11. [11]

    Retrieval-based language models and applications

    Akari Asai, Sewon Min, Zexuan Zhong, and Danqi Chen. Retrieval-based language models and applications. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts), pages 41–46, 2023

  12. [12]

    Query expansion techniques for information retrieval: a survey

    Hiteshwar Kumar Azad and Akshay Deepak. Query expansion techniques for information retrieval: a survey. Information Processing & Management, 56(5):1698–1735, 2019

  13. [13]

    The pitfalls of next-token prediction

    Gregor Bachmann and Vaishnavh Nagarajan. The pitfalls of next-token prediction. arXiv preprint arXiv:2403.06963, 2024

  14. [14]

    Knowledge-augmented language model prompting for zero-shot knowledge graph question answering

    Jinheon Baek, Alham Fikri Aji, and Amir Saffari. Knowledge-augmented language model prompting for zero-shot knowledge graph question answering. arXiv preprint arXiv:2306.04136, 2023

  15. [15]

    Atj-net: Auto- table-join network for automatic learning on relational databases

    Jinze Bai, Jialin Wang, Zhao Li, Donghui Ding, Ji Zhang, and Jun Gao. Atj-net: Auto- table-join network for automatic learning on relational databases. In Proceedings of the Web Conference 2021, pages 1540–1551, 2021

  16. [16]

    Hypergraph convolution and hypergraph attention

    Song Bai, Feihu Zhang, and Philip HS Torr. Hypergraph convolution and hypergraph attention. Pattern Recognition, 110:107637, 2021

  17. [17]

    Abstract meaning representation for sembanking

    Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, and Nathan Schneider. Abstract meaning representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, pages 178–186, 2013

  18. [18]

    Emergence of scaling in random networks

    Albert-László Barabási and Réka Albert. Emergence of scaling in random networks. science, 286(5439):509–512, 1999

  19. [19]

    Graph convolution for semi- supervised classification: Improved linear separability and out-of-distribution generalization

    Aseem Baranwal, Kimon Fountoulakis, and Aukosh Jagannath. Graph convolution for semi- supervised classification: Improved linear separability and out-of-distribution generalization. arXiv preprint arXiv:2102.06966, 2021

  20. [20]

    Large generative ai models for telecom: The next big thing? IEEE Communications Magazine, 2024

    Lina Bariah, Qiyang Zhao, Hang Zou, Yu Tian, Faouzi Bader, and Merouane Debbah. Large generative ai models for telecom: The next big thing? IEEE Communications Magazine, 2024

  21. [21]

    Seven failure points when engineering a retrieval augmented generation system

    Scott Barnett, Stefanus Kurniawan, Srikanth Thudumu, Zach Brannelly, and Mohamed Ab- delrazek. Seven failure points when engineering a retrieval augmented generation system. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI, pages 194–199, 2024

  22. [22]

    The technological landscape and applications of single-cell multi-omics

    Alev Baysoy, Zhiliang Bai, Rahul Satija, and Rong Fan. The technological landscape and applications of single-cell multi-omics. Nature Reviews Molecular Cell Biology , 24(10): 695–713, 2023

  23. [23]

    Tabgraphs: A benchmark and strong baselines for learning on graphs with tabular node features

    Gleb Bazhenov, Oleg Platonov, and Liudmila Prokhorenkova. Tabgraphs: A benchmark and strong baselines for learning on graphs with tabular node features. arXiv e-prints, pages arXiv–2409, 2024. 54

  24. [24]

    Graph of thoughts: Solving elaborate problems with large language models

    Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michal Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, et al. Graph of thoughts: Solving elaborate problems with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 17682–17690, 2024

  25. [25]

    The unified medical language system (umls): integrating biomedical terminology

    Olivier Bodenreider. The unified medical language system (umls): integrating biomedical terminology. Nucleic acids research, 32(suppl_1):D267–D270, 2004

  26. [26]

    Weisfeiler and lehman go cellular: Cw networks

    Cristian Bodnar, Fabrizio Frasca, Nina Otter, Yuguang Wang, Pietro Lio, Guido F Montufar, and Michael Bronstein. Weisfeiler and lehman go cellular: Cw networks. Advances in neural information processing systems, 34:2625–2640, 2021

  27. [27]

    Freebase: a collaboratively created graph database for structuring human knowledge

    Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247–1250, 2008

  28. [28]

    A review of biomedical datasets relating to drug discovery: a knowledge graph perspective

    Stephen Bonner, Ian P Barrett, Cheng Ye, Rowan Swiers, Ola Engkvist, Andreas Bender, Charles Tapley Hoyt, and William L Hamilton. A review of biomedical datasets relating to drug discovery: a knowledge graph perspective. Briefings in Bioinformatics, 23(6):bbac404, 2022

  29. [29]

    Transposition of native chromatin for multimodal regulatory analysis and personal epigenomics

    Jason D Buenrostro, Paul G Giresi, Lisa C Zaba, Howard Y Chang, and William J Greenleaf. Transposition of native chromatin for multimodal regulatory analysis and personal epigenomics. Nature methods, 10(12):1213, 2013

  30. [30]

    Preliminary study on the construction of chinese medical knowledge graph

    Odma Byambasuren, Yunfei Yang, Zhifang Sui, Damai Dai, Baobao Chang, Sujian Li, and Hongying Zan. Preliminary study on the construction of chinese medical knowledge graph. Journal of Chinese Information Processing, 33(10):1–9, 2019

  31. [31]

    Xai meets llms: A survey of the relation between explainable ai and large language models

    Erik Cambria, Lorenzo Malandri, Fabio Mercorio, Navid Nobani, and Andrea Seveso. Xai meets llms: A survey of the relation between explainable ai and large language models. arXiv preprint arXiv:2407.15248, 2024

  32. [32]

    A graph based analysis of leak localization in urban water networks

    Antonio Candelieri, Dante Conti, and Francesco Archetti. A graph based analysis of leak localization in urban water networks. Procedia Engineering, 70:228–237, 2014

  33. [33]

    Random geometric graphs, 2005

    Chris Cannings. Random geometric graphs, 2005

  34. [34]

    Relational data imputation with graph neural networks

    Riccardo Cappuzzo, Saravanan Thirumuruganathan, and Paolo Papotti. Relational data imputation with graph neural networks. In EDBT/ICDT 2024, 27th International Conference on Extending Database Technology, 2024

  35. [35]

    The use of mmr, diversity-based reranking for reordering documents and producing summaries

    Jaime Carbonell and Jade Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval , pages 335–336, 1998

  36. [36]

    Graphllm: Boosting graph reasoning ability of large language model

    Ziwei Chai, Tianjie Zhang, Liang Wu, Kaiqiao Han, Xiaohai Hu, Xuanwen Huang, and Yang Yang. Graphllm: Boosting graph reasoning ability of large language model. arXiv preprint arXiv:2310.05845, 2023

  37. [37]

    Building a knowledge graph to enable precision medicine

    Payal Chandak, Kexin Huang, and Marinka Zitnik. Building a knowledge graph to enable precision medicine. Scientific Data, 10(1):67, 2023

  38. [38]

    Path-based explanation for knowledge graph completion

    Heng Chang, Jiangnan Ye, Alejo Lopez-Avila, Jinhua Du, and Jia Li. Path-based explanation for knowledge graph completion. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 231–242, 2024

  39. [39]

    Fairness-aware graph neural networks: A survey

    April Chen, Ryan A Rossi, Namyong Park, Puja Trivedi, Yu Wang, Tong Yu, Sungchul Kim, Franck Dernoncourt, and Nesreen K Ahmed. Fairness-aware graph neural networks: A survey. ACM Transactions on Knowledge Discovery from Data, 18(6):1–23, 2024. 55

  40. [40]

    Autokg: Efficient automated knowledge graph generation for language models

    Bohan Chen and Andrea L Bertozzi. Autokg: Efficient automated knowledge graph generation for language models. In 2023 IEEE International Conference on Big Data (BigData), pages 3117–3126. IEEE, 2023

  41. [41]

    Sur- vey and open problems in privacy-preserving knowledge graph: merging, query, representation, completion, and applications

    Chaochao Chen, Fei Zheng, Jamie Cui, Yuwei Cao, Guanfeng Liu, Jia Wu, and Jun Zhou. Sur- vey and open problems in privacy-preserving knowledge graph: merging, query, representation, completion, and applications. International Journal of Machine Learning and Cybernetics, pages 1–20, 2024

  42. [42]

    Rich knowledge sources bring complex knowledge conflicts: Recalibrating models to reflect conflicting evidence

    Hung-Ting Chen, Michael JQ Zhang, and Eunsol Choi. Rich knowledge sources bring complex knowledge conflicts: Recalibrating models to reflect conflicting evidence. arXiv preprint arXiv:2210.13701, 2022

  43. [43]

    Understanding retrieval augmentation for long-form question answering

    Hung-Ting Chen, Fangyuan Xu, Shane A Arora, and Eunsol Choi. Understanding retrieval augmentation for long-form question answering. arXiv preprint arXiv:2310.12150, 2023

  44. [44]

    Gedi: A graph-based end-to- end data imputation framework

    Katrina Chen, Xiuqin Liang, Zheng Ma, and Zhibin Zhang. Gedi: A graph-based end-to- end data imputation framework. In 2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI), pages 723–730. IEEE, 2023

  45. [45]

    Sgsum: transform- ing multi-document summarization into sub-graph selection

    Moye Chen, Wei Li, Jiachen Liu, Xinyan Xiao, Hua Wu, and Haifeng Wang. Sgsum: transform- ing multi-document summarization into sub-graph selection. arXiv preprint arXiv:2110.12645, 2021

  46. [46]

    Hytrel: Hypergraph-enhanced tabular data representation learning

    Pei Chen, Soumajyoti Sarkar, Leonard Lausen, Balasubramaniam Srinivasan, Sheng Zha, Ruihong Huang, and George Karypis. Hytrel: Hypergraph-enhanced tabular data representation learning. Advances in Neural Information Processing Systems, 36, 2024

  47. [47]

    Knowedu: A system to construct knowledge graph for education

    Penghe Chen, Yu Lu, Vincent W Zheng, Xiyang Chen, and Boda Yang. Knowedu: A system to construct knowledge graph for education. Ieee Access, 6:31553–31563, 2018

  48. [48]

    Improving commonsense question answering by graph-based iterative retrieval over multiple knowledge sources

    Qianglong Chen, Feng Ji, Haiqing Chen, and Yin Zhang. Improving commonsense question answering by graph-based iterative retrieval over multiple knowledge sources. arXiv preprint arXiv:2011.02705, 2020

  49. [49]

    Llaga: Large language and graph assistant

    Runjin Chen, Tong Zhao, Ajay Jaiswal, Neil Shah, and Zhangyang Wang. Llaga: Large language and graph assistant. arXiv preprint arXiv:2402.08170, 2024

  50. [50]

    Xgboost: A scalable tree boosting system

    Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining , pages 785–794, 2016

  51. [51]

    AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

    Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chen Qian, Chi-Min Chan, Yujia Qin, Yaxi Lu, Ruobing Xie, et al. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents. arXiv preprint arXiv:2308.10848, 2(4):6, 2023

  52. [52]

    Graph generative pre-trained transformer

    Xiaohui Chen, Yinkai Wang, Jiaxing He, Yuanqi Du, Soha Hassoun, Xiaolin Xu, and Li-Ping Liu. Graph generative pre-trained transformer. arXiv preprint arXiv:2501.01073, 2025

  53. [53]

    A review: Knowledge reasoning over knowledge graph

    Xiaojun Chen, Shengbin Jia, and Yang Xiang. A review: Knowledge reasoning over knowledge graph. Expert systems with applications, 141:112948, 2020

  54. [54]

    Analyze, generate and refine: Query expansion with llms for zero-shot open-domain qa

    Xinran Chen, Xuanang Chen, Ben He, Tengfei Wen, and Le Sun. Analyze, generate and refine: Query expansion with llms for zero-shot open-domain qa. In Findings of the Association for Computational Linguistics ACL 2024, pages 11908–11922, 2024

  55. [55]

    Label-free node classification on graphs with large language models (llms).arXiv preprint arXiv:2310.04668, 2023

    Zhikai Chen, Haitao Mao, Hongzhi Wen, Haoyu Han, Wei Jin, Haiyang Zhang, Hui Liu, and Jiliang Tang. Label-free node classification on graphs with large language models (llms).arXiv preprint arXiv:2310.04668, 2023

  56. [56]

    Exploring the potential of large language models (llms) in learning on graphs

    Zhikai Chen, Haitao Mao, Hang Li, Wei Jin, Hongzhi Wen, Xiaochi Wei, Shuaiqiang Wang, Dawei Yin, Wenqi Fan, Hui Liu, et al. Exploring the potential of large language models (llms) in learning on graphs. ACM SIGKDD Explorations Newsletter, 25(2):42–61, 2024. 56

  57. [57]

    Anti- money laundering by group-aware deep graph learning

    Dawei Cheng, Yujia Ye, Sheng Xiang, Zhenwei Ma, Ying Zhang, and Changjun Jiang. Anti- money laundering by group-aware deep graph learning. IEEE Transactions on Knowledge and Data Engineering, 35(12):12444–12457, 2023

  58. [58]

    Structure guided prompt: Instructing large language model in multi-step reasoning by exploring graph structure of the text

    Kewei Cheng, Nesreen K Ahmed, Theodore Willke, and Yizhou Sun. Structure guided prompt: Instructing large language model in multi-step reasoning by exploring graph structure of the text. arXiv preprint arXiv:2402.13415, 2024

  59. [59]

    Multi-hop question answering under temporal knowledge editing

    Keyuan Cheng, Gang Lin, Haoyang Fei, Lu Yu, Muhammad Asif Ali, Lijie Hu, Di Wang, et al. Multi-hop question answering under temporal knowledge editing. arXiv preprint arXiv:2404.00492, 2024

  60. [60]

    scgac: a graph attentional architecture for clustering single-cell rna-seq data

    Yi Cheng and Xiuli Ma. scgac: a graph attentional architecture for clustering single-cell rna-seq data. Bioinformatics, 38(8):2187–2193, 2022

  61. [61]

    Hitab: A hierarchical table dataset for question answering and natural language generation

    Zhoujun Cheng, Haoyu Dong, Zhiruo Wang, Ran Jia, Jiaqi Guo, Yan Gao, Shi Han, Jian-Guang Lou, and Dongmei Zhang. Hitab: A hierarchical table dataset for question answering and natural language generation. arXiv preprint arXiv:2108.06712, 2021

  62. [62]

    Complex logical reasoning over knowledge graphs using large language models

    Nurendra Choudhary and Chandan K Reddy. Complex logical reasoning over knowledge graphs using large language models. arXiv preprint arXiv:2305.01157, 2023

  63. [63]

    Connecting the dots: Document- level neural relation extraction with edge-oriented graphs

    Fenia Christopoulou, Makoto Miwa, and Sophia Ananiadou. Connecting the dots: Document- level neural relation extraction with edge-oriented graphs. arXiv preprint arXiv:1909.00228, 2019

  64. [64]

    Gnn-based embedding for clustering scrna-seq data

    Madalina Ciortan and Matthieu Defrance. Gnn-based embedding for clustering scrna-seq data. Bioinformatics, 38(4):1037–1044, 2022

  65. [65]

    Training Verifiers to Solve Math Word Problems

    Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, et al. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021

  66. [66]

    Relational database: A practical foundation for productivity

    Edgar F Codd. Relational database: A practical foundation for productivity. In ACM Turing award lectures, page 1981. 2007

  67. [67]

    Latent-graph learning for disease prediction

    Luca Cosmo, Anees Kazi, Seyed-Ahmad Ahmadi, Nassir Navab, and Michael Bronstein. Latent-graph learning for disease prediction. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part II 23, pages 643–653. Springer, 2020

  68. [68]

    On positional and structural node features for graph neural networks on non-attributed graphs

    Hejie Cui, Zijie Lu, Pan Li, and Carl Yang. On positional and structural node features for graph neural networks on non-attributed graphs. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 3898–3902, 2022

  69. [69]

    More: Multi-modal retrieval augmented generative commonsense reasoning

    Wanqing Cui, Keping Bi, Jiafeng Guo, and Xueqi Cheng. More: Multi-modal retrieval augmented generative commonsense reasoning. arXiv preprint arXiv:2402.13625, 2024

  70. [70]

    Relational data embeddings for feature enrichment with background information

    Alexis Cvetkov-Iliev, Alexandre Allauzen, and Gaël Varoquaux. Relational data embeddings for feature enrichment with background information. Machine Learning, 112(2):687–720, 2023

  71. [71]

    Supervised learning on relational databases with graph neural networks

    Milan Cvitkovic. Supervised learning on relational databases with graph neural networks. arXiv preprint arXiv:2002.02046, 2020

  72. [72]

    Counter-intuitive: Large language models can better understand knowledge graphs than we thought

    Xinbang Dai, Yuncheng Hua, Tongtong Wu, Yang Sheng, and Guilin Qi. Counter-intuitive: Large language models can better understand knowledge graphs than we thought. arXiv preprint arXiv:2402.11541, 2024

  73. [73]

    Revisiting the graph reasoning ability of large language models: Case studies in translation, connectivity and shortest path

    Xinnan Dai, Qihao Wen, Yifei Shen, Hongzhi Wen, Dongsheng Li, Jiliang Tang, and Caihua Shan. Revisiting the graph reasoning ability of large language models: Case studies in translation, connectivity and shortest path. arXiv preprint arXiv:2408.09529, 2024

  74. [74]

    Security and privacy challenges of large language models: A survey

    Badhan Chandra Das, M Hadi Amini, and Yanzhao Wu. Security and privacy challenges of large language models: A survey. arXiv preprint arXiv:2402.00888, 2024. 57

  75. [75]

    Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning

    Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, and Andrew McCallum. Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning. In International Conference on Learning Representations, 2018

  76. [76]

    , title =

    Nicola De Cao, Wilker Aziz, and Ivan Titov. Question answering by reasoning across documents with graph convolutional networks. arXiv preprint arXiv:1808.09920, 2018

  77. [77]

    A review of modern recommender systems using generative models (gen-recsys)

    Yashar Deldjoo, Zhankui He, Julian McAuley, Anton Korikov, Scott Sanner, Arnau Ramisa, René Vidal, Maheswaran Sathiamoorthy, Atoosa Kasirzadeh, and Silvia Milano. A review of modern recommender systems using generative models (gen-recsys). arXiv preprint arXiv:2404.00579, 2024

  78. [78]

    Graph-based retriever captures the long tail of biomedical knowledge

    Julien Delile, Srayanta Mukherjee, Anton Van Pamel, and Leonid Zhukov. Graph-based retriever captures the long tail of biomedical knowledge. arXiv preprint arXiv:2402.12352, 2024

  79. [79]

    Pandora: Jailbreak gpts by retrieval augmented generation poisoning, 2024

    Gelei Deng, Yi Liu, Kailong Wang, Yuekang Li, Tianwei Zhang, and Yang Liu. Pandora: Jailbreak gpts by retrieval augmented generation poisoning, 2024. URL https://arxiv. org/abs/2402.08416

  80. [80]

    Contextual stochastic block models

    Yash Deshpande, Subhabrata Sen, Andrea Montanari, and Elchanan Mossel. Contextual stochastic block models. Advances in Neural Information Processing Systems, 31, 2018

Showing first 80 references.