LLM+Graph@VLDB'2025 Workshop Summary

Yixiang Fang , Arijit Khan , Tianxing Wu , Da Yan , Shu Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-13 18:42 UTC · model grok-4.3

classification 💻 cs.DB cs.AI

keywords LLMgraph dataworkshop summaryVLDBgraph machine learningdata management

0 comments

The pith

The 2nd LLM+Graph Workshop at VLDB 2025 advanced algorithms and systems that combine large language models with graph-structured data for practical use.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This workshop summary describes the integration of large language models with graph data as a fast-evolving frontier that draws interest from both academia and industry. The report covers the key research directions, open challenges, and concrete solutions presented by speakers at the 2025 VLDB event in London. A sympathetic reader would care because the workshop focuses on bridging LLMs, graph data management, and graph machine learning to enable new applications in data systems. The summary positions these intersections as central to future progress in handling complex, structured information with language models.

Core claim

The integration of large language models with graph-structured data has become a pivotal research frontier, and the workshop advanced algorithms and systems that bridge LLMs, graph data management, and graph machine learning for practical applications.

What carries the argument

The workshop presentations that highlight research directions, challenges, and innovative solutions in combining LLMs with graph data.

Load-bearing premise

The presentations chosen for the workshop and report represent the most important ongoing work without major omissions.

What would settle it

A follow-up survey or later workshop that systematically covers major LLM-graph papers omitted from this summary would show the selection was incomplete.

read the original abstract

The integration of large language models (LLMs) with graph-structured data has become a pivotal and fast evolving research frontier, drawing strong interest from both academia and industry. The 2nd LLM+Graph Workshop, co-located with the 51st International Conference on Very Large Data Bases (VLDB 2025) in London, focused on advancing algorithms and systems that bridge LLMs, graph data management, and graph machine learning for practical applications. This report highlights the key research directions, challenges, and innovative solutions presented by the workshop's speakers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript is a summary report of the 2nd LLM+Graph Workshop co-located with VLDB 2025. It frames the integration of large language models with graph-structured data as a pivotal research area and highlights the key directions, challenges, and solutions presented by workshop speakers on algorithms, systems, graph data management, and graph machine learning.

Significance. If the summary accurately reflects the workshop content, the report provides a timely community resource documenting emerging trends at the LLM-graph intersection. Such summaries are valuable for disseminating workshop outcomes to a broader audience and can help orient researchers to practical applications and open challenges in this fast-moving area.

minor comments (2)

[Abstract] The abstract and report body refer to 'innovative solutions' and 'key research directions' without naming specific presentations, speakers, or providing even brief citations to the underlying works; adding a short list or table of highlighted contributions would improve traceability and utility for readers.
[Full Text] The report states that the workshop focused on 'practical applications' but does not elaborate on any concrete use cases or evaluation settings discussed by speakers; a single paragraph summarizing one or two representative examples would strengthen the descriptive value.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of our workshop summary manuscript and for recommending acceptance. We are pleased that the report is viewed as a timely community resource documenting emerging trends at the LLM-graph intersection.

Circularity Check

0 steps flagged

No significant circularity: factual workshop summary with no derivations

full rationale

The document is a descriptive workshop summary report. It contains no equations, fitted parameters, predictions, or technical derivations. The opening claim that LLM+Graph integration is a 'pivotal and fast evolving research frontier' is a contextual framing statement, not a result derived from any prior step within the paper. No self-citation chains, ansatzes, or renamings of results are used to support any load-bearing argument. The report simply lists presented directions and solutions; its value is observational rather than deductive. This matches the default expectation of no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are present because the document is a descriptive summary of external talks rather than a technical derivation.

pith-pipeline@v0.9.0 · 5387 in / 878 out tokens · 39905 ms · 2026-05-13T18:42:25.016477+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 3 internal anchors

[1]

S. Abedini. SQLyzr: A comprehensive benchmark and framework for evaluating text-to-sql systems. Master’s thesis, University of Waterloo, Apr

work page
[2]

Tamer Özsu

Advisor: M. Tamer Özsu. Available at https://hdl.handle.net/10012/23045

work page
[3]

Akillioglu, A

K. Akillioglu, A. Chakraborty, S. V oruganti, and M. T. Özsu. Research challenges in relational database management systems for llm queries. Proceedings of the VLDB Endowment. ISSN, 2150:8097

work page
[4]

Brati ´c, M

B. Brati ´c, M. E. Houle, V . Kurbalija, V . Oria, and M. Radovanovi´c. Nn-descent on high-dimensional data. InProceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, pages 1–8, 2018

work page 2018
[5]

Chat2Graph: A Graph Native Agentic System.https: //github.com/TuGraph-family/chat2graph, 2025

Chat2Graph. Chat2Graph: A Graph Native Agentic System.https: //github.com/TuGraph-family/chat2graph, 2025

work page 2025
[6]

Chung, R

J. Chung, R. Ko, W. Yoo, M. Onizuka, S. Kim, T.-W. Kim, and W.-Y . Shin. Graphcompliance: Aligning policy and context graphs for llm-based regulatory compliance.arXiv preprint arXiv:2510.26309, 2025

work page arXiv 2025
[7]

Clemedtson and B

A. Clemedtson and B. Shi. Graphraft: Retrieval augmented fine-tuning for knowledge graphs on graph databases.arXiv preprint arXiv:2504.05478, 2025

work page arXiv 2025
[8]

Colombo and F

A. Colombo and F. Cambria. Llm-assisted construction of the united states legislative graph. VLDB 2025 Workshop: LLM+Graph, 2025

work page 2025
[9]

W. Dong, C. Moses, and K. Li. Efficient k-nearest neighbor graph construction for generic similarity measures. InProceedings of the 20th international conference on World wide web, pages 577–586, 2011

work page 2011
[10]

Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB

A. Dorbani, S. Yasser, J. Lin, and A. Mhedhbi. Beyond quacking: Deep integration of language models and rag into duckdb.arXiv preprint arXiv:2504.01157, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[11]

D. Edge, H. Trinh, N. Cheng, J. Bradley, A. Chao, A. Mody, S. Truitt, D. Metropolitansky, R. O. Ness, and J. Larson. From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[12]

Y . Fang, A. Khan, T. Wu, and D. Yan. Llm+ graph: 2 nd international workshop on data management opportunities in bringing llms with graph data.Proceedings of the VLDB Endowment. ISSN, 2150:8097

work page
[13]

X. He, Y . Tian, Y . Sun, N. Chawla, T. Laurent, Y . LeCun, X. Bresson, and B. Hooi. G-retriever: Retrieval-augmented generation for textual graph understanding and question answering.Advances in Neural Information Processing Systems, 37:132876–132907, 2024

work page 2024
[14]

Text2Cypher-2025v1

HF/Neo4j. Text2Cypher-2025v1. https://huggingface.co/datasets/neo4j/ text2cypher-2025v1, 2025

work page 2025
[15]

Text2Cypher Models

HF/Neo4j. Text2Cypher Models. https://huggingface.co/neo4j/models, 2025

work page 2025
[16]

Q. Ji, P. Shen, H. Zhu, G. Qi, Y . Sheng, L. Wu, K. Xu, and Y . Meng. Llm-hype: A targeted evaluation framework for hypernym-hyponym identification in large language models.VLDB 2025 Workshop: LLM+Graph, 2025

work page 2025
[17]

X. Jian, Z. Dong, and M. T. Özsu. Interacsparql: An interactive system for sparql query refinement using natural language explanations.arXiv preprint arXiv:2511.02002, 2025

work page arXiv 2025
[18]

A. Khan, T. Wu, and X. Chen. Llm+kg@vldb 24 workshop summary.SIGMOD Rec., 54(2):60–65, 2025

work page 2025
[19]

Khattab, A

O. Khattab, A. Singhvi, P. Maheshwari, Z. Zhang, K. Santhanam, S. Haq, A. Sharma, T. T. Joshi, H. Moazam, H. Miller, et al. Dspy: Compiling declarative language model calls into state-of-the-art pipelines. InThe Twelfth International Conference on Learning Representations, 2024

work page 2024
[20]

Q. Kong, Z. Lv, Y . Xiong, D. Wang, J. Sun, T. Su, L. Li, X. Yang, and G. Huo. Prophetagent: Automatically synthesizing gui tests from test cases in natural language for mobile apps. In Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering, pages 174–179, 2025

work page 2025
[21]

H. T. Le, A. Bonifati, and A. Mauri. Graph consistency rule mining with llms: an exploratory study. InProceedings 28th International Conference on Extending Database Technology, EDBT 2025, Barcelona, Spain, March 25-28, 2025, pages 748–754, 2025

work page 2025
[22]

C. Li, H. Chen, S. Zhang, Y . Hu, C. Chen, Z. Zhang, M. Li, X. Li, D. Han, X. Chen, et al. Bytegraph: a high-performance distributed graph database in bytedance.Proceedings of the VLDB Endowment, 15(12):3306–3318, 2022

work page 2022
[23]

Y . Li, J. Hu, B. Hooi, B. He, and C. Chen. Dgp: A dual-granularity prompting framework for fraud detection with graph-enhanced llms.arXiv preprint arXiv:2507.21653, 2025

work page arXiv 2025
[24]

Y . Liu, P. Gao, X. Wang, J. Liu, Y . Shi, Z. Zhang, and C. Peng. Marscode agent: Ai-native automated bug fixing.arXiv preprint arXiv:2409.00899, 2024

work page arXiv 2024
[25]

C. Ma, S. Chakrabarti, A. Khan, and B. Molnár. Knowledge graph-based retrieval-augmented generation for schema matching.CoRR, abs/2501.08686, 2025

work page arXiv 2025
[26]

Y . A. Malkov and D. A. Yashunin. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs.IEEE transactions on pattern analysis and machine intelligence, 42(4):824–836, 2018

work page 2018
[27]

Mihindukulasooriya, N

N. Mihindukulasooriya, N. S. D’Souza, F. Chowdhury, and H. Samulowitz. Automatic prompt optimization for knowledge graph construction: Insights from an empirical study. VLDB 2025 Workshop: LLM+Graph, 2025

work page 2025
[28]

GDS Agent.https: //github.com/neo4j-contrib/gds-agent, 2025

Neo4j. GDS Agent.https: //github.com/neo4j-contrib/gds-agent, 2025

work page 2025
[29]

Neo4j graphrag python: Modular tools for knowledge graph rag, 2025

Neo4j, Inc. Neo4j graphrag python: Modular tools for knowledge graph rag, 2025. GitHub repository

work page 2025
[30]

M. G. Ozsoy. Enhancing text2cypher with schema filtering.arXiv preprint arXiv:2505.05118, 2025

work page arXiv 2025
[31]

M. G. Ozsoy, L. Messallem, J. Besga, and G. Minneci. Text2cypher: Bridging natural language and graph databases.COLING 2025, page 100, 2025

work page 2025
[32]

Pachera, A

A. Pachera, A. Bonifati, and A. Mauri. User-centric property graph repairs.Proceedings of the ACM on Management of Data, 3(1):1–27, 2025

work page 2025
[33]

J. Pan, S. Razniewski, J.-C. Kalo, S. Singhania, J. Chen, S. Dietze, H. Jabeen, J. Omeliyanenko, W. Zhang, M. Lissandrini, et al. Large language models and knowledge graphs: Opportunities and challenges.Transactions on Graph Data and Knowledge, 2023

work page 2023
[34]

G. C. Publio and J. E. L. Gayo. xpshacl: Explainable shacl validation using retrieval-augmented generation and large language models.VLDB 2025 Workshop: LLM+Graph, 2025

work page 2025
[35]

N. R. Schneider, K. O’Sullivan, and H. Samet. Graph-enhanced large language models for spatial search.VLDB 2025 Workshop: LLM+Graph, 2025

work page 2025
[36]

Shi and I

B. Shi and I. Panagiotas. Gds agent: A graph algorithmic reasoning agent.arXiv preprint arXiv:2508.20637, 2025

work page arXiv 2025
[37]

L. Sun, J. Hu, S. Zhou, Z. Huang, J. Ye, H. Peng, Z. Yu, and P. Yu. Riccinet: Deep clustering via a riemannian generative model. InProceedings of the ACM Web Conference 2024, pages 4071–4082, 2024

work page 2024
[38]

L. Sun, Z. Huang, S. Zhou, Q. Wan, H. Peng, and P. Yu. Riemanngfm: Learning a graph foundation model from riemannian geometry. InProceedings of the ACM on Web Conference 2025, pages 1154–1165, 2025

work page 2025
[39]

L. Sun, F. Wang, J. Ye, H. Peng, and S. Y . Philip. Congregate: Contrastive graph clustering in curvature spaces. InIJCAI, pages 2296–2305, 2023

work page 2023
[40]

L. Sun, Z. Zhang, Z. Wang, Y . Wang, Q. Wan, H. Li, H. Peng, and P. S. Yu. Pioneer: Physics-informed riemannian graph ode for entropy-increasing dynamics. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 12586–12594, 2025

work page 2025
[41]

L. Sun, S. Zhou, B. Fang, H. Zhang, J. Ye, Y . Ye, and P. S. Yu. Trace: Structural riemannian bridge matching for transferable source localization in information propagation.IJCAI, 2025

work page 2025
[42]

Terdalkar, A

H. Terdalkar, A. Bonifati, and A. Mauri. Graph repairs with large language models: An empirical study. InProceedings of the 8th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), pages 1–10, 2025

work page 2025
[43]

V oruganti and M

S. V oruganti and M. T. Özsu. Mirage-anns: Mixed approach graph-based indexing for approximate nearest neighbor search.Proceedings of the ACM on Management of Data, 3(3):1–27, 2025

work page 2025
[44]

S. Wang, Y . Fang, Y . Zhou, X. Liu, and Y . Ma. Archrag: Attributed community-based hierarchical retrieval-augmented generation.arXiv preprint arXiv:2502.09891, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[45]

Zhang and H

B. Zhang and H. Soh. Extract, define, canonicalize: An llm-based framework for knowledge graph construction. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 9820–9836, 2024

work page 2024
[46]

Zhang, Z

F. Zhang, Z. Huang, Y . Zhou, Q. Guo, W. Luo, and X. Zhou. Scalable graph-based retrieval-augmented generation via locality-sensitive hashing.VLDB 2025 Workshop: LLM+Graph, 2025

work page 2025
[47]

X. Zhou, X. Zhao, and G. Li. Llm-enhanced data management.arXiv preprint arXiv:2402.02643, 2024

work page arXiv 2024
[48]

Y . Zhou, A. I. Muresanu, Z. Han, K. Paster, S. Pitis, H. Chan, and J. Ba. Large language models are human-level prompt engineers. InThe eleventh international conference on learning representations, 2022

work page 2022
[49]

Y . Zhou, Y . Su, Y . Sun, T. Wang, R. He, Y . Zhang, S. Liang, X. Liu, Y . Ma, and Y . Fang. In-depth analysis of graph-based rag in a unified framework.Proceedings of the VLDB Endowment, 18(13):5623–5637, 2025

work page 2025
[50]

Zhou and S

Y . Zhou and S. Wang. Towards the next generation of agent systems: From rag to agentic ai.VLDB 2025 Workshop: LLM+Graph, 2025

work page 2025