super hub Canonical reference

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

Alex Chao, Apurva Mody, Darren Edge, Ha Trinh, Joshua Bradley, Newman Cheng · 2024 · cs.CL · arXiv 2404.16130

Canonical reference. 80% of citing Pith papers cite this work as background.

287 Pith papers citing it

Background 80% of classified citations

open full Pith review browse 287 citing papers more from Alex Chao arXiv PDF

abstract

The use of retrieval-augmented generation (RAG) to retrieve relevant information from an external knowledge source enables large language models (LLMs) to answer questions over private and/or previously unseen document collections. However, RAG fails on global questions directed at an entire text corpus, such as "What are the main themes in the dataset?", since this is inherently a query-focused summarization (QFS) task, rather than an explicit retrieval task. Prior QFS methods, meanwhile, do not scale to the quantities of text indexed by typical RAG systems. To combine the strengths of these contrasting methods, we propose GraphRAG, a graph-based approach to question answering over private text corpora that scales with both the generality of user questions and the quantity of source text. Our approach uses an LLM to build a graph index in two stages: first, to derive an entity knowledge graph from the source documents, then to pregenerate community summaries for all groups of closely related entities. Given a question, each community summary is used to generate a partial response, before all partial responses are again summarized in a final response to the user. For a class of global sensemaking questions over datasets in the 1 million token range, we show that GraphRAG leads to substantial improvements over a conventional RAG baseline for both the comprehensiveness and diversity of generated answers.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 40 baseline 5 method 3 dataset 1

citation-polarity summary

background 39 baseline 5 use method 3 support 1 use dataset 1

claims ledger

abstract The use of retrieval-augmented generation (RAG) to retrieve relevant information from an external knowledge source enables large language models (LLMs) to answer questions over private and/or previously unseen document collections. However, RAG fails on global questions directed at an entire text corpus, such as "What are the main themes in the dataset?", since this is inherently a query-focused summarization (QFS) task, rather than an explicit retrieval task. Prior QFS methods, meanwhile, do not scale to the quantities of text indexed by typical RAG systems. To combine the strengths of these

authors

Alex Chao Apurva Mody Darren Edge Ha Trinh Joshua Bradley Newman Cheng

co-cited works

representative citing papers

Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio

cs.CL · 2026-06-15 · unverdicted · novelty 8.0 · 3 refs

MetaSyn benchmark shows LLM pipelines recover at most 52.7% of ground-truth included studies due to screening failures on PI/ECO eligibility, despite 90.9% retrieval recall at K=200.

MedMemoryBench: Benchmarking Agent Memory in Personalized Healthcare

cs.AI · 2026-05-12 · conditional · novelty 8.0

MedMemoryBench supplies a 2,000-session synthetic medical trajectory dataset and an evaluate-while-constructing streaming protocol to expose memory saturation and reasoning failures in current agent architectures for personalized healthcare.

ShadowMerge: A Novel Poisoning Attack on Graph-Based Agent Memory via Relation-Channel Conflicts

cs.CR · 2026-05-09 · unverdicted · novelty 8.0 · 3 refs

ShadowMerge exploits relation-channel conflicts to poison graph-based agent memory, achieving 93.8% average attack success rate on Mem0 and real-world datasets while bypassing existing defenses.

ContextNest: Verifiable Context Governance for Autonomous AI Agent

cs.AI · 2026-07-02 · unverdicted · novelty 7.0

ContextNest formalizes context governance for AI agents using hash-chained documents and deterministic selectors, with experiments showing higher answer quality and perfect determinism versus standard retrieval.

Grounding LLM Reasoning under Incomplete Graph Evidence

cs.CL · 2026-06-29 · unverdicted · novelty 7.0 · 2 refs

Develops a theoretical perspective showing no hard rule can perfectly reject false unsupported trajectories while retaining true-but-unobserved ones under incomplete graph evidence, and characterizes soft grounding as KL-regularized deformation of the LLM prior.

Query-Aware Spreading Activation for Multi-Hop Retrieval over Knowledge Graphs

cs.LG · 2026-06-29 · unverdicted · novelty 7.0

A fixed-iteration spreading activation with per-step cosine similarity gating enables query-aware KG retrieval as one database query, matching QAFD-RAG on MuSiQue while cutting latency.

Metadata, Structure, or Strategy? A Decomposition of RAG Context Enrichment

cs.IR · 2026-06-28 · unverdicted · novelty 7.0

Controlled experiments across six benchmarks and four models show RAG context enrichment with metadata, structure, or strategies mostly lowers accuracy, with model-context alignment as the determining factor.

MKG-RAG-Bench: Benchmarking Retrieval in Multimodal Knowledge Graph-Augmented Generation

cs.AI · 2026-06-24 · unverdicted · novelty 7.0

MKG-RAG-Bench is a cross-domain benchmark for retrieval in multimodal knowledge graph-augmented generation, constructed via LLM curation from two MKGs with aligned QA datasets.

Beyond the Reranker: Do RAG Retrieval Enhancements Help Once a Strong Reranker Is Present?

cs.IR · 2026-06-14 · conditional · novelty 7.0

On heterogeneous document collections, only query expansion and a newly introduced per-source calibrated corrector (SSCC) deliver reliable gains beyond a strong cross-encoder reranker; other common retrieval enhancements do not.

Generalized Rank-based Evaluation for Knowledge Graph Completion: Perspectives, Framework, and Analyses

cs.LG · 2026-06-08 · unverdicted · novelty 7.0

PROBE is a generalized rank-based KGC evaluation framework with adjustable sharpness and bias-robustness components that satisfies six claimed key properties where prior metrics fall short.

Self-Augmenting Retrieval for Diffusion Language Models

cs.CL · 2026-06-04 · unverdicted · novelty 7.0

SARDI uses lookahead tokens from low-confidence predictions in discrete diffusion language models to dynamically guide retrieval during denoising, outperforming training-free baselines on five multi-hop QA benchmarks at up to 8x higher throughput.

Agent Memory: Characterization and System Implications of Stateful Long-Horizon Workloads

cs.AI · 2026-06-04 · unverdicted · novelty 7.0

The paper delivers the first systems characterization of agent memory, with a four-axis taxonomy, phase-aware profiler, evaluation of ten systems on two benchmarks, and ten design recommendations.

PersonaTree: Structured Lifecycle Memory for Person Understanding in LLM Agents

cs.CL · 2026-06-03 · unverdicted · novelty 7.0

PersonaTree is a new hierarchical memory framework for persistent LLM agents that structures evidence into persona claims via support paths and outperforms baselines on six person-understanding benchmarks.

LifeSide: Benchmarking Agents as Lifelong Digital Companions

cs.CL · 2026-06-03 · unverdicted · novelty 7.0

LifeSide is a new benchmark that evaluates AI agents on multi-session Memory-Emotion-Environment loops via simulated user profiles and event trajectories, revealing that models saturating existing memory tests fail at long-horizon user understanding.

QO-Bench: Diagnosing Query-Operator-Preserving Retrieval over Typed Event Tuples

cs.CL · 2026-06-03 · unverdicted · novelty 7.0

QO-Bench shows RAG systems retrieve relevant text but often discard typed values required for query operators, with paradigm performance inverting across operators and execution remaining a bottleneck even with gold evidence.

HyperPatch: Sequential Knowledge Editing Under n-ary Structural Drift

cs.CL · 2026-06-02 · unverdicted · novelty 7.0

HyperPatch reformulates sequential n-ary knowledge editing as hypergraph manifold stability, using HGNN initialization, SimHash alignment plus Topological LoRA, and fused reasoning to achieve large H-Acc gains on MQuAKE benchmarks.

SkillDAG: Self-Evolving Typed Skill Graphs for LLM Skill Selection at Scale

cs.AI · 2026-06-02 · unverdicted · novelty 7.0

SkillDAG builds a self-evolving typed skill graph that LLM agents query and update at inference time, raising success on ALFWorld and SkillsBench by 12.8 and 8.6 points over graph baselines.

AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification

cs.AI · 2026-06-02 · unverdicted · novelty 7.0

AuditFlow combines a graph-grounded symbolic environment with a multi-agent LLM setup to reach 82.09% joint audit accuracy on structured financial reports, 14.93 points above the strongest baseline.

RWGBench: Evaluating Scholarly Positioning in Related Work Generation

cs.DL · 2026-05-30 · unverdicted · novelty 7.0

RWGBench is a citation-centric benchmark for related work generation built from 40k CS papers and a 100-paper test set, with multi-dimensional metrics that better match human expert judgment than standard similarity scores.

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

cs.AI · 2026-05-26 · unverdicted · novelty 7.0

VitaBench 2.0 introduces a benchmark for long-term personalized and proactive agent behavior, with results indicating substantial gaps in current frontier LLMs.

Retrieval as Reasoning: Self-Evolving Agent-Native Retrieval via LLM-Wiki

cs.CL · 2026-05-25 · unverdicted · novelty 7.0

LLM-Wiki structures external knowledge as compilable wiki pages with links and persistent self-correction, achieving SOTA results on HotpotQA, MuSiQue, and 2WikiMultiHopQA by 2.0-8.1 F1 points over prior RAG systems.

MemGym: a Long-Horizon Memory Environment for LLM Agents

cs.CL · 2026-05-20 · unverdicted · novelty 7.0

MemGym unifies agent gyms into a memory benchmark with isolated scoring across tool-use, research, coding, and computer-use regimes plus a lightweight reward model for tractable coding evaluation.

Graphs of Research: Citation Evolution Graphs as Supervision for Research Idea Generation

cs.CL · 2026-05-14 · unverdicted · novelty 7.0

GoR extracts citation DAGs using position, frequency, predecessor links and time, then fine-tunes Qwen2.5-7B on 498 seed papers to generate ideas, claiming SOTA over gpt-4o baselines via LLM judges.

GroupMemBench: Benchmarking LLM Agent Memory in Multi-Party Conversations

cs.CL · 2026-05-14 · unverdicted · novelty 7.0 · 2 refs

GroupMemBench is a new benchmark exposing that LLM agent memory systems fail on group conversation properties like speaker-grounded tracking and audience-adapted responses, with top systems at 46% accuracy.

citing papers explorer

Showing 50 of 70 citing papers after filters.

MedMemoryBench: Benchmarking Agent Memory in Personalized Healthcare cs.AI · 2026-05-12 · conditional · none · ref 7 · internal anchor
MedMemoryBench supplies a 2,000-session synthetic medical trajectory dataset and an evaluate-while-constructing streaming protocol to expose memory saturation and reasoning failures in current agent architectures for personalized healthcare.
ContextNest: Verifiable Context Governance for Autonomous AI Agent cs.AI · 2026-07-02 · unverdicted · none · ref 16 · internal anchor
ContextNest formalizes context governance for AI agents using hash-chained documents and deterministic selectors, with experiments showing higher answer quality and perfect determinism versus standard retrieval.
MKG-RAG-Bench: Benchmarking Retrieval in Multimodal Knowledge Graph-Augmented Generation cs.AI · 2026-06-24 · unverdicted · none · ref 8 · internal anchor
MKG-RAG-Bench is a cross-domain benchmark for retrieval in multimodal knowledge graph-augmented generation, constructed via LLM curation from two MKGs with aligned QA datasets.
Agent Memory: Characterization and System Implications of Stateful Long-Horizon Workloads cs.AI · 2026-06-04 · unverdicted · none · ref 2 · internal anchor
The paper delivers the first systems characterization of agent memory, with a four-axis taxonomy, phase-aware profiler, evaluation of ten systems on two benchmarks, and ten design recommendations.
SkillDAG: Self-Evolving Typed Skill Graphs for LLM Skill Selection at Scale cs.AI · 2026-06-02 · unverdicted · none · ref 2 · internal anchor
SkillDAG builds a self-evolving typed skill graph that LLM agents query and update at inference time, raising success on ALFWorld and SkillsBench by 12.8 and 8.6 points over graph baselines.
AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification cs.AI · 2026-06-02 · unverdicted · none · ref 64 · internal anchor
AuditFlow combines a graph-grounded symbolic environment with a multi-agent LLM setup to reach 82.09% joint audit accuracy on structured financial reports, 14.93 points above the strongest baseline.
VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions cs.AI · 2026-05-26 · unverdicted · none · ref 105 · internal anchor
VitaBench 2.0 introduces a benchmark for long-term personalized and proactive agent behavior, with results indicating substantial gaps in current frontier LLMs.
Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation cs.AI · 2026-05-13 · unverdicted · none · ref 5 · internal anchor
PyRAG turns multi-hop reasoning into executable Python code over retrieval tools for explicit, verifiable step-by-step RAG.
MAGE: Multi-Agent Self-Evolution with Co-Evolutionary Knowledge Graphs cs.AI · 2026-05-11 · unverdicted · none · ref 8 · internal anchor
MAGE uses a four-subgraph co-evolutionary knowledge graph plus dual bandits to externalize and retrieve experience for stable self-evolution of frozen language-model agents, showing gains on nine diverse benchmarks.
When Stored Evidence Stops Being Usable: Scale-Conditioned Evaluation of Agent Memory cs.AI · 2026-05-08 · unverdicted · none · ref 17 · internal anchor
A new evaluation protocol shows agent memory reliability degrades variably with added irrelevant sessions depending on agent, memory interface, and scale.
The Context Gathering Decision Process: A POMDP Framework for Agentic Search cs.AI · 2026-05-07 · accept · none · ref 5 · internal anchor
Framing LLM agent loops as a Context Gathering Decision Process POMDP yields a predicate-based belief state that boosts multi-hop reasoning up to 11.4% and an exhaustion gate that cuts token use up to 39% with no performance loss.
XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation cs.AI · 2026-04-27 · unverdicted · none · ref 7 · internal anchor
XGRAG uses graph perturbations to quantify component contributions in GraphRAG and achieves 14.81% better explanation quality than text-based baselines on QA datasets, with correlations to graph centrality.
A-MAR: Agent-based Multimodal Art Retrieval for Fine-Grained Artwork Understanding cs.AI · 2026-04-21 · unverdicted · none · ref 17 · internal anchor
A-MAR decomposes art queries into reasoning plans to condition retrieval, leading to improved explanation quality and multi-step reasoning on art benchmarks compared to baselines.
STRIDE: Strategic Iterative Decision-Making for Retrieval-Augmented Multi-Hop Question Answering cs.AI · 2026-04-19 · unverdicted · none · ref 7 · internal anchor
STRIDE uses a meta-planner for entity-agnostic reasoning skeletons and a supervisor for dependency-aware execution to improve retrieval-augmented multi-hop QA.
ROZA Graphs: Self-Improving Near-Deterministic RAG through Evidence-Centric Feedback cs.AI · 2026-04-08 · unverdicted · none · ref 6 · internal anchor
ROZA graphs enable self-improving RAG by storing evidence-specific reasoning chains, yielding up to 10.6pp accuracy gains and 46% lower cost through graph traversal feedback.
GraphScout: Empowering Large Language Models with Intrinsic Exploration Ability for Agentic Graph Reasoning cs.AI · 2026-03-02 · unverdicted · none · ref 11 · internal anchor
GraphScout trains LLMs to autonomously synthesize structured training data from knowledge graphs via flexible exploration tools, enabling a 4B model to outperform larger LLMs by 16.7% on average with fewer inference tokens and strong cross-domain transfer.
Autonomous Knowledge Graph Exploration with Adaptive Breadth-Depth Retrieval cs.AI · 2026-01-20 · unverdicted · none · ref 2 · internal anchor
ARK adaptively retrieves from knowledge graphs using global lexical search and one-hop neighborhood exploration, reaching 59.1% Hit@1 on STaRK with up to 31.4% gains over training-free baselines and enabling distillation to 8B models.
Deterministic Legal Agents: A Canonical Primitive API for Auditable Reasoning over Temporal Knowledge Graphs cs.AI · 2025-10-07 · unverdicted · none · ref 5 · internal anchor
The paper specifies the SAT-Graph API, a canonical primitive interface that enables auditable, deterministic reasoning over temporal knowledge graphs by isolating uncertainty to intent translation and narrative synthesis.
MetaPS: Adaptive Programmatic Strategy Selection for Market Agents cs.AI · 2026-06-21 · unverdicted · none · ref 61 · internal anchor
MetaPS trains models via simulation rollouts to select from programmatic strategy libraries for market agents, yielding better performance than fixed or direct LLM baselines across model sizes.
A Unified Framework for Context-Aware and Relation-Aware Graph Retrieval-Augmented Generation cs.AI · 2026-06-16 · unverdicted · none · ref 4 · internal anchor
HyGRAG is a hierarchical graph RAG framework that constructs LLM summaries over hybrid chunk-entity graphs, retrieves via context and relation awareness across levels, and enables dynamic updates, reporting a 9.7% average accuracy gain on multi-hop reasoning tasks.
Agents-K1: Towards Agent-native Knowledge Orchestration cs.AI · 2026-06-11 · unverdicted · none · ref 26 · 2 links · internal anchor
Agents-K1 is an end-to-end pipeline with a multimodal parser, 4B GRPO-trained extractor, and agent CLI that builds scientific knowledge graphs from full papers and was run on 2.46 million documents to produce Scholar-KG.
TokenMizer: Graph-Structured Session Memory for Long-Horizon LLM Context Management cs.AI · 2026-06-04 · unverdicted · none · ref 8 · internal anchor
TokenMizer builds a knowledge graph of LLM sessions and serializes it into 78-token resume blocks that retain more task, decision, and file information than flat-text baselines at roughly half the token cost.
Beyond Similarity: Trustworthy Memory Search for Personal AI Agents cs.AI · 2026-06-04 · unverdicted · none · ref 9 · internal anchor
MemGate is a 9M-parameter neural gate inserted between vector memory and LLM that converts similarity search into task-conditioned admission, reducing memory-induced threats across agent frameworks while preserving utility.
Reasoning4Sciences: Bridging Reasoning Language Models to All Scientific Branches cs.AI · 2026-05-31 · unverdicted · none · ref 68 · 2 links · internal anchor
A survey of RLM use in 28 disciplines reveals uneven adoption and introduces a maturity assessment framework showing larger gaps when limited to public resources.
Query Symbolically or Retrieve Semantically? A Dataset and Method for Semi-Structured Question Answering cs.AI · 2026-05-26 · unverdicted · none · ref 13 · internal anchor
DualGraph combines semantic textual KGs with symbolic KGs for semi-structured QA and introduces the SpecsQA benchmark, outperforming baselines on both open and specification questions.
Format-Constraint Coupling in Knowledge Graph Construction from Statistical Tables cs.AI · 2026-05-21 · unverdicted · none · ref 1 · internal anchor
Empirical 2x2 factorial study on 6 statistical datasets shows format and schema constraints in LLM-based KG construction from CSV tables produce super-additive fidelity loss up to +1.180, with mismatched pairs falling below baseline, plus release of CSVFidelity-Bench.
GraphMind: From Operational Traces to Self-Evolving Workflow Automation cs.AI · 2026-05-17 · unverdicted · none · ref 14 · 2 links · internal anchor
GraphMind builds and evolves action-centric workflow graphs from traces, navigates them via multi-agent LLM reasoning, and adapts via ATR, outperforming baselines on 93 incidents with 8x less context and 26% lower hallucination in production deployment.
IdeaForge: A Knowledge Graph-Grounded Multi-Agent Framework for Cross-Methodology Innovation Analysis and Patent Claim Generation cs.AI · 2026-05-13 · unverdicted · none · ref 12 · internal anchor
IdeaForge combines multiple innovation methodologies through specialist agents on a persistent knowledge graph, using cross-methodology convergent claim linkages to rank and draft patent claims with higher traceability than single-method baselines.
Goal-Oriented Reasoning for RAG-based Memory in Conversational Agentic LLM Systems cs.AI · 2026-05-12 · unverdicted · none · ref 7 · 2 links · internal anchor
Goal-Mem decomposes user goals into subgoals for targeted memory retrieval using Natural Language Logic, improving performance on multi-hop reasoning tasks in conversational agents.
SAGE: A Self-Evolving Agentic Graph-Memory Engine for Structure-Aware Associative Memory cs.AI · 2026-05-12 · unverdicted · none · ref 206 · internal anchor
SAGE is a self-evolving agentic graph-memory engine that dynamically constructs and refines structured memory graphs via writer-reader feedback, yielding performance gains on multi-hop QA, open-domain retrieval, and long-term agent benchmarks.
HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution cs.AI · 2026-05-11 · unverdicted · none · ref 13 · internal anchor
HAGE proposes a trainable weighted graph memory framework with LLM intent classification, dynamic edge modulation, and RL optimization that improves long-horizon reasoning accuracy in agentic LLMs over static baselines.
ScrapMem: A Bio-inspired Framework for On-device Personalized Agent Memory via Optical Forgetting cs.AI · 2026-05-05 · unverdicted · none · ref 12 · 2 links · internal anchor
ScrapMem reports SOTA 51.0% Joint@10 on ATM-Bench with up to 93% memory reduction and 70.3% Recall@10 via optical forgetting and EM-Graph.
Retrieval and Multi-Hop Reasoning in 1M-Token Context Windows: Evaluating LLMs on Classical Chinese Text cs.AI · 2026-05-04 · unverdicted · none · ref 2 · internal anchor
Frontier LLMs solve single-needle retrieval at 1M tokens on classical Chinese but show three distinct accuracy-decay patterns in three-hop reasoning between 256K and 1M tokens.
From Unstructured Recall to Schema-Grounded Memory: Reliable AI Memory via Iterative, Schema-Aware Extraction cs.AI · 2026-04-30 · unverdicted · none · ref 20 · internal anchor
Schema-aware iterative extraction turns AI memory into a verified system of record, reaching 90-97% accuracy on extraction and end-to-end memory benchmarks where retrieval baselines score 80-87%.
ObjectGraph: From Document Injection to Knowledge Traversal -- A Native File Format for the Agentic Era cs.AI · 2026-04-30 · unverdicted · none · ref 2 · internal anchor
ObjectGraph is a Markdown superset file format that represents documents as traversable knowledge graphs, achieving up to 95.3% token reduction for agents with no significant accuracy loss.
Towards Lawful Autonomous Driving: Deriving Scenario-Aware Driving Requirements from Traffic Laws and Regulations cs.AI · 2026-04-27 · unverdicted · none · ref 48 · internal anchor
Grounding LLMs via node-wise anchors in a traffic scenario taxonomy improves law-scenario matching by 29.1% and derived requirement accuracy by 36.9-38.2% on Chinese laws and 5,897 scenarios, enabling a compliance layer and real-time monitor for AVs.
DW-Bench: Benchmarking LLMs on Data Warehouse Graph Topology Reasoning cs.AI · 2026-04-21 · unverdicted · none · ref 3 · internal anchor
DW-Bench shows tool-augmented LLMs outperform static ones on data warehouse graph reasoning but plateau on hard compositional question subtypes.
EHRAG: Bridging Semantic Gaps in Lightweight GraphRAG via Hybrid Hypergraph Construction and Retrieval cs.AI · 2026-04-19 · unverdicted · none · ref 92 · internal anchor
EHRAG constructs structural hyperedges from sentence co-occurrence and semantic hyperedges from entity embedding clusters, then applies hybrid diffusion plus topic-aware PPR to retrieve top-k documents, outperforming baselines on four datasets with linear indexing cost and zero token overhead.
GAM: Hierarchical Graph-based Agentic Memory for LLM Agents cs.AI · 2026-04-14 · unverdicted · none · ref 10 · internal anchor
GAM decouples event-level memory encoding from topic-level consolidation in LLM agents using hierarchical graphs to reduce interference and improve long-term coherence and retrieval.
Beyond RAG for Cyber Threat Intelligence: A Systematic Evaluation of Graph-Based and Agentic Retrieval cs.AI · 2026-04-13 · unverdicted · none · ref 8 · internal anchor
A hybrid graph-text retrieval system for cyber threat intelligence improves multi-hop question answering by up to 35% over vector-based RAG on a 3,300-question benchmark.
HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling cs.AI · 2026-02-15 · unverdicted · none · ref 9 · internal anchor
HyMem introduces dual-granular memory storage with a lightweight summary module for fast responses and selective activation of a deep LLM module for complex queries, outperforming full-context baselines by 92.6% lower computational cost on LOCOMO and LongMemEval benchmarks.
ARIA: A Causal-Aware Framework for Rescuing LLM Reasoning in Trustworthy Materials Discovery cs.AI · 2026-06-21 · unverdicted · none · ref 12 · internal anchor
ARIA is a three-tier causal framework that conditions LLM knowledge use on mechanistic completeness for forward prediction and inverse design of 2D materials, producing auditable traces.
What Spatial Memory Must Store: Occlusion as the Test for Language-Agent Memory cs.AI · 2026-06-09 · unverdicted · none · ref 12 · internal anchor
Geometry-led weighting outperforms blended memory recall for spatial queries, and a DDA-based visibility predicate correctly flags occluded targets while recall remains occlusion-blind.
Beyond Vector Similarity: A Structural Analysis of Graph-Augmented Retrieval for Industrial Knowledge Graphs cs.AI · 2026-06-04 · unverdicted · none · ref 4 · internal anchor
Empirical comparison on small industrial KG finds vector retrieval fails on structural queries while LLM planner with typed graph operators achieves higher F1 and generalizes to unseen queries.
Exploring Cross-Scenario Generality of Agentic Memory Systems: Diagnostics and a Strong Baseline cs.AI · 2026-06-03 · unverdicted · none · ref 4 · internal anchor
An agentic harness letting the LLM self-manage flat text-file storage via tool calls outperforms eight prior memory systems on cross-scenario generality across QA, chat, trajectory, stress-test, and long-horizon tasks.
Citation-Closure Retrieval and Per-Rule Attribution for Real-World Regulatory Compliance Question Answering cs.AI · 2026-05-28 · unverdicted · none · ref 3 · internal anchor
Presents RegOps-Bench benchmark and RefWalk framework for citation-closure retrieval and per-rule attribution in regulatory compliance QA, reporting substantial gains in recall and citation accuracy over baselines.
CogniFold: Always-On Proactive Memory via Cognitive Folding cs.AI · 2026-05-13 · unverdicted · none · ref 11 · 2 links · internal anchor
CogniFold extends Complementary Learning Systems theory to three layers with a prefrontal intent layer and uses graph self-organization to build proactive agent memory from continuous event streams.
SKG-VLA: Scene Knowledge Graph Priors for Structured Scene Semantics and Multimodal Reasoning for Decision Making cs.AI · 2026-05-10 · unverdicted · none · ref 12 · internal anchor
SKG-VLA models each complaint as a structured scene via a Scene Knowledge Graph to improve policy-grounded multimodal reasoning and decision accuracy.
Grokers: Bottom-Up Inductive Comprehension and Write-Time Intelligence over Typed Knowledge Graphs cs.AI · 2026-05-07 · unverdicted · none · ref 3 · internal anchor
Grokers architecture performs bottom-up inductive comprehension over typed KGs at write time via LM agents, with three claimed formal theorems on byte-identity, accumulation monotonicity, and dual-traversal ordering, plus a deterministic synonym-caching search alternative.
AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases cs.AI · 2026-05-07 · unverdicted · none · ref 35 · internal anchor
AgenticRAG equips an LLM with iterative retrieval and navigation tools, delivering 49.6% recall@1 on BRIGHT, 0.96 factuality on WixQA, and 92% correctness on FinanceBench.

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer