{"total":23,"items":[{"citing_arxiv_id":"2606.00610","ref_index":69,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MemGraphRAG: Memory-based Multi-Agent System for Graph Retrieval-Augmented Generation","primary_cat":"cs.IR","submitted_at":"2026-05-30T08:18:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MemGraphRAG uses a memory-based multi-agent system for globally consistent graph construction from fragmented corpora plus a memory-aware hierarchical retriever, claiming better benchmark performance than prior GraphRAG methods at similar cost.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.29602","ref_index":118,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"CogniVerse: Revolutionizing Multi-Modal Retrieval-Augmented Generation with Cognitive Reflection and Geometric Reasoning","primary_cat":"cs.CV","submitted_at":"2026-05-28T08:40:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"CogniVerse is a proposed MMRAG framework that combines cognitive reflection for retrieval filtering, Riemannian manifold alignment plus spectral graphs for retrieval, and optimal transport loss for generation, claiming better accuracy, coherence, and lower latency than prior systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19738","ref_index":10,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"TERGAD: Structure-Aware Text-Enhanced Representations for Graph Anomaly Detection","primary_cat":"cs.CL","submitted_at":"2026-05-19T12:09:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"TERGAD augments graph anomaly detection by converting node topological properties into LLM-generated semantic embeddings that are fused with original attributes via a gated dual-branch autoencoder for joint reconstruction-based anomaly scoring.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19366","ref_index":198,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Accurate, Efficient, and Explainable Deep Learning Approaches for Environmental Science Problems","primary_cat":"cs.LG","submitted_at":"2026-05-19T04:58:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The work introduces WaLeF/FIDLAr for flood forecasting, CoDiCast for probabilistic weather, and Hypercube-RAG for explainable environmental QA, claiming superior accuracy, efficiency, and interpretability over baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18747","ref_index":182,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Code as Agent Harness","primary_cat":"cs.CL","submitted_at":"2026-05-18T17:59:03+00:00","verdict":"ACCEPT","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A survey that organizes existing work on LLM-based agents around code as the central harness, structured in three layers of interfaces, mechanisms, and multi-agent scaling, with applications across domains and listed open challenges.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"AHE [281] Telemetry-driven optimization Cost, decisions, latency, failures Context, tools, validators GEPA [18] Reflective prompt evolution Scores, feedback, critiques Prompts and instructions EvoMAC [328] Workflow topology evolution Handoffs, idle roles, loops Agent roles and graph SEW [312] Self-evolving workflow Workflow scores, failures Stage order and roles Live-SWE [182] Online agent evolution Live issue trajectories Policies, tools, memory GroundedTTA [232] Test-time adaptation State-action evidence Adaptation rules RLEF [104] Execution-feedback learning Execution rewards, failures Feedback reward signal DeepEval [300] Evaluation harness Scenario and metric traces Regression suites, gates FeedbackEval [23] Repair evaluation benchmark Feedback-task scores Failure taxonomy and eval set"},{"citing_arxiv_id":"2605.18025","ref_index":64,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"TeleCom-Bench: How Far Are Large Language Models from Industrial Telecommunication Applications?","primary_cat":"cs.AI","submitted_at":"2026-05-18T08:14:49+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"TeleCom-Bench reveals LLMs reach 90% on telecom intent and entity tasks but drop to 30% on solution generation and root cause analysis in live network scenarios.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.13050","ref_index":49,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Context Training with Active Information Seeking","primary_cat":"cs.CL","submitted_at":"2026-05-13T06:15:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Active information seeking via search tools, when combined with multi-candidate context pruning during training, produces consistent gains on translation, health, and reasoning tasks over naive tool addition or no-tool baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12061","ref_index":166,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SAGE: A Self-Evolving Agentic Graph-Memory Engine for Structure-Aware Associative Memory","primary_cat":"cs.AI","submitted_at":"2026-05-12T12:47:43+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SAGE is a self-evolving agentic graph-memory engine that dynamically constructs and refines structured memory graphs via writer-reader feedback, yielding performance gains on multi-hop QA, open-domain retrieval, and long-term agent benchmarks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"If the reader reward bias is at mostϵ ϕ, and the writer update improves the surrogate reward by bUϕ(Gθ′)− bUϕ(Gθ)≥∆,(163) then the true utility satisfies U ⋆(Gθ′)−U ⋆(Gθ)≥∆−2ϵ ϕ.(164) Proof.By the bias assumption, U ⋆(Gθ′)≥ bUϕ(Gθ′)−ϵ ϕ, U ⋆(Gθ)≤ bUϕ(Gθ) +ϵ ϕ.(165) Subtracting the two inequalities gives U ⋆(Gθ′)−U ⋆(Gθ)≥ bUϕ(Gθ′)− bUϕ(Gθ)−2ϵ ϕ ≥∆−2ϵ ϕ.(166) 34 Corollary F.4(Reader calibration reduces writer optimization bias).If the reader is calibrated from ϕ to ϕ′ and reduces the reward bias from ϵϕ to ϵϕ′, where ϵϕ′ < ϵ ϕ, then for the same surrogate reward improvement∆, the lower bound on true utility improvement increases by 2(ϵϕ −ϵ ϕ′).(167) Proof. By Theorem 5.3, the true utility improvement lower bound before calibration is ∆−2ϵ ϕ, and"},{"citing_arxiv_id":"2605.07517","ref_index":43,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation","primary_cat":"cs.IR","submitted_at":"2026-05-08T09:50:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LARAG improves RAG answer quality on hyperlinked technical documentation by using author-defined links for retrieval, achieving higher BERTScore while using fewer chunks and tokens than standard embedding-based RAG.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"fine-tuning framework for attributed graph embedding.Advances in neural information processing systems 36(2023), 13308-13325. [42] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., and et al.Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971(2023). [43] Xiao, T., and Zhu, J.Foundations of large language models.arXiv preprint arXiv:2501.09223 (2025). [44] Xiao, T., and Zhu, J.Natural Language Processing: Neural Networks and Large Language Models. NiuTrans, 2025. [45] Zhang, Q., Chen, S., Bei, Y., Yuan, Z., Zhou, H., Hong, Z., Chen, H., Xiao, Y., Zhou, C., Dong, J., et al.A survey of graph retrieval-augmented generation for customized large language"},{"citing_arxiv_id":"2605.02106","ref_index":32,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"The Dynamic Gist-Based Memory Model (DGMM): A Memory-Centric Architecture for Artificial Intelligence","primary_cat":"cs.AI","submitted_at":"2026-05-04T00:02:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"DGMM is proposed as an explicit graph-structured memory architecture for AI that enables persistent episodic memory, cue-based recall, and context-dependent interpretation without retraining.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.00702","ref_index":63,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory","primary_cat":"cs.CL","submitted_at":"2026-05-01T14:45:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MemCoE learns memory organization guidelines via contrastive feedback and then trains a guideline-aligned RL policy for memory updates, yielding consistent gains on personalization benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.17843","ref_index":123,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning from AVA: Early Lessons from a Curated and Trustworthy Generative AI for Policy and Development Research","primary_cat":"cs.HC","submitted_at":"2026-04-20T05:53:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"AVA is a specialized GenAI platform for development policy research that provides verifiable syntheses from World Bank reports and is associated with 2.4-3.9 hours of weekly time savings in a large-scale user evaluation.","context_count":1,"top_context_role":"background","top_context_polarity":"support","context_text":"Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. InProceedings of the 2020 conference on fairness, accountability, and transparency. 295-305. [122] Zelun Tony Zhang and Heinrich Hußmann. 2021. How to Manage Output Uncertainty: Targeting the Actual End User Problem in Interactions with AI.. In IUI Workshops. [123] Jiawei Zhou, Yixuan Zhang, Qianni Luo, Andrea G Parker, and Munmun De Choudhury. 2023. Synthetic lies: Understanding ai-generated misinfor- mation and evaluating algorithmic and human solutions. InProceedings of the 2023 CHI conference on human factors in computing systems. 1-20. [124] Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, and Maarten Sap. 2024."},{"citing_arxiv_id":"2604.15676","ref_index":104,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"EvoRAG: Making Knowledge Graph-based RAG Automatically Evolve through Feedback-driven Backpropagation","primary_cat":"cs.DB","submitted_at":"2026-04-17T03:54:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"EvoRAG adds a feedback-driven backpropagation step that attributes response quality to individual knowledge-graph triplets and updates the graph to raise reasoning accuracy by 7.34 percent over prior KG-RAG methods.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Retrieval-Augmented Generation (RAG) [17, 35, 101] empowers Large Language Models (LLMs) to improve response quality by leveraging external knowledge, and has been widely adopted in various domains [80, 83, 94]. Among RAG paradigms, Knowledge Graph-based RAG (KG-RAG) [60, 97, 104] has gained increasing at- tention for its ability to transform the textual corpus into structured knowledge graphs (KGs) [104], capturing rich semantic informa- tion and entity-level relations. Given a query 𝑞, the core idea of KG-RAG is to retrieve a relevant knowledge subgraph (KSG) from the KG and feed it together with 𝑞 to the LLM for response gen- eration. The retrieved KSG is typically organized into a sequence Low Quality! Human Feedback Input Feedback-driven Backpropagation"},{"citing_arxiv_id":"2604.12138","ref_index":7,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Retrieval-Augmented Generation Must Move Beyond Factual Grounding to Represent Diverse Opinions","primary_cat":"cs.AI","submitted_at":"2026-04-13T23:39:39+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Retrieval augmented generation evaluation in the era of large language models: A comprehensive survey.arXiv preprint arXiv:2504.14891, 2024. [6] Penghao Zhao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, and Bin Cui. Retrieval-augmented generation for ai-generated content: A survey.arXiv preprint arXiv:2402.19473, 2024. [7] Yucheng Wang, Xiaohan Li, Yongbin Gao, Jiawei Chen, and Zhiyuan Liu. A systematic literature review of rag: Techniques, metrics, and challenges.arXiv preprint arXiv:2501.13958, 2025. [8] Wenqi Fan, Yujuan Ding, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, and Qing Li. A survey on rag meeting llms: Towards retrieval- augmented large language models."},{"citing_arxiv_id":"2605.18765","ref_index":36,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"STAR: Semantic-Tuned and Tail-Adaptive Retriever for Graph-Augmented Generation","primary_cat":"cs.IR","submitted_at":"2026-04-11T10:16:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"STAR is a semantic-tuned and tail-adaptive retriever for GraphRAG that uses cross-attention interaction learning and path-weighted contrastive learning to mitigate Semantic Shortcut Bias and Long-Tail Path Bias, reporting 1.8% retrieval and 2.2% QA gains.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.02545","ref_index":37,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Competency Questions as Executable Plans: a Controlled RAG Architecture for Cultural Heritage Storytelling","primary_cat":"cs.AI","submitted_at":"2026-04-02T21:54:33+00:00","verdict":"UNVERDICTED","verdict_confidence":"UNKNOWN","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Repurposing competency questions as runtime executable plans creates a controlled neuro-symbolic RAG architecture that produces evidence-closed stories from knowledge graphs.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"logical leap presents a significant dilemma for applications in cultural heritage. LLMs are prone to factual invention and distortion - a phenomenon widely known as \"hallucination\" - which renders them fundamentally unsuitable for domains where factuality is paramount [20,36]. For museums, archives, and ed- ucational platforms, factual veracity is not merely a desirable feature but an ethical and institutional imperative [34,37]. The core of this issue is a deep epistemological conflict. LLMs are probabilistic systems; their mastery lies in statistical correlation and linguistic fluency, not in the representation of factual, verifiable knowledge [18]. In contrast, cultural heritage is a domain of evidence, where the value of a statement is intrinsically tied to its provenance [11]."},{"citing_arxiv_id":"2603.14828","ref_index":14,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Toward Robust GraphRAG: Mitigating Retrieval Drift and Hallucination from Imperfect Knowledge Graphs","primary_cat":"cs.IR","submitted_at":"2026-03-16T05:08:41+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"CS-RAG is a GraphRAG framework that plans queries as ordered atomic constraints, uses anchor-relation aware retrieval, applies sufficiency checks, and falls back to text recovery to reduce drift and hallucination from imperfect KGs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.22762","ref_index":21,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Behavioral Intelligence Platforms: From Event Streams to Autonomous Insight via Probabilistic Journey Graphs, Behavioral Knowledge Extraction, and Grounded Language Generation","primary_cat":"cs.IR","submitted_at":"2026-03-12T16:47:43+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"BIP turns event streams into autonomous insights by modeling journeys as absorbing Markov chains, extracting facts via knowledge graphs, and generating narratives constrained to verified data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.20859","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"KGiRAG: An Iterative GraphRAG Approach for Responding Sensemaking Queries","primary_cat":"cs.IR","submitted_at":"2026-03-02T10:38:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"An iterative feedback-driven GraphRAG architecture produces higher semantic quality and relevance on HotPotQA queries than single-shot baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.20844","ref_index":40,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AtomicRAG: Atom-Entity Graphs for Retrieval-Augmented Generation","primary_cat":"cs.IR","submitted_at":"2026-02-10T05:57:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AtomicRAG replaces chunk-based and triple-based GraphRAG with atom-entity graphs that store facts as atomic units and use personalized PageRank plus relevance filtering to achieve higher retrieval accuracy and reasoning robustness on five benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.20136","ref_index":58,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"M$^3$KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation","primary_cat":"cs.CL","submitted_at":"2025-12-23T07:54:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"M³KG-RAG improves multimodal reasoning in large language models by constructing multi-hop knowledge graphs and selectively pruning retrieved context with GRASP.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.03724","ref_index":7,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MemOS: A Memory OS for AI System","primary_cat":"cs.CL","submitted_at":"2025-07-04T17:21:46+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MemOS introduces a unified memory management framework for LLMs using MemCubes to handle and evolve different memory types for improved controllability and evolvability.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[5] Penghao Zhao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, and Bin Cui. Retrieval-augmented generation for ai-generated content: A survey.CoRR, abs/2402.19473, 2024. 31 [6] Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Qianyu Guo, Meng Wang, and Haofen Wang. Retrieval-augmented generation for large language models: A survey.CoRR, abs/2312.10997, 2023. [7] Qinggang Zhang, Shengyuan Chen, Yuanchen Bei, Zheng Yuan, Huachi Zhou, Zijin Hong, Junnan Dong, Hao Chen, Yi Chang, and Xiao Huang. A survey of graph retrieval-augmented generation for customized large language models. CoRR, abs/2501.13958, 2025. [8] Bo Ni, Zheyuan Liu, Leyao Wang, Yongjia Lei, Yuying Zhao, Xueqi Cheng, Qingkai Zeng, Luna Dong, Yinglong"},{"citing_arxiv_id":"2502.12911","ref_index":44,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Knapsack Optimization-based Schema Linking for LLM-based Text-to-SQL Generation","primary_cat":"cs.CL","submitted_at":"2025-02-18T14:53:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"KaSLA applies knapsack optimization hierarchically to schema linking for LLM text-to-SQL, claiming better results than large models and improved SQL generation on Spider and BIRD.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}