Title resolution pending

· 2025 · arXiv 2504.13171

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Do Language Models Need Sleep? Offline Recurrence for Improved Online Inference

cs.CL · 2026-05-25 · unverdicted · novelty 7.0

A sleep mechanism with N offline recurrent passes consolidates context into fast weights, improving performance on reasoning tasks where standard transformers fail.

IdleSpec: Exploiting Idle Time via Speculative Planning for LLM Agents

cs.AI · 2026-05-21 · conditional · novelty 7.0

IdleSpec improves LLM agent accuracy by generating and aggregating speculative plans during idle time between tool calls and observations using complementary drafting strategies.

LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

cs.CL · 2026-05-12 · unverdicted · novelty 7.0

LongMemEval-V2 is a new benchmark where AgentRunbook-C reaches 72.5% accuracy on long-term agent memory tasks, beating RAG baselines at 48.5% and basic coding agents at 69.3%.

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost

cs.AI · 2026-05-07 · conditional · novelty 7.0

Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.

Playful Agentic Robot Learning

cs.RO · 2026-06-17 · unverdicted · novelty 6.0

RATs agents generate and solve their own exploratory tasks during play, distill successful code into a skill library, and reuse it to improve held-out task performance by 20.6 and 17.0 points on two benchmarks.

Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

cs.LG · 2026-06-02 · unverdicted · novelty 6.0

Language models can use a two-stage sleep process of upward distillation for memory consolidation and RL-based dreaming for unsupervised self-improvement to enable continual learning.

Anticipate and Learn: Unleashing Idle-Time Compute in Proactive Agents

cs.CL · 2026-05-25 · unverdicted · novelty 6.0

ProAct uses idle compute to anticipate user needs via dialogue history and memory, achieving 14.8% fewer turns, 11.7% less user effort, and 28.1% fewer hallucinations than reactive baselines on the new ProActEval benchmark.

Auto-Dreamer: Learning Offline Memory Consolidation for Language Agents

cs.CL · 2026-05-20 · unverdicted · novelty 6.0

Auto-Dreamer trains an offline memory consolidator via GRPO on agent performance to abstract cross-session patterns, outperforming baselines by 7 points on ScienceWorld with 12x smaller memory and generalizing to ALFWorld and WebArena.

The Impact of Response Latency and Task Type on Human-LLM Interaction and Perception

cs.HC · 2026-02-09 · unverdicted · novelty 6.0

Shorter LLM response latencies reduce perceived output thoughtfulness and usefulness, while task type affects prompting frequency independently of latency.

Evolving Agents in the Dark: Retrospective Harness Optimization via Self-Preference

cs.AI · 2026-06-04 · unverdicted · novelty 5.0

RHO is a self-supervised technique that selects challenging past tasks, re-solves them, and uses self-preference to update an agent's harness, raising SWE-Bench Pro pass rate from 59% to 78% without external labels.

BALAR : A Bayesian Agentic Loop for Active Reasoning

cs.AI · 2026-05-06 · unverdicted · novelty 5.0

BALAR is a task-agnostic Bayesian loop that maintains structured beliefs over latent states, selects questions via expected mutual information, and expands its state space when needed, delivering 14.6-38.5% accuracy gains over baselines on detective, puzzle, and clinical diagnosis benchmarks.

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

cs.CL · 2025-03-20 · accept · novelty 5.0

A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.

Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions

cs.CL · 2025-07-07

citing papers explorer

Showing 1 of 1 citing paper after filters.

Playful Agentic Robot Learning cs.RO · 2026-06-17 · unverdicted · none · ref 56
RATs agents generate and solve their own exploratory tasks during play, distill successful code into a skill library, and reuse it to improve held-out task performance by 20.6 and 17.0 points on two benchmarks.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer