hub

falling piece

· 2024 · arXiv 2406.04520

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

read on arXiv browse 12 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation

cs.CL · 2026-05-21 · unverdicted · novelty 7.0

TransitLM is a large-scale dataset and benchmark for training LLMs to generate structurally valid map-free transit routes from origin-destination pairs.

IE as Cache: Information Extraction Enhanced Agentic Reasoning

cs.CL · 2026-04-16 · unverdicted · novelty 7.0

IE-as-Cache framework repurposes information extraction as a dynamic cognitive cache to improve agentic reasoning accuracy in LLMs on challenging benchmarks.

Learning to Interrupt in Language-based Multi-agent Communication

cs.CL · 2026-04-07 · unverdicted · novelty 7.0

HANDRAISER learns optimal interruption points in multi-agent LLM communication using estimated future reward and cost, achieving 32.2% lower communication cost with comparable or better task results across games, scheduling, and debate.

A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners

cs.LG · 2026-06-02 · unverdicted · novelty 6.0

Supervised fine-tuning lets LLMs linearly encode action validity and state predicates, with broader state-space coverage during training improving world-model recovery.

Inducing Reasoning Primitives from Agent Traces

cs.AI · 2026-06-02 · unverdicted · novelty 6.0

Reasoning Primitive Induction mines ReAct traces to build a library of typed pseudo-tools that, when composed in a standard ReAct loop, outperform the original agent by 22-44 percentage points on five subtasks.

Extracting Search Trees from LLM Reasoning Traces Reveals Myopic Planning

cs.AI · 2026-05-07 · unverdicted · novelty 6.0 · 5 refs

LLM planning in four-in-a-row is myopic: move choices match a shallow model that ignores deep nodes expanded in reasoning traces.

DoubleAgents: Human-Agent Alignment in a Socially Embedded Workflow

cs.HC · 2025-09-16 · unverdicted · novelty 6.0

DoubleAgents shows that a distributed-cognition design with coordination agent, dashboard, and policy module increases user comfort and reliance on AI agents for coordination tasks over time.

Dream 7B: Diffusion Large Language Models

cs.CL · 2025-08-21 · unverdicted · novelty 6.0

Dream 7B is a 7B diffusion LLM that refines sequences in parallel via denoising and outperforms prior diffusion models on general, mathematical, and coding benchmarks with added flexibility in generation order and quality-speed tradeoffs.

ChinaTravel: An Open-Ended Travel Planning Benchmark with Compositional Constraint Validation for Language Agents

cs.AI · 2024-12-18 · unverdicted · novelty 6.0

ChinaTravel is a benchmark with sandbox, compositional DSL, and 1154-human dataset for testing language agents on open-ended travel planning constraint satisfaction.

Training Language Models to Self-Correct via Reinforcement Learning

cs.LG · 2024-09-19 · unverdicted · novelty 6.0

SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.

End-to-end PDDL Planning with Hardcoded and Dynamic Agents

cs.AI · 2025-12-10 · unverdicted · novelty 5.0

An end-to-end LLM framework refines natural language into valid PDDL domains and problems via hardcoded and dynamic agents, generates plans with standard engines, and returns readable output.

A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence

cs.AI · 2025-07-28 · accept · novelty 4.0

The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.

citing papers explorer

Showing 12 of 12 citing papers.

TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation cs.CL · 2026-05-21 · unverdicted · none · ref 43
TransitLM is a large-scale dataset and benchmark for training LLMs to generate structurally valid map-free transit routes from origin-destination pairs.
IE as Cache: Information Extraction Enhanced Agentic Reasoning cs.CL · 2026-04-16 · unverdicted · none · ref 33
IE-as-Cache framework repurposes information extraction as a dynamic cognitive cache to improve agentic reasoning accuracy in LLMs on challenging benchmarks.
Learning to Interrupt in Language-based Multi-agent Communication cs.CL · 2026-04-07 · unverdicted · none · ref 46
HANDRAISER learns optimal interruption points in multi-agent LLM communication using estimated future reward and cost, achieving 32.2% lower communication cost with comparable or better task results across games, scheduling, and debate.
A Close Look At World Model Recovery In Supervised Fine-Tuned LLM Planners cs.LG · 2026-06-02 · unverdicted · none · ref 30
Supervised fine-tuning lets LLMs linearly encode action validity and state predicates, with broader state-space coverage during training improving world-model recovery.
Inducing Reasoning Primitives from Agent Traces cs.AI · 2026-06-02 · unverdicted · none · ref 20
Reasoning Primitive Induction mines ReAct traces to build a library of typed pseudo-tools that, when composed in a standard ReAct loop, outperform the original agent by 22-44 percentage points on five subtasks.
Extracting Search Trees from LLM Reasoning Traces Reveals Myopic Planning cs.AI · 2026-05-07 · unverdicted · none · ref 36 · 5 links
LLM planning in four-in-a-row is myopic: move choices match a shallow model that ignores deep nodes expanded in reasoning traces.
DoubleAgents: Human-Agent Alignment in a Socially Embedded Workflow cs.HC · 2025-09-16 · unverdicted · none · ref 55
DoubleAgents shows that a distributed-cognition design with coordination agent, dashboard, and policy module increases user comfort and reliance on AI agents for coordination tasks over time.
Dream 7B: Diffusion Large Language Models cs.CL · 2025-08-21 · unverdicted · none · ref 25
Dream 7B is a 7B diffusion LLM that refines sequences in parallel via denoising and outperforms prior diffusion models on general, mathematical, and coding benchmarks with added flexibility in generation order and quality-speed tradeoffs.
ChinaTravel: An Open-Ended Travel Planning Benchmark with Compositional Constraint Validation for Language Agents cs.AI · 2024-12-18 · unverdicted · none · ref 33
ChinaTravel is a benchmark with sandbox, compositional DSL, and 1154-human dataset for testing language agents on open-ended travel planning constraint satisfaction.
Training Language Models to Self-Correct via Reinforcement Learning cs.LG · 2024-09-19 · unverdicted · none · ref 59
SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.
End-to-end PDDL Planning with Hardcoded and Dynamic Agents cs.AI · 2025-12-10 · unverdicted · none · ref 41
An end-to-end LLM framework refines natural language into valid PDDL domains and problems via hardcoded and dynamic agents, generates plans with standard engines, and returns readable output.
A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence cs.AI · 2025-07-28 · accept · none · ref 86
The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.

falling piece

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer