hub Mixed citations

arXiv preprint arXiv:2407.03502 , year=

Arindam Mitra, Luciano Del Corro, Guoqing Zheng, Shweti Mahajan, Dany Rouhana, Andres Codas, Yadong Lu, Wei-ge Chen, Olga Vrousgos, Corby Rosset, Fillipe Silva, Hamed Khanpour, Yash Lara, Ahmed Hassan Awadallah · 2024 · arXiv 2407.03502

Mixed citation behavior. Most common role is background (67%).

14 Pith papers citing it

Background 67% of classified citations

read on arXiv browse 14 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4 method 1 other 1

citation-polarity summary

background 4 unclear 1 use method 1

representative citing papers

ISE: An Execution-Grounded Recipe for Multi-Turn OS-Agent Trajectories

cs.CL · 2026-06-09 · conditional · novelty 7.0

ISE creates 23,132 execution-grounded multi-turn OS agent trajectories via intent simulation and live execution, improving agent performance on ClawEval from 19.3 to 37.7 pass@1 with Qwen3-8B.

WhiFlash: Accelerating Speculative Decoding with Token-Level Cross-Paradigm Routing

cs.LG · 2026-06-05 · unverdicted · novelty 7.0

WhiFlash introduces token-level cross-paradigm routing between autoregressive and diffusion drafting models, with cache optimizations, to raise acceptance lengths and deliver up to 69.6% throughput gains over EAGLE-3.

WRIT: Write-Read Intensive Trajectory Synthesis for Multi-Turn User-Facing Agents

cs.CL · 2026-06-01 · unverdicted · novelty 6.0

WRIT is a synthesis pipeline that generates write-read intensive trajectories along axes of write-decision count and per-decision evidence burden, enabling a 4B model to outperform GPT-5.1 on τ²-bench with reduced inference tokens.

SkillGen: Verified Inference-Time Agent Skill Synthesis

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

SkillGen synthesizes auditable skills from agent trajectories via contrastive induction on successes and failures, then verifies net performance impact by comparing outcomes with and without the skill on identical tasks.

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key

cs.AI · 2026-05-07 · unverdicted · novelty 6.0 · 3 refs

RL training compute for logical reasoning follows a power law with horizon depth whose exponent rises with logical expressiveness, yielding better downstream transfer when models train on richer logics.

TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration

cs.AI · 2026-04-15 · unverdicted · novelty 6.0

TREX automates the LLM training lifecycle via collaborative agents and tree-based exploration, delivering consistent performance gains across 10 real-world fine-tuning tasks in FT-Bench.

DataClaw0: Agentic Tailoring Multimodal Data from Raw Streams

cs.LG · 2026-06-19 · unverdicted · novelty 5.0

DataClaw0 introduces an agentic data-tailoring paradigm, a 9B model trained on a synthetically generated dataset, and a new benchmark, claiming improved downstream adaptation in video generation, VQA, and GUI navigation under limited data.

Provenance-Grounded Gating and Adaptive Recovery in Synthetic Post-Training Data Curation

cs.CL · 2026-06-09 · unverdicted · novelty 5.0

Controlled experiments on synthetic post-training data show provenance-grounded gating and adaptive recovery improve yield and recall over baselines, with generator scale as the primary driver of downstream fine-tuning quality.

Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents

cs.CL · 2026-05-11 · unverdicted · novelty 5.0 · 2 refs

Proposes image-bank harness and ODE closed-loop data generation to boost multimodal deep search agents, reporting average score gains from 24.9% to 39.0% on 8 benchmarks for 8B model and 30.6% to 41.5% for 30B.

Kimi K2: Open Agentic Intelligence

cs.LG · 2025-07-28 · unverdicted · novelty 5.0

Kimi K2 is a 1-trillion-parameter MoE model that leads open-source non-thinking models on agentic benchmarks including 65.8 on SWE-Bench Verified and 66.1 on Tau2-Bench.

Claw-R1: A Step-Level Data Middleware System for Agentic Reinforcement Learning

cs.LG · 2026-06-08 · unverdicted · novelty 4.0

Claw-R1 provides a Gateway Server and Data Pool to manage step-level agent interaction traces as structured data assets for agentic RL training.

Phi-4-reasoning Technical Report

cs.AI · 2025-04-30 · unverdicted · novelty 4.0

A 14B reasoning model trained via supervised fine-tuning on selected prompts and o3-mini traces, plus outcome RL, outperforms larger open models like DeepSeek-R1-Distill-Llama-70B on math, coding, planning and related benchmarks.

From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review

cs.AI · 2025-04-28 · accept · novelty 4.0

A survey consolidating benchmarks, agent frameworks, real-world applications, and protocols for LLM-based autonomous agents into a proposed taxonomy with recommendations for future research.

Multi-Agent Collaboration Mechanisms: A Survey of LLMs

cs.AI · 2025-01-10 · unverdicted · novelty 4.0

The survey organizes LLM-based multi-agent collaboration mechanisms into a framework with dimensions of actors, types, structures, strategies, and coordination protocols, reviews applications across domains, and identifies challenges for future research.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Kimi K2: Open Agentic Intelligence cs.LG · 2025-07-28 · unverdicted · none · ref 55
Kimi K2 is a 1-trillion-parameter MoE model that leads open-source non-thinking models on agentic benchmarks including 65.8 on SWE-Bench Verified and 66.1 on Tau2-Bench.

arXiv preprint arXiv:2407.03502 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer