arXiv preprint arXiv:2502.00640(2025)

Collabllm: From passive responders to active collaborators , author= · 2025 · arXiv 2502.00640

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

CollabSim: A CSCW-Grounded Methodology for Investigating Collaborative Competence of LLM Agents through Controlled Multi-Agent Experiments

cs.CL · 2026-06-04 · unverdicted · novelty 7.0

CollabSim is a new CSCW-grounded simulation framework that enables controlled multi-agent experiments to measure collaborative competence in LLM agents.

PRIME: Training Free Proactive Reasoning via Iterative Memory Evolution for User-Centric Agent

cs.AI · 2026-04-08 · unverdicted · novelty 7.0

PRIME enables agents to proactively reason in user-centric tasks by iteratively evolving structured memories from interaction trajectories without gradient-based training.

Cognitive World Models for Process-Level Social Influence Evaluation

cs.AI · 2026-06-28 · unverdicted · novelty 6.0

CogWM is a new LLM user model for evaluating social influence by predicting and tracking cognitive state evolution in dialogues, trained on 150k samples and shown to differentiate AI agents effectively.

Recon: Reconstruction-Guided Reasoning Synthesis for User Modeling

cs.CL · 2026-05-26 · unverdicted · novelty 6.0

Recon scores reasoning traces via action reconstruction fidelity, achieving 54.7% win rate over post-hoc baselines and up to 70% when used to train synthesis models across four domains.

MLReplicate: Benchmarking Autonomous Research Systems for Machine Learning Reproducibility

cs.LG · 2026-05-15 · conditional · novelty 6.0

MLReplicate benchmark evaluates six autonomous systems on 45 manuscripts from ICML 2025 papers, finding that automated reviews accept flawed outputs with fabricated claims while human review exposes methodological failures, and that the cheapest system outperforms the most expensive by a wide margin

Alignment has a Fantasia Problem

cs.AI · 2026-04-23 · unverdicted · novelty 6.0

AI alignment must move beyond assuming users have fully formed goals and instead provide active cognitive support to help form and refine intent over time.

CoFi-PGMA: Counterfactual Policy Gradients under Filtered Feedback for Multi-Agent LLMs

cs.LG · 2026-04-03 · unverdicted · novelty 6.0

CoFi-PGMA derives a unified counterfactual policy gradient objective based on marginal contribution to correct filtered feedback for both routing and collaborative multi-agent LLM training.

Quantifying the Utility of User Simulators for Building Collaborative LLM Assistants

cs.CL · 2026-05-10 · unverdicted · novelty 5.0

Fine-tuned simulators grounded in real human data produce LLM assistants that win more often against real users than those trained against role-playing simulators.

Learning Decentralized LLM Collaboration with Multi-Agent Actor Critic

cs.AI · 2026-01-29

citing papers explorer

Showing 1 of 1 citing paper after filters.

MLReplicate: Benchmarking Autonomous Research Systems for Machine Learning Reproducibility cs.LG · 2026-05-15 · conditional · none · ref 37
MLReplicate benchmark evaluates six autonomous systems on 45 manuscripts from ICML 2025 papers, finding that automated reviews accept flawed outputs with fabricated claims while human review exposes methodological failures, and that the cheapest system outperforms the most expensive by a wide margin

arXiv preprint arXiv:2502.00640(2025)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer