Data Interpreter: An LLM Agent for Data Science

Sirui Hong, Yizhang Lin, Bang Liu, Bangbang Liu, Binhao Wu, Ceyao Zhang, Danyang Li, Jiaqi Chen, Jiayi Zhang, Jinlin Wang, Li Zhang, Lingyao Zhang, Min Yang, Mingchen Zhuge, Taicheng Guo, Tuo Zhou, Wei Tao, Robert Tang, Xiangtao Lu, Xiawu Z · 2025 · DOI 10.18653/v1/2025.findings-acl.1016

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

One Reflection Is Not Enough: Self-Correcting Autonomous Research via Multi-Hypothesis Failure Attribution

cs.AI · 2026-06-30 · unverdicted · novelty 6.0

SAGE with MHFA improves failure recovery in autonomous research agents, raising metrics-bearing outputs from 42% to 92% on a 12-topic benchmark versus single-reflection baselines.

"When to Hand Off, When to Work Together": Expanding Human-Agent Co-Creative Collaboration through Concurrent Interaction

cs.HC · 2026-03-02 · unverdicted · novelty 6.0

Concurrent human-agent interactions occur in 31.8% of turns and follow five action patterns explained by six triggers and four enabling factors, enabled by a context-aware design probe called CLEO.

Text Analytics Evaluation Framework: A Case Study on LLMs and Social Media

cs.CL · 2026-05-20 · unverdicted · novelty 5.0

Presents a new question-based evaluation framework for LLMs on aggregated social media text and reports that performance declines with input scale, task complexity, and numerical operations beyond 500 instances.

citing papers explorer

Showing 3 of 3 citing papers.

One Reflection Is Not Enough: Self-Correcting Autonomous Research via Multi-Hypothesis Failure Attribution cs.AI · 2026-06-30 · unverdicted · none · ref 87
SAGE with MHFA improves failure recovery in autonomous research agents, raising metrics-bearing outputs from 42% to 92% on a 12-topic benchmark versus single-reflection baselines.
"When to Hand Off, When to Work Together": Expanding Human-Agent Co-Creative Collaboration through Concurrent Interaction cs.HC · 2026-03-02 · unverdicted · none · ref 28
Concurrent human-agent interactions occur in 31.8% of turns and follow five action patterns explained by six triggers and four enabling factors, enabled by a context-aware design probe called CLEO.
Text Analytics Evaluation Framework: A Case Study on LLMs and Social Media cs.CL · 2026-05-20 · unverdicted · none · ref 28
Presents a new question-based evaluation framework for LLMs on aggregated social media text and reports that performance declines with input scale, task complexity, and numerical operations beyond 500 instances.

Data Interpreter: An LLM Agent for Data Science

fields

years

verdicts

representative citing papers

citing papers explorer