AI agents reproduce 72% of the human ideological gap in effect estimates from an immigration dataset and introduce the m-value plus Agentic Bootstrap to quantify a reported analysis's position in the multiverse of defensible paths.
arXiv preprint arXiv:2510.16872 , year=
10 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 10roles
background 2polarities
background 2representative citing papers
Data2Story is a multi-agent framework that generates evidence-grounded multimodal articles from data, evaluated on 18 articles against human pieces for verifiability, angle coverage, and quality across human, rubric, and automated judges.
The paper introduces a layered vulnerability framework and attack taxonomy for LLM-driven data agents and demonstrates attacks on four open-source and two production systems.
AIDA is the first end-to-end autonomous agent that combines a domain-specific language with Pareto-guided reinforcement learning to discover insights from complex business data.
DataPRM is an environment-aware generative process reward model that improves LLM data analysis agents by 7-11% on benchmarks via active verification and reflection-aware ternary rewards.
LLMs match original qualitative conclusions in 80% of 180 studies and effect sizes in 24%, performing similarly to humans in a tested subset, positioning them as a screening tool rather than a full replacement.
EvoDS adds autonomous skill acquisition via synthesis-validation-reuse and adaptive context compression via learned control within a two-stage multi-agent RL scheme, claiming 28.9% average gains over prior agents on four benchmarks plus elimination of out-of-token failures.
DA-Studio is an agentic system that generates, executes, and exposes multi-step data analysis workflows in a sandboxed environment with visible traces and artifacts.
DataCOPE uses verifier-guided contrastive distillation from agent trajectories to discover skills, yielding average gains of 9.71% on report-style and 32.30% on reasoning-style data analysis tasks across four model settings.
ProfiliTable is a multi-agent system with profiler, generator, and evaluator components that outperforms baselines on 18 tabular task types via dynamic profiling and closed-loop refinement.
citing papers explorer
-
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis
DataPRM is an environment-aware generative process reward model that improves LLM data analysis agents by 7-11% on benchmarks via active verification and reflection-aware ternary rewards.