CDR-Bench shows state-of-the-art LLMs fail at compositional and especially order-sensitive data refinement across atomic, order-agnostic, and order-sensitive settings.
and Reddy, Siva
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CDR-Bench: Evaluating Faithful Execution of Compositional, Order-Sensitive Data Refinement Recipes
CDR-Bench shows state-of-the-art LLMs fail at compositional and especially order-sensitive data refinement across atomic, order-agnostic, and order-sensitive settings.