CDR-Bench shows state-of-the-art LLMs fail at compositional and especially order-sensitive data refinement across atomic, order-agnostic, and order-sensitive settings.
Jellyfish: Instruction-Tuning Local Large Language Models for Data Preprocessing
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A unified framework for LLM agent memory is benchmarked, with a new hybrid method outperforming state-of-the-art on standard tasks.
citing papers explorer
-
CDR-Bench: Evaluating Faithful Execution of Compositional, Order-Sensitive Data Refinement Recipes
CDR-Bench shows state-of-the-art LLMs fail at compositional and especially order-sensitive data refinement across atomic, order-agnostic, and order-sensitive settings.
-
Memory in the LLM Era: Modular Architectures and Strategies in a Unified Framework
A unified framework for LLM agent memory is benchmarked, with a new hybrid method outperforming state-of-the-art on standard tasks.