Large-scale standardized benchmarks show state-of-the-art dataset distillation methods do not outperform coreset selection on ImageNet-scale data and have substantially higher construction costs.
Elfs: Enhancing label-free coreset selection via clustering-based pseudo-labeling
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
SLAP is a new batch-aware pruning framework that uses distribution-aware stratified sampling and Hessian-approximated gradients to select data, claiming 20-40% less data while matching or exceeding full-dataset performance on LLM instruction tuning tasks.
citing papers explorer
-
Rethinking Dataset Distillation for Classification: Do Distilled Sets Outperform Coresets?
Large-scale standardized benchmarks show state-of-the-art dataset distillation methods do not outperform coreset selection on ImageNet-scale data and have substantially higher construction costs.
-
SLAP: Stratified Loss-based Pruning for On-Policy Data-Efficient Instruction Tuning
SLAP is a new batch-aware pruning framework that uses distribution-aware stratified sampling and Hessian-approximated gradients to select data, claiming 20-40% less data while matching or exceeding full-dataset performance on LLM instruction tuning tasks.