CRAFT filters training data via source clustering and conditional target selection to bound KL divergence to validation distributions, yielding 43.34 BLEU on English-Hindi translation from 33M pairs while running over 40x faster than TSDS.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CRAFT: Clustered Regression for Adaptive Filtering of Training data
CRAFT filters training data via source clustering and conditional target selection to bound KL divergence to validation distributions, yielding 43.34 BLEU on English-Hindi translation from 33M pairs while running over 40x faster than TSDS.