Synthetic data boosts multi-label patent classification mainly through volume in low-data regimes, with fidelity mattering more as real data increases and a 20-30% real data mix optimal under fixed budgets.
Hain, and Roman Jurowetzki
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
When Does Synthetic Patent Data Help? Volume-Fidelity Trade-offs in Low-Resource Multi-Label Classification
Synthetic data boosts multi-label patent classification mainly through volume in low-data regimes, with fidelity mattering more as real data increases and a 20-30% real data mix optimal under fixed budgets.