The Thirteenth International Conference on Learning Representations , year=

(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning , author=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Prescriptive Scaling Laws for Data Constrained Training

cs.LG · 2026-05-02 · unverdicted · novelty 6.0

A one-parameter scaling law models excess loss from data repetition as an additive overfitting penalty, recommending model capacity increases over excessive repetition and showing that strong weight decay reduces the penalty coefficient by ~70%.

Compute Optimal Tokenization

cs.CL · 2026-05-02

citing papers explorer

Showing 2 of 2 citing papers.

Prescriptive Scaling Laws for Data Constrained Training cs.LG · 2026-05-02 · unverdicted · none · ref 12
A one-parameter scaling law models excess loss from data repetition as an additive overfitting penalty, recommending model capacity increases over excessive repetition and showing that strong weight decay reduces the penalty coefficient by ~70%.
Compute Optimal Tokenization cs.CL · 2026-05-02 · unreviewed · ref 6

The Thirteenth International Conference on Learning Representations , year=

fields

years

verdicts

representative citing papers

citing papers explorer