Is there "Secret Sauce'' in Large Language Model Development?

· 2026 · cs.AI · arXiv 2602.07238

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Do leading LLM developers possess a proprietary ``secret sauce'', or is LLM performance driven by scaling up compute? Using training and benchmark data for 809 models released between 2022 and 2025, we estimate scaling-law regressions with release-date and developer fixed effects. We find clear evidence of developer-specific efficiency advantages, but their importance depends on where models lie in the performance distribution. At the frontier, 80-90% of performance differences are explained by higher training compute, implying that scale--not proprietary technology--drives frontier advances. Away from the frontier, however, proprietary techniques and shared algorithmic progress substantially reduce the compute required to reach fixed capability thresholds. Some companies can systematically produce smaller models more efficiently. Strikingly, we also find substantial variation of model efficiency within companies; a firm can train two models with more than 40x compute efficiency difference. We also discuss the implications for AI leadership and capability diffusion.

representative citing papers

Validity Threats for Foundation Model Research

cs.LG · 2026-06-03 · accept · novelty 6.0

Maps common low-compute research strategies for foundation models onto statistical, internal, external, and construct validity threats via a causal-inference lens.

Two AI Metrics Diverged: Will it Make All the Difference?

cs.AI · 2026-07-01 · unverdicted · novelty 5.0

Bounded performance metrics always favor convergence of AI capabilities to meek models while unbounded metrics allow frontier models to maintain leads indefinitely, with policy implications for capability concentration.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Two AI Metrics Diverged: Will it Make All the Difference? cs.AI · 2026-07-01 · unverdicted · none · ref 36 · internal anchor
Bounded performance metrics always favor convergence of AI capabilities to meek models while unbounded metrics allow frontier models to maintain leads indefinitely, with policy implications for capability concentration.

Is there "Secret Sauce'' in Large Language Model Development?

fields

years

verdicts

representative citing papers

citing papers explorer