pith. sign in

Reuse, don't retrain: A recipe for continued pretraining of language models

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 1 method 1

citation-polarity summary

years

2026 3 2024 2

verdicts

UNVERDICTED 5

representative citing papers

Optimization Hyper-parameter Laws for Large Language Models

cs.LG · 2024-09-07 · unverdicted · novelty 6.0

Opt-Laws predicts LLM final training loss from LR schedules via SDE-derived convergence and escape features, with 94% Top-2 hit rate on held-out schedules and F1=0.92 for divergence detection.

Phoenix-VL 1.5 Medium Technical Report

cs.CL · 2026-05-11 · unverdicted · novelty 3.0

Phoenix-VL 1.5 Medium is a 123B-parameter natively multimodal model that reaches state-of-the-art results on Singapore multimodal, legal, and policy benchmarks after localized training on 1T+ tokens while staying competitive on global benchmarks.

citing papers explorer

Showing 5 of 5 citing papers.