pith. sign in

← back to paper

Review history

arxiv: 2604.27263 · 2 revisions

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation

  1. 2026-05-15 UNVERDICTED LOW v0.9.0 novelty 6.0
    59377 ms 5426 in 992 out 2026-05-15T06:55:23.361819+00:00
  2. 2026-05-07 UNVERDICTED LOW v0.9.0 novelty 5.0
    40714 ms 5426 in 1227 out 2026-05-07T09:59:06.804641+00:00