ByT5 : Towards a token-free future with pre-trained byte-to-byte models

Linting Xue, Aditya Barua, Noah Constant, Rami Al · 2021 · arXiv 2105.13626

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

MimeLens: Position-Agnostic Content-Type Detection for Binary Fragments

cs.CR · 2026-06-02 · unverdicted · novelty 7.0

MimeLens uses position-agnostic BERT encoders pretrained on random-offset binary windows to output one of 125 libmagic MIME labels, beating Magika on full files and enabling accurate classification on mid-file fragments.

YOMI-Bench: A Benchmark for Evaluating Kanji Reading and Phonological Understanding of LLMs for Japanese

cs.CL · 2026-07-01 · unverdicted · novelty 6.0

YOMI-Bench is a new benchmark of four tasks for kanji reading and phonological understanding in LLMs, showing low performance even for Japanese-specific and commercial models.

The Tokenizer Tax Across 25 European Languages: Domain Invariance, Cross-Lingual Few-Shot Effects, and the Ukrainian Penalty

cs.CL · 2026-05-23 · unverdicted · novelty 6.0

Tokenizer fertility varies 2.5x across 25 European languages with domain-invariant rankings, morphological fragmentation in high-fertility cases, and a Ukrainian penalty from pre-training underrepresentation.

PaLM: Scaling Language Modeling with Pathways

cs.CL · 2022-04-05 · accept · novelty 6.0

PaLM 540B demonstrates continued scaling benefits by setting new few-shot SOTA results on hundreds of benchmarks and outperforming humans on BIG-bench.

citing papers explorer

Showing 3 of 3 citing papers after filters.

YOMI-Bench: A Benchmark for Evaluating Kanji Reading and Phonological Understanding of LLMs for Japanese cs.CL · 2026-07-01 · unverdicted · none · ref 27
YOMI-Bench is a new benchmark of four tasks for kanji reading and phonological understanding in LLMs, showing low performance even for Japanese-specific and commercial models.
The Tokenizer Tax Across 25 European Languages: Domain Invariance, Cross-Lingual Few-Shot Effects, and the Ukrainian Penalty cs.CL · 2026-05-23 · unverdicted · none · ref 17
Tokenizer fertility varies 2.5x across 25 European languages with domain-invariant rankings, morphological fragmentation in high-fertility cases, and a Ukrainian penalty from pre-training underrepresentation.
PaLM: Scaling Language Modeling with Pathways cs.CL · 2022-04-05 · accept · none · ref 168
PaLM 540B demonstrates continued scaling benefits by setting new few-shot SOTA results on hundreds of benchmarks and outperforming humans on BIG-bench.

ByT5 : Towards a token-free future with pre-trained byte-to-byte models

fields

years

verdicts

representative citing papers

citing papers explorer