Scaling embedding layers in language models

Da Y u, Edith Cohen, Badih Ghazi, Y angsibo Huang, Pritish Kamath, Ravi Kumar, Daogao Liu, Chiyuan Zhang · arXiv 2502.01637

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

VocabTailor: Dynamic Vocabulary Selection for Downstream Tasks in Small Language Models

cs.CL · 2025-08-21 · unverdicted · novelty 7.0

VocabTailor introduces a decoupled dynamic vocabulary selection framework that reduces vocabulary-related memory in SLMs by up to 99% with minimal task performance loss.

Beyond N-gram: Data-Aware X-GRAM Extraction for Efficient Embedding Parameter Scaling

cs.CL · 2026-04-23 · unverdicted · novelty 6.0

X-GRAM applies data-aware dynamic token injection with hybrid hashing and local feature extraction to achieve up to 4.4 accuracy point gains over vanilla backbones and 3.2 over retrieval baselines at 0.73B-1.15B scales using 50% smaller tables.

NGM: A Plug-and-Play Training-Free Memory Module for LLMs

cs.AI · 2026-05-16 · unverdicted · novelty 5.0

NGM is a plug-and-play n-gram memory module that encodes n-grams from pretrained embeddings and gates their injection to improve LLM performance by 0.5-1.2 points on average across eight benchmarks.

citing papers explorer

Showing 3 of 3 citing papers.

VocabTailor: Dynamic Vocabulary Selection for Downstream Tasks in Small Language Models cs.CL · 2025-08-21 · unverdicted · none · ref 21
VocabTailor introduces a decoupled dynamic vocabulary selection framework that reduces vocabulary-related memory in SLMs by up to 99% with minimal task performance loss.
Beyond N-gram: Data-Aware X-GRAM Extraction for Efficient Embedding Parameter Scaling cs.CL · 2026-04-23 · unverdicted · none · ref 3
X-GRAM applies data-aware dynamic token injection with hybrid hashing and local feature extraction to achieve up to 4.4 accuracy point gains over vanilla backbones and 3.2 over retrieval baselines at 0.73B-1.15B scales using 50% smaller tables.
NGM: A Plug-and-Play Training-Free Memory Module for LLMs cs.AI · 2026-05-16 · unverdicted · none · ref 42
NGM is a plug-and-play n-gram memory module that encodes n-grams from pretrained embeddings and gates their injection to improve LLM performance by 0.5-1.2 points on average across eight benchmarks.

Scaling embedding layers in language models

fields

years

verdicts

representative citing papers

citing papers explorer