pith. sign in

arxiv: 2602.10388 · v4 · pith:MPS2VC2Wnew · submitted 2026-02-11 · 💻 cs.CL · cs.AI

Less is Enough: Synthesizing Diverse Data in LLM Feature Space with Sparse Autoencoders

classification 💻 cs.CL cs.AI
keywords datadiversityfeaturedownstreamfeaturesperformancespaceidentify
0
0 comments X
read the original abstract

The diversity of post-training data is critical for effective downstream performance in large language models (LLMs). Many existing approaches to constructing post-training data quantify diversity using text-based metrics that capture linguistic variation, but such metrics provide only weak signals for the task-relevant features that determine downstream performance. In this work, we introduce Feature Activation Coverage (FAC) which measures data diversity in an interpretable feature space. Building upon this metric, we further propose a diversity-driven data synthesis framework, named FAC Synthesis, that first uses a sparse autoencoder to identify missing features from a seed dataset, and then generates synthetic samples that explicitly reflect these features. Experiments show that our approach consistently improves both data diversity and downstream performance on various tasks, including instruction following, toxicity detection, reward modeling, and behavior steering. Interestingly, we identify a shared, interpretable feature space across model families (i.e., LLaMA, Mistral, and Qwen), enabling cross-model knowledge transfer. Our work provides a solid and practical methodology for exploring data-centric optimization of LLMs.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of English Language Learners via Inter-group Data Augmentation

    cs.CL 2026-02 unverdicted novelty 7.0

    BRIDGE reduces bias against high-scoring ELL students in automated scoring by generating synthetic samples via inter-group content pasting and quality discrimination, achieving fairness gains comparable to additional ...

  2. Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models

    cs.CL 2026-05 unverdicted novelty 4.0

    Qwen-Scope provides open-source sparse autoencoders for Qwen models that function as practical interfaces for steering, evaluating, data workflows, and optimizing large language models.