Evolutionary trees from LLM weights recover ground-truth training topologies and identify key datasets and layers through phenotypic analysis.
BillSum: A corpus for automatic summarization of US legislation
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
A 1.88-million-article biomedical summarization dataset is released and quality-aware selection of training data based on abstract alignment outperforms random sampling on factuality metrics.
DP-SGD-RC applies Hutchinson and Hutch++ estimators to approximate per-sample gradient norms for clipping in DP-SGD, claiming competitive privacy noise multipliers and utility on Llama 3.2-1B with reduced memory.
DOF ranks document categories by distinctiveness instead of size to promote blind-spot discovery, surfacing different content than coverage-based methods across four domains.
citing papers explorer
-
Analysis and Explainability of LLMs Via Evolutionary Methods
Evolutionary trees from LLM weights recover ground-truth training topologies and identify key datasets and layers through phenotypic analysis.
-
Less is More: Quality-Aware Training Data Selection for Scientific Summarization
A 1.88-million-article biomedical summarization dataset is released and quality-aware selection of training data based on abstract alignment outperforms random sampling on factuality metrics.
-
Efficient DP-SGD for LLMs with Randomized Clipping
DP-SGD-RC applies Hutchinson and Hutch++ estimators to approximate per-sample gradient norms for clipping in DP-SGD, claiming competitive privacy noise multipliers and utility on Llama 3.2-1B with reduced memory.
-
Discovery-Oriented Faceting: From Coverage to Blind-Spot Discovery
DOF ranks document categories by distinctiveness instead of size to promote blind-spot discovery, surfacing different content than coverage-based methods across four domains.