Evolutionary trees from LLM weights recover ground-truth training topologies and identify key datasets and layers through phenotypic analysis.
BillSum: A corpus for automatic summarization of US legislation
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
A 1.88-million-article biomedical summarization dataset is released and quality-aware selection of training data based on abstract alignment outperforms random sampling on factuality metrics.
DP-SGD-RC applies Hutchinson and Hutch++ estimators to approximate per-sample gradient norms for clipping in DP-SGD, claiming competitive privacy noise multipliers and utility on Llama 3.2-1B with reduced memory.
DOF ranks document categories by distinctiveness instead of size to promote blind-spot discovery, surfacing different content than coverage-based methods across four domains.
citing papers explorer
-
Efficient DP-SGD for LLMs with Randomized Clipping
DP-SGD-RC applies Hutchinson and Hutch++ estimators to approximate per-sample gradient norms for clipping in DP-SGD, claiming competitive privacy noise multipliers and utility on Llama 3.2-1B with reduced memory.