LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.
Sparse networks from scratch: Faster training without losing performance
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
SMET stabilizes dynamic sparse training of LLMs via optimizer warm-up and density-aware scaling, reducing memory use and enabling practical sparse pre-training as an alternative to dense methods.
A stochastic column-block nonlinear Bregman method is introduced for sparse solutions of nonlinear systems, with a proven convergence rate bound under stated assumptions.
Adaptive λ adjustment for target sparsity in LinBreg and AdaBreg, shown to work on speaker verification models with ECAPA-TDNN and ResNet34.
citing papers explorer
-
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.
-
Memory-Efficient LLM Training with Dynamic Sparsity: From Stability to Practical Scaling
SMET stabilizes dynamic sparse training of LLMs via optimizer warm-up and density-aware scaling, reducing memory use and enabling practical sparse pre-training as an alternative to dense methods.
-
On a stochastic column-block bregman method for nonlinear systems
A stochastic column-block nonlinear Bregman method is introduced for sparse solutions of nonlinear systems, with a proven convergence rate bound under stated assumptions.
-
Adaptive Regularization for Sparsity Control in Bregman-Based Optimizers
Adaptive λ adjustment for target sparsity in LinBreg and AdaBreg, shown to work on speaker verification models with ECAPA-TDNN and ResNet34.
- On the Stability of Growth in Structural Plasticity