Local linearity of LLM layers enables LQR-based closed-loop activation steering with theoretical tracking guarantees.
International Conference on Learning Representations , year=
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Prefix-tuning matches or exceeds fine-tuning on NLG tasks by optimizing a continuous prefix using 0.1% of parameters while keeping the LM frozen.
Activation steering is cast as constrained optimization that minimizes collateral damage by weighting perturbations according to the empirical second-moment matrix of activations instead of assuming isotropy.
Incidental multilingualism from uneven web training makes LLMs unequal, brittle, and opaque across languages.
citing papers explorer
-
Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control
Local linearity of LLM layers enables LQR-based closed-loop activation steering with theoretical tracking guarantees.
-
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-tuning matches or exceeds fine-tuning on NLG tasks by optimizing a continuous prefix using 0.1% of parameters while keeping the LM frozen.
-
Minimizing Collateral Damage in Activation Steering
Activation steering is cast as constrained optimization that minimizes collateral damage by weighting perturbations according to the empirical second-moment matrix of activations instead of assuming isotropy.
-
Lost in the Tower of Babel: The Adverse Effects of Incidental Multilingualism in LLMs
Incidental multilingualism from uneven web training makes LLMs unequal, brittle, and opaque across languages.