Chain-of-thought prompting, by including intermediate reasoning steps in few-shot examples, elicits strong reasoning abilities in large language models on arithmetic, commonsense, and symbolic tasks.
hub
Prefix-Tuning: Optimizing Continuous Prompts for Generation
15 Pith papers cite this work, alongside 1,373 external citations. Polarity classification is still indexing.
hub tools
representative citing papers
LOFT unifies orthogonal PEFT by treating adaptation as low-rank subspace rotation and adds task-aware support selection that improves efficiency under fixed budgets.
QD-LLM evolves prompt embeddings via neuroevolution in a quality-diversity framework, delivering 46% higher coverage and 41% higher QD-score than prior methods on coding and writing benchmarks.
Prologue introduces dedicated prologue tokens to decouple generation and reconstruction in AR visual models, significantly improving generation FID scores on ImageNet while maintaining reconstruction quality.
PSR models that estimate token-specific steering coefficients from activations outperform standard activation steering and compare favorably to prompting on steering benchmarks.
OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.
Prompt tuning matches full model tuning performance on large language models while tuning only a small fraction of parameters and improves robustness to domain shifts.
Summing outputs from separately trained QLoRA PEFT modules provides strong performance for attribute-controlled text generation, often matching or exceeding single-task modules even on single-attribute tests.
On-policy distillation gains efficiency from early foresight in module allocation and low-rank update directions, enabling EffOPD to accelerate training by 3x via adaptive extrapolation without extra modules or tuning.
Memory Inception is a training-free method that injects latent KV banks at chosen layers to steer LLMs, achieving superior control-drift balance and up to 118x storage reduction on personality and structured-reasoning tasks.
Autoregressive generation modeled as a Markov process over tokens allows new knowledge to be incorporated by extending the state space with a token-to-dictionary mapping whose sample complexity is linear in the number of mapped existing tokens, realized via embedding tuning that induces zero forget.
VisPrompt improves prompt learning robustness under label noise by injecting instance-level visual semantics via attention and adaptive modulation while freezing the VLM backbone.
AdaLoRA uses SVD-based pruning to allocate the parameter budget for low-rank fine-tuning updates according to per-matrix importance scores, yielding better performance than uniform allocation especially under tight budgets.
LightEdit enables scalable lifelong knowledge editing in LLMs via selective knowledge retrieval and probability suppression during decoding, outperforming prior methods on ZSRE, Counterfact, and RIPE while reducing training costs.
PaLM 2 reports state-of-the-art results on language, reasoning, and multilingual tasks with improved efficiency over PaLM.
citing papers explorer
-
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-thought prompting, by including intermediate reasoning steps in few-shot examples, elicits strong reasoning abilities in large language models on arithmetic, commonsense, and symbolic tasks.
-
LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection
LOFT unifies orthogonal PEFT by treating adaptation as low-rank subspace rotation and adds task-aware support selection that improves efficiency under fixed budgets.
-
Parameter-Efficient Neuroevolution for Diverse LLM Generation: Quality-Diversity Optimization via Prompt Embedding Evolution
QD-LLM evolves prompt embeddings via neuroevolution in a quality-diversity framework, delivering 46% higher coverage and 41% higher QD-score than prior methods on coding and writing benchmarks.
-
Autoregressive Visual Generation Needs a Prologue
Prologue introduces dedicated prologue tokens to decouple generation and reconstruction in AR visual models, significantly improving generation FID scores on ImageNet while maintaining reconstruction quality.
-
Steer Like the LLM: Activation Steering that Mimics Prompting
PSR models that estimate token-specific steering coefficients from activations outperform standard activation steering and compare favorably to prompting on steering benchmarks.
-
OPT: Open Pre-trained Transformer Language Models
OPT releases open decoder-only transformers up to 175B parameters that match GPT-3 performance at one-seventh the carbon cost, along with code and training logs.
-
The Power of Scale for Parameter-Efficient Prompt Tuning
Prompt tuning matches full model tuning performance on large language models while tuning only a small fraction of parameters and improves robustness to domain shifts.
-
Output Composability of QLoRA PEFT Modules for Plug-and-Play Attribute-Controlled Text Generation
Summing outputs from separately trained QLoRA PEFT modules provides strong performance for attribute-controlled text generation, often matching or exceeding single-task modules even on single-attribute tests.
-
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation
On-policy distillation gains efficiency from early foresight in module allocation and low-rank update directions, enabling EffOPD to accelerate training by 3x via adaptive extrapolation without extra modules or tuning.
-
Memory Inception: Latent-Space KV Cache Manipulation for Steering LLMs
Memory Inception is a training-free method that injects latent KV banks at chosen layers to steer LLMs, achieving superior control-drift balance and up to 118x storage reduction on personality and structured-reasoning tasks.
-
Memory as a Markov Matrix: Sample Efficient Knowledge Expansion via Token-to-Dictionary Mapping
Autoregressive generation modeled as a Markov process over tokens allows new knowledge to be incorporated by extending the state space with a token-to-dictionary mapping whose sample complexity is linear in the number of mapped existing tokens, realized via embedding tuning that induces zero forget.
-
Seeing is Believing: Robust Vision-Guided Cross-Modal Prompt Learning under Label Noise
VisPrompt improves prompt learning robustness under label noise by injecting instance-level visual semantics via attention and adaptive modulation while freezing the VLM backbone.
-
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
AdaLoRA uses SVD-based pruning to allocate the parameter budget for low-rank fine-tuning updates according to per-matrix importance scores, yielding better performance than uniform allocation especially under tight budgets.
-
Towards Scalable Lifelong Knowledge Editing with Selective Knowledge Suppression
LightEdit enables scalable lifelong knowledge editing in LLMs via selective knowledge retrieval and probability suppression during decoding, outperforming prior methods on ZSRE, Counterfact, and RIPE while reducing training costs.
-
PaLM 2 Technical Report
PaLM 2 reports state-of-the-art results on language, reasoning, and multilingual tasks with improved efficiency over PaLM.