pith. machine review for the scientific record. sign in

arxiv: 2505.24859 · v3 · submitted 2025-05-30 · 💻 cs.LG · cs.CL

Recognition: unknown

Beyond Multiple Choice: Evaluating Steering Vectors for Summarization

Authors on Pith no claims yet
classification 💻 cs.LG cs.CL
keywords steeringvectorsbeyondchoicecontrolcontrollingfree-formgeneration
0
0 comments X
read the original abstract

Steering vectors are a lightweight method for controlling text properties by adding a learned bias to language model activations at inference time. While predominantly studied for multiple-choice and toy tasks, their effectiveness in free-form generation remains largely unexplored. Moving "Beyond Multiple Choice," we evaluate steering vectors for controlling topical focus, sentiment, toxicity, and readability in abstractive summaries across the SAMSum, NEWTS, and arXiv datasets. We find that steering effectively controls targeted properties, but high steering strengths consistently induce degenerate repetition and factual hallucinations. Prompting alone preserves summary quality but offers weaker control. Combining both methods yields the strongest control and the most favorable efficacy-quality trade-off at moderate steering strengths. Our work demonstrates that steering vectors face a critical control-quality trade-off in free-form generation, and that hybrid approaches offer the best balance in practice.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

    cs.CL 2026-01 unverdicted novelty 5.0

    The survey organizes mechanistic interpretability techniques into a Locate-Steer-Improve framework to enable actionable improvements in LLM alignment, capability, and efficiency.