arXiv preprint arXiv:2402.01694 , year=

Maxim Khanov, Jirayu Burapacheep, Yixuan Li · 2024 · arXiv 2402.01694

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control

cs.LG · 2026-04-21 · conditional · novelty 7.0

Local linearity of LLM layers enables LQR-based closed-loop activation steering with theoretical tracking guarantees.

Pref-CTRL: Preference Driven LLM Alignment using Representation Editing

cs.CL · 2026-04-26 · unverdicted · novelty 6.0

Pref-CTRL trains a multi-objective value function on preferences to guide representation editing for LLM alignment, outperforming RE-Control on benchmarks with better out-of-domain generalization.

Meet Dynamic Individual Preferences: Resolving Conflicting Human Value with Paired Fine-Tuning

cs.CL · 2026-04-14 · unverdicted · novelty 5.0

Preference-Paired Fine-Tuning (PFT) lets LLMs handle conflicting and dynamic individual preferences better than single-preference methods, reaching 96.6% accuracy on the new VCD dataset and 44.76% gains in user alignment with limited history.

citing papers explorer

Showing 3 of 3 citing papers.

Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control cs.LG · 2026-04-21 · conditional · none · ref 63
Local linearity of LLM layers enables LQR-based closed-loop activation steering with theoretical tracking guarantees.
Pref-CTRL: Preference Driven LLM Alignment using Representation Editing cs.CL · 2026-04-26 · unverdicted · none · ref 16
Pref-CTRL trains a multi-objective value function on preferences to guide representation editing for LLM alignment, outperforming RE-Control on benchmarks with better out-of-domain generalization.
Meet Dynamic Individual Preferences: Resolving Conflicting Human Value with Paired Fine-Tuning cs.CL · 2026-04-14 · unverdicted · none · ref 3
Preference-Paired Fine-Tuning (PFT) lets LLMs handle conflicting and dynamic individual preferences better than single-preference methods, reaching 96.6% accuracy on the new VCD dataset and 44.76% gains in user alignment with limited history.

arXiv preprint arXiv:2402.01694 , year=

fields

years

verdicts

representative citing papers

citing papers explorer