Probes predicting future behaviors from intermediate steps enable Future Probe Controlled Generation for steering large reasoning models with minimal quality degradation.
InFindings of the Association for Computational Linguistics: EMNLP 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.)
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 7roles
background 1polarities
background 1representative citing papers
A neuro-symbolic framework compiles LTLf formulas to DFAs, derives differentiable satisfaction signals from DFA progression, and uses them as a logic-based regularization loss to enforce temporal constraints in autoregressive transformer RL policies while preserving competitive returns.
CRGC models instructions as constraint graphs, identifies bridge constraints, and cuts violations by 39% on three datasets while preserving reasoning performance.
Grounded Decoding fuses full-RAG and retrieval-only next-token distributions via normalized geometric mean from a KL-barycenter to improve factual consistency and citation quality in RAG.
Summing outputs from separately trained QLoRA PEFT modules provides strong performance for attribute-controlled text generation, often matching or exceeding single-task modules even on single-attribute tests.
Re-evaluating controlled text generation systems under standardized conditions reveals that many published performance claims do not hold, highlighting the need for consistent evaluation practices.
citing papers explorer
-
Predicting Future Behaviors in Reasoning Models Enables Better Steering
Probes predicting future behaviors from intermediate steps enable Future Probe Controlled Generation for steering large reasoning models with minimal quality degradation.
-
Neuro-Symbolic Injection of LTLf Constraints in Autoregressive Reinforcement Learning Policies
A neuro-symbolic framework compiles LTLf formulas to DFAs, derives differentiable satisfaction signals from DFA progression, and uses them as a logic-based regularization loss to enforce temporal constraints in autoregressive transformer RL policies while preserving competitive returns.
-
Bridging Auxiliary Constraints to Resolve Instruction Following in Large Reasoning Models
CRGC models instructions as constraint graphs, identifies bridge constraints, and cuts violations by 39% on three datasets while preserving reasoning performance.
-
Grounded Decoding: Retrieval-Anchored Probability Fusion for Faithful RAG
Grounded Decoding fuses full-RAG and retrieval-only next-token distributions via normalized geometric mean from a KL-barycenter to improve factual consistency and citation quality in RAG.
-
Output Composability of QLoRA PEFT Modules for Plug-and-Play Attribute-Controlled Text Generation
Summing outputs from separately trained QLoRA PEFT modules provides strong performance for attribute-controlled text generation, often matching or exceeding single-task modules even on single-attribute tests.
-
A Comparative Study of Controlled Text Generation Systems Using Level-Playing-Field Evaluation Principles
Re-evaluating controlled text generation systems under standardized conditions reveals that many published performance claims do not hold, highlighting the need for consistent evaluation practices.
- The Safety-Aware Denoiser for Text Diffusion Models