pith. machine review for the scientific record. sign in

arxiv: 2503.01804 · v4 · submitted 2025-03-03 · 💻 cs.CL · cs.AI· cs.LG

Recognition: unknown

texttt{SEM-CTRL}: Semantically Controlled Decoding

Authors on Pith no claims yet
classification 💻 cs.CL cs.AIcs.LG
keywords sem-ctrltextttapproachconstraintssemanticallowsgrammarsoutputs
0
0 comments X
read the original abstract

Ensuring both syntactic and semantic correctness in Large Language Model (LLM) outputs remains a significant challenge, despite being critical for real-world deployment. In this paper, we introduce $\texttt{SEM-CTRL}$, a unified approach that allows for enforcing rich context-sensitive constraints, and task and instance specific semantics directly on the LLM decoder. Our approach integrates token-level MCTS which is guided by specific syntactic and semantic constraints. The constraints over desired outputs are expressed using Answer Set Grammars, which is a logic-based formalism that generalizes context sensitive grammars while incorporating background knowledge to represent task-specific semantics. We show that our approach helps guarantee valid completions for any off-the-shelf LLM without the need for fine-tuning. We evaluate $\texttt{SEM-CTRL}$ on a range of tasks, including synthetic grammar synthesis, combinatorial reasoning, JSON parsing, and planning. Our experimental results demonstrate that $\texttt{SEM-CTRL}$ allows even small pre-trained LLMs to efficiently outperform larger variants and state-of-the-art reasoning models (e.g., $\textit{o4-mini}$) while simultaneously guaranteeing semantic validity.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Learning and Enforcing Context-Sensitive Control for LLMs

    cs.CL 2026-04 unverdicted novelty 7.0

    A framework learns context-sensitive constraints automatically from LLM outputs to enforce perfect adherence during generation without manual specification.