SAD modifies the denoising process in text diffusion models to enforce safety constraints at inference time, reducing unsafe generations while preserving quality and diversity.
Smith, and Yejin Choi
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
ExecTune trains guide models via acceptance sampling, supervised fine-tuning, and structure-aware RL to boost executability of strategies for black-box LLMs, yielding up to 9.2% higher accuracy and 22.4% lower cost on math and code tasks.
Re-evaluating controlled text generation systems under standardized conditions reveals that many published performance claims do not hold, highlighting the need for consistent evaluation practices.
Closure of the Perspective API exposes structural dependence on a single proprietary toxicity scorer, leaving non-updatable benchmarks and irreproducible results while risking continued reliance on closed LLMs.
citing papers explorer
-
The Safety-Aware Denoiser for Text Diffusion Models
SAD modifies the denoising process in text diffusion models to enforce safety constraints at inference time, reducing unsafe generations while preserving quality and diversity.
-
ExecTune: Effective Steering of Black-Box LLMs with Guide Models
ExecTune trains guide models via acceptance sampling, supervised fine-tuning, and structure-aware RL to boost executability of strategies for black-box LLMs, yielding up to 9.2% higher accuracy and 22.4% lower cost on math and code tasks.
-
A Comparative Study of Controlled Text Generation Systems Using Level-Playing-Field Evaluation Principles
Re-evaluating controlled text generation systems under standardized conditions reveals that many published performance claims do not hold, highlighting the need for consistent evaluation practices.
-
Bye Bye Perspective API: Lessons for Measurement Infrastructure in NLP, CSS and LLM Evaluation
Closure of the Perspective API exposes structural dependence on a single proprietary toxicity scorer, leaving non-updatable benchmarks and irreproducible results while risking continued reliance on closed LLMs.