Contextual bandits with entropy-based human feedback, 2025

Seraj, R · 2025 · arXiv 2502.08759

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Autoformalization of Agent Instructions into Policy-as-Code

cs.AI · 2026-06-25 · unverdicted · novelty 5.0

An LLM-based generator-critic loop autoformalizes natural language policies into Cedar policies that cover substantially more of the source specification than hand-coded symbolic enforcement on MedAgentBench.

citing papers explorer

Showing 1 of 1 citing paper.

Autoformalization of Agent Instructions into Policy-as-Code cs.AI · 2026-06-25 · unverdicted · none · ref 12
An LLM-based generator-critic loop autoformalizes natural language policies into Cedar policies that cover substantially more of the source specification than hand-coded symbolic enforcement on MedAgentBench.

Contextual bandits with entropy-based human feedback, 2025

fields

years

verdicts

representative citing papers

citing papers explorer