HalluWorld is a controlled benchmark using explicit reference world models to automatically label and disentangle hallucinations in LLMs across synthetic environments with varying complexity and observability.
ACPBench: Reasoning about Action, Change, and Planning
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
LLMs trained via rubric-based self-rewarding RL with GRPO enhanced feeling expression and sycophancy robustness but degraded truthful QA performance.
citing papers explorer
-
HalluWorld: A Controlled Benchmark for Hallucination via Reference World Models
HalluWorld is a controlled benchmark using explicit reference world models to automatically label and disentangle hallucinations in LLMs across synthetic environments with varying complexity and observability.
-
When AI Says It Feels
LLMs trained via rubric-based self-rewarding RL with GRPO enhanced feeling expression and sycophancy robustness but degraded truthful QA performance.