pith. machine review for the scientific record. sign in

Can Large Language Models Be an Alternative to Human Evaluations?

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

years

2026 8

verdicts

UNVERDICTED 8

representative citing papers

NARRA-Gym for Evaluating Interactive Narrative Agents

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

NARRA-Gym is an executable benchmark that generates complete interactive narrative episodes from emotional seeds and logs full model trajectories to expose gaps in coherence, adaptation, and personalization that static story tests miss.

LLM Advertisement based on Neuron Auctions

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

Neuron Auctions auction continuous neuron intervention budgets on brand-specific orthogonal subspaces in LLMs to achieve strategy-proof revenue optimization while penalizing user utility loss.

citing papers explorer

Showing 8 of 8 citing papers.