pith. sign in

Trust- judge: Inconsistencies of LLM-as-a-judge and how to alleviate them.arXiv preprint arXiv:2509.21117,

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 5 2024 1

verdicts

UNVERDICTED 6

roles

background 1

polarities

background 1

clear filters

representative citing papers

Can LLMs Rank? A Tale of Triads and Triage

cs.CY · 2026-06-29 · unverdicted · novelty 5.0

LLM ranking reliability for prioritization tasks can be assessed via coefficient of consistency ζ (intra-run circular triads) and Kendall's τ (inter-run distance), with three leading models showing distinct consistency profiles on homelessness allocation and ED triage.

A Survey on LLM-as-a-Judge

cs.CL · 2024-11-23 · unverdicted · novelty 4.0

A survey on LLM-as-a-Judge that reviews reliability strategies, proposes evaluation methods, and introduces a novel benchmark for assessing such systems.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • A Survey on LLM-as-a-Judge cs.CL · 2024-11-23 · unverdicted · none · ref 166

    A survey on LLM-as-a-Judge that reviews reliability strategies, proposes evaluation methods, and introduces a novel benchmark for assessing such systems.