pith. sign in

Trustjudge: Inconsistencies of llm-as-a-judge and how to alleviate them

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 4 2024 1

verdicts

UNVERDICTED 5

roles

background 1

polarities

background 1

representative citing papers

Can LLMs Rank? A Tale of Triads and Triage

cs.CY · 2026-06-29 · unverdicted · novelty 5.0

LLM ranking reliability for prioritization tasks can be assessed via coefficient of consistency ζ (intra-run circular triads) and Kendall's τ (inter-run distance), with three leading models showing distinct consistency profiles on homelessness allocation and ED triage.

A Survey on LLM-as-a-Judge

cs.CL · 2024-11-23 · unverdicted · novelty 4.0

A survey on LLM-as-a-Judge that reviews reliability strategies, proposes evaluation methods, and introduces a novel benchmark for assessing such systems.

citing papers explorer

Showing 5 of 5 citing papers.