Title resolution pending

Shivani Upadhyay, Ronak Pradeep, Nandan Thakur, Daniel Campos, Nick Craswell, Ian Soboroff, Jimmy Lin · 2025 · arXiv 1120.374460

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Formalized Information Needs Improve Large-Language-Model Relevance Judgments

cs.IR · 2026-04-05 · conditional · novelty 6.0

Synthetically formalizing information needs into topics with descriptions and narratives improves LLM relevance assessor agreement with humans and reduces over-labeling of relevant documents on TREC Deep Learning and Robust04.

When LLM Judges Inflate Scores: Exploring Overrating in Relevance Assessment

cs.IR · 2026-02-19 · unverdicted · novelty 6.0

LLMs consistently overrate relevance of inadequate passages in IR evaluations due to biases toward length and lexical features rather than true content match.

LLMs as Assessors: Right for the Right Reason?

cs.IR · 2026-01-13 · unverdicted · novelty 5.0

LLMs judge document relevance at a level comparable to humans but frequently highlight different passages, indicating they are often not right for the right reasons and cannot fully replace human assessors.

citing papers explorer

Showing 3 of 3 citing papers.

Formalized Information Needs Improve Large-Language-Model Relevance Judgments cs.IR · 2026-04-05 · conditional · none · ref 1
Synthetically formalizing information needs into topics with descriptions and narratives improves LLM relevance assessor agreement with humans and reduces over-labeling of relevant documents on TREC Deep Learning and Robust04.
When LLM Judges Inflate Scores: Exploring Overrating in Relevance Assessment cs.IR · 2026-02-19 · unverdicted · none · ref 27
LLMs consistently overrate relevance of inadequate passages in IR evaluations due to biases toward length and lexical features rather than true content match.
LLMs as Assessors: Right for the Right Reason? cs.IR · 2026-01-13 · unverdicted · none · ref 31
LLMs judge document relevance at a level comparable to humans but frequently highlight different passages, indicating they are often not right for the right reasons and cannot fully replace human assessors.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer