pith. machine review for the scientific record. sign in

arxiv: 2510.17853 · v4 · submitted 2025-10-15 · 💻 cs.DL

Recognition: unknown

CiteGuard: Faithful Citation Attribution for LLMs via Retrieval-Augmented Validation

Authors on Pith no claims yet
classification 💻 cs.DL
keywords citationciteguardattributionaccuracycitationsfaithfulhumanllm-as-a-judge
0
0 comments X
read the original abstract

Large Language Models (LLMs) have emerged as powerful assistants for scientific writing. However, concerns remain about the quality and reliability of the generated text, including citation accuracy and faithfulness. While most recent work relies on methods such as LLM-as-a-Judge, the reliability of LLM-as-a-Judge alone is also in doubt. In this work, we reframe citation evaluation as a problem of citation attribution alignment, which assesses whether LLM-generated citations match those a human author would include for the same text. We propose CiteGuard, a retrieval-aware agent framework designed to provide more faithful grounding for citation validation. CiteGuard improves over the prior baseline by 10 percentage points and achieves up to 68.1% accuracy on the CiteME benchmark, approaching human performance (69.2%). It also identifies alternative valid citations and demonstrates generalization ability for cross-domain citation attribution. Our code is available at https://github.com/KathCYM/CiteGuard.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Cited but Not Verified: Parsing and Evaluating Source Attribution in LLM Deep Research Agents

    cs.CL 2026-05 unverdicted novelty 7.0

    A new framework parses and evaluates citations in LLM deep research reports across link validity, relevance, and factuality, finding 94%+ link success but only 39-77% factual accuracy.

  2. BibTeX Citation Hallucinations in Scientific Publishing Agents: Evaluation and Mitigation

    cs.DL 2026-04 conditional novelty 7.0

    Frontier LLMs generate BibTeX entries at 83.6% field accuracy but only 50.9% fully correct; two-stage clibib revision raises accuracy to 91.5% and fully correct entries to 78.3% with 0.8% regression.