AtomEval: Validity-Aware Atomic Evaluation of Adversarial Claim Rewriting in Fact Verification

Hanze Jia; Hongyi Cen; Jingyi Zheng; Mingxin Wang; Tan Tang; Yule Liu

arxiv: 2604.07967 · v3 · pith:DUC3BMNInew · submitted 2026-04-09 · 💻 cs.CL · cs.AI

AtomEval: Validity-Aware Atomic Evaluation of Adversarial Claim Rewriting in Fact Verification

Hongyi Cen , Mingxin Wang , Yule Liu , Jingyi Zheng , Hanze Jia , Tan Tang This is my paper

Pith reviewed 2026-05-10 18:12 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords adversarial claimsfact verificationclaim decompositionatomic evaluationFEVER datasetLLM adversarial generationtruth conditions

0 comments

The pith

AtomEval evaluates adversarial claims by decomposing them into atomic SROM components to better detect factual corruptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard metrics for adversarial rewrites in fact verification often overlook whether the rewritten claim still means the same thing at the level of truth conditions. AtomEval addresses this by breaking claims into subject-relation-object-modifier atoms and scoring their validity individually. This approach identifies when a rewrite introduces factual errors that surface-level similarity measures miss. Tests on the FEVER dataset using different attack methods and large language model generators demonstrate that it yields more dependable assessments of adversarial effectiveness. The framework also reveals that more capable language models do not always generate stronger adversarial examples under this atomic evaluation.

Core claim

We introduce AtomEval, a validity-aware evaluation framework that decomposes claims into subject-relation-object-modifier (SROM) atoms and scores adversarial rewrites with Atomic Validity Scoring (AVS), enabling detection of factual corruption beyond surface similarity. Experiments on the FEVER dataset across representative attack strategies and LLM generators show that AtomEval provides more reliable evaluation signals.

What carries the argument

Atomic Validity Scoring (AVS) on subject-relation-object-modifier (SROM) atoms, which breaks down claims to assess consistency at the smallest factual units.

Load-bearing premise

That breaking claims into subject-relation-object-modifier atoms and scoring them individually captures all important aspects of whether the claim's truth value has changed.

What would settle it

Run a study where humans judge a collection of original and adversarial claims for factual equivalence and compare how well AtomEval scores match those judgments versus traditional metrics like cosine similarity or BLEU.

Figures

Figures reproduced from arXiv: 2604.07967 by Hanze Jia, Hongyi Cen, Jingyi Zheng, Mingxin Wang, Tan Tang, Yule Liu.

**Figure 1.** Figure 1: Valid and invalid adversarial rewrites of a refuted claim. A valid rewrite preserves the original false proposition (red), whereas an invalid rewrite drifts toward the evidencesupported fact (green) or introduces hallucinated content (blue). textual similarity alone can misclassify semantically corrupted rewrites as successful attacks. In our analysis, such cases account for a substantial portion of at… view at source ↗

**Figure 2.** Figure 2: Overview of AtomEval, a validity-aware evaluation framework for adversarial claim rewriting in fact verification. Given an original claim, its retrieved evidence, and an adversarial rewrite, AtomEval decomposes the original and rewritten claims into atomic facts (SROM tuples) and evaluates them using a hard structural gate together with soft semantic degradation metrics. Blue, red, and green correspond to … view at source ↗

**Figure 3.** Figure 3: The modular prompt generation pipeline. The final adversarial prompt is synthesized by [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

read the original abstract

Large language models (LLMs) can rewrite refuted claims to evade evidence-based fact verifiers, but conventional attack success rate (ASR) can be inflated when rewrites change, weaken, or correct the false proposition they are supposed to preserve. We introduce AtomEval, a validity-aware evaluation protocol for fixed-evidence adversarial claim rewriting. AtomEval represents claims as subject--relation--object--modifier (SROM) atoms, applies a one-way preservation gate to separate valid verifier evasion from proposition-changing rewrites, and reports validity-aware attack success rate (VASR), which counts only verifier-evasive rewrites that preserve the original false proposition. AtomEval further provides fine-grained diagnostics that explain both proposition-level failures and non-minimal valid rewrites. On FEVER refuted-claim rewriting, AtomEval exposes and explains ASR inflation: many apparent attacks fool the verifier by altering, weakening, or correcting the proposition they should preserve. By making attacked-proposition preservation explicit and measurable, AtomEval provides a stable evaluation target for evaluating adversarial rewriters that must balance verifier evasion with proposition preservation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AtomEval introduces SROM atom decomposition and AVS scoring to catch factual corruption in adversarial rewrites that standard metrics miss, but the abstract gives too little on methods or numbers to confirm it works as claimed.

read the letter

The main takeaway is that this paper offers a new way to evaluate adversarial claims in fact verification by breaking them into subject-relation-object-modifier atoms and applying Atomic Validity Scoring. That targets a real issue where surface-level metrics accept rewrites that change the underlying facts. The experiments on FEVER with multiple attack strategies and LLM generators are mentioned as showing more reliable signals, plus the note that stronger models do not always generate better attacks under this lens. Those points are worth attention in the robustness subfield. The atomic approach and validity focus are presented as fresh contributions rather than minor tweaks to existing work. The paper does a straightforward job naming the gap in current evaluation practices and sketching a finer-grained alternative. The experiments across generators add some practical angle by questioning assumptions about model scale. Soft spots center on the missing pieces. The abstract does not spell out how atoms get extracted, how AVS is computed in practice, what the exact baselines were, or any quantitative results with error bars. Without those, it is difficult to tell whether the method avoids its own biases or misses semantic details, as the weakest assumption flags. The superiority claim stays untestable from the given text. This work is mainly for researchers in NLP fact verification and adversarial testing who need better evaluation tools. A reader already working on claim rewriting or robustness metrics could pick up the framework idea and the LLM analysis for follow-up. It deserves a serious referee because the core problem is genuine and the proposed structure is concrete enough to review and improve, even if the current version needs more methods and data to stand on its own.

Referee Report

0 major / 0 minor

Summary. The paper introduces AtomEval, a validity-aware evaluation framework for adversarial claim rewriting in fact verification. It decomposes claims into subject-relation-object-modifier (SROM) atoms and applies Atomic Validity Scoring (AVS) to detect factual corruption beyond surface similarity. Experiments on the FEVER dataset across representative attack strategies and LLM generators demonstrate that AtomEval provides more reliable evaluation signals than standard metrics, with further analysis showing that stronger LLMs do not necessarily produce more effective adversarial claims under validity-aware evaluation.

Significance. If the empirical results hold, AtomEval could meaningfully improve evaluation practices in fact verification by addressing the failure of standard metrics to capture truth-conditional consistency in adversarial rewrites. The framework's atomic decomposition approach offers a more granular tool for assessing factual corruption, and the analysis of LLM generators highlights previously overlooked limitations in current adversarial evaluation methods.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work, the recognition of AtomEval's potential significance for improving evaluation practices in fact verification, and the recommendation for minor revision.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces AtomEval as an empirical evaluation framework that decomposes claims into SROM atoms and applies Atomic Validity Scoring, validated through experiments on the FEVER dataset. No equations, derivations, predictions, or first-principles results are claimed or present in the provided text. The contribution rests on definitional construction and external benchmarking rather than any self-referential reduction, fitted inputs renamed as predictions, or load-bearing self-citations. This is a standard case of a self-contained empirical proposal with no internal circularity in its derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only input provides no information on free parameters, axioms, or invented entities used by the method.

pith-pipeline@v0.9.0 · 5410 in / 996 out tokens · 30504 ms · 2026-05-10T18:12:53.655768+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

S(C') = H · max(0,1−Ltotal) where H = Irel and Ltotal aggregates core distortion, fact conflict, topic drift, evidence leakage

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.