pith. machine review for the scientific record. sign in

Can we trust AI benchmarks? An interdisciplinary review of current issues in AI evaluation

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 6

roles

background 1

polarities

support 1

representative citing papers

Dataset Watermarking for Closed LLMs with Provable Detection

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

A new watermarking method for closed LLMs boosts random word-pair co-occurrences via rephrasing and detects the signal statistically in outputs, working reliably even when the watermarked data is only 1% of fine-tuning tokens while preserving utility.

citing papers explorer

Showing 6 of 6 citing papers.