pith. sign in
Pith / Trust infrastructure / Document

Pith Integrity

The author's signature on a scientific paper used to be a one-time promise checked by a few reviewers. Pith makes that promise a continuous, signed, machine-checked property that follows the paper forever and lets anyone challenge any claim on the record.


§1The contract

Every scientific paper rests on an implicit contract from its author: each reference exists; each cited work says what the paper says it said; each data point is real; each paragraph is the author's; each theorem the paper claims to prove is proved. The contract was historically checked once, at peer review, by a few humans. It is now systematically violated at scale and there is no infrastructure to verify it.

Pith makes the contract explicit and continuously verifiable. Five properties must hold for a paper to deserve trust from anyone who did not write it:

  1. ExplicitEvery factual claim has machine-readable provenance: cited works, evidence type, location, proof artifact. Surfaced at /pith/<id>/claims.json.live
  2. Machine-checkableVerification does not require a human to read the paper. Detectors run automatically, deterministically, and reproducibly.live
  3. ContinuousThe integrity record changes when the world changes. URL availability, DOI status, and Crossref/OpenAlex retraction flags are re-checked on a schedule; a retracted cited work flips the record to critical.live
  4. SignedEvery finding and every challenge is signed with the Pith Ed25519 key and emitted as a bundle event. Replayable by anyone with the cited paper.live
  5. ChallengeableAny Pith user can file a signed challenge against a specific claim, reference, attribution, data point, or figure. The challenge is a first-class bundle event. The author may respond. The disagreement is the receipt.live

"Live" means the property holds for any paper in the corpus today. The detector pool and challenge mechanism are append-only.


§2What is running

13375 papers checked 2133 findings 885 critical 1248 advisory 1046 papers affected 8 detectors live

§3Position

Pith is a support layer, not a replacement. The existing publication infrastructure stays. Pith sits beneath it.

DOI
Crossref and DataCite remain the global registry. Pith never mints a competing DOI. It verifies that each DOI as printed in a bibliography actually resolves, and re-checks over time.
arXiv
arXiv remains the canonical preprint server. Pith ingests arXiv papers, attaches a Pith Number for citable provenance, and adds a verification record beside the paper.
Journals
Peer review remains where editorial judgment happens. Pith adds the layer underneath: deterministic checks that should never have been a human's job in the first place.
"Verification and trust infrastructure could become complementary to the existing publication system."Milan Zlatanovic, May 2026

§4What we check

DetectorVerdict classWhat it does
doi_complianceincontrovertibleResolves every DOI and arXiv ID in a paper's bibliography against Crossref, OpenAlex, internal corpus, and arXiv. Flags only identifiers that cannot resolve anywhere.
doi_title_agreementcross sourceCompares the title that a paper claims for each cited reference against the title that the reference's DOI or arXiv ID actually resolves to.
ai_meta_artifactincontrovertibleScans paper body text for verbatim AI assistant artifacts (refusal templates, placeholder cites, training-cutoff disclaimers).
external_linksincontrovertibleExtracts external URLs from paper text and re-verifies them with HTTP HEAD/GET. Flags dead repos and 404 URLs with the status code at check time.
citation_quote_validitythreshold with marginWhen a citing paper attributes a specific factual claim to a referenced work, verifies the claim against the cited paper's text. Publishes only when the cited text is in the Pith corpus and definitively contradicts the attribution.
shingle_duplicationincontrovertibleHashes 40-token n-grams of paper body text and flags identical n-grams shared with another paper that has no shared authors and no citation relationship in either direction.
claim_evidenceincontrovertibleFor every recorded claim in a paper, verifies the asserted evidence artifact (Lean module, cited work, formal proof) actually exists.
cited_work_retractioncross sourceContinuously monitors every cited reference for retraction or expression-of-concern flags from Crossref and OpenAlex. Flags when at least two sources agree (retracted) or any one source surfaces an editorial concern (advisory).

Each detector commits to a verdict class up front. Findings that don't meet the class bar are dropped at the source. Contracts and evidence schemas: /pith-integrity-protocol.


§5Surfaces

Public feeds and protocol
/findings
Severity-banded, detector-filterable feed of every finding the layer emits.
/challenges
Signed challenges filed by readers against specific claims or references.
/pith-integrity-protocol
Detector contracts, verdict classes, evidence schemas, framing rules, rescission.
/number
Pith Number — citable, content-addressed identifier complementary to DOI/arXiv.
Per-paper records
/pith/2605.12611/integrity.json
Detector summary, findings, and signed events for arXiv:2605.12611.
/pith/2605.12611/claims.json
Machine-readable claim ledger with evidence anchors.
/pith/KBA77APKBK425RMJCW6FVVBP6Y/bundle.json
Full signed bundle including integrity events and challenges.
Schemas, signing, audit
/schemas/pith-integrity-event/v1.json
JSON Schema for the pith.integrity.v1 events emitted with each finding.
/schemas/pith-open-graph-bundle/v1.json
JSON Schema for the bundle envelope.
/schemas/pith-open-graph-event/v1.json
JSON Schema for events inside a bundle.
/pith-signing-key.json
Ed25519 public key used to sign every integrity event and canonical record.
/pith-mirrors.json
Endpoints that mirror Pith bundles. Integrity survives if Pith goes down.

§6How a finding is produced

  1. A timer wakes one detector. The detector pulls a batch of papers due for a fresh check.
  2. For each paper, the detector inspects extracted references, body text, claims, or external URLs and emits zero or more candidates. Each carries an evidence_hash over the canonicalized evidence payload.
  3. Findings are upserted into integrity_findings keyed by (detector, evidence_hash). Re-detections are idempotent.
  4. The emitter drains pending findings, signs each one with the Pith Ed25519 key, and writes a pith.integrity.v1 event to integrity_event_log.
  5. The paper's Open Graph Bundle now carries those events alongside any signed challenges. External verifiers can re-run the detector code and reproduce the finding.

§7For journals, repositories, and partners

If you run a journal, a preprint server, a discovery engine, or an institutional repository, Pith is built to be consumed. Fetch /pith/<id>/integrity.json for any paper, embed the summary inline beside a paper, subscribe to the integrity and challenge event streams via the Open Graph Bundle, or mirror the bundles. The protocol is open, the implementation runs every minute, and the findings are reproducible by anyone with the cited paper.

hello@pith.science See the live feed Read the protocol