Self-Verifying Measurement Records: Hash-Linked Evidence Graphs for Hardware Benchmarking

Baris Basaran; Faruk Alpay

arxiv: 2606.27934 · v1 · pith:BLBRVZSPnew · submitted 2026-06-26 · 💻 cs.CR · cs.AR

Self-Verifying Measurement Records: Hash-Linked Evidence Graphs for Hardware Benchmarking

Faruk Alpay , Baris Basaran This is my paper

Pith reviewed 2026-06-29 04:10 UTC · model grok-4.3

classification 💻 cs.CR cs.AR

keywords hardware benchmarkingtamper-evident recordshash-linked evidence graphsFreivalds verificationmeasurement transparencyevidence graphsGPU benchmarks

0 comments

The pith

Reported hardware measurements can be made into tamper-evident records that anyone can verify offline.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to bind every reported hardware measurement to a hash-linked append-only structure so that each quantity carries its own observation record and verification proof. A reader can then audit the entire log offline without needing to trust the device owner or the hardware. For matrix products the verification uses Freivalds probabilistic identity after calibrating a tolerance to the device's measured floating-point residual floor, rejecting wrong results with probability 1-2^(-k). Other quantities receive algebraic checksums together with a measured reproducibility class. The method also closes an attack vector in which an adversary who knows the verification probes could hide corruption in their null space.

Core claim

We make a reported hardware measurement a tamper-evident, independently checkable record. Every quantity in the text, a table, or a figure is bound, by its content hash, to the observation and the verification behind it; the whole is a hash-linked, append-only structure that a verifier audits offline without trusting its producer. Matrix products are verified by a probabilistic identity at O(k n^2) cost under a tolerance derived from floating-point error analysis and calibrated to the device's measured residual floor.

What carries the argument

Hash-linked append-only evidence graph that binds each reported quantity to its observation and verification via content hashes, augmented by Freivalds identity for matrix products and algebraic checksums for other quantities.

If this is right

A verifier can audit the entire record offline without trusting its producer.
Wrong matrix products are rejected with probability 1-2^(-k) after tolerance calibration to the device's residual floor.
Quantities without a probabilistic identity carry an algebraic checksum and a measured reproducibility class.
Power or thermal stress applied from unprivileged access neither shifts the calibrated tolerance nor produces accepted silent errors.
The physical-fault threat model is thereby restricted to rare defective parts or privileged attackers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same binding technique could be applied to other forms of computational reporting that currently rest on trust.
Complete protection against privileged attackers would require the record to compose with a hardware root of trust.
The reported residual-floor and reproducibility maps supply device-specific baselines that later work could use to refine tolerance settings.

Load-bearing premise

The calibrated tolerance derived from floating-point error analysis and the device's measured residual floor, together with Freivalds identity, suffices to reject wrong matrix products with probability 1-2^(-k) while physical faults remain limited to rare defective parts or privileged attackers.

What would settle it

An experiment in which a deliberately incorrect matrix product passes the verification check at the claimed rate, or a non-privileged physical fault on the device produces a silent error that the record accepts.

Figures

Figures reproduced from arXiv: 2606.27934 by Baris Basaran, Faruk Alpay.

**Figure 2.** Figure 2: Throughput against SM clock for a bounded ( [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: The archive as a single sealed record: the document sources compile to the paper, the ancillary [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

read the original abstract

Performance numbers reported for hardware are accepted on trust: the reader cannot recompute them, the apparatus is gone, and the silicon itself can be silently wrong, with fleet studies reporting on the order of one core in a thousand returning incorrect arithmetic with no error raised. We make a reported hardware measurement a tamper-evident, independently checkable record. Every quantity in the text, a table, or a figure is bound, by its content hash, to the observation and the verification behind it; the whole is a hash-linked, append-only structure (a transparency log for measurement) that a verifier audits offline without trusting its producer. Matrix products are verified by a probabilistic identity (Freivalds) at O(k n^2) cost under a tolerance we derive from floating-point error analysis and calibrate to the device's own measured residual floor, so a wrong product is rejected with probability 1 - 2^(-k); quantities with no such identity carry an algebraic checksum and a measured reproducibility class. We then treat the check itself as a security object: a probe seed committed for offline reproducibility is an attack surface, and a probe-aware adversary can hide a corruption in the probe's null space, fooling even a quorum of bit-identical witnesses, while a Fiat-Shamir challenge derived from the claimed output closes this. Driving the device from an unprivileged tenant's reach, with a di/dt power virus and a thermal soak, neither moves the calibrated tolerance nor produces a silent error, placing the physical-fault threat at the rare defective part or the privileged attacker and marking the boundary at which the record must compose with a hardware root of trust. We demonstrate the construction across Blackwell and Hopper GPUs and report a residual-floor and reproducibility map by precision, size, and device.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds a hash-chained record system for hardware measurements that ties results to Freivalds checks and device calibration, but the tolerance step in the probabilistic bound looks like the part that needs the most checking.

read the letter

The core idea is to make reported hardware numbers into an append-only, hash-linked structure that anyone can audit later without trusting the original run. Every table entry or figure gets bound to its verification step, and matrix products use a calibrated Freivalds test while other quantities get algebraic checksums. They close a probe attack surface with a Fiat-Shamir challenge derived from the output itself.

What stands out is the integration: hash-linked evidence graphs plus Freivalds at O(k n^2), checksums for non-matrix data, and the probe-aware security layer. They also run the physical stress tests with di/dt viruses and thermal soaks on Blackwell and Hopper GPUs, then map residual floors by precision and size. That gives a concrete boundary on when the record can stand alone versus when it needs a hardware root of trust.

The soft spot is exactly the one the stress-test note flags. Freivalds gives its probability over exact arithmetic; once you add a tolerance derived from floating-point error plus measured residual, the false-negative region is no longer automatically bounded by 2^{-k}. The abstract states the bound directly, but the derivation would have to show that any passing wrong product still falls inside the failure set after tolerance is applied. If that step leans on unstated uniformity between rounding errors and the random vector, the reduction weakens. The paper reports the residual map and reproducibility classes, which helps, but the error analysis needs to be explicit.

This is for people working on verifiable benchmarks in cloud or security-sensitive hardware settings. A reader who cares about silent arithmetic faults or transparency logs for measurements will get usable ideas even if they end up adjusting the tolerance argument. It is worth sending to a serious referee because the construction is specific, the experiments are on real silicon, and the threat model is stated clearly enough to evaluate.

Referee Report

2 major / 2 minor

Summary. The paper claims to construct tamper-evident, independently verifiable records of hardware benchmark measurements via hash-linked evidence graphs (a transparency log for measurements). Matrix products are checked via Freivalds' identity under a tolerance derived from floating-point error analysis plus the device's measured residual floor, yielding a claimed rejection probability of 1-2^{-k} for incorrect results; other quantities receive algebraic checksums and reproducibility classes. A Fiat-Shamir challenge derived from the claimed output closes the probe null-space attack surface. Physical stress tests (di/dt power virus, thermal soak) on Blackwell and Hopper GPUs are reported not to shift the tolerance or induce silent errors, confining the physical-fault threat to rare defective parts or privileged attackers. The construction is demonstrated with a residual-floor and reproducibility map by precision, size, and device.

Significance. If the tolerance derivation and security reduction are sound, the work would provide a practical mechanism for making reported hardware performance numbers independently auditable without trusting the producer or the apparatus. It combines standard hash chaining and Fiat-Shamir with numerical verification and device-specific calibration, and the empirical reproducibility map is a concrete contribution. The explicit treatment of the probe seed as an attack surface and the boundary condition for composition with a hardware root of trust are useful framing.

major comments (2)

[§4] §4 (Freivalds verification under tolerance): the abstract and threat-model paragraph state that a wrong product is rejected with probability 1-2^{-k} after the tolerance τ is set from ||A||·||B||·ε + measured residual floor. Freivalds' identity is probabilistic only in exact arithmetic; the paper must show that the measure of the false-negative region created by the additive tolerance remains bounded by 2^{-k} (or that any adversarial product passing the check still lies inside the original 2^{-k} failure set). No such derivation or independence assumption between the random probe vector and rounding/residual errors is supplied.
[§5] §5 (physical-fault experiments): the claim that di/dt power-virus and thermal-soak stress neither move the calibrated tolerance nor produce silent errors is load-bearing for confining the threat model to rare defective parts or privileged attackers. The section must report the number of trials, observed error rates, and statistical bounds that support this boundary; without them the reduction to a hardware root of trust cannot be evaluated.

minor comments (2)

[Abstract] The reproducibility classes mentioned in the abstract are not defined or enumerated; a short table or paragraph listing the classes and the criteria for assignment would improve clarity.
Notation for the hash-linked evidence graph (nodes, edges, commitment) is introduced gradually; an early figure or pseudocode listing the structure would aid the reader.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. We address the two major comments below and will revise the manuscript to incorporate the requested clarifications and data.

read point-by-point responses

Referee: [§4] §4 (Freivalds verification under tolerance): the abstract and threat-model paragraph state that a wrong product is rejected with probability 1-2^{-k} after the tolerance τ is set from ||A||·||B||·ε + measured residual floor. Freivalds' identity is probabilistic only in exact arithmetic; the paper must show that the measure of the false-negative region created by the additive tolerance remains bounded by 2^{-k} (or that any adversarial product passing the check still lies inside the original 2^{-k} failure set). No such derivation or independence assumption between the random probe vector and rounding/residual errors is supplied.

Authors: We agree that an explicit derivation is required. In the revision we will add a subsection proving that, under standard models of floating-point error (bounded by machine epsilon and independent of the random probe in distribution), the measure of the additional false-negative region is at most a small additive term that can be absorbed into the security parameter k without changing the claimed 1-2^{-k} bound. We will also state the precise independence assumption used. revision: yes
Referee: [§5] §5 (physical-fault experiments): the claim that di/dt power-virus and thermal-soak stress neither move the calibrated tolerance nor produce silent errors is load-bearing for confining the threat model to rare defective parts or privileged attackers. The section must report the number of trials, observed error rates, and statistical bounds that support this boundary; without them the reduction to a hardware root of trust cannot be evaluated.

Authors: We acknowledge that the current text of §5 omits the requested statistical details. The revision will expand the section to report the exact number of stress trials executed on each device, the observed silent-error count (zero), and the derived statistical bounds (e.g., upper confidence limits on the per-trial fault probability). revision: yes

Circularity Check

0 steps flagged

No circularity; claims rest on standard Freivalds identity and hash properties without reduction to self-defined inputs

full rationale

The abstract derives a tolerance from floating-point error analysis plus measured residual floor, then invokes the established Freivalds probabilistic bound (1-2^(-k)) for rejection of incorrect matrix products. Hash-linking and append-only structure rely on standard cryptographic properties. No equation or claim reduces a reported prediction to a fitted parameter by construction, no self-citation supplies a load-bearing uniqueness theorem, and no ansatz is smuggled via prior work. The chain is externally grounded in Freivalds' algorithm and hash collision resistance.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

Abstract-only view limits visibility; the approach rests on standard probabilistic verification and floating-point analysis plus a new evidence structure whose details are not expanded.

free parameters (1)

tolerance = device-specific residual floor
Calibrated to the device's own measured residual floor after derivation from floating-point error analysis.

axioms (1)

standard math Freivalds' probabilistic identity correctly verifies matrix products at the stated cost and error probability
Invoked for O(k n^2) verification of matrix products with probability 1-2^(-k)

invented entities (1)

hash-linked evidence graph no independent evidence
purpose: Tamper-evident append-only structure binding measurements to verifications
New structure proposed to make records independently checkable offline

pith-pipeline@v0.9.1-grok · 5852 in / 1468 out tokens · 41435 ms · 2026-06-29T04:10:54.894251+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 4 linked inside Pith

[1]

Hawkeye: Reproducing GPU-level non- determinism.arXiv:2603.20421, 2026

Erez Badash, Dan Boneh, Ilan Komargodski, and Megha Srivastava. Hawkeye: Reproducing GPU-level non- determinism.arXiv:2603.20421, 2026

Pith/arXiv arXiv 2026
[2]

IPFS: Content addressed, versioned, P2P file system

Juan Benet. IPFS: Content addressed, versioned, P2P file system. InarXiv:1407.3561, 2014

Pith/arXiv arXiv 2014
[3]

Validation of GPU computation in decentralized, trustless networks.arXiv:2501.05374, 2025

Eric Boniardi, Stanley Bishop, and Alison Haire. Validation of GPU computation in decentralized, trustless networks.arXiv:2501.05374, 2025

arXiv 2025
[4]

Proebsting

Christian Collberg and Todd A. Proebsting. Repeatability in computer systems research.Communications of the ACM, 59(3):62–69, 2016

2016
[5]

Practical verified computation with streaming interactive proofs

Graham Cormode, Justin Thaler, and Ke Yi. Practical verified computation with streaming interactive proofs. InProc. Innovations in Theoretical Computer Science (ITCS), 2012

2012
[6]

Crosby and Dan S

Scott A. Crosby and Dan S. Wallach. Efficient data structures for tamper-evident logging. InProceedings of the 18th USENIX Security Symposium, 2009

2009
[7]

FT-Transformer: Resilient and reliable transformer with end-to-end fault tolerant attention.arXiv:2504.02211, 2025

Huangliang Dai, Shixun Wu, Hairui Zhao, Jiajun Huang, Zizhe Jian, Yue Zhu, Haiyang Hu, and Zizhong Chen. FT-Transformer: Resilient and reliable transformer with end-to-end fault tolerant attention.arXiv:2504.02211, 2025

arXiv 2025
[8]

Fu, Stefano Ermon, Atri Rudra, and Christopher Ré

Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, and Christopher Ré. FlashAttention: Fast and memory- efficient exact attention with IO-awareness. InAdvances in Neural Information Processing Systems (NeurIPS), 2022

2022
[9]

Parallel reproducible summation.IEEE Transactions on Computers, 64(7):2060–2070, 2015

James Demmel and Hong Diep Nguyen. Parallel reproducible summation.IEEE Transactions on Computers, 64(7):2060–2070, 2015

2060
[10]

Silent data corruptions at scale.arXiv:2102.11245, 2021

Harish Dattatraya Dixit, Sneha Pendharkar, Matt Beadon, Chris Mason, Tejasvi Chakravarthy, Bharath Muthiah, and Sriram Sankar. Silent data corruptions at scale.arXiv:2102.11245, 2021

arXiv 2021
[11]

Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI act); logging and record-keeping provisions

European Union. Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI act); logging and record-keeping provisions. Official Journal of the European Union, 2024

2024
[12]

Regulation (EU) 2024/2847 on horizontal cybersecurity requirements for products with digital elements (Cyber Resilience Act)

European Union. Regulation (EU) 2024/2847 on horizontal cybersecurity requirements for products with digital elements (Cyber Resilience Act). Official Journal of the European Union, 2024

2024
[13]

How to prove yourself: Practical solutions to identification and signature problems

Amos Fiat and Adi Shamir. How to prove yourself: Practical solutions to identification and signature problems. InAdvances in Cryptology: CRYPTO ’86, volume 263 ofLNCS, pages 186–194. Springer, 1987

1987
[14]

Probabilistic machines can use less running time

R¯ usin,š Freivalds. Probabilistic machines can use less running time. InInformation Processing 77 (IFIP Congress), pages 839–842, 1977

1977
[15]

The knowledge complexity of interactive proof systems

Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The knowledge complexity of interactive proof systems. SIAM Journal on Computing, 18(1):186–208, 1989. 15

1989
[16]

Higham.Accuracy and Stability of Numerical Algorithms

Nicholas J. Higham.Accuracy and Stability of Numerical Algorithms. SIAM, 2nd edition, 2002

2002
[17]

Higham and Theo Mary

Nicholas J. Higham and Theo Mary. A new approach to probabilistic rounding error analysis.SIAM Journal on Scientific Computing, 41(5):A2815–A2835, 2019

2019
[18]

Hochschild, Paul Turner, Jeffrey C

Peter H. Hochschild, Paul Turner, Jeffrey C. Mogul, Rama Govindaraju, Parthasarathy Ranganathan, David E. Culler, and Amin Vahdat. Cores that don’t count. InProceedings of the Workshop on Hot Topics in Operating Systems (HotOS), pages 9–16. ACM, 2021

2021
[19]

Scientific benchmarking of parallel computing systems

Torsten Hoefler and Roberto Belli. Scientific benchmarking of parallel computing systems. InProc. Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC). ACM, 2015

2015
[20]

Kuang-Hua Huang and Jacob A. Abraham. Algorithm-based fault tolerance for matrix operations.IEEE Transactions on Computers, C-33(6):518–528, 1984

1984
[21]

Microbenchmarking NVIDIA’s Blackwell architecture: An in-depth architectural analysis.arXiv:2512.02189, 2025

Aaron Jarmusch and Sunita Chandrasekaran. Microbenchmarking NVIDIA’s Blackwell architecture: An in-depth architectural analysis.arXiv:2512.02189, 2025

arXiv 2025
[22]

Dissecting the NVIDIA Blackwell architecture with microbenchmarks.arXiv:2507.10789, 2025

Aaron Jarmusch, Nathan Graddon, and Sunita Chandrasekaran. Dissecting the NVIDIA Blackwell architecture with microbenchmarks.arXiv:2507.10789, 2025

arXiv 2025
[23]

DRAWNAPART: A device identification technique based on remote GPU fingerprinting

Tomer Laor, Naif Mehanna, Antonin Durey, Vitaly Dyadyuk, Pierre Laperdrix, Clémentine Maurice, Yossi Oren, Romain Rouvoy, Walter Rudametkin, and Yuval Yarom. DRAWNAPART: A device identification technique based on remote GPU fingerprinting. InNetwork and Distributed System Security Symposium (NDSS), 2022

2022
[24]

Certificate transparency.Communications of the ACM, 57(10):40–46, 2014

Ben Laurie. Certificate transparency.Communications of the ACM, 57(10):40–46, 2014

2014
[25]

Lee and Katrina A

John D. Lee and Katrina A. See. Trust in automation: Designing for appropriate reliance.Human Factors, 46(1):50–80, 2004

2004
[26]

Lin, Onur Mutlu, et al

Chris S. Lin, Onur Mutlu, et al. GPUHammer: Rowhammer attacks on GPU memories are practical. In Proceedings of the 34th USENIX Security Symposium, 2025

2025
[27]

LLM-PRISM: Characterizing silent data corruption from permanent GPU faults in LLM training.arXiv:2604.10390, 2026

LLM-PRISM Authors. LLM-PRISM: Characterizing silent data corruption from permanent GPU faults in LLM training.arXiv:2604.10390, 2026

Pith/arXiv arXiv 2026
[28]

Melara, Aaron Blankstein, Joseph Bonneau, Edward W

Marcela S. Melara, Aaron Blankstein, Joseph Bonneau, Edward W. Felten, and Michael J. Freedman. CONIKS: Bringing key transparency to end users. InProceedings of the 24th USENIX Security Symposium, 2015

2015
[29]

Ralph C. Merkle. A digital signature based on a conventional encryption function. InAdvances in Cryptology: CRYPTO ’87, volume 293 ofLNCS, pages 369–378. Springer, 1988

1988
[30]

PROV-DM: The PROV data model

Luc Moreau and Paolo Missier. PROV-DM: The PROV data model. W3c recommendation, World Wide Web Consortium (W3C), 2013

2013
[31]

Garcia, Jo Van Bulck, Daniel Gruss, and Frank Piessens

Kit Murdock, David Oswald, Flavio D. Garcia, Jo Van Bulck, Daniel Gruss, and Frank Piessens. Plundervolt: Software-based fault injection attacks against Intel SGX. InIEEE Symposium on Security and Privacy (S&P), 2020

2020
[32]

Secure hash standard (SHS)

National Institute of Standards and Technology. Secure hash standard (SHS). Technical Report FIPS PUB 180-4, NIST, 2015

2015
[33]

NVIDIA confidential computing and device attestation for Hopper and Blackwell GPUs

NVIDIA Corporation. NVIDIA confidential computing and device attestation for Hopper and Blackwell GPUs. Whitepaper, NVIDIA Corporation, 2024

2024
[34]

NVIDIA Blackwell GPU architecture

NVIDIA Corporation. NVIDIA Blackwell GPU architecture. Whitepaper, NVIDIA Corporation, 2025

2025
[35]

GeForce RTX 5090: Specifications

NVIDIA Corporation. GeForce RTX 5090: Specifications. https://www.nvidia.com/en-us/geforce/ graphics-cards/50-series/rtx-5090/, 2026. Accessed 2026-06-25

2026
[36]

NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition: Specifications.https:// www.nvidia.com/en-us/products/workstations/professional-desktop-gpus/rtx-pro-6000-max-q/ , 2026

NVIDIA Corporation. NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition: Specifications.https:// www.nvidia.com/en-us/products/workstations/professional-desktop-gpus/rtx-pro-6000-max-q/ , 2026. Accessed 2026-06-25

2026
[37]

NVIDIA RTX PRO 6000 Blackwell Server Edition: Specifications.https://www.nvidia

NVIDIA Corporation. NVIDIA RTX PRO 6000 Blackwell Server Edition: Specifications.https://www.nvidia. com/en-us/data-center/rtx-pro-6000-blackwell-server-edition/, 2026. Accessed 2026-06-25

2026
[38]

Physical one-way functions.Science, 297(5589):2026–2030, 2002

Ravikanth Pappu, Ben Recht, Jason Taylor, and Neil Gershenfeld. Physical one-way functions.Science, 297(5589):2026–2030, 2002

2026
[39]

Lightning: Striking the secure isolation on GPU clouds with transient hardware faults.arXiv:2112.03662, 2021

Majid Sabbagh, Yunsi Fei, and David Kaeli. Lightning: Striking the secure isolation on GPU clouds with transient hardware faults.arXiv:2112.03662, 2021

arXiv 2021
[40]

The anatomy of silent data corruption: GPU error pattern study and modeling guidance

SDC Anatomy Authors. The anatomy of silent data corruption: GPU error pattern study and modeling guidance. arXiv:2605.04213, 2026. 16

Pith/arXiv arXiv 2026
[41]

Impacts of floating-point non-associativity on reproducibility for HPC and deep learning applications

Sanjif Shanmugavelu, Mathieu Taillefumier, Christopher Culver, Oscar Hernandez, Mark Coletti, and Ada Sedova. Impacts of floating-point non-associativity on reproducibility for HPC and deep learning applications. arXiv:2408.05148, 2024

arXiv 2024
[42]

Sinclair, and Shivaram Venkataraman

Prasoon Sinha, Akhil Guliani, Rutwik Jain, Brandon Tran, Matthew D. Sinclair, and Shivaram Venkataraman. Not all GPUs are created equal: Characterizing variability in large-scale, accelerator-rich systems. InProc. Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC). IEEE, 2022

2022
[43]

honest or bust

Ewa Syta, Iulia Tamas, Dylan Visher, David Isaac Wolinsky, Philipp Jovanovic, Linus Gasser, Nicolas Gailly, Ismail Khoffi, and Bryan Ford. Keeping authorities “honest or bust” with decentralized witness cosigning. In IEEE Symposium on Security and Privacy (S&P), 2016

2016
[44]

Time-optimal interactive proofs for circuit evaluation

Justin Thaler. Time-optimal interactive proofs for circuit evaluation. InAdvances in Cryptology: CRYPTO 2013, volume 8043 ofLNCS, pages 71–89. Springer, 2013

2013
[45]

Custom algorithm-based fault tolerance for attention layers in transformers.arXiv:2507.16676, 2025

Vasileios Titopoulos, Kosmas Alexandridis, and Giorgos Dimitrakopoulos. Custom algorithm-based fault tolerance for attention layers in transformers.arXiv:2507.16676, 2025

arXiv 2025
[46]

Paris agreement, article 13: Enhanced transparency framework

United Nations Framework Convention on Climate Change. Paris agreement, article 13: Enhanced transparency framework. United Nations, 2015

2015
[47]

Precision & performance: Floating point and IEEE 754 compliance for NVIDIA GPUs

Nathan Whitehead and Alex Fit-Florea. Precision & performance: Floating point and IEEE 754 compliance for NVIDIA GPUs. Technical report, NVIDIA Corporation, 2011

2011
[48]

Wilkinson et al

Mark D. Wilkinson et al. The FAIR guiding principles for scientific data management and stewardship.Scientific Data, 3:160018, 2016

2016
[49]

TAO: Tolerance-aware optimistic verification for floating-point neural networks

Jianzhu Yao, Hongxu Su, Taobo Liao, Zerui Cheng, Huan Zhang, Xuechao Wang, and Pramod Viswanath. TAO: Tolerance-aware optimistic verification for floating-point neural networks. InProceedings of the 21st European Conference on Computer Systems (EuroSys), pages 1515–1532, 2026. 17

2026

[1] [1]

Hawkeye: Reproducing GPU-level non- determinism.arXiv:2603.20421, 2026

Erez Badash, Dan Boneh, Ilan Komargodski, and Megha Srivastava. Hawkeye: Reproducing GPU-level non- determinism.arXiv:2603.20421, 2026

Pith/arXiv arXiv 2026

[2] [2]

IPFS: Content addressed, versioned, P2P file system

Juan Benet. IPFS: Content addressed, versioned, P2P file system. InarXiv:1407.3561, 2014

Pith/arXiv arXiv 2014

[3] [3]

Validation of GPU computation in decentralized, trustless networks.arXiv:2501.05374, 2025

Eric Boniardi, Stanley Bishop, and Alison Haire. Validation of GPU computation in decentralized, trustless networks.arXiv:2501.05374, 2025

arXiv 2025

[4] [4]

Proebsting

Christian Collberg and Todd A. Proebsting. Repeatability in computer systems research.Communications of the ACM, 59(3):62–69, 2016

2016

[5] [5]

Practical verified computation with streaming interactive proofs

Graham Cormode, Justin Thaler, and Ke Yi. Practical verified computation with streaming interactive proofs. InProc. Innovations in Theoretical Computer Science (ITCS), 2012

2012

[6] [6]

Crosby and Dan S

Scott A. Crosby and Dan S. Wallach. Efficient data structures for tamper-evident logging. InProceedings of the 18th USENIX Security Symposium, 2009

2009

[7] [7]

FT-Transformer: Resilient and reliable transformer with end-to-end fault tolerant attention.arXiv:2504.02211, 2025

Huangliang Dai, Shixun Wu, Hairui Zhao, Jiajun Huang, Zizhe Jian, Yue Zhu, Haiyang Hu, and Zizhong Chen. FT-Transformer: Resilient and reliable transformer with end-to-end fault tolerant attention.arXiv:2504.02211, 2025

arXiv 2025

[8] [8]

Fu, Stefano Ermon, Atri Rudra, and Christopher Ré

Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, and Christopher Ré. FlashAttention: Fast and memory- efficient exact attention with IO-awareness. InAdvances in Neural Information Processing Systems (NeurIPS), 2022

2022

[9] [9]

Parallel reproducible summation.IEEE Transactions on Computers, 64(7):2060–2070, 2015

James Demmel and Hong Diep Nguyen. Parallel reproducible summation.IEEE Transactions on Computers, 64(7):2060–2070, 2015

2060

[10] [10]

Silent data corruptions at scale.arXiv:2102.11245, 2021

Harish Dattatraya Dixit, Sneha Pendharkar, Matt Beadon, Chris Mason, Tejasvi Chakravarthy, Bharath Muthiah, and Sriram Sankar. Silent data corruptions at scale.arXiv:2102.11245, 2021

arXiv 2021

[11] [11]

Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI act); logging and record-keeping provisions

European Union. Regulation (EU) 2024/1689 laying down harmonised rules on artificial intelligence (AI act); logging and record-keeping provisions. Official Journal of the European Union, 2024

2024

[12] [12]

Regulation (EU) 2024/2847 on horizontal cybersecurity requirements for products with digital elements (Cyber Resilience Act)

European Union. Regulation (EU) 2024/2847 on horizontal cybersecurity requirements for products with digital elements (Cyber Resilience Act). Official Journal of the European Union, 2024

2024

[13] [13]

How to prove yourself: Practical solutions to identification and signature problems

Amos Fiat and Adi Shamir. How to prove yourself: Practical solutions to identification and signature problems. InAdvances in Cryptology: CRYPTO ’86, volume 263 ofLNCS, pages 186–194. Springer, 1987

1987

[14] [14]

Probabilistic machines can use less running time

R¯ usin,š Freivalds. Probabilistic machines can use less running time. InInformation Processing 77 (IFIP Congress), pages 839–842, 1977

1977

[15] [15]

The knowledge complexity of interactive proof systems

Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The knowledge complexity of interactive proof systems. SIAM Journal on Computing, 18(1):186–208, 1989. 15

1989

[16] [16]

Higham.Accuracy and Stability of Numerical Algorithms

Nicholas J. Higham.Accuracy and Stability of Numerical Algorithms. SIAM, 2nd edition, 2002

2002

[17] [17]

Higham and Theo Mary

Nicholas J. Higham and Theo Mary. A new approach to probabilistic rounding error analysis.SIAM Journal on Scientific Computing, 41(5):A2815–A2835, 2019

2019

[18] [18]

Hochschild, Paul Turner, Jeffrey C

Peter H. Hochschild, Paul Turner, Jeffrey C. Mogul, Rama Govindaraju, Parthasarathy Ranganathan, David E. Culler, and Amin Vahdat. Cores that don’t count. InProceedings of the Workshop on Hot Topics in Operating Systems (HotOS), pages 9–16. ACM, 2021

2021

[19] [19]

Scientific benchmarking of parallel computing systems

Torsten Hoefler and Roberto Belli. Scientific benchmarking of parallel computing systems. InProc. Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC). ACM, 2015

2015

[20] [20]

Kuang-Hua Huang and Jacob A. Abraham. Algorithm-based fault tolerance for matrix operations.IEEE Transactions on Computers, C-33(6):518–528, 1984

1984

[21] [21]

Microbenchmarking NVIDIA’s Blackwell architecture: An in-depth architectural analysis.arXiv:2512.02189, 2025

Aaron Jarmusch and Sunita Chandrasekaran. Microbenchmarking NVIDIA’s Blackwell architecture: An in-depth architectural analysis.arXiv:2512.02189, 2025

arXiv 2025

[22] [22]

Dissecting the NVIDIA Blackwell architecture with microbenchmarks.arXiv:2507.10789, 2025

Aaron Jarmusch, Nathan Graddon, and Sunita Chandrasekaran. Dissecting the NVIDIA Blackwell architecture with microbenchmarks.arXiv:2507.10789, 2025

arXiv 2025

[23] [23]

DRAWNAPART: A device identification technique based on remote GPU fingerprinting

Tomer Laor, Naif Mehanna, Antonin Durey, Vitaly Dyadyuk, Pierre Laperdrix, Clémentine Maurice, Yossi Oren, Romain Rouvoy, Walter Rudametkin, and Yuval Yarom. DRAWNAPART: A device identification technique based on remote GPU fingerprinting. InNetwork and Distributed System Security Symposium (NDSS), 2022

2022

[24] [24]

Certificate transparency.Communications of the ACM, 57(10):40–46, 2014

Ben Laurie. Certificate transparency.Communications of the ACM, 57(10):40–46, 2014

2014

[25] [25]

Lee and Katrina A

John D. Lee and Katrina A. See. Trust in automation: Designing for appropriate reliance.Human Factors, 46(1):50–80, 2004

2004

[26] [26]

Lin, Onur Mutlu, et al

Chris S. Lin, Onur Mutlu, et al. GPUHammer: Rowhammer attacks on GPU memories are practical. In Proceedings of the 34th USENIX Security Symposium, 2025

2025

[27] [27]

LLM-PRISM: Characterizing silent data corruption from permanent GPU faults in LLM training.arXiv:2604.10390, 2026

LLM-PRISM Authors. LLM-PRISM: Characterizing silent data corruption from permanent GPU faults in LLM training.arXiv:2604.10390, 2026

Pith/arXiv arXiv 2026

[28] [28]

Melara, Aaron Blankstein, Joseph Bonneau, Edward W

Marcela S. Melara, Aaron Blankstein, Joseph Bonneau, Edward W. Felten, and Michael J. Freedman. CONIKS: Bringing key transparency to end users. InProceedings of the 24th USENIX Security Symposium, 2015

2015

[29] [29]

Ralph C. Merkle. A digital signature based on a conventional encryption function. InAdvances in Cryptology: CRYPTO ’87, volume 293 ofLNCS, pages 369–378. Springer, 1988

1988

[30] [30]

PROV-DM: The PROV data model

Luc Moreau and Paolo Missier. PROV-DM: The PROV data model. W3c recommendation, World Wide Web Consortium (W3C), 2013

2013

[31] [31]

Garcia, Jo Van Bulck, Daniel Gruss, and Frank Piessens

Kit Murdock, David Oswald, Flavio D. Garcia, Jo Van Bulck, Daniel Gruss, and Frank Piessens. Plundervolt: Software-based fault injection attacks against Intel SGX. InIEEE Symposium on Security and Privacy (S&P), 2020

2020

[32] [32]

Secure hash standard (SHS)

National Institute of Standards and Technology. Secure hash standard (SHS). Technical Report FIPS PUB 180-4, NIST, 2015

2015

[33] [33]

NVIDIA confidential computing and device attestation for Hopper and Blackwell GPUs

NVIDIA Corporation. NVIDIA confidential computing and device attestation for Hopper and Blackwell GPUs. Whitepaper, NVIDIA Corporation, 2024

2024

[34] [34]

NVIDIA Blackwell GPU architecture

NVIDIA Corporation. NVIDIA Blackwell GPU architecture. Whitepaper, NVIDIA Corporation, 2025

2025

[35] [35]

GeForce RTX 5090: Specifications

NVIDIA Corporation. GeForce RTX 5090: Specifications. https://www.nvidia.com/en-us/geforce/ graphics-cards/50-series/rtx-5090/, 2026. Accessed 2026-06-25

2026

[36] [36]

NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition: Specifications.https:// www.nvidia.com/en-us/products/workstations/professional-desktop-gpus/rtx-pro-6000-max-q/ , 2026

NVIDIA Corporation. NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition: Specifications.https:// www.nvidia.com/en-us/products/workstations/professional-desktop-gpus/rtx-pro-6000-max-q/ , 2026. Accessed 2026-06-25

2026

[37] [37]

NVIDIA RTX PRO 6000 Blackwell Server Edition: Specifications.https://www.nvidia

NVIDIA Corporation. NVIDIA RTX PRO 6000 Blackwell Server Edition: Specifications.https://www.nvidia. com/en-us/data-center/rtx-pro-6000-blackwell-server-edition/, 2026. Accessed 2026-06-25

2026

[38] [38]

Physical one-way functions.Science, 297(5589):2026–2030, 2002

Ravikanth Pappu, Ben Recht, Jason Taylor, and Neil Gershenfeld. Physical one-way functions.Science, 297(5589):2026–2030, 2002

2026

[39] [39]

Lightning: Striking the secure isolation on GPU clouds with transient hardware faults.arXiv:2112.03662, 2021

Majid Sabbagh, Yunsi Fei, and David Kaeli. Lightning: Striking the secure isolation on GPU clouds with transient hardware faults.arXiv:2112.03662, 2021

arXiv 2021

[40] [40]

The anatomy of silent data corruption: GPU error pattern study and modeling guidance

SDC Anatomy Authors. The anatomy of silent data corruption: GPU error pattern study and modeling guidance. arXiv:2605.04213, 2026. 16

Pith/arXiv arXiv 2026

[41] [41]

Impacts of floating-point non-associativity on reproducibility for HPC and deep learning applications

Sanjif Shanmugavelu, Mathieu Taillefumier, Christopher Culver, Oscar Hernandez, Mark Coletti, and Ada Sedova. Impacts of floating-point non-associativity on reproducibility for HPC and deep learning applications. arXiv:2408.05148, 2024

arXiv 2024

[42] [42]

Sinclair, and Shivaram Venkataraman

Prasoon Sinha, Akhil Guliani, Rutwik Jain, Brandon Tran, Matthew D. Sinclair, and Shivaram Venkataraman. Not all GPUs are created equal: Characterizing variability in large-scale, accelerator-rich systems. InProc. Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC). IEEE, 2022

2022

[43] [43]

honest or bust

Ewa Syta, Iulia Tamas, Dylan Visher, David Isaac Wolinsky, Philipp Jovanovic, Linus Gasser, Nicolas Gailly, Ismail Khoffi, and Bryan Ford. Keeping authorities “honest or bust” with decentralized witness cosigning. In IEEE Symposium on Security and Privacy (S&P), 2016

2016

[44] [44]

Time-optimal interactive proofs for circuit evaluation

Justin Thaler. Time-optimal interactive proofs for circuit evaluation. InAdvances in Cryptology: CRYPTO 2013, volume 8043 ofLNCS, pages 71–89. Springer, 2013

2013

[45] [45]

Custom algorithm-based fault tolerance for attention layers in transformers.arXiv:2507.16676, 2025

Vasileios Titopoulos, Kosmas Alexandridis, and Giorgos Dimitrakopoulos. Custom algorithm-based fault tolerance for attention layers in transformers.arXiv:2507.16676, 2025

arXiv 2025

[46] [46]

Paris agreement, article 13: Enhanced transparency framework

United Nations Framework Convention on Climate Change. Paris agreement, article 13: Enhanced transparency framework. United Nations, 2015

2015

[47] [47]

Precision & performance: Floating point and IEEE 754 compliance for NVIDIA GPUs

Nathan Whitehead and Alex Fit-Florea. Precision & performance: Floating point and IEEE 754 compliance for NVIDIA GPUs. Technical report, NVIDIA Corporation, 2011

2011

[48] [48]

Wilkinson et al

Mark D. Wilkinson et al. The FAIR guiding principles for scientific data management and stewardship.Scientific Data, 3:160018, 2016

2016

[49] [49]

TAO: Tolerance-aware optimistic verification for floating-point neural networks

Jianzhu Yao, Hongxu Su, Taobo Liao, Zerui Cheng, Huan Zhang, Xuechao Wang, and Pramod Viswanath. TAO: Tolerance-aware optimistic verification for floating-point neural networks. InProceedings of the 21st European Conference on Computer Systems (EuroSys), pages 1515–1532, 2026. 17

2026