Provenance Tracking in AI Compilers through the Lens of Coalgebra

Liying Liu; Zilu Tian

arxiv: 2606.10937 · v1 · pith:OO3JCUKTnew · submitted 2026-06-09 · 💻 cs.DB · cs.AI

Provenance Tracking in AI Compilers through the Lens of Coalgebra

Zilu Tian , Liying Liu This is my paper

Pith reviewed 2026-06-27 10:58 UTC · model grok-4.3

classification 💻 cs.DB cs.AI

keywords provenance trackingAI compilerscoalgebrabisimulationobservational semanticscomputation graphstensor provenancegraph rewrites

0 comments

The pith

A coalgebraic model using bisimulation tracks tensor and operator provenance in AI compilers by observing computational actions alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that provenance can be maintained across aggressive graph rewrites in AI compilers without inserting explicit identifiers into the code. It does so by shifting from direct propagation to an observational approach that reasons about what remains visible after transformations. A coalgebraic formalization with bisimulation ensures that origins stay recoverable even when intermediate nodes disappear during normalization or optimization. This matters because reliable provenance supports postprocessing, debugging, and validation steps that depend on knowing which original tensors map to which final ones. The method is demonstrated in a working prototype that adds little implementation cost.

Core claim

We present a lightweight, generative approach to provenance tracking based on observational semantics. Instead of propagating identifiers through compiler passes, we observe graph transformations and reason about provenance in terms of observable computational actions. We formalize this approach using a coalgebraic model and bisimulation, which preserves provenance even when intermediate nodes are eliminated. Furthermore, we implement this approach in a prototype AI compiler COVAN, demonstrating stable provenance across compilation pipelines with minimal engineering overhead.

What carries the argument

The coalgebraic model paired with bisimulation, which defines provenance equivalence through matching observable actions rather than internal node identities.

If this is right

Provenance remains stable even under non-injective rewrites that remove intermediate nodes.
Platform-specific postprocessing can attach reliably to final tensors based on recovered origins.
Compiler debugging and transformation validation become possible without invasive instrumentation.
The approach requires only minimal changes to existing compilation pipelines.
Observational reasoning suffices to equate computations that differ only in eliminated structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same observational technique could apply to provenance questions in database query planners that perform similar rewrites.
Extending the bisimulation check to include timing or resource observations might handle additional compiler passes.
If the prototype scales to full production compilers, it would reduce the need for custom provenance hooks in each new optimization.
The method leaves open whether bisimulation can also certify that two different compiler backends produce equivalent provenance mappings.

Load-bearing premise

Provenance information can be recovered reliably from observable computational actions alone, without propagating explicit identifiers through compiler passes.

What would settle it

A concrete graph rewrite that eliminates nodes such that the final observable actions no longer uniquely determine which original tensor or operator produced a given output tensor.

Figures

Figures reproduced from arXiv: 2606.10937 by Liying Liu, Zilu Tian.

**Figure 1.** Figure 1: Diagram of bisimulation: 𝑅 relates states of 𝑆 and 𝑇 , and 𝜌 ensures the coalgebra structures are compatible via projections. Bisimulation gives us a precise way to relate two differently structured graphs: when two states are bisimilar, they are observationally indistinguishable – the foundation of our provenance guarantees [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: Covan Software Architecture and IR Flow Implications. The generative correspondence between 𝐹 and 𝐹 ♯ suggests a lightweight implementation strategy for provenance tracking. Rather than redesigning compiler passes or introducing ad hoc instrumentation, one can exploit the functorial transformation from 𝐹 to 𝐹 ♯ as a semantic code generator, lifting a value-based continuation semantics into an action-refle… view at source ↗

read the original abstract

AI compilers aggressively rewrite computation graphs through normalization, lowering, and optimization, making it difficult to track the provenance of tensors and operators across compilation. Reliable provenance is essential for attaching platform-specific postprocessing, debugging compiler behavior, and validating transformations, yet existing solutions are either invasive or ad hoc under non-injective graph rewrites. We present a lightweight, generative approach to provenance tracking based on observational semantics. Instead of propagating identifiers through compiler passes, we observe graph transformations and reason about provenance in terms of observable computational actions. We formalize this approach using a coalgebraic model and bisimulation, which preserves provenance even when intermediate nodes are eliminated. Furthermore, we implement this approach in a prototype AI compiler COVAN, demonstrating stable provenance across compilation pipelines with minimal engineering overhead.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a coalgebraic observational model for provenance in AI compilers that avoids explicit identifier propagation, but the abstract supplies no definitions or proofs so the claim cannot be checked yet.

read the letter

The core idea is to recover tensor and operator provenance after aggressive graph rewrites by observing computational actions and using coalgebra plus bisimulation instead of threading tags through every pass. They built a prototype called COVAN that reportedly keeps provenance stable with low overhead.

What stands out is the attempt to treat provenance as an observational property rather than an explicit annotation. That framing could cut down on invasive changes in compiler pipelines, which is a practical pain point for debugging and post-processing in AI compilation.

The main limitation right now is that the abstract states the coalgebraic model and bisimulation preserve provenance under non-injective rewrites but gives no functor, observation map, bisimulation relation, or proof sketch. Without those, it is impossible to see whether the weakest assumption—that observable actions alone are sufficient—actually holds or whether the approach collapses when rewrites discard too much structure. The literature comparison is also missing, so it is unclear how this differs from prior ad-hoc methods beyond the coalgebra vocabulary.

This is for readers already working on AI compiler internals who either know coalgebra or are open to it for tooling problems. A systems person looking for a drop-in provenance solution will not get enough from the abstract alone.

The paper deserves a serious referee so the formal model and prototype results can be examined directly. If the bisimulation is properly defined and the implementation is reproducible, it could be a useful incremental tool; if the formal part is thin, that will become clear quickly in review.

Referee Report

2 major / 1 minor

Summary. The paper claims that provenance tracking in AI compilers can be achieved via a lightweight observational approach rather than invasive identifier propagation. It formalizes the method with a coalgebraic model and bisimulation that is asserted to preserve provenance even under non-injective rewrites (normalization, lowering, optimization). A prototype implementation COVAN is said to demonstrate stable provenance across compilation pipelines with minimal overhead.

Significance. If the coalgebraic construction and bisimulation proof are correct, the work would supply a non-invasive, semantics-based alternative to existing provenance mechanisms in graph-rewriting compilers. This could aid debugging, post-processing attachment, and transformation validation without modifying every compiler pass. The coalgebraic framing is a potentially reusable technique for other compiler provenance problems.

major comments (2)

[Abstract] Abstract: the central claim that 'a coalgebraic model and bisimulation... preserves provenance even when intermediate nodes are eliminated' is stated without any functor definition, observation map, bisimulation relation, or proof sketch. No equations or theorems are supplied, so the load-bearing formal result cannot be assessed for correctness or scope.
[Abstract] Abstract / §4 (prototype): the claim of 'stable provenance across compilation pipelines' is presented without any quantitative metrics, comparison to baseline methods, or description of the observation map used in COVAN. The weakest assumption—that provenance is recoverable from observable actions alone—remains untested in the supplied text.

minor comments (1)

[Abstract] The abstract refers to 'existing solutions' being invasive or ad hoc but supplies no citations or concrete comparison points.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the comments. We respond point by point to the major comments and note the changes planned for the revised manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'a coalgebraic model and bisimulation... preserves provenance even when intermediate nodes are eliminated' is stated without any functor definition, observation map, bisimulation relation, or proof sketch. No equations or theorems are supplied, so the load-bearing formal result cannot be assessed for correctness or scope.

Authors: The submitted manuscript presents the coalgebraic approach at a summary level in the abstract. We will revise the manuscript to include the explicit functor definition, observation map, bisimulation relation, and a proof sketch establishing that the bisimulation preserves provenance under non-injective rewrites. These additions will be placed in the main body with a brief reference added to the abstract. revision: yes
Referee: [Abstract] Abstract / §4 (prototype): the claim of 'stable provenance across compilation pipelines' is presented without any quantitative metrics, comparison to baseline methods, or description of the observation map used in COVAN. The weakest assumption—that provenance is recoverable from observable actions alone—remains untested in the supplied text.

Authors: The current text describes the COVAN prototype at a high level. We will revise §4 to include quantitative metrics on provenance stability and overhead, direct comparisons against identifier-propagation baselines, an explicit description of the observation map, and experiments that test recoverability of provenance from observable actions alone. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The abstract and available description present a coalgebraic model and bisimulation for provenance tracking based on observational semantics, without any provided equations, self-citations, or derivation steps that reduce claims to inputs by construction. No load-bearing self-definitional, fitted-prediction, or uniqueness-imported steps are identifiable from the given text. The approach is treated as self-contained against standard coalgebra theory and external compiler benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review based on abstract only; ledger entries are inferred from the high-level description of the coalgebraic model.

axioms (1)

domain assumption Observational semantics suffice to recover provenance without explicit identifier propagation
Central premise of the generative approach described in the abstract.

pith-pipeline@v0.9.1-grok · 5653 in / 1118 out tokens · 31384 ms · 2026-06-27T10:58:05.352665+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 5 canonical work pages

[1]

Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: a system for...

2016
[2]

Apache Software Foundation. 2016. Apache Airflow. https://airflow.apache.org/ Version 2.8.2

2016
[3]

Apache Software Foundation. 2016. Apache Beam. https://beam.apache.org/ Version 2.54.0

2016
[4]

Apache Software Foundation. 2026. Apache TVM. https://tvm.apache.org/ Version 0.16.0

2026
[5]

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: an automated end-to-end optimizing compiler for deep learning. InProceedings of the 13th USENIX Conference on Operating Systems Design and Implementation(Carlsbad, CA, USA)...

2018
[6]

Li Deng. 2012. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Process. Mag.29, 6 (2012), 141–142. doi:10.1109/MSP.2012.2211477

work page doi:10.1109/msp.2012.2211477 2012
[7]

TFRecord developers. 2026. TFRecordDataset. https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset

2026
[8]

Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 2–14. doi:10.1109/CGO51591.2021.9370308

work page doi:10.1109/cgo51591.2021.9370308 2021
[9]

Alexandru A Ormenisan, Mahmoud Ismail, Seif Haridi, and Jim Dowling. 2020. Implicit provenance for machine learning artifacts.Proceedings of MLSys20 (2020)

2020
[10]

Gabriele Padovani, Valentine Anantharaj, and Sandro Fiore. 2025. Provenance Tracking in Large-Scale Machine Learning Systems.CoRRabs/2507.01075 (2025). arXiv:2507.01075 doi:10.48550/ARXIV.2507.01075

work page doi:10.48550/arxiv.2507.01075 2025
[11]

2019.PyTorch: an imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019.PyTorch: an imperative style, high-per...

2019
[12]

Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Garret Catron, Summer Deng, Roman Dzhabarov, Nick Gibson, James Hegeman, Meghan Lele, Roman Levenstein, Jack Montgomery, Bert Maher, Satish Nadathur, Jakob Olesen, Jongsoo Park, Artem Rakhov, Misha Smelyanskiy, and Man Wang. 2019. Glow: Graph Lowering Compiler Techniques for Neural Networks. arXiv:1805.00907 [c...

Pith/arXiv arXiv 2019
[13]

Amit Sabne. 2020. XLA : Compiling Machine Learning for Peak Performance. Coalgebraic Provenance Tracking 13

2020
[14]

MobileNetV2: In- verted Residuals and Linear Bottlenecks

Mark Sandler, Andrew G. Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society, 4510–4520. doi:10.1109/CVPR.2018.00474

work page doi:10.1109/cvpr.2018.00474 2018
[15]

Marius Schlegel and Kai-Uwe Sattler. 2023. MLflow2PROV: Extracting Provenance from Machine Learning Experiments. InProceedings of the Seventh Workshop on Data Management for End-to-End Machine Learning, DEEM 2023, Seattle, W A, USA, 18 June 2023. ACM, 9:1–9:4. doi:10.1145/3595360.3595859

work page doi:10.1145/3595360.3595859 2023
[16]

The Kubeflow Authors. 2018. Kubeflow Pipelines. https://www.kubeflow.org/docs/components/pipelines/ Version 2.0

2018

[1] [1]

Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: a system for...

2016

[2] [2]

Apache Software Foundation. 2016. Apache Airflow. https://airflow.apache.org/ Version 2.8.2

2016

[3] [3]

Apache Software Foundation. 2016. Apache Beam. https://beam.apache.org/ Version 2.54.0

2016

[4] [4]

Apache Software Foundation. 2026. Apache TVM. https://tvm.apache.org/ Version 0.16.0

2026

[5] [5]

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: an automated end-to-end optimizing compiler for deep learning. InProceedings of the 13th USENIX Conference on Operating Systems Design and Implementation(Carlsbad, CA, USA)...

2018

[6] [6]

Li Deng. 2012. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Process. Mag.29, 6 (2012), 141–142. doi:10.1109/MSP.2012.2211477

work page doi:10.1109/msp.2012.2211477 2012

[7] [7]

TFRecord developers. 2026. TFRecordDataset. https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset

2026

[8] [8]

Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 2–14. doi:10.1109/CGO51591.2021.9370308

work page doi:10.1109/cgo51591.2021.9370308 2021

[9] [9]

Alexandru A Ormenisan, Mahmoud Ismail, Seif Haridi, and Jim Dowling. 2020. Implicit provenance for machine learning artifacts.Proceedings of MLSys20 (2020)

2020

[10] [10]

Gabriele Padovani, Valentine Anantharaj, and Sandro Fiore. 2025. Provenance Tracking in Large-Scale Machine Learning Systems.CoRRabs/2507.01075 (2025). arXiv:2507.01075 doi:10.48550/ARXIV.2507.01075

work page doi:10.48550/arxiv.2507.01075 2025

[11] [11]

2019.PyTorch: an imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019.PyTorch: an imperative style, high-per...

2019

[12] [12]

Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Garret Catron, Summer Deng, Roman Dzhabarov, Nick Gibson, James Hegeman, Meghan Lele, Roman Levenstein, Jack Montgomery, Bert Maher, Satish Nadathur, Jakob Olesen, Jongsoo Park, Artem Rakhov, Misha Smelyanskiy, and Man Wang. 2019. Glow: Graph Lowering Compiler Techniques for Neural Networks. arXiv:1805.00907 [c...

Pith/arXiv arXiv 2019

[13] [13]

Amit Sabne. 2020. XLA : Compiling Machine Learning for Peak Performance. Coalgebraic Provenance Tracking 13

2020

[14] [14]

MobileNetV2: In- verted Residuals and Linear Bottlenecks

Mark Sandler, Andrew G. Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society, 4510–4520. doi:10.1109/CVPR.2018.00474

work page doi:10.1109/cvpr.2018.00474 2018

[15] [15]

Marius Schlegel and Kai-Uwe Sattler. 2023. MLflow2PROV: Extracting Provenance from Machine Learning Experiments. InProceedings of the Seventh Workshop on Data Management for End-to-End Machine Learning, DEEM 2023, Seattle, W A, USA, 18 June 2023. ACM, 9:1–9:4. doi:10.1145/3595360.3595859

work page doi:10.1145/3595360.3595859 2023

[16] [16]

The Kubeflow Authors. 2018. Kubeflow Pipelines. https://www.kubeflow.org/docs/components/pipelines/ Version 2.0

2018