pith. sign in

arxiv: 2103.15776 · v6 · submitted 2021-03-29 · 💻 cs.PL · cs.LO

CHAD: Combinatory Homomorphic Automatic Differentiation

Pith reviewed 2026-05-24 13:31 UTC · model grok-4.3

classification 💻 cs.PL cs.LO
keywords automatic differentiationhomomorphic transformationsource code transformationlogical relationshigher-order functional languagesforward modereverse modelinear types
0
0 comments X

The pith

CHAD performs automatic differentiation as a unique homomorphic source-code transformation that extends Elliott's first-order definitions while preserving correctness for higher-order languages.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Combinatory Homomorphic Automatic Differentiation as a define-then-run technique that turns any program into purely functional derivative code through a structure-preserving transformation. This transformation is presented as the only way to extend well-known first-order AD rules to languages with higher-order functions and arrays without breaking compositionality. Correctness is shown by a logical-relations proof that the semantics of the generated derivative code equals the mathematical derivative of the original program's semantics. The method supports both forward and reverse modes and can be realized in ordinary functional languages by encoding linear types as abstract types via modules. The same homomorphic approach is noted to apply to other analyses that accumulate values in commutative monoids.

Core claim

CHAD defines forward- and reverse-mode automatic differentiation as the unique homomorphic extension of Elliott's first-order functional-language definitions to a higher-order language with statically sized arrays, realized as a compositional type-respecting source transformation that produces purely functional code; a logical-relations argument establishes that the semantics of this syntactic derivative is exactly the usual calculus derivative of the semantics of the source program.

What carries the argument

The homomorphic (structure-preserving) source-code transformation that uniquely extends Elliott's first-order AD definitions, generating derivative code whose semantics matches the mathematical derivative.

If this is right

  • Correct forward- and reverse-mode derivatives are obtained for programs using higher-order functions and arrays.
  • The generated code remains purely functional and type-safe even when implemented in languages without native linear types.
  • The same methodology applies to functional languages with additional expressive features beyond arrays.
  • Other dynamic program analyses that accumulate data in a commutative monoid can be implemented via analogous homomorphic transformations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could reduce the need for language-specific AD implementations by providing a uniform transformation recipe.
  • It suggests a route to proving correctness for AD in languages with more advanced features such as effects or polymorphism.
  • Similar homomorphic transformations might be explored for analyses like taint propagation or resource tracking.
  • An implementation test would involve checking whether abstract-type encodings preserve the expected linearity invariants in practice.

Load-bearing premise

The transformations are the unique homomorphic extension to expressive languages of Elliott's definitions of AD for a first-order functional language.

What would settle it

A concrete higher-order array program for which the semantics of the CHAD-generated derivative code differs from the true mathematical derivative of the original program, for example by finite-difference verification of the reverse-mode output.

Figures

Figures reproduced from arXiv: 2103.15776 by Matthijs V\'ak\'ar, Tom Smeding.

Figure 1
Figure 1. Figure 1: Forward and reverse AD illustrated on simple first-o [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Forward and reverse AD illustrated on higher-order f [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Typing rules for the AD source language. [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Typing rules for the idealised AD target language wit [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Equational rules for the idealised, linear AD langua [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The big-step call-by-value operational semantics [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Typing rules for the applied target language, to exte [PITH_FULL_IMAGE:figures/full_fig_p036_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Output of our Haskell implementation when executed o [PITH_FULL_IMAGE:figures/full_fig_p049_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Output of our Haskell implementation of the forward [PITH_FULL_IMAGE:figures/full_fig_p050_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Manually simplified code from Fig [PITH_FULL_IMAGE:figures/full_fig_p050_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Output of our Haskell implementation of the reverse [PITH_FULL_IMAGE:figures/full_fig_p051_12.png] view at source ↗
read the original abstract

We introduce Combinatory Homomorphic Automatic Differentiation (CHAD), a principled, pure, provably correct define-then-run method for performing forward- and reverse-mode automatic differentiation (AD) on programming languages with expressive features. It implements AD as a compositional, type-respecting source-code transformation that generates purely functional code. This code transformation is principled in the sense that it is the unique homomorphic (structure-preserving) extension to expressive languages of Elliott's well-known and unambiguous definitions of AD for a first-order functional language. Correctness of the method follows by a compositional logical-relations argument that shows that the semantics of the syntactic derivative is the usual calculus derivative of the semantics of the original program. In their most elegant formulation, the transformations generate code with linear types. However, the code transformations can be implemented in a standard functional language lacking linear types: while the correctness proof requires tracking linearity, the actual transformations do not. In fact, even in a standard functional language, we can get all the type safety that linear types give us: we can implement all linear types used to type the transformations as abstract types by using a basic module system. In this paper, we detail the method when applied to a simple higher-order language for manipulating statically sized arrays. However, we explain how the methodology applies, more generally, to functional languages with other expressive features. Finally, we discuss how the scope of CHAD extends beyond applications in AD to other dynamic program analyses that accumulate data in a commutative monoid.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces Combinatory Homomorphic Automatic Differentiation (CHAD), a source-to-source transformation for forward- and reverse-mode AD on higher-order functional languages with arrays. The transformations are presented as the unique homomorphic extension of Elliott's first-order definitions; correctness is claimed to follow from a compositional logical-relations argument equating the semantics of the syntactic derivative to the usual calculus derivative of the program semantics. The paper notes that while the elegant formulation uses linear types, the transformations can be realized in standard functional languages by encoding linear types as abstract types via a module system, and it discusses generalization to other language features and other monoid-accumulating analyses.

Significance. If the logical-relations argument and the transfer to the abstract-type implementation both hold, the work supplies a principled, compositional foundation for AD in expressive languages that extends Elliott's definitions without ad-hoc adjustments. The emphasis on homomorphic structure preservation and the explicit separation of the linearity-tracking proof from the implementable transformation are notable strengths; the discussion of applicability to other dynamic analyses is a further positive.

major comments (2)
  1. [Abstract (linearity paragraph) and the section detailing the module-system implementation] The central claim that the implemented code (using abstract types) satisfies the same correctness statement as the logical-relations proof is load-bearing. The abstract states that 'the correctness proof requires tracking linearity' while 'the actual transformations do not,' yet the denotational semantics of the module interface must still enforce the linear usage discipline (no duplication or discarding) for the relational interpretation to carry over; no section appears to supply a formal argument or model showing that the abstract-type encoding preserves the required relational properties at the semantic level rather than merely providing syntactic type safety.
  2. [Section defining the CHAD transformations for the array language] The claim that the transformations constitute 'the unique homomorphic (structure-preserving) extension' of Elliott's definitions is central to the methodology. The paper should make explicit, in the section defining the combinatory transformations, the precise categorical or algebraic structure with respect to which uniqueness is asserted and verify that the higher-order array language satisfies the necessary universal property.
minor comments (2)
  1. [Implementation section] Notation for the linear-type constructors and the module interface could be clarified with a small example showing the exported operations and their types.
  2. [Preliminaries] The paper would benefit from an explicit statement of the language grammar and typing rules before the transformation definitions, to make the homomorphic extension precise.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of the paper's significance and for the constructive major comments. We address each point below and will revise the manuscript accordingly to strengthen the presentation.

read point-by-point responses
  1. Referee: [Abstract (linearity paragraph) and the section detailing the module-system implementation] The central claim that the implemented code (using abstract types) satisfies the same correctness statement as the logical-relations proof is load-bearing. The abstract states that 'the correctness proof requires tracking linearity' while 'the actual transformations do not,' yet the denotational semantics of the module interface must still enforce the linear usage discipline (no duplication or discarding) for the relational interpretation to carry over; no section appears to supply a formal argument or model showing that the abstract-type encoding preserves the required relational properties at the semantic level rather than merely providing syntactic type safety.

    Authors: We agree this is a substantive point requiring clarification. The current manuscript separates the linear-typed formulation (used for the logical-relations proof) from the module-based implementation (which provides only syntactic safety via abstract types). To bridge this, the revised manuscript will add a new subsection in the implementation section that supplies a denotational model for the module interface. In this model, the abstract linear types are interpreted in a suitable linear category (or via a relational interpretation that forbids duplication/discarding), ensuring the relational properties transfer from the linear case to the abstract-type encoding. This will make explicit that the implemented transformations inherit the correctness statement. revision: yes

  2. Referee: [Section defining the CHAD transformations for the array language] The claim that the transformations constitute 'the unique homomorphic (structure-preserving) extension' of Elliott's definitions is central to the methodology. The paper should make explicit, in the section defining the combinatory transformations, the precise categorical or algebraic structure with respect to which uniqueness is asserted and verify that the higher-order array language satisfies the necessary universal property.

    Authors: We agree that the uniqueness claim benefits from greater precision. The transformations are defined as the unique structure-preserving map (i.e., the unique functor) extending Elliott's first-order combinators to the full language syntax. In the revised manuscript, the section on the CHAD transformations will explicitly identify the relevant structure as the initial algebra for the signature of the language (cartesian closed structure plus array operations) and verify that the higher-order array language is freely generated by this signature, thereby satisfying the universal property that guarantees uniqueness of the homomorphic extension. revision: yes

Circularity Check

0 steps flagged

No significant circularity; correctness via self-contained logical relations extending independent prior work

full rationale

The paper's central derivation is a compositional logical-relations argument establishing that the semantics of the syntactic derivative equals the calculus derivative of the original program's semantics. This rests on the transformations being the homomorphic extension of Elliott's independent first-order definitions (cited as external, well-known work by Conal Elliott). No equations, fitted parameters, or self-citations are shown reducing the result to its inputs by construction. The linearity-to-abstract-type implementation detail is presented as a separate realization step that does not alter the proof's premises. The derivation is therefore self-contained against external benchmarks and does not match any enumerated circularity pattern.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper builds on existing definitions of AD and uses standard logical relations for the proof. No new free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Elliott's definitions of AD for a first-order functional language
    The CHAD transformations are defined as the unique homomorphic extension of these definitions.

pith-pipeline@v0.9.0 · 5800 in / 1286 out tokens · 28017 ms · 2026-05-24T13:31:49.648671+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Higher Order Automatic Differentiation of Higher Order Functions

    cs.PL 2021-01 unverdicted novelty 7.0

    The paper characterizes forward-mode AD as a unique structure-preserving macro on a higher-order language with ADTs and proves its semantic correctness using a gluing construction on diffeological spaces, extending to...

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · cited by 1 Pith paper · 5 internal anchors

  1. [1]

    Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, A ndy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ge offrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale mach ine learning. In 12th USENIX Symposium on Operating Systems Design and Imple men- tation (OSDI 16) . 265–283

  2. [2]

    Martín Abadi and Gordon D Plotkin. 2020. A Simple Differen tiable Programming Language. In Proc. POPL 2020 . ACM

  3. [3]

    Andrew Barber and Gordon Plotkin. 1996. Dual intuitionistic linear logic . University of Edinburgh, Department of Computer Science, Laboratory for Foundations of Computer Science

  4. [4]

    Gilles Barthe, Raphaëlle Crubillé, Ugo Dal Lago, and Fra ncesco Gavazzo. 2020. On the Versatility of Open Logical Relations - Continuity, Automatic Differentiation, and a Containment Theorem. In Programming Languages and Systems - 29th European Symposium on Programming, ESOP 2020, Held as Part of the European Joint Conferences on Theory and Practic e of So...

  5. [5]

    Atilim Gunes Baydin and Barak A Pearlmutter. 2014. Autom atic differentiation of algorithms for machine learning. arXiv preprint arXiv:1404.7456 (2014)

  6. [6]

    P Nick Benton. 1994. A mixed linear and non-linear logic: Proofs, terms and models. In International Workshop on Computer Science Logic. Springer, 121–135

  7. [7]

    Michael Betancourt, Charles C Margossian, and Vianey Le os-Barajas. 2020. The Discrete Adjoint Method: Efficient Der ivatives for Functions of Discrete Sequences. arXiv preprint arXiv:2002.00326 (2020)

  8. [8]

    Alois Brunel, Damiano Mazza, and Michele Pagani. 2020. B ackpropagation in the Simply Typed Lambda-calculus with Li near Negation. In Proc. POPL 2020

  9. [9]

    Bob Carpenter, Matthew D Hoffman, Marcus Brubaker, Danie l Lee, Peter Li, and Michael Betancourt. 2015. The Stan math l ibrary: Reverse-mode automatic differentiation in C++. arXiv preprint arXiv:1509.07164 (2015)

  10. [10]

    Manuel MT Chakravarty, Gabriele Keller, Sean Lee, Trev or L McDonell, and Vinod Grover. 2011. Accelerating Haskell array codes with multicore GPUs. In Proceedings of the sixth workshop on Declarative aspects of m ulticore programming. 3–14

  11. [11]

    Chakravarty, Gabriele Keller, Sean Lee, Trevor L

    Manuel M.T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. 2011. Accelerating Haskell Array Codes with Multicore GPUs. In Proceedings of the Sixth Workshop on Declarative Aspects of M ulticore Programming (Austin, Texas, USA) (DAMP ’11) . ACM, New York, NY, USA, 3–14. https://doi.org/10.1145/1926354.1926358

  12. [12]

    J. Robin B. Cockett, Geoff S. H. Cruttwell, Jonathan Gall agher, Jean-Simon Pacaud Lemay, Benjamin MacAdam, Gordon D . Plotkin, and Dorette Pronk. 2020. Reverse Derivative Categories. In Proc. CSL 2020

  13. [13]

    P-L Curien. 1986. Categorical combinators. Information and Control 69, 1-3 (1986), 188–254

  14. [14]

    Jeff Egger, Rasmus Ejlers Møgelberg, and Alex Simpson. 2 009. Enriching an effect calculus with linear types. In International Workshop on Computer Science Logic. Springer, 240–254

  15. [15]

    Thomas Ehrhard. 2018. An introduction to differential l inear logic: proof-nets, models and antiderivatives. Mathematical Structures in Computer Science 28, 7 (2018), 995–1060

  16. [16]

    Conal Elliott. 2018. The simple essence of automatic di fferentiation. Proceedings of the ACM on Programming Languages 2, ICFP (2018), 1–29

  17. [17]

    Marcelo P Fiore. 2007. Differential structure in models of multiplicative biadditive intuitionistic linear logic . In International Conference on Typed Lambda Calculi and Applications. Springer, 163–177

  18. [18]

    Ralf Hannemann-Tamas, Diego A Munoz, and Wolfgang Marquardt. 2015. Adjoint sensitivity analysis for nonsmooth differential-algebraic equation systems. SIAM Journal on Scientific Computing 37, 5 (2015), A2380–A2402

  19. [19]

    Troels Henriksen, Niels GW Serup, Martin Elsman, Fritz Henglein, and Cosmin E Oancea. 2017. Futhark: purely functi onal GPU-programming with nested parallelism and in-place array updates. In Proceedings of the 38th ACM SIGPLAN Conference on Programming L anguage Design and Implementation. 556–571

  20. [20]

    Mathieu Huot, Sam Staton, and Matthijs Vákár. 2022. Hig her order automatic differentiation of higher order functions. Logical Methods in Computer Science 18 (2022), 41:1–41:34. Issue 1

  21. [21]

    Mathieu Huot, Sam Staton, and Matthijs Vákár. 2020. Cor rectness of Automatic Differentiation via Diffeologies and C ategorical Gluing. In Proc. FoS- SaCS

  22. [22]

    Michael Innes. 2018. Don’t unroll adjoint: differentia ting SSA-Form programs. arXiv preprint arXiv:1810.07951 (2018)

  23. [23]

    Peter T Johnstone. 2002. Sketches of an elephant: A topos theory compendium . Vol. 2. Oxford University Press

  24. [24]

    Peter T Johnstone, Stephen Lack, and P Sobocinski. 2007 . Quasitoposes, Quasiadhesive Categories and Artin Gluein g. In Proc. CALCO 2007

  25. [25]

    Faustyna Krawiec, Neel Krishnaswami, Simon Peyton Jon es, Tom Ellis, Andrew Fitzgibbon, and R Eisenberg. 2022. Provably correct, asymptotically efficient, higher-order reverse-mode automatic differentia tion. Proceedings of the ACM on Programming Languages 6, POPL (2022), 1–30

  26. [26]

    Andreas Kriegl and Peter W Michor. 1997. The convenient setting of global analysis . Vol. 53. American Mathematical Soc

  27. [27]

    Joachim Lambek and Philip J Scott. 1988. Introduction to higher-order categorical logic . Vol. 7. Cambridge University Press

  28. [28]

    Serge Lang. 2002. Algebra. Springer, New York, NY

  29. [29]

    Paul Blain Levy. 2012. Call-by-push-value: A Functional/imperative Synthesis . Vol. 2. Springer Science & Business Media

  30. [30]

    Fernando Lucatelli Nunes and Matthijs Vákár. 2021. CHA D for Expressive Total languages. arXiv preprint arXiv:2110.00446 (2021). Manuscript submitted to ACM 44 Matthijs Vákár and Tom Smeding

  31. [31]

    Carol Mak and Luke Ong. 2020. A Differential-form Pullba ck Programming Language for Higher-order Reverse-mode Aut omatic Differentiation. (2020). arxiv:2002.08241

  32. [32]

    Charles C Margossian. 2019. A review of automatic differ entiation and its efficient implementation. Wiley interdisciplinary reviews: data mining and knowledge discovery 9, 4 (2019), e1305

  33. [33]

    Damiano Mazza and Michele Pagani. 2021. Automatic diffe rentiation in PCF. Proceedings of the ACM on Programming Languages 5, POPL (2021), 1–27

  34. [34]

    McDonell, Manuel M.T

    Trevor L. McDonell, Manuel M.T. Chakravarty, Gabriele Keller, and Ben Lippmeier. 2013. Optimising Purely Functio nal GPU Programs. In Proceed- ings of the 18th ACM SIGPLAN International Conference on Func tional Programming (Boston, Massachusetts, USA) (ICFP ’13) . ACM, New York, NY, USA, 49–60. https://doi.org/10.1145/2500365.2500595

  35. [35]

    Paul-André Mellies. 2009. Categorical semantics of li near logic. Panoramas et syntheses 27 (2009), 15–215

  36. [36]

    Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chan an, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaiso n, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (20 17)

  37. [37]

    Barak A Pearlmutter and Jeffrey Mark Siskind. 2008. Reve rse-mode AD in a functional framework: Lambda the ultimate b ackpropagator. ACM Transactions on Programming Languages and Systems (TOPLAS) 30, 2 (2008), 7

  38. [38]

    Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniot is, and Simon Peyton Jones. 2019. Efficient differentiable pro gramming in a functional array-processing language. Proceedings of the ACM on Programming Languages 3, ICFP (2019), 97

  39. [39]

    Lau Skorstengaard. 2019. An Introduction to Logical Re lations. CoRR abs/1907.11133 (2019). arXiv:1907.11133 http://arxiv.org/abs/1907.11133

  40. [40]

    Periklis Tsiros, Frederic Y Bois, Aristides Dokoumetz idis, Georgia Tsiliki, and Haralambos Sarimveis. 2019. Pop ulation pharmacokinetic reanalysis of a Diazepam PBPK model: a comparison of Stan and GNU MCSim. Journal of Pharmacokinetics and Pharmacodynamics 46, 2 (2019), 173–192

  41. [41]

    Matthijs Vákár. 2017. In search of effectful dependent t ypes. arXiv preprint arXiv:1706.07997 (2017)

  42. [42]

    Matthijs Vákár. 2021. Reverse AD at Higher Types: Pure, Principled and Denotationally Correct. Proc. ESOP (2021)

  43. [43]

    Dimitrios Vytiniotis, Dan Belov, Richard Wei, Gordon P lotkin, and Martin Abadi. 2019. The Differentiable Curry. In NeurIPS 2019 Workshop Program Transformations

  44. [44]

    Fei Wang, Xilun Wu, Gregory Essertel, James Decker, and Tiark Rompf. 2019. Demystifying differentiable programming: Shift/reset the penultimate backpropagator. Proceedings of the ACM on Programming Languages 3, ICFP (2019)

  45. [45]

    let substitution

    Paul W. Wilson and Fabio Zanasi. 2020. Reverse Derivati ve Ascent: A Categorical Approach to Learning Boolean Circu its. In Proceedings of the 3rd Annual International Applied Category Theory Conference 2 020, ACT 2020, Cambridge, USA, 6-10th July 2020 (EPTCS, Vol. 3 33), David I. Spivak and Jamie Vicary (Eds.). 247–260. https://doi.org/10.4204/EPTCS.333....