Higher Order Automatic Differentiation of Higher Order Functions
Pith reviewed 2026-05-24 14:12 UTC · model grok-4.3
The pith
Forward-mode automatic differentiation on higher-order languages is the unique structure-preserving macro and is correct by a gluing construction on diffeological spaces.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The forward-mode AD method is the unique structure-preserving macro extending a choice of derivatives for primitives; this uniqueness yields a correctness proof relative to the diffeological-space semantics of the language via an explicit gluing construction that equates the AD-transformed term with the semantic derivative, and the argument extends directly to higher-order derivatives obtained by Taylor approximation.
What carries the argument
The gluing construction on diffeological spaces that realises the logical relation between a term and its AD image.
If this is right
- Derivatives computed by the macro match the semantic derivative for every well-typed program in the language.
- The same uniqueness argument supplies correctness for the Taylor-based computation of second- and higher-order derivatives.
- Any other AD method that also preserves the language structure must coincide with this one.
- The proof technique is independent of the concrete choice of derivatives for the basic operations.
Where Pith is reading between the lines
- The gluing technique may adapt to prove correctness of reverse-mode or mixed-mode AD on the same language.
- The diffeological model could serve as a target for other program analyses that must respect differentiation.
- Similar uniqueness-plus-gluing arguments might apply to languages that add recursion or probabilistic choice.
Load-bearing premise
Diffeological spaces give a faithful semantics for the higher-order language with algebraic data types, and the AD macro is the only structure-preserving extension of the chosen basic derivatives.
What would settle it
A concrete higher-order program (for example a function that takes a function as argument) whose AD image, when interpreted in the diffeological semantics, fails to equal the semantic derivative of the original program.
Figures
read the original abstract
We present semantic correctness proofs of automatic differentiation (AD). We consider a forward-mode AD method on a higher-order language with algebraic data types and we characterise it as the unique structure-preserving macro given a choice of derivatives for basic operations. We describe a rich semantics for differentiable programming based on diffeological spaces. We show that it interprets our language and we phrase what it means for the AD method to be correct with respect to this semantics. We show that our characterisation of AD gives rise to an elegant semantic proof of its correctness based on a gluing construction on diffeological spaces. We explain how this is in essence a logical relations argument. Throughout we show how the analysis extends to AD methods for computing higher-order derivatives using a Taylor approximation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents semantic correctness proofs for a forward-mode automatic differentiation (AD) method on a higher-order functional language with algebraic data types. It characterizes the AD transformation as the unique structure-preserving macro given derivatives for basic operations, develops a semantics based on diffeological spaces that interprets the language, and proves correctness of the AD method via a gluing construction on these spaces (equivalent to a logical relations argument). The analysis is shown to extend to higher-order derivatives through Taylor approximation.
Significance. If the central claims hold, the work provides an elegant and rigorous semantic foundation for the correctness of AD in higher-order settings with ADTs. The characterization as a unique structure-preserving macro and the use of a gluing construction for the proof are notable strengths that could influence semantic models for differentiable programming. The extension to higher-order derivatives adds breadth, though the primary contribution is the semantic proof technique.
minor comments (3)
- The introduction would benefit from a small concrete example of the AD macro applied to a higher-order function involving an algebraic data type to ground the abstract claims before the semantic development.
- Section headings and subsection numbering for the gluing construction and its relation to logical relations could be made more explicit to help readers trace the proof structure.
- The notation used for the Taylor approximation when extending to higher-order derivatives should be cross-referenced more clearly to the basic first-order case.
Simulated Author's Rebuttal
We thank the referee for their positive review, recognition of the semantic contributions, and recommendation to accept the paper.
Circularity Check
No significant circularity; derivation is self-contained via external semantics
full rationale
The paper characterises forward-mode AD as the unique structure-preserving macro given derivatives for basic operations and proves correctness via a gluing construction (logical relations) on diffeological spaces. This relies on external mathematical structures (diffeological spaces semantics and gluing) that interpret the higher-order language with ADTs, rather than any self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations. The abstract and reader's assessment indicate the model and construction are exhibited directly, with no reduction of the central claim to its own inputs by construction. The extension to higher-order derivatives via Taylor approximation follows the same external framework.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Diffeological spaces provide a rich semantics for differentiable programming that interprets the higher-order language with algebraic data types.
Reference graph
Works this paper leans on
-
[1]
A Geometric Theory of Higher-Order Automatic Differentiation
To appear. [Bet18] Michael Betancourt. A geometric theory of higher-order automatic differentiation. arXiv preprint arXiv:1812.11592,
work page internal anchor Pith review Pith/arXiv arXiv
-
[2]
Differentiating a tensor language
[BML+20] Gilbert Bernstein, Michael Mara, Tzu-Mao Li, Dougal Maclaurin, and Jonathan Ragan-Kelley. Differentiating a tensor language. arXiv preprint arXiv:2008.11256 ,
-
[3]
Backpropagation in the simply typed lambda-calculus with linear negation
[BMP20] Alois Brunel, Damiano Mazza, and Michele Pagani. Backpropagation in the simply typed lambda-calculus with linear negation. In Proc. POPL 2020,
work page 2020
-
[4]
Fadbad, a flexible C++ package for automatic differentiation
[BS96] Claus Bendtsen and Ole Stauning. Fadbad, a flexible C++ package for automatic differentiation. Technical report, Technical Report IMM–REP–1996–17, Department of Mathematical Modelling, Technical University of Denmark, Lyngby,
work page 1996
-
[5]
Tadiff, a flexible c++ package for automatic differentiation
[BS97] Claus Bendtsen and Ole Stauning. Tadiff, a flexible c++ package for automatic differentiation. TU of Denmark, Department of Mathematical Modelling, Lungby. Technical report IMM-REP- 1997-07,
work page 1997
-
[6]
[CCG+20] J. Robin B. Cockett, Geoff S. H. Cruttwell, Jonathan Gallagher, Jean-Simon Pacaud Lemay, Benjamin MacAdam, Gordon D. Plotkin, and Dorette Pronk. Reverse derivative categories. In Proc. CSL 2020,
work page 2020
-
[7]
Towards formalizing and extending differential programming using tangent categories
[CGM19] Geoff Cruttwell, Jonathan Gallagher, and Ben MacAdam. Towards formalizing and extending differential programming using tangent categories. In Proc. ACT 2019,
work page 2019
-
[8]
The Stan Math Library: Reverse-Mode Automatic Differentiation in C++
[CHB+15] Bob Carpenter, Matthew D Hoffman, Marcus Brubaker, Daniel Lee, Peter Li, and Michael Betancourt. The Stan math library: Reverse-mode automatic differentiation in C++. arXiv preprint arXiv:1509.07164,
work page internal anchor Pith review Pith/arXiv arXiv
-
[9]
Tangent spaces and tangent bundles for diffeological spaces
Vol. 18:1 HIGHER ORDER AUTOMATIC DIFFERENTIATION OF HIGHER ORDER FUNCTIONS 41:31 [CW14] J Daniel Christensen and Enxin Wu. Tangent spaces and tangent bundles for diffeological spaces. arXiv preprint arXiv:1411.5425 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[10]
Backprop as functor: A compositional perspective on supervised learning
[FST19] Brendan Fong, David Spivak, and R´ emy Tuy´ eras. Backprop as functor: A compositional perspective on supervised learning. In 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) , pages 1–13. IEEE,
work page 2019
-
[11]
[IZ13] Patrick Iglesias-Zemmour
arxiv:2001.02209. [IZ13] Patrick Iglesias-Zemmour. Diffeology. American Mathematical Soc.,
-
[12]
Quasitoposes, quasiadhesive categories and Artin glueing
[JLS07] Peter T Johnstone, Stephen Lack, and P Sobocinski. Quasitoposes, quasiadhesive categories and Artin glueing. In Proc. CALCO 2007,
work page 2007
-
[13]
Adam: A Method for Stochastic Optimization
[KB14] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980,
work page internal anchor Pith review Pith/arXiv arXiv
-
[14]
A simply typed λ-calculus of forward automatic differentiation
[Man12] Oleksandr Manzyuk. A simply typed λ-calculus of forward automatic differentiation. In Proc. MFPS 2012,
work page 2012
-
[15]
[Mer04] Joel Merker. Four explicit formulas for the prolongations of an infinitesimal lie symmetry and multivariate Faa di Bruno formulas. arXiv preprint math/0411650 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[16]
[MO20] Carol Mak and Luke Ong. A differential-form pullback programming language for higher-order reverse-mode automatic differentiation. arxiv:2002.08241,
-
[17]
[MS92] John C Mitchell and Andre Scedrov
doi:10.1145/3434309. [MS92] John C Mitchell and Andre Scedrov. Notes on sconing and relators. In International Workshop on Computer Science Logic , pages 352–378. Springer,
-
[18]
Some principles of differential programming languages
[Plo18] Gordon D Plotkin. Some principles of differential programming languages. Invited talk, POPL 2018,
work page 2018
-
[19]
λS: Computable semantics for differen- tiable programming with higher-order functions and datatypes
[SMC20] Benjamin Sherman, Jesse Michel, and Michael Carbin. λS: Computable semantics for differen- tiable programming with higher-order functions and datatypes. arXiv preprint arXiv:2007.08017,
-
[20]
Denotational correctness of foward-mode automatic differentiation for iteration and recursion
[V´ ak20] Matthijs V´ ak´ ar. Denotational correctness of foward-mode automatic differentiation for iteration and recursion. arXiv preprint arXiv:2007.05282 ,
-
[21]
CHAD: Combinatory Homomorphic Automatic Differentiation
Vol. 18:1 HIGHER ORDER AUTOMATIC DIFFERENTIATION OF HIGHER ORDER FUNCTIONS 41:33 [VS21] Matthijs V´ ak´ ar and Tom Smeding. CHAD: Combinatory homomorphic automatic differentiation. arXiv preprint arXiv:2103.15776 ,
work page internal anchor Pith review Pith/arXiv arXiv
-
[22]
On the principles of differentiable quantum programming languages
[ZHCW20] Shaopeng Zhu, Shih-Han Hung, Shouvanik Chakrabarti, and Xiaodi Wu. On the principles of differentiable quantum programming languages. In Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020 , pages 272–285. ACM,
work page 2020
-
[23]
doi:10.1145/3385412.3386011. 41:34 M. Huot, S. Staton, and M. V ´ak´ar Vol. 18:1 Appendix A. CartSp and Man are not cartesian closed categories Lemma A.1. There is no continuous injection Rd+1→ Rd. Proof. If there were, it would restrict to a continuous injection Sd→ Rd. The Borsuk-Ulam theorem, however, tells us that every continuous f : Sd→ Rd has some ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.