NeSyCat Torch: A Differentiable Tensor Implementation of Categorical Semantics for Neurosymbolic Learning
Pith reviewed 2026-06-26 20:34 UTC · model grok-4.3
The pith
NeSyCat Torch embeds neural network interpretations of symbols into a monadic categorical semantics for uniform neurosymbolic learning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that by interpreting symbols via neural networks and realizing the framework in tensor backends with the distribution monad and lazy log-tensor monad, one obtains a differentiable implementation of the categorical semantics that preserves the inductive truth definition while enabling efficient training and outperforming prior systems on benchmark tasks.
What carries the argument
The lazy log-tensor monad over the log-semiring, which performs lazy marginalization via monadic bind in do-notation for numerically stable backpropagation.
If this is right
- Code written once in monad-based do-notation can be reused across different monads for various semantics.
- Neural approximation of symbols integrates directly without breaking the categorical structure.
- Batching via an additional monad layer allows scalable training without changes to the core logic.
- The approach applies to continuous domains once a suitable monad is implemented with neural components.
Where Pith is reading between the lines
- This construction could integrate with other tensor-based probabilistic programming libraries for broader neurosymbolic applications.
- Testing on tasks beyond MNIST addition would verify if the performance gains generalize to new logical queries.
- The lazy pruning of branches in the monad may suggest similar optimizations in other logical tensor systems.
Load-bearing premise
The monadic bind realized by tensor operations and the lazy log-tensor monad preserves the categorical semantics enough for the inductive truth definition to hold after neural approximation and batching.
What would settle it
A counterexample where the output probabilities or truth values from the tensor implementation diverge significantly from those computed by the reference distribution monad on the same logical program would show the semantics are not preserved.
read the original abstract
Neurosymbolic semantics is fragmented: classical, fuzzy, probabilistic and neural systems each define truth by their own inductive rules. NeSyCat, extending ULLER, subsumes them under a single inductive definition of truth, parametric in a strong monad and an aggregation structure on truth-values. NeSyCat has so far lacked an account of predicates and functions learned by neural networks. We provide NeSyCat Torch as the missing link and interpret computational symbols via neural networks, implementing the framework in probabilistic programming and tensor-based backends. We use the distribution monad for reference semantics and metric evaluation, and complement it by a monad for numerically stable, differentiable training: the lazy log-tensor monad over the log-semiring. For efficient training in batches, we furthermore employ a batch monad. The axioms are the source code: written once in monad-based do-notation, monadic bind performs marginalisation, lazily pruning unneeded branches. On MNIST addition, our HaskTorch, JAX, and PyTorch implementations outperform LTN and DeepProbLog in speed and accuracy, while achieving nearly the accuracy of DeepStochLog. However, unlike DeepStochLog, we stay in a uniform framework that applies to many first-order NeSy approaches. Namely, the construction is parametric in the monad; instantiating it with, e.g., the Giry monad extends the approach to continuous probability (working out a neural representation here is left for future work).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents NeSyCat Torch, a tensor implementation of the NeSyCat categorical framework for neurosymbolic learning. It extends the framework to neural predicates and functions by realizing symbols via neural networks, using the distribution monad for reference semantics and a lazy log-tensor monad (over the log-semiring) for differentiable training, together with a batch monad for efficiency. The implementation is written once in monad-based do-notation (with bind performing marginalization and lazy pruning), remains parametric in the monad, and is instantiated in HaskTorch, JAX, and PyTorch backends. On the MNIST addition task the approach is reported to outperform LTN and DeepProbLog in speed and accuracy while approaching DeepStochLog performance; the source code is presented as the axioms of the construction.
Significance. If the tensor monads preserve the original inductive definition of truth, the work supplies a uniform, extensible first-order neurosymbolic framework that supports both probabilistic reference semantics and efficient neural training. Notable strengths are the parametric monad design (explicitly allowing future extension to the Giry monad), the use of source code as axioms, and the provision of multiple backend implementations. These features could reduce fragmentation across classical, fuzzy, probabilistic and neural NeSy systems.
major comments (2)
- [Abstract] Abstract: the central claim that the lazy log-tensor monad and batch monad preserve NeSyCat's inductive definition of truth after neural approximation and batching is load-bearing, yet the manuscript supplies no monad-law verification, equivalence proof on a non-neural fragment, or other formal check that tensor-realized bind computes the same marginals and truth values as the abstract strong monad; MNIST accuracy/speed results alone do not establish semantic fidelity.
- [Experimental evaluation] Experimental evaluation (MNIST addition): the reported outperformance over LTN and DeepProbLog is presented without error bars, number of runs, ablation on the choice of lazy pruning or log-semiring arithmetic, or details of how neural predicates are integrated into the monadic construction, leaving open whether the gains are robust or depend on post-hoc implementation choices.
minor comments (2)
- The interaction between the batch monad and the lazy log-tensor monad would benefit from a small worked example showing how a simple first-order formula is evaluated under batching.
- [Abstract] The abstract states that the construction 'stays in a uniform framework' but does not explicitly contrast the monadic parametricity with the non-parametric aspects of DeepStochLog; a short comparative paragraph would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting the need for stronger evidence of semantic preservation and more rigorous experimental reporting. We address each major comment below, indicating planned revisions where appropriate. The manuscript's core contribution remains the monad-parametric tensor implementation with source code as axioms, but we will strengthen the presentation as detailed.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that the lazy log-tensor monad and batch monad preserve NeSyCat's inductive definition of truth after neural approximation and batching is load-bearing, yet the manuscript supplies no monad-law verification, equivalence proof on a non-neural fragment, or other formal check that tensor-realized bind computes the same marginals and truth values as the abstract strong monad; MNIST accuracy/speed results alone do not establish semantic fidelity.
Authors: The implementation is expressed uniformly in monad-based do-notation with bind explicitly performing marginalization and lazy pruning, which by construction mirrors the abstract strong monad semantics of NeSyCat. The distribution monad provides the reference semantics, while the lazy log-tensor monad is engineered for numerical stability over the log-semiring without altering the inductive truth definition. We acknowledge the absence of an explicit monad-law verification or equivalence proof in the current draft. In revision we will add a dedicated subsection sketching the equivalence on the non-neural fragment (showing that tensor bind yields identical marginals and truth values) and confirming that the custom monads satisfy the required laws up to the lazy pruning approximation. This addresses the load-bearing claim without relying solely on MNIST results. revision: partial
-
Referee: [Experimental evaluation] Experimental evaluation (MNIST addition): the reported outperformance over LTN and DeepProbLog is presented without error bars, number of runs, ablation on the choice of lazy pruning or log-semiring arithmetic, or details of how neural predicates are integrated into the monadic construction, leaving open whether the gains are robust or depend on post-hoc implementation choices.
Authors: We agree that the experimental section would benefit from greater transparency. The neural predicates are integrated by realizing symbols as neural networks inside the monadic do-notation, with the batch monad handling efficient tensor operations; this is described in the implementation sections but can be expanded. In the revised manuscript we will report results with error bars over 5 independent runs with different random seeds, include an ablation study varying the lazy pruning threshold and comparing log-semiring versus standard arithmetic, and add explicit pseudocode or diagrams showing neural predicate embedding within the monadic construction. These additions will demonstrate that the reported speed and accuracy gains are robust rather than dependent on specific post-hoc choices. revision: yes
Circularity Check
No circularity: implementation and empirical MNIST results are independent of fitted inputs or self-referential definitions
full rationale
The paper describes a monad-parametric implementation of NeSyCat (distribution monad for reference, lazy log-tensor monad for training, plus batch monad) with neural symbol interpretation, then reports measured speed/accuracy on MNIST addition as an external empirical outcome. No equation, claim, or self-citation reduces the performance numbers or the preservation of inductive truth to a quantity defined inside the same paper. The source-code-as-axioms statement is an implementation choice, not a derivation that equates output to input by construction. The work is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math NeSyCat supplies a single inductive definition of truth parametric in a strong monad and an aggregation structure
- domain assumption The lazy log-tensor monad over the log-semiring yields numerically stable differentiable training
Reference graph
Works this paper leans on
-
[1]
Conference version in FoSSaCS 2010, LNCS 6014, pp
doi: 10.2168/LMCS-11(1: 3)2015. Conference version in FoSSaCS 2010, LNCS 6014, pp. 297–311. P. B. Andrews.An Introduction to Mathematical Logic and Type Theory: To Truth Through Proof. Academic press,
-
[2]
Logic Tensor Networks , volume =
doi: 10.1016/J.ARTINT.2021.103649. URLhttps://doi.org/10.1016/j.artint.2021.103649. Samy Badreddine, Luciano Serafini, and Michael Spranger. logLTN: Differentiable Fuzzy Logic in the Logarithm Space. arXiv:2306.14546, June
-
[3]
doi: 10.1093/imanum/draa038. Wray L. Buntine. Operations for learning with graphical models.Journal of Artificial Intelligence Research, 2:159–225,
-
[4]
doi: 10.1613/jair.62. Alonzo Church. A formulation of the simple theory of types.The journal of symbolic logic, 5(2):56–68,
-
[5]
Bruno Gavranovi´ c, Paul Lessard, Andrew Dudzik, Tamara von Glehn, Jo˜ ao G
doi: 10.1109/LICS.2019.8785665. Bruno Gavranovi´ c, Paul Lessard, Andrew Dudzik, Tamara von Glehn, Jo˜ ao G. M. Ara´ ujo, and Petar Veliˇ ckovi´ c. Position: Categorical Deep Learning is an Algebraic Theory of All Architectures. arXiv:2402.15332, June
-
[6]
2006.Decision Modelling For Health Economic Evaluation
ISBN 978-0-19-851598-2. doi: 10.1093/oso/ 9780198515982.001.0001. Anders Kock. Monads on symmetric monoidal closed categories.Archiv der Mathematik, 21:1–10,
-
[7]
Christina Kohl and Christina Schwaiger
doi: 10.1007/BF01220868. Christina Kohl and Christina Schwaiger. Monads in computer science,
-
[8]
Artificial Intelligence298, 103504 (2021)
doi: 10.1016/J.ARTINT.2021.103504. URLhttps://doi.org/10.1016/ j.artint.2021.103504. Eugenio Moggi. Notions of computation and monads.Information and Computation, 93 (1):55–92, July
-
[9]
Fritz Obermeyer, Eli Bingham, Martin Jankowiak, Justin Chiu, Neeraj Pradhan, Alexan- der M
doi: 10.1016/0890-5401(91)90052-4. Fritz Obermeyer, Eli Bingham, Martin Jankowiak, Justin Chiu, Neeraj Pradhan, Alexan- der M. Rush, and Noah D. Goodman. Tensor variable elimination for plated factor graphs. InProceedings of the 36th International Conference on Machine Learning (ICML), vol- ume 97 ofProceedings of Machine Learning Research, pages 4871–4880,
-
[11]
Natalia ´Slusarz, Ekaterina Komendantskaya, Matthew L
URLhttps: //arxiv.org/abs/2604.24612. Natalia ´Slusarz, Ekaterina Komendantskaya, Matthew L. Daggitt, Robert Stewart, and Kathrin Stark. Logic of Differentiable Logics: Towards a Uniform Semantics of DL. arXiv:2303.10650, October
-
[12]
Neural Probabilistic Logic Programming in Discrete- Continuous Domains
Lennert De Smet, Pedro Zuidberg Dos Martires, Robin Manhaeve, Giuseppe Marra, Ange- lika Kimmig, and Luc De Raedt. Neural Probabilistic Logic Programming in Discrete- Continuous Domains. arXiv:2303.04660, March
-
[13]
doi: 10.1016/j.artint. 2021.103602. Emile van Krieken, Thiviyan Thanapalasingam, Jakub M. Tomczak, Frank van Harmelen, and Annette ten Teije. A-NeSI: A scalable approximate method for probabilistic neu- rosymbolic inference. InAdvances in Neural Information Processing Systems (NeurIPS),
-
[14]
doi: 10.1007/978-3-031-71167-1
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.