pith. machine review for the scientific record. sign in

arxiv: 2605.10159 · v1 · submitted 2026-05-11 · 💻 cs.LG · cs.NA· math.NA· physics.comp-ph

Recognition: 1 theorem link

· Lean Theorem

jNO: A JAX Library for Neural Operator and Foundation Model Training

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:03 UTC · model grok-4.3

classification 💻 cs.LG cs.NAmath.NAphysics.comp-ph
keywords neural operatorsphysics-informed learningJAX librarysymbolic tracingPDE foundation modelsoperator regressionresidual evaluation
0
0 comments X

The pith

jNO introduces a JAX library whose symbolic tracing system writes domains, model calls, residuals, and losses in one language that compiles into a single optimization pipeline for mixed data-driven and physics-informed neural operator tasks

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents jNO as a library that lets researchers express operator regression, mesh-aware residuals, and PDE constraints through a unified symbolic interface. This interface traces the components and assembles them into one JAX optimization pipeline, so the same code base can switch between purely supervised training and physics-constrained variants without rewriting the surrounding structure. A sympathetic reader would care because the approach removes the usual need to maintain separate code paths for data-driven and residual-based objectives when building neural operators or foundation models for PDEs.

Core claim

The central claim is that a single tracing system in JAX can represent domains, model evaluations, residuals, supervised losses, and diagnostics symbolically and then compile them into one consistent optimization pipeline, thereby allowing seamless movement between operator regression, mesh-aware residual evaluation, and PDE-constrained training.

What carries the argument

The symbolic tracing system that records domains, model calls, residuals, supervised losses, and diagnostics in one language and compiles them into a unified JAX optimization pipeline.

If this is right

  • The same code can be used for pure data-driven operator learning and for physics-informed variants without structural changes.
  • Multi-model compositions become possible inside the same traced pipeline.
  • Fine-grained control over individual model parameters, optimizers, and learning rates can be expressed at the symbolic level.
  • Translated PDE foundation-model families can be handled through native JAX workflows.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the tracing layer scales, it could reduce the engineering overhead when researchers experiment with hybrid loss terms that blend observations and physical constraints.
  • The design may encourage libraries that treat neural operators and PDE solvers as interchangeable modules within one compilation graph.
  • Adoption would likely depend on whether the tracing overhead remains negligible when the underlying PDE mesh or model size grows.

Load-bearing premise

The symbolic tracing mechanism correctly and efficiently assembles arbitrary mixes of data-driven losses and PDE residuals without hidden errors or performance losses on realistic problems.

What would settle it

A test case that combines a high-dimensional PDE residual with a supervised loss on irregular meshes and measures whether the compiled pipeline produces the expected gradient updates and converges at the speed of hand-written JAX code.

Figures

Figures reproduced from arXiv: 2605.10159 by Christopher Straub, Georg Kruse, Leon Armbruster, Rathan Ramesh.

Figure 1
Figure 1. Figure 1: Overview of jNO’s interdependencies. 1.2 JAX tracing paradigm and performance benefits jNO’s traced-symbolic design naturally complements JAX’s own lazy tracing and XLA compi￾lation strategy. Rather than introducing a new execution model, jNO extends JAX’s functional programming paradigm by deferring both model calls and differential operators until compilation time. This alignment provides two concrete ad… view at source ↗
Figure 2
Figure 2. Figure 2: Main execution pipeline of jNO for model training. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
read the original abstract

jNO (jax Neural Operators) is a JAX-native library for neural operators and foundation models with unified support for both data-driven and physics-informed training. Its core design is a tracing system in which domains, model calls, residuals, supervised losses, and diagnostics are written in one symbolic language and compiled into one optimization pipeline. This allows users to move between operator regression, mesh-aware residual evaluation, and PDE-constrained training without restructuring the surrounding code. jNO also supports multi-model compositions, fine-grained control at parameter level (model, optimizer, and learning rate), hyperparameter tuning, and JAX-native workflows for translated PDE foundation-model families. The source repository is available at https://github.com/FhG-IISB/jNO.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces jNO, a JAX-native library for neural operators and foundation models. Its core contribution is a symbolic tracing system that unifies the specification of domains, model calls, residuals, supervised losses, and diagnostics into a single language that compiles to one optimization pipeline, enabling seamless transitions between operator regression, mesh-aware residual evaluation, and PDE-constrained training without code restructuring. The library additionally provides support for multi-model compositions, per-parameter control of models/optimizers/learning rates, hyperparameter tuning, and JAX-native workflows for PDE foundation models. The source code is released at https://github.com/FhG-IISB/jNO.

Significance. If the tracing system functions as described, jNO would offer a practical tool for researchers combining data-driven and physics-informed approaches in neural operators, reducing boilerplate when switching loss formulations. The JAX-native design and open-source release are strengths that align with reproducible scientific ML workflows. However, the absence of any code examples, benchmarks, or verification leaves the practical impact and correctness of the unification claim unassessed.

major comments (1)
  1. [Abstract] Abstract: the central claim that the tracing system 'allows users to move between operator regression, mesh-aware residual evaluation, and PDE-constrained training without restructuring the surrounding code' is presented purely descriptively with no accompanying usage example, code snippet, or empirical verification that the compiled pipeline actually executes the claimed workflows without errors or performance loss; this directly affects the load-bearing design assertion.
minor comments (2)
  1. The manuscript would benefit from a brief 'Getting Started' code example (even 10-15 lines) illustrating a minimal tracing workflow for a residual loss, to make the symbolic language concrete for readers.
  2. No performance or scaling results are reported (e.g., tracing overhead vs. hand-written JAX, or wall-clock time on a standard PDE benchmark); adding even a small table of such metrics would strengthen the library description.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential utility of jNO's unified tracing system for combining data-driven and physics-informed neural operator training. We agree that the central claim in the abstract requires concrete support to be fully convincing. We address the major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the tracing system 'allows users to move between operator regression, mesh-aware residual evaluation, and PDE-constrained training without restructuring the surrounding code' is presented purely descriptively with no accompanying usage example, code snippet, or empirical verification that the compiled pipeline actually executes the claimed workflows without errors or performance loss; this directly affects the load-bearing design assertion.

    Authors: We agree that the abstract presents this capability in purely descriptive terms and that the manuscript would benefit from explicit demonstration. In the revised version we will add a short 'Usage Example' subsection (placed after the library overview) that contains two minimal, self-contained code snippets. The first shows a complete operator-regression pipeline using the symbolic tracing API; the second shows the identical domain and model specification being recompiled for PDE-constrained training by simply swapping the loss term. Both snippets will be accompanied by a brief execution trace confirming successful compilation and run without code restructuring. We will also report wall-clock times for the two pipelines on a small benchmark problem to provide initial empirical evidence that the unification incurs no prohibitive overhead. These additions directly substantiate the load-bearing claim while remaining concise. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a software library description whose central claim is an architectural feature (a unified symbolic tracing system for domains, model calls, residuals, losses, and diagnostics). No mathematical derivations, equations, predictions, or fitted parameters are present in the provided text or abstract. Claims reduce to implementation details rather than any self-referential reduction or self-citation chain, satisfying the default expectation of no circularity for non-derivational papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software library announcement with no mathematical derivations, fitted parameters, or new physical entities.

pith-pipeline@v0.9.0 · 5436 in / 1028 out tokens · 48229 ms · 2026-05-12T03:03:09.180219+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 1 internal anchor

  1. [1]

    URLhttps://doi.org/10

    doi: 10.1038/s42254-024-00712-5. URLhttps://doi.org/10. 1038/s42254-024-00712-5. Igor Babuschkin, Kate Baumli, Alison Bell, Surya Bhupatiraju, Jake Bruce, Peter Buchlovsky, David Budden, Trevor Cai, Aidan Clark, Ivo Danihelka, et al. Optax: A gradient processing and optimization library for jax,

  2. [2]

    URLhttps://doi.org/10

    doi: 10.1002/nme.2579. URLhttps://doi.org/10. 1002/nme.2579. 14 Zhongkai Hao, Chang Su, Songming Liu, Julius Berner, Chengyang Ying, Hang Su, An- ima Anandkumar, Jian Song, and Jun Zhu. Dpot: Auto-regressive denoising operator transformer for large-scale pde pre-training.arXiv preprint arXiv:2403.03542,

  3. [3]

    Accessed: 2026-04-29

    Software library. Accessed: 2026-04-29. iree-org and contributors. Iree: End-to-end mlir compiler and runtime for machine learning. https://github.com/iree-org/iree,

  4. [4]

    Variational physics-informed neural networks for solving partial differential equations.arXiv preprint arXiv:1912.00873, 2019

    URLhttps://arxiv.org/abs/1912.00873. Patrick Kidger and Cristian Garcia. Equinox: neural networks in JAX via callable PyTrees and filtered transformations.Differentiable Programming workshop at Neural Information Processing Systems 2021,

  5. [5]

    Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar

    URLhttps://arxiv.org/abs/2111.00254. Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations. InProc. 9th International Conference on Learning Representations (ICLR, 2021).,

  6. [6]

    Fourier Neural Operator for Parametric Partial Differential Equations

    URLhttps://arxiv.org/abs/2010.08895. Yuxuan Liu, Jingmin Sun, and Hayden Schaeffer. BCAT: A block causal transformer for pde foundation models for fluid dynamics.arXiv preprint arXiv:2501.18972,

  7. [7]

    Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3), 218–229 (2021)

    URLhttps: //github.com/bootjp/bcat. Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021a. doi: 10.1038/s42256-021-00302-5. URL https://doi.org/10.1038/s42256-021-00302-5. Lu Lu, Xuh...

  8. [8]

    Walrus: A cross-domain foundation model for continuum dynamics.arXiv preprint arXiv:2511.15684, 2025

    URLhttps://arxiv.org/abs/2511.15684. Vlad Medvedev, Leon Armbruster, Christopher Straub, Georg Kruse, and Andreas Rosskopf. Physics-informed fine-tuning of foundation models for partial differential equations. In AI&PDE: ICLR 2026 Workshop on AI and Partial Differential Equations,

  9. [9]

    A review on brain tumor segmentation based on deep learning methods with federated learning techniques

    doi: 10.1016/j. jcp.2018.10.045. URLhttps://doi.org/10.1016/j.jcp.2018.10.045. J. Rapin and O. Teytaud. Nevergrad - A gradient-free optimization platform.https://GitHub. com/FacebookResearch/Nevergrad,

  10. [10]

    Morph: Pde foundation models with arbitrary data modality

    Mahindra Singh Rautela, Alexander Most, Siddharth Mansingh, Bradley C Love, Ayan Biswas, Diane Oyen, and Earl Lawrence. Morph: Pde foundation models with arbitrary data modality. arXiv preprint arXiv:2509.21670,

  11. [11]

    Pdeformer-2: A versatile foundation model for two-dimensional partial differential equations.arXiv preprint arXiv:2507.15409, 2025

    URLhttps:// arxiv.org/abs/2507.15409. Kirill Zubov, Zoe McCarthy, Yingbo Ma, Francesco Calisto, Valerio Pagliarino, Simone Azeglio, Luca Bottero, Emmanuel Lujan, Valentin Sulzer, Ashutosh Bharambe, et al. Neuralpde: Au- tomatic physics-informed neural networks (pinn) with julia.arXiv preprint arXiv:2107.09443,

  12. [12]

    URLhttps://arxiv.org/abs/2107.09443. 17