Mosaic: A Benchmark Suite for Differentiable Physics Solvers
Pith reviewed 2026-06-29 02:10 UTC · model grok-4.3
The pith
Evaluation of 14 differentiable PDE solvers shows memory limits, numerical stability, and compatibility are bigger barriers than gradient accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Mosaic demonstrates that while differentiable PDE solvers vary widely in computational cost and Jacobian conditioning, all that successfully produce gradients converge to comparable optima, making memory limits, numerical stability, and setup compatibility the primary practical barriers rather than gradient accuracy.
What carries the argument
Tesseract containerization that wraps each solver to provide a uniform gradient API regardless of its original language or automatic-differentiation approach.
If this is right
- Order-of-magnitude cost differences make some solvers impractical for given problem sizes.
- Structural incompatibilities can rule out entire solvers for realistic tasks.
- Gradient-producing solvers reach similar optima across the tested domains.
- The dominant obstacles are memory footprint, numerical stability, and integration compatibility.
Where Pith is reading between the lines
- Developers may begin to publish memory and compatibility profiles alongside accuracy claims for new solvers.
- The uniform API could be adopted as a de-facto standard for releasing differentiable physics code.
- Training loops that swap solvers mid-experiment become easier to implement and compare.
- The same container pattern might be applied to non-PDE simulators that currently lack gradient support.
Load-bearing premise
Wrapping solvers in containers and routing their gradients through a common interface leaves their numerical results and gradient correctness unchanged.
What would settle it
An experiment showing that two solvers reach clearly different optima when both supply gradients would contradict the claim that gradient accuracy is not the limiting factor.
read the original abstract
Differentiable partial differential equation (PDE) solvers underpin solver-in-the-loop ML training, gradient-based optimal control, and inverse problems, yet the practical cost of obtaining correct, usable gradients from a given solver on a given problem is largely undocumented. Integration effort, computational cost, gradient accuracy, and numerical conditioning vary widely across solvers and are discoverable only by trial and error. We introduce Mosaic, an extensible benchmarking framework for differentiable PDE solvers that standardizes access to solver gradients. Each solver is packaged as a containerized component (Tesseract) exposing a uniform gradient API regardless of language or automatic differentiation (AD) strategy, enabling researchers to evaluate, compare, and build on non-trivial physical solvers. Our evaluation of 14 solvers across fluid dynamics, structural mechanics, and heat transfer demonstrates that the benchmark surfaces practically relevant differences: order-of-magnitude variation in computational cost and Jacobian conditioning, alongside structural incompatibilities that eliminate solvers from realistic tasks entirely. Despite this variation, all solvers that produce gradients converge to similar optima, indicating that the practical barriers are memory limits, numerical stability, and setup compatibility rather than gradient accuracy alone. Mosaic is open-source and available at https://github.com/pasteurlabs/mosaic.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Mosaic, an extensible benchmarking framework for differentiable PDE solvers. Solvers are packaged as Tesseract containers exposing a uniform gradient API independent of language or AD strategy. An evaluation of 14 solvers across fluid dynamics, structural mechanics, and heat transfer reports order-of-magnitude differences in computational cost and Jacobian conditioning, plus structural incompatibilities that rule out some solvers for realistic tasks; however, all gradient-producing solvers converge to similar optima, leading to the conclusion that practical barriers are memory limits, numerical stability, and setup compatibility rather than gradient accuracy.
Significance. If the central empirical findings hold, the work supplies a practical, reproducible resource for researchers selecting differentiable solvers for solver-in-the-loop training, optimal control, and inverse problems. The open-source release and containerized uniform API are concrete strengths that lower the barrier to comparing non-trivial physical solvers.
major comments (1)
- [Abstract] Abstract: the claim that 'all solvers that produce gradients converge to similar optima, indicating that the practical barriers are ... rather than gradient accuracy alone' is load-bearing for the paper's main conclusion, yet the manuscript provides no explicit validation (e.g., side-by-side comparison of native vs. Tesseract-wrapped Jacobians, convergence rates, or final optima on the same test problems) that the containerization and uniform API preserve each solver's original numerical behavior and gradient correctness.
Simulated Author's Rebuttal
We thank the referee for their careful reading and for highlighting a point that strengthens the manuscript. We address the single major comment below and will revise accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'all solvers that produce gradients converge to similar optima, indicating that the practical barriers are ... rather than gradient accuracy alone' is load-bearing for the paper's main conclusion, yet the manuscript provides no explicit validation (e.g., side-by-side comparison of native vs. Tesseract-wrapped Jacobians, convergence rates, or final optima on the same test problems) that the containerization and uniform API preserve each solver's original numerical behavior and gradient correctness.
Authors: We agree that the claim is central and that explicit validation of numerical fidelity under the Tesseract wrapper would make the argument more robust. The current manuscript reports convergence behavior only for the wrapped solvers; it does not include direct native-versus-wrapped comparisons of Jacobians or final optima. In the revised manuscript we will add a dedicated validation subsection that performs exactly these side-by-side checks on a representative subset of solvers and test problems, reporting both Jacobian differences (where accessible) and convergence trajectories to the same optima. This addition will be placed in the evaluation section and referenced from the abstract. revision: yes
Circularity Check
No circularity: empirical benchmark with no derivation chain
full rationale
The paper presents an empirical benchmark suite for differentiable PDE solvers, standardizing access via containerized components and evaluating 14 solvers on metrics like cost, conditioning, and compatibility. No equations, fitted parameters, predictions, or uniqueness theorems are claimed; the central observations (order-of-magnitude variation and convergence to similar optima) are direct empirical results from the benchmark runs, not reductions to self-defined inputs or self-citations. The work is self-contained as a comparison framework without any load-bearing derivation that could be circular.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
S. D. Agdestein, S. Ciarella, B. Sanderse, and R. Hoekstra. IncompressibleNavierStokes.jl: Incompressible Navier–Stokes solver in Julia, 2024. URL https://github.com/agdestein/IncompressibleNavierStokes. jl
2024
-
[2]
P. Angot, C.-H. Bruneau, and P. Fabrie. A penalization method to take into account obstacles in incompressible viscous flows.Numerische Mathematik, 81(4):497–520, 1999. doi: 10.1007/s002110050401
-
[3]
D. Arndt, W. Bangerth, M. Bergbauer, M. Feder, M. Fehling, J. Heinz, T. Heister, L. Heltai, M. Kronbichler, M. Maier, P. Munch, J.-P. Pelteret, B. Turcksin, D. Wells, and S. Zampini. The deal.II library, version 9.5. Journal of Numerical Mathematics, 31(3):231–246, 2023. doi: 10.1515/jnma-2023-0089
-
[4]
M. Ataei and H. Salehipour. XLB: A differentiable massively parallel lattice Boltzmann library in Python. Computer Physics Communications, 2024. URLhttps://arxiv.org/abs/2311.16080
arXiv 2024
-
[5]
W. Bangerth, R. Hartmann, and G. Kanschat. deal.II—A general-purpose object-oriented finite element library.ACM Transactions on Mathematical Software, 33(4), 2007. doi: 10.1145/1268776.1268779
-
[6]
M. P. Bendsøe and O. Sigmund.Topology Optimization: Theory, Methods, and Applications. Springer, Berlin, 2003. doi: 10.1007/978-3-662-05086-6
-
[7]
D. P. Bertsekas. Projected newton methods for optimization problems with simple constraints.SIAM Journal on Control and Optimization, 20(2):221–246, 1982. doi: 10.1137/0320018
-
[8]
T. R. Bewley, P. Moin, and R. Temam. DNS-based predictive control of turbulence: An optimal benchmark for feedback algorithms.Journal of Fluid Mechanics, 447:179–225, 2001. doi: 10.1017/S0022112001005821
-
[9]
D. A. Bezgin, A. B. Buhendwa, and N. A. Adams. JAX-Fluids: A fully-differentiable high-order computa- tional fluid dynamics solver for compressible two-phase flows.Computer Physics Communications, 282: 108527, 2022. doi: 10.1016/j.cpc.2022.108527
-
[10]
Bhatia, G
H. Bhatia, G. Norgard, V. Pascucci, and P.-T. Bremer. The Helmholtz–Hodge decomposition—A survey. IEEE Transactions on Visualization and Computer Graphics, 19(8):1386–1404, 2013
2013
-
[11]
C. Bodnar, W. P. Bruinsma, A. Lucic, M. Stanley, A. Vaughan, J. Brandstetter, P. Garvan, M. Riechert, J. A. Weyn, H. Dong, J. K. Gupta, K. Thambiratnam, A. T. Archibald, C.-C. Wu, E. Heider, M. Welling, R. E. Turner, and P. Perdikaris. A foundation model for the Earth system.Nature, 641:1180–1187, 2025. doi: 10.1038/s41586-025-09005-y
-
[12]
Bradbury, R
J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Van- derPlas, S. Wanderman-Milne, and Q. Zhang. JAX: Composable transformations of Python+NumPy programs, 2018. URLhttps://github.com/jax-ml/jax
2018
-
[13]
G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba. OpenAI Gym, 2016. URLhttps://arxiv.org/abs/1606.01540
Pith/arXiv arXiv 2016
-
[14]
Bungartz, F
H.-J. Bungartz, F. Lindner, B. Gatzhammer, M. Mehl, K. Scheufele, A. Shukaev, and B. Uekermann. preCICE—A fully parallel library for multi-physics surface coupling.Computers & Fluids, 141:250–258,
-
[15]
doi: 10.1016/j.compfluid.2016.04.003
-
[16]
R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud. Neural ordinary differential equations. In Advances in Neural Information Processing Systems, volume 31, 2018
2018
-
[17]
P. E. Farrell, D. A. Ham, S. W. Funke, and M. E. Rognes. Automated derivation of the adjoint of high-level transient finite element programs.SIAM Journal on Scientific Computing, 35(4):C369–C393, 2013. doi: 10.1137/120873558
-
[18]
A. Franz, H. Wei, L. Guastoni, and N. Thuerey. PICT – a differentiable, GPU-accelerated multi-block PISO solver for simulation-coupled learning tasks in fluid dynamics.Journal of Computational Physics, 544:114433, 2025. doi: 10.1016/j.jcp.2025.114433
-
[19]
Fu and G
Y. Fu and G. J. Kennedy. Quasi-Newton corrections for compliance and natural frequency topology optimization problems.Structural and Multidisciplinary Optimization, 66:176, 2023. doi: 10.1007/ s00158-023-03630-9. Mosaic: A Benchmark Suite for Differentiable Physics Solvers — Rehmann et al., 2026 12
2023
-
[20]
M. B. Giles and N. A. Pierce. An introduction to the adjoint approach to design.Flow, Turbulence and Combustion, 65:393–415, 2000
2000
-
[21]
Guermond, P
J.-L. Guermond, P. Minev, and J. Shen. An overview of projection methods for incompressible flows. Computer Methods in Applied Mechanics and Engineering, 195(44–47):6011–6045, 2006
2006
-
[22]
D. Häfner and A. Lavin. Tesseract core: Universal, autodiff-native software components for simulation intelligence.Journal of Open Source Software, 10(111):8385, 2025. doi: 10.21105/joss.08385
-
[23]
Z. Hao, C. Su, S. Liu, J. Berner, C. Ying, H. Su, A. Anandkumar, J. Song, and J. Zhu. DPOT: Auto- regressive denoising operator transformer for large-scale PDE pre-training. InProceedings of the 41st International Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 2024. URLhttps://proceedings.mlr.press/v235/hao24d.html
2024
-
[24]
Herde, B
M. Herde, B. Raonić, T. Rohner, R. Käppeli, R. Molinaro, E. de Bézenac, and S. Mishra. Poseidon: Efficient foundation models for PDEs. InAdvances in Neural Information Processing Systems, 2024. URL https://proceedings.neurips.cc/paper_files/paper/2024/hash/ 84e1b1ec17bb11c57234e96433022a9a-Abstract-Conference.html
2024
-
[25]
P. Holl, V. Koltun, and N. Thuerey.ΦFlow: Differentiable simulations for machine learning. InProceedings of the 41st International Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 2024. URLhttps://proceedings.mlr.press/v235/holl24a.html
2024
-
[26]
Huang and M
Y. Huang and M. Tarek. TopOpt.jl: Truss and continuum topology optimization, interactive visualiza- tion, automatic differentiation and more. InProceedings of the 14th World Congress of Structural and Multidisciplinary Optimization (WCSMO-14), 2021. URLhttps://github.com/JuliaTopOpt/TopOpt.jl
2021
-
[28]
URLhttps://arxiv.org/abs/1810.07951
-
[29]
A. Jameson. Aerodynamic design via control theory.Journal of Scientific Computing, 3(3):233–260, 1988. doi: 10.1007/BF01061285
-
[30]
H. Kato, D. Beker, M. Morariu, T. Ando, T. Matsuoka, W. Kehl, and A. Gaidon. Differentiable rendering: A survey.arXiv preprint arXiv:2006.12057, 2020. URLhttps://arxiv.org/abs/2006.12057
arXiv 2006
-
[31]
D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. InInternational Conference on Learning Representations, 2015. URLhttps://arxiv.org/abs/1412.6980
Pith/arXiv arXiv 2015
-
[32]
Kochkov, J
D. Kochkov, J. A. Smith, A. Alieva, Q. Wang, M. P. Brenner, and S. Hoyer. Machine learning–accelerated computational fluid dynamics.Proceedings of the National Academy of Sciences, 118(21):e2101784118,
-
[33]
doi: 10.1073/pnas.2101784118
-
[34]
F. Koehler, S. Niedermayr, R. Westermann, and N. Thuerey. APEBench: A benchmark for autoregressive neural emulators of PDEs. InAdvances in Neural Information Processing Systems, 2024. URL https: //arxiv.org/abs/2411.00180
arXiv 2024
-
[35]
Kovachki, Z
N. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. Stuart, and A. Anandkumar. Neural operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023
2023
-
[36]
Y. Li, Y. Sun, P. Ma, E. Sifakis, T. Du, B. Zhu, and W. Matusik. NeuralFluid: Neural fluidic system design and control with differentiable simulation. InAdvances in Neural Information Processing Systems, 2024. URLhttps://arxiv.org/abs/2405.14903
arXiv 2024
-
[37]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar. Fourier neural operator for parametric partial differential equations. InInternational Conference on Learning Representations, 2021. URLhttps://arxiv.org/abs/2010.08895
Pith/arXiv arXiv 2021
-
[38]
B. List, L.-W. Chen, K. Bali, and N. Thuerey. Differentiability in unrolled training of neural physics simulators on transient dynamics.Computer Methods in Applied Mechanics and Engineering, 2024. URL https://arxiv.org/abs/2402.12971
arXiv 2024
-
[39]
L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229,
-
[40]
Mosaic: A Benchmark Suite for Differentiable Physics Solvers — Rehmann et al., 2026 13
doi: 10.1038/s42256-021-00302-5. Mosaic: A Benchmark Suite for Differentiable Physics Solvers — Rehmann et al., 2026 13
-
[41]
M. Macklin. Warp: Differentiable spatial computing for Python. InACM SIGGRAPH 2024 Courses, pages 1–147. ACM, 2024. doi: 10.1145/3664475.3664543
-
[42]
N. Meyer. torch-fem: Differentiable finite elements for PyTorch, 2024. URL https://github.com/ Meyer-Nils/torch-fem
2024
-
[43]
W. S. Moses and V. Churavy. Instead of rewriting foreign code for machine learning, automatically synthesize fast gradients. InAdvances in Neural Information Processing Systems, volume 33, pages 12472–12485, 2020
2020
-
[44]
S. Nadarajah and A. Jameson. A comparison of the continuous and discrete adjoint approach to automatic aerodynamic optimization. In38th AIAA Aerospace Sciences Meeting and Exhibit, 2000. doi: 10.2514/6.2000-667
-
[45]
Paszke, S
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. PyTorch: An imperative style, high-performance deep learning library. InAdvances in Neural Information Processing Systems, volume 32, 2019
2019
-
[46]
J. Pathak, S. Subramanian, P. Harrington, S. Raja, A. Chattopadhyay, M. Mardani, T. Kurth, D. Hall, Z. Li, K. Azizzadenesheli, P. Hassanzadeh, K. Kashinath, and A. Anandkumar. FourCastNet: A global data-driven high-resolution weather forecasting model.arXiv preprint arXiv:2202.11214, 2022. URL https://arxiv.org/abs/2202.11214
Pith/arXiv arXiv 2022
-
[47]
O. Pironneau. On optimum design in fluid mechanics.Journal of Fluid Mechanics, 64(1):97–110, 1974. doi: 10.1017/S0022112074002023
-
[48]
I. Price, A. Sanchez-Gonzalez, F. Alet, T. Ewalds, A. El-Kadi, J. Stott, S. Mohamed, P. Battaglia, R. Lam, and M. Willson. Probabilistic weather forecasting with machine learning.Nature, 637:1038–1044, 2024. doi: 10.1038/s41586-024-08252-9
-
[49]
Y. Qian, D. d’Humières, and P. Lallemand. Lattice BGK models for Navier–Stokes equation.Europhysics Letters, 17(6):479–484, 1992
1992
-
[50]
C. Rackauckas, Y. Ma, J. Martensen, C. Warner, K. Zubov, R. Supekar, D. Skinner, A. Ramadhan, and A. Edelman. Universal differential equations for scientific machine learning.arXiv preprint arXiv:2001.04385, 2021
Pith/arXiv arXiv 2001
-
[51]
M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019. doi: 10.1016/j.jcp.2018.10.045
-
[52]
F. Rathgeber, D. A. Ham, L. Mitchell, M. Lange, F. Luporini, A. T. T. McRae, G.-T. Bercea, G. R. Markall, and P. H. J. Kelly. Firedrake: Automating the finite element method by composing abstractions.ACM Transactions on Mathematical Software, 43(3):1–27, 2016. doi: 10.1145/2998441
-
[53]
K. Svanberg. The method of moving asymptotes — a new method for structural optimization.Interna- tional Journal for Numerical Methods in Engineering, 24(2):359–373, 1987. doi: 10.1002/nme.1620240207
-
[54]
Takamoto, T
M. Takamoto, T. Praditia, R. Leiteritz, D. MacKinlay, F. Alesiani, D. Pflüger, and M. Niepert. PDEBench: An extensive benchmark for scientific machine learning.Advances in Neural Information Processing Systems, 35:1596–1611, 2022
2022
-
[55]
K. Um, R. Brand, Y. R. Fei, P. Holl, and N. Thuerey. Solver-in-the-loop: Learning from differentiable physics to interact with iterative PDE-solvers.Advances in Neural Information Processing Systems, 33: 6111–6122, 2020
2020
-
[56]
H. G. Weller, G. Tabor, H. Jasak, and C. Fureby. A tensorial approach to computational continuum mechanics using object-oriented techniques.Computers in Physics, 12(6):620–631, 1998. doi: 10.1063/1. 168744
work page doi:10.1063/1 1998
-
[57]
G. D. Weymouth and B. Font. WaterLily.jl: A differentiable and backend-agnostic Julia solver for incompressible viscous flow around dynamic bodies.Computer Physics Communications, 315:109748,
-
[58]
doi: 10.1016/j.cpc.2025.109748
-
[59]
T. Xue, S. Liao, Z. Gan, C. Park, X. Xie, W. K. Liu, and J. Cao. JAX-FEM: A differentiable GPU-accelerated Mosaic: A Benchmark Suite for Differentiable Physics Solvers — Rehmann et al., 2026 14 3d finite element solver for automatic inverse design and mechanistic data science.Computer Physics Communications, 291:108802, 2023. doi: 10.1016/j.cpc.2023.10880...
-
[60]
L-BFGS+proj and Adam+proj variants apply a solenoidal projection to the gradient Mosaic: A Benchmark Suite for Differentiable Physics Solvers — Rehmann et al., 2026 23 10−4 10−3 10−2 10−1 Relative FD error 2D NS 10−4 10−2 100 3D NS 10−4 10−3 10−2 10−1 Structural 10−4 10−3 10−2 10−1 Thermal 10−2 ε 10−8 10−6 10−4 1 - cos. sim. 10−2 ε 10−7 10−4 10−1 10−2 ε 1...
2026
-
[61]
and PICT produced no converged result. The failure is structural: the inflow-to-drag map is strongly non-convex (advection-driven, with separation and re-attachment regimes), so secant pairs 𝑠𝑘 =𝑥 𝑘+1 −𝑥 𝑘 taken across iterations sample regions where the local Hessian differs in sign, violating the positive-curvature condition and causing the limited-memo...
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.