pith. sign in

arxiv: 2605.22724 · v1 · pith:DV5JSGR2new · submitted 2026-05-21 · 💻 cs.LG · cs.NA· math.NA· stat.ML

Multiple Neural Operators Achieve Near-Optimal Rates for Multi-Task Learning

Pith reviewed 2026-05-22 07:28 UTC · model grok-4.3

classification 💻 cs.LG cs.NAmath.NAstat.ML
keywords neural operatorsmulti-task learningoperator learningapproximation theorystatistical learningdeep learningfunction approximationminimax rates
0
0 comments X

The pith

Collections of Lipschitz operator maps can be learned jointly with multiple neural operators at near-optimal rates that match single-task learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that multiple neural operators achieve near-optimal approximation and statistical generalization bounds when learning collections of operators that belong to broad classes of Lipschitz multiple operator maps. Shared representations across tasks do not raise the overall cost, so multi-task operator learning obeys the same scaling laws as single-operator learning. Lower bounds establish corresponding minimax rates and a curse of parametric complexity. The same asymptotic rates hold when the architecture is compared to a concatenated-input multi-task extension of DeepONet.

Core claim

For broad classes of Lipschitz multiple operator maps the Multiple Neural Operators architecture delivers near-optimal upper bounds on approximation error and statistical generalization; matching lower bounds prove minimax rates that exhibit a curse of parametric complexity; together these results establish that joint learning of multiple operators incurs no extra cost beyond single-operator learning and therefore follows identical scaling laws.

What carries the argument

The Multiple Neural Operators (MNO) architecture, which learns collections of operators through shared representations while respecting the Lipschitz condition on the joint map.

If this is right

  • Multi-task operator learning achieves the same near-optimal approximation rates as single-task operator learning.
  • Statistical generalization bounds remain near-optimal and identical in scaling to the single-task case.
  • Shared representations across tasks produce no increase in overall parametric complexity.
  • The MNO architecture and the multi-task DeepONet extension satisfy essentially the same asymptotic rates from a worst-case perspective.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The result suggests that joint training on several related simulation tasks could be performed with sample complexity no higher than for one task alone.
  • If real-world operator collections satisfy the Lipschitz condition, the bounds would justify multi-task training pipelines in scientific machine-learning applications.
  • The equivalence of rates invites direct empirical checks on concrete operator-learning benchmarks such as fluid flow or elasticity maps.

Load-bearing premise

The collections of target operators must belong to the broad classes of Lipschitz multiple operator maps.

What would settle it

A measured scaling of approximation or generalization error that grows strictly faster with the number of tasks than the single-task rate would contradict the claimed equivalence.

read the original abstract

We study the approximation and statistical complexity of learning collections of operators in a shared multi-task setting, with a focus on the Multiple Neural Operators (MNO) architecture. For broad classes of Lipschitz multiple operator maps, we derive near-optimal upper bounds for approximation and statistical generalization. On the lower-bound side, we establish a curse of parametric complexity and prove corresponding minimax rates. Together, these results show that shared representations across tasks do not increase the overall cost: multi-task operator learning follows the same scaling laws as single operator learning. We also compare MNO with a multi-task extension of DeepONet based on concatenated task inputs and show that, from a worst-case approximation-complexity perspective, both architectures satisfy essentially the same asymptotic rates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper studies the approximation and statistical complexity of multi-task operator learning with a focus on the Multiple Neural Operators (MNO) architecture. For broad classes of Lipschitz multiple operator maps, it derives near-optimal upper bounds on approximation and generalization error. It establishes matching minimax lower bounds that exhibit a curse of parametric complexity, showing that shared representations across tasks incur no extra cost and that multi-task operator learning obeys the same scaling laws as the single-task case. The paper also compares MNO to a concatenated-input multi-task extension of DeepONet and concludes that both architectures achieve essentially the same asymptotic rates from a worst-case perspective.

Significance. If the upper and lower bounds are correctly derived, the result would be significant for the theory of neural operators and multi-task learning. It supplies a rigorous justification that joint learning of operator collections does not inflate sample or parameter complexity beyond the single-operator baseline, which could inform architecture design in scientific machine learning applications involving families of related PDEs or dynamical systems. The explicit comparison to a multi-task DeepONet variant adds practical value by identifying two architectures with comparable worst-case guarantees.

minor comments (3)
  1. The abstract and introduction use the phrase 'near-optimal' without immediately stating the precise rate (e.g., the dependence on the number of tasks, the Lipschitz constant, or the dimension of the input function space). Adding a one-sentence summary of the achieved rate would improve readability.
  2. Section 2 (or the related-work subsection) would benefit from an explicit citation to the single-task operator-learning rates that the multi-task bounds are claimed to match, so that the 'same scaling laws' statement can be checked at a glance.
  3. In the statement of the main upper-bound theorem, the dependence of the constant on the number of tasks T should be written out explicitly rather than absorbed into big-O notation, to make the 'no extra cost' claim immediately verifiable.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and accurate summary of our work, as well as the recommendation for minor revision. We appreciate the recognition of the significance of our results on near-optimal rates for multi-task operator learning and the comparison to the multi-task DeepONet extension. We will incorporate any minor suggestions to improve clarity and presentation in the revised manuscript.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper derives near-optimal approximation and statistical upper bounds for collections of Lipschitz multiple operator maps, along with matching minimax lower bounds that establish shared representations incur no extra cost relative to single-task operator learning. These results rely on standard approximation theory and statistical learning arguments conditioned explicitly on the Lipschitz multiple-operator class, without reducing any claimed prediction or rate to a fitted parameter, self-defined quantity, or load-bearing self-citation. The comparison to concatenated-input multi-task DeepONet is likewise framed in terms of asymptotic rates under the same class, with no step that equates an output to its own input by construction. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the domain assumption that the operator collections are Lipschitz. No free parameters or invented entities are described in the abstract.

axioms (1)
  • domain assumption Collections of operators belong to broad classes of Lipschitz multiple operator maps
    Explicitly invoked in the abstract as the setting for which near-optimal bounds are derived.

pith-pipeline@v0.9.0 · 5657 in / 1284 out tokens · 57348 ms · 2026-05-22T07:28:13.867856+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages

  1. [1]

    The sample complexity of learning lipschitz operators with respect to gaussian measures, 2025

    Ben Adcock, Michael Griebel, and Gregor Maier. The sample complexity of learning lipschitz operators with respect to gaussian measures, 2025

  2. [2]

    Sorokin, Xianjin Yang, Théo Bourdais, Edoardo Calvello, Matthieu Darcy, Alexander Hsu, Bamdad Hosseini, and Houman Owhadi

    Aras Bacho, Aleksei G. Sorokin, Xianjin Yang, Théo Bourdais, Edoardo Calvello, Matthieu Darcy, Alexander Hsu, Bamdad Hosseini, and Houman Owhadi. Operator learning at machine precision, 2025

  3. [3]

    Kovachki, and Andrew M

    Kaushik Bhattacharya, Bamdad Hosseini, Nikola B. Kovachki, and Andrew M. Stuart. Model Reduction And Neural Networks For Parametric PDEs.The SMAI Journal of computational mathematics, 7:121– 157, 2021

  4. [4]

    Vicon: Vision in-context operator networks for multi-physics fluid dynamics prediction.arXiv:2411.16063, 2024

    Yadi Cao, Yuxuan Liu, Liu Yang, Rose Yu, Hayden Schaeffer, and Stanley Osher. Vicon: Vision in- context operator networks for multi-physics fluid dynamics prediction.arXiv preprint arXiv:2411.16063, 2024

  5. [5]

    The kolmogorov infinite dimensional equation in a hilbert space via deep learning methods

    Javier Castro. The kolmogorov infinite dimensional equation in a hilbert space via deep learning methods. Journal of Mathematical Analysis and Applications, 527(2):127413, 2023

  6. [6]

    The calderón’s problem via deeponets.Vietnam Journal of Mathematics, 52(3):775–806, 2024

    Javier Castro, Claudio Muñoz, and Nicolás Valenzuela. The calderón’s problem via deeponets.Vietnam Journal of Mathematics, 52(3):775–806, 2024

  7. [7]

    Chen and H

    T. Chen and H. Chen. Approximations of continuous functionals by neural networks with application to dynamic systems.IEEE Transactions on Neural Networks, 4(6):910–918, 1993

  8. [8]

    Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems.IEEE Transactions on Neural Networks, 6(4):911–917, 1995

  9. [9]

    Applied and Numerical Harmonic Analysis

    Stephan Dahlke, Filippo De Mari, Philipp Grohs, and Demetrio Labate, editors.Harmonic and Applied Analysis. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham, 2015

  10. [10]

    de Hoop, Daniel Zhengyu Huang, Elizabeth Qian, and Andrew M

    Maarten V . de Hoop, Daniel Zhengyu Huang, Elizabeth Qian, and Andrew M. Stuart. The cost-accuracy trade-off in operator learning with neural networks, 2022

  11. [11]

    D. L. Donoho. Sparse components of images and optimal atomic decompositions.Constructive Approx- imation, 17(3):353–382, 2001

  12. [12]

    Takashi Furuya, Michael Anthony Puthawala, Matti Lassas, and Maarten V . de Hoop. Globally injective and bijective neural operators. InThirty-seventh Conference on Neural Information Processing Systems, 2023

  13. [13]

    Theory-to-practice gap for neural networks and neural operators, 2025

    Philipp Grohs, Samuel Lanthaler, and Margaret Trautner. Theory-to-practice gap for neural networks and neural operators, 2025

  14. [14]

    Poseidon: Efficient foundation models for PDEs

    Maximilian Herde, Bogdan Raonic, Tobias Rohner, Roger Käppeli, Roberto Molinaro, Emmanuel de Bezenac, and Siddhartha Mishra. Poseidon: Efficient foundation models for PDEs. InThe Thirty- eighth Annual Conference on Neural Information Processing Systems, 2024

  15. [15]

    Neural and spectral operator surrogates: unified construction and expression rate bounds.Advances in Computational Mathematics, 50(4):72, 2024

    Lukas Herrmann, Christoph Schwab, and Jakob Zech. Neural and spectral operator surrogates: unified construction and expression rate bounds.Advances in Computational Mathematics, 50(4):72, 2024

  16. [16]

    Nelsen, and Margaret Trautner

    Daniel Zhengyu Huang, Nicholas H. Nelsen, and Margaret Trautner. An operator learning perspective on parameter-to-observable maps.Foundations of Data Science, 7(1):163–225, 2025

  17. [17]

    Mionet: Learning multiple-input operators via tensor product

    Pengzhan Jin, Shuai Meng, and Lu Lu. Mionet: Learning multiple-input operators via tensor product. SIAM Journal on Scientific Computing, 44(6):A3490–A3514, 2022. 25

  18. [18]

    Time-series forecasting and refine- ment within a multimodal pde foundation model.Journal of Machine Learning for Modeling and Com- puting, 6(2):77–89, 2025

    Derek Jollie, Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. Time-series forecasting and refine- ment within a multimodal pde foundation model.Journal of Machine Learning for Modeling and Com- puting, 6(2):77–89, 2025

  19. [19]

    Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei

    Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models, 2020

  20. [20]

    On universal approximation and error bounds for fourier neural operators.J

    Nikola Kovachki, Samuel Lanthaler, and Siddhartha Mishra. On universal approximation and error bounds for fourier neural operators.J. Mach. Learn. Res., 22(1), January 2021

  21. [21]

    Neural operator: learning maps between function spaces with applica- tions to pdes.J

    Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: learning maps between function spaces with applica- tions to pdes.J. Mach. Learn. Res., 24(1), January 2023

  22. [22]

    Kovachki, Samuel Lanthaler, and Hrushikesh Mhaskar

    Nikola B. Kovachki, Samuel Lanthaler, and Hrushikesh Mhaskar. Data complexity estimates for operator learning, 2024

  23. [23]

    Kovachki, Samuel Lanthaler, and Andrew M

    Nikola B. Kovachki, Samuel Lanthaler, and Andrew M. Stuart. Chapter 9 - operator learning: Algorithms and analysis. In Siddhartha Mishra and Alex Townsend, editors,Numerical Analysis Meets Machine Learning, volume 25 ofHandbook of Numerical Analysis, pages 419–467. Elsevier, 2024

  24. [24]

    Operator learning with pca-net: upper and lower complexity bounds.J

    Samuel Lanthaler. Operator learning with pca-net: upper and lower complexity bounds.J. Mach. Learn. Res., 24(1), January 2023

  25. [25]

    Error estimates for deeponets: a deep learning framework in infinite dimensions.Transactions of Mathematics and Its Applications, 6(1):tnac001, 03 2022

    Samuel Lanthaler, Siddhartha Mishra, and George E Karniadakis. Error estimates for deeponets: a deep learning framework in infinite dimensions.Transactions of Mathematics and Its Applications, 6(1):tnac001, 03 2022

  26. [26]

    The parametric complexity of operator learning.IMA Journal of Numerical Analysis, page draf028, 08 2025

    Samuel Lanthaler and Andrew M Stuart. The parametric complexity of operator learning.IMA Journal of Numerical Analysis, page draf028, 08 2025

  27. [27]

    Jose Antonio Lara Benitez, Takashi Furuya, Florian Faucher, Anastasis Kratsios, Xavier Tricoche, and Maarten V . de Hoop. Out-of-distributional risk bounds for neural operators with applications to the helmholtz equation.J. Comput. Phys., 513(C), September 2024

  28. [28]

    Deep neural networks are adaptive to function regularity and data distribution in approximation and estimation.Journal of Machine Learning Research, 26(213):1–56, 2025

    Hao Liu, Jiahui Cheng, and Wenjing Liao. Deep neural networks are adaptive to function regularity and data distribution in approximation and estimation.Journal of Machine Learning Research, 26(213):1–56, 2025

  29. [29]

    Generalization error guaranteed auto-encoder- based nonlinear model reduction for operator learning.Applied and Computational Harmonic Analysis, 74:101717, 2025

    Hao Liu, Biraj Dahal, Rongjie Lai, and Wenjing Liao. Generalization error guaranteed auto-encoder- based nonlinear model reduction for operator learning.Applied and Computational Harmonic Analysis, 74:101717, 2025

  30. [30]

    Deep nonparametric estimation of operators between infinite dimensional spaces.J

    Hao Liu, Haizhao Yang, Minshuo Chen, Tuo Zhao, and Wenjing Liao. Deep nonparametric estimation of operators between infinite dimensional spaces.J. Mach. Learn. Res., 25(1), January 2024

  31. [31]

    Neural scaling laws of deep relu and deep operator network: A theoretical study, 2024

    Hao Liu, Zecheng Zhang, Wenjing Liao, and Hayden Schaeffer. Neural scaling laws of deep relu and deep operator network: A theoretical study, 2024

  32. [32]

    PROSE-FD: A multimodal PDE foundation model for learning multiple operators for forecasting fluid dynamics

    Yuxuan Liu, Jingmin Sun, Xinjie He, Griffin Pinney, Zecheng Zhang, and Hayden Schaeffer. PROSE-FD: A multimodal PDE foundation model for learning multiple operators for forecasting fluid dynamics. In Neurips 2024 Workshop Foundation Models for Science: Progress, Opportunities, and Challenges, 2024

  33. [33]

    Bcat: A block causal transformer for pde foundation models for fluid dynamics.arXiv preprint arXiv:2501.18972, 2025

    Yuxuan Liu, Jingmin Sun, and Hayden Schaeffer. Bcat: A block causal transformer for pde foundation models for fluid dynamics.arXiv preprint arXiv:2501.18972, 2025

  34. [34]

    Prose: Predicting multiple operators and symbolic expressions using multimodal transformers.Neural Networks, 180:106707, 2024

    Yuxuan Liu, Zecheng Zhang, and Hayden Schaeffer. Prose: Predicting multiple operators and symbolic expressions using multimodal transformers.Neural Networks, 180:106707, 2024. 26

  35. [35]

    Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, 2021

    Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, 2021

  36. [36]

    A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data.Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022

    Lu Lu, Xuhui Meng, Shengze Cai, Zhiping Mao, Somdatta Goswami, Zhongqiang Zhang, and George Em Karniadakis. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data.Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022

  37. [37]

    Exponential convergence of deep operator networks for elliptic partial differential equations.SIAM Journal on Numerical Analysis, 61(3):1513–1545, 2023

    Carlo Marcati and Christoph Schwab. Exponential convergence of deep operator networks for elliptic partial differential equations.SIAM Journal on Numerical Analysis, 61(3):1513–1545, 2023

  38. [38]

    Expression rates of neural operators for linear elliptic pdes in polytopes.CoRR, abs/2409.17552, 2024

    Carlo Marcati and Christoph Schwab. Expression rates of neural operators for linear elliptic pdes in polytopes.CoRR, abs/2409.17552, 2024

  39. [39]

    Multiple physics pretraining for physical surrogate models.arXiv preprint arXiv:2310.02994, 2023

    Michael McCabe, Bruno Régaldo-Saint Blancard, Liam Holden Parker, Ruben Ohana, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Siavash Golkar, Géraud Krawezik, Francois Lanusse, et al. Multiple physics pretraining for physical surrogate models.arXiv preprint arXiv:2310.02994, 2023

  40. [40]

    A multimodal pde founda- tion model for prediction and scientific text descriptions.arXiv preprint arXiv:2502.06026, 2025

    Elisa Negrini, Yuxuan Liu, Liu Yang, Stanley J Osher, and Hayden Schaeffer. A multimodal pde founda- tion model for prediction and scientific text descriptions.arXiv preprint arXiv:2502.06026, 2025

  41. [41]

    Optimal approximation of piecewise smooth functions using deep relu neural networks.Neural Networks, 108:296–330, 2018

    Philipp Petersen and Felix V oigtlaender. Optimal approximation of piecewise smooth functions using deep relu neural networks.Neural Networks, 108:296–330, 2018

  42. [42]

    Deep operator network approximation rates for lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

    Christoph Schwab, Andreas Stein, and Jakob Zech. Deep operator network approximation rates for lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

  43. [43]

    Towards a foundation model for partial differential equations: Multioperator learning and extrapolation.Physical Review E, 111(3):035304, 2025

    Jingmin Sun, Yuxuan Liu, Zecheng Zhang, and Hayden Schaeffer. Towards a foundation model for partial differential equations: Multioperator learning and extrapolation.Physical Review E, 111(3):035304, 2025

  44. [44]

    Lemon: Learning to learn multi-operator networks, 2025

    Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. Lemon: Learning to learn multi-operator networks, 2025

  45. [45]

    Opinf-llm: Parametric pde solving with llms via operator inference, 2026

    Zhuoyuan Wang, Hanjiang Hu, Xiyu Deng, Saviz Mowlavi, and Yorie Nakahira. Opinf-llm: Parametric pde solving with llms via operator inference, 2026

  46. [46]

    Generalization bounds and statistical guarantees for multi-task and multiple operator learning with mno networks, 2026

    Adrien Weihs and Hayden Schaeffer. Generalization bounds and statistical guarantees for multi-task and multiple operator learning with mno networks, 2026

  47. [47]

    A deep learning framework for multi-operator learning: Architectures and approximation theory, 2025

    Adrien Weihs, Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. A deep learning framework for multi-operator learning: Architectures and approximation theory, 2025

  48. [48]

    In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023

    Liu Yang, Siting Liu, Tingwei Meng, and Stanley J Osher. In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023

  49. [49]

    Prompting in-context operator learning with sensor data, equations, and natural language.arXiv preprint arXiv:2308.05061, 2023

    Liu Yang, Tingwei Meng, Siting Liu, and Stanley J Osher. Prompting in-context operator learning with sensor data, equations, and natural language.arXiv preprint arXiv:2308.05061, 2023

  50. [50]

    Pdeformer-2: A versatile foundation model for two- dimensional partial differential equations.arXiv preprint arXiv:2507.15409, 2025

    Zhanhong Ye, Zining Liu, Bingyang Wu, Hongjie Jiang, Leheng Chen, Minyan Zhang, Xiang Huang, Qinghe Meng Zou, Hongsheng Liu, and Bin Dong. Pdeformer-2: A versatile foundation model for two- dimensional partial differential equations.arXiv preprint arXiv:2507.15409, 2025

  51. [51]

    Probabilistic operator learn- ing: generative modeling and uncertainty quantification for foundation models of differential equations

    Benjamin J Zhang, Siting Liu, Stanley J Osher, and Markos A Katsoulakis. Probabilistic operator learn- ing: generative modeling and uncertainty quantification for foundation models of differential equations. arXiv preprint arXiv:2509.05186, 2025

  52. [52]

    Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024

    Zecheng Zhang. Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024. 27

  53. [53]

    A discretization-invariant extension and analysis of some deep operator networks.Journal of Computational and Applied Mathematics, 456:116226, 2025

    Zecheng Zhang, Wing Tat Leung, and Hayden Schaeffer. A discretization-invariant extension and analysis of some deep operator networks.Journal of Computational and Applied Mathematics, 456:116226, 2025

  54. [54]

    Coefficient-to-basis network: a fine-tunable op- erator learning framework for inverse problems with adaptive discretizations and theoretical guarantees

    Zecheng Zhang, Hao Liu, Wenjing Liao, and Guang Lin. Coefficient-to-basis network: a fine-tunable op- erator learning framework for inverse problems with adaptive discretizations and theoretical guarantees. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 383(2305):20240054, 09 2025

  55. [55]

    D2no: Efficient handling of heterogeneous input function spaces with distributed deep neural operators.Computer Methods in Applied Mechanics and Engineering, 428:117084, 2024

    Zecheng Zhang, Christian Moya, Lu Lu, Guang Lin, and Hayden Schaeffer. D2no: Efficient handling of heterogeneous input function spaces with distributed deep neural operators.Computer Methods in Applied Mechanics and Engineering, 428:117084, 2024

  56. [56]

    Belnet: basis enhanced learning, a mesh-free neural operator.Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 479(2276):20230043, 2023

    Zecheng Zhang, Leung Wing Tat, and Hayden Schaeffer. Belnet: basis enhanced learning, a mesh-free neural operator.Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 479(2276):20230043, 2023. Appendix In this section, we present detailed proofs of all our results. A Near-Optimal Approximation Rates Proof of Theorem 3.1.For...

  57. [57]

    From part 1, we may taker= 1to obtainG: L rG(ΩW )×L rG(ΩU)→Vwhich is Frechet differentiable onL rG(ΩW )×L rG(ΩU). Specifically, from the proof of Corollary 3.14, we know that G[α][u](x) =F(α)ϕ(x), whereF: L rG(ΩW )→Ris the Frechet differentiable functional provided by [26, Theorem 2.11], and ϕ∈Vis a fixed nontrivial function. Next, the proof of [26, Lemma...

  58. [58]

    The upper bound is a direct consequence of Theorem 3.1

    The lower bound in (16) is given by combining parts 1 and 2 of the theorem. The upper bound is a direct consequence of Theorem 3.1. C An Extension of DeepONet to Multi-Task Learning Proof of Lemma 3.21.Let(α, u)∈W×Uandx∈Ω V . Then evx ◦NN[α][u] = HX k=1 NX ℓ=1 θkℓ bk(MW (α), MU(u))τ ℓ(x) = HX k=1 NX ℓ=1 θkℓτℓ(x) ! bk(MW (α), MU(u)) =: ˜τ(x)⊤b(MW (α), MU(u...

  59. [59]

    From part 1, we may taker= 1to obtainG: L rG(ΩW )×L rG(ΩU)→Vwhich is Frechet differentiable onL rG(ΩW )×L rG(ΩU). Specifically, from the proof of [26, Corollary 2.12], we know that G[α][u](x) =F(α, u)ϕ(x), whereF: L rG(ΩW )×L rG(ΩU)→Ris the Frechet differentiable functional provided by [26, Theorem 2.11], andϕ∈Vis a fixed nontrivial function. Next, the pr...

  60. [60]

    The upper bound is a direct consequence of Proposition 3.25 and Remark 3.26

    The lower bound in (16) is given by combining parts 1 and 2 of the theorem. The upper bound is a direct consequence of Proposition 3.25 and Remark 3.26. 46