Multiple Neural Operators Achieve Near-Optimal Rates for Multi-Task Learning

Adrien Weihs; Hayden Schaeffer

arxiv: 2605.22724 · v1 · pith:DV5JSGR2new · submitted 2026-05-21 · 💻 cs.LG · cs.NA· math.NA· stat.ML

Multiple Neural Operators Achieve Near-Optimal Rates for Multi-Task Learning

Adrien Weihs , Hayden Schaeffer This is my paper

Pith reviewed 2026-05-22 07:28 UTC · model grok-4.3

classification 💻 cs.LG cs.NAmath.NAstat.ML

keywords neural operatorsmulti-task learningoperator learningapproximation theorystatistical learningdeep learningfunction approximationminimax rates

0 comments

The pith

Collections of Lipschitz operator maps can be learned jointly with multiple neural operators at near-optimal rates that match single-task learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that multiple neural operators achieve near-optimal approximation and statistical generalization bounds when learning collections of operators that belong to broad classes of Lipschitz multiple operator maps. Shared representations across tasks do not raise the overall cost, so multi-task operator learning obeys the same scaling laws as single-operator learning. Lower bounds establish corresponding minimax rates and a curse of parametric complexity. The same asymptotic rates hold when the architecture is compared to a concatenated-input multi-task extension of DeepONet.

Core claim

For broad classes of Lipschitz multiple operator maps the Multiple Neural Operators architecture delivers near-optimal upper bounds on approximation error and statistical generalization; matching lower bounds prove minimax rates that exhibit a curse of parametric complexity; together these results establish that joint learning of multiple operators incurs no extra cost beyond single-operator learning and therefore follows identical scaling laws.

What carries the argument

The Multiple Neural Operators (MNO) architecture, which learns collections of operators through shared representations while respecting the Lipschitz condition on the joint map.

If this is right

Multi-task operator learning achieves the same near-optimal approximation rates as single-task operator learning.
Statistical generalization bounds remain near-optimal and identical in scaling to the single-task case.
Shared representations across tasks produce no increase in overall parametric complexity.
The MNO architecture and the multi-task DeepONet extension satisfy essentially the same asymptotic rates from a worst-case perspective.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The result suggests that joint training on several related simulation tasks could be performed with sample complexity no higher than for one task alone.
If real-world operator collections satisfy the Lipschitz condition, the bounds would justify multi-task training pipelines in scientific machine-learning applications.
The equivalence of rates invites direct empirical checks on concrete operator-learning benchmarks such as fluid flow or elasticity maps.

Load-bearing premise

The collections of target operators must belong to the broad classes of Lipschitz multiple operator maps.

What would settle it

A measured scaling of approximation or generalization error that grows strictly faster with the number of tasks than the single-task rate would contradict the claimed equivalence.

read the original abstract

We study the approximation and statistical complexity of learning collections of operators in a shared multi-task setting, with a focus on the Multiple Neural Operators (MNO) architecture. For broad classes of Lipschitz multiple operator maps, we derive near-optimal upper bounds for approximation and statistical generalization. On the lower-bound side, we establish a curse of parametric complexity and prove corresponding minimax rates. Together, these results show that shared representations across tasks do not increase the overall cost: multi-task operator learning follows the same scaling laws as single operator learning. We also compare MNO with a multi-task extension of DeepONet based on concatenated task inputs and show that, from a worst-case approximation-complexity perspective, both architectures satisfy essentially the same asymptotic rates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows multi-task neural operators match single-task rates under Lipschitz assumptions, with new upper and lower bounds that confirm no extra cost from shared representations.

read the letter

The main thing to know is that for collections of operators in broad Lipschitz classes, the MNO setup delivers near-optimal approximation and generalization rates that line up exactly with the single-task case, and a concatenated DeepONet variant does the same asymptotically. The matching minimax lower bounds are the part that makes the no-extra-cost claim concrete rather than just an upper-bound observation. This is a straightforward extension of existing neural operator theory, but the multi-task tailoring and the architecture comparison are done cleanly enough to be useful. The work earns credit for stating the assumptions up front and for including the lower-bound side instead of stopping at upper bounds. The soft spots are limited and mostly about scope. The equivalence holds only inside those Lipschitz multiple-operator maps; if the actual tasks fall outside, the scaling claim does not apply. The abstract talks about clean derivations, but a referee would still need to check the lemmas for any looseness in how the statistical rates treat task correlations or for hidden regularity conditions that make the constants work. No circularity or self-referential fitting shows up in the argument as presented. This is for people already working on neural operators or scientific machine learning who want theoretical backing for multi-task pipelines on related PDEs. A reader who cares about whether sharing representations changes the sample complexity will find the bounds worth reading. It deserves peer review because the new multi-task-specific rates and the lower bounds give a referee something concrete to verify, even if the overall reach stays inside the subfield.

Referee Report

0 major / 3 minor

Summary. The paper studies the approximation and statistical complexity of multi-task operator learning with a focus on the Multiple Neural Operators (MNO) architecture. For broad classes of Lipschitz multiple operator maps, it derives near-optimal upper bounds on approximation and generalization error. It establishes matching minimax lower bounds that exhibit a curse of parametric complexity, showing that shared representations across tasks incur no extra cost and that multi-task operator learning obeys the same scaling laws as the single-task case. The paper also compares MNO to a concatenated-input multi-task extension of DeepONet and concludes that both architectures achieve essentially the same asymptotic rates from a worst-case perspective.

Significance. If the upper and lower bounds are correctly derived, the result would be significant for the theory of neural operators and multi-task learning. It supplies a rigorous justification that joint learning of operator collections does not inflate sample or parameter complexity beyond the single-operator baseline, which could inform architecture design in scientific machine learning applications involving families of related PDEs or dynamical systems. The explicit comparison to a multi-task DeepONet variant adds practical value by identifying two architectures with comparable worst-case guarantees.

minor comments (3)

The abstract and introduction use the phrase 'near-optimal' without immediately stating the precise rate (e.g., the dependence on the number of tasks, the Lipschitz constant, or the dimension of the input function space). Adding a one-sentence summary of the achieved rate would improve readability.
Section 2 (or the related-work subsection) would benefit from an explicit citation to the single-task operator-learning rates that the multi-task bounds are claimed to match, so that the 'same scaling laws' statement can be checked at a glance.
In the statement of the main upper-bound theorem, the dependence of the constant on the number of tasks T should be written out explicitly rather than absorbed into big-O notation, to make the 'no extra cost' claim immediately verifiable.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and accurate summary of our work, as well as the recommendation for minor revision. We appreciate the recognition of the significance of our results on near-optimal rates for multi-task operator learning and the comparison to the multi-task DeepONet extension. We will incorporate any minor suggestions to improve clarity and presentation in the revised manuscript.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper derives near-optimal approximation and statistical upper bounds for collections of Lipschitz multiple operator maps, along with matching minimax lower bounds that establish shared representations incur no extra cost relative to single-task operator learning. These results rely on standard approximation theory and statistical learning arguments conditioned explicitly on the Lipschitz multiple-operator class, without reducing any claimed prediction or rate to a fitted parameter, self-defined quantity, or load-bearing self-citation. The comparison to concatenated-input multi-task DeepONet is likewise framed in terms of asymptotic rates under the same class, with no step that equates an output to its own input by construction. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the domain assumption that the operator collections are Lipschitz. No free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption Collections of operators belong to broad classes of Lipschitz multiple operator maps
Explicitly invoked in the abstract as the setting for which near-optimal bounds are derived.

pith-pipeline@v0.9.0 · 5657 in / 1284 out tokens · 57348 ms · 2026-05-22T07:28:13.867856+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

For broad classes of Lipschitz multiple operator maps, we derive near-optimal upper bounds for approximation and statistical generalization... multi-task operator learning follows the same scaling laws as single operator learning.
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We extend the lower-complexity framework of [26] to the multiple operator setting... curse of parametric complexity

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages

[1]

The sample complexity of learning lipschitz operators with respect to gaussian measures, 2025

Ben Adcock, Michael Griebel, and Gregor Maier. The sample complexity of learning lipschitz operators with respect to gaussian measures, 2025

work page 2025
[2]

Sorokin, Xianjin Yang, Théo Bourdais, Edoardo Calvello, Matthieu Darcy, Alexander Hsu, Bamdad Hosseini, and Houman Owhadi

Aras Bacho, Aleksei G. Sorokin, Xianjin Yang, Théo Bourdais, Edoardo Calvello, Matthieu Darcy, Alexander Hsu, Bamdad Hosseini, and Houman Owhadi. Operator learning at machine precision, 2025

work page 2025
[3]

Kovachki, and Andrew M

Kaushik Bhattacharya, Bamdad Hosseini, Nikola B. Kovachki, and Andrew M. Stuart. Model Reduction And Neural Networks For Parametric PDEs.The SMAI Journal of computational mathematics, 7:121– 157, 2021

work page 2021
[4]

Vicon: Vision in-context operator networks for multi-physics fluid dynamics prediction.arXiv:2411.16063, 2024

Yadi Cao, Yuxuan Liu, Liu Yang, Rose Yu, Hayden Schaeffer, and Stanley Osher. Vicon: Vision in- context operator networks for multi-physics fluid dynamics prediction.arXiv preprint arXiv:2411.16063, 2024

work page arXiv 2024
[5]

The kolmogorov infinite dimensional equation in a hilbert space via deep learning methods

Javier Castro. The kolmogorov infinite dimensional equation in a hilbert space via deep learning methods. Journal of Mathematical Analysis and Applications, 527(2):127413, 2023

work page 2023
[6]

The calderón’s problem via deeponets.Vietnam Journal of Mathematics, 52(3):775–806, 2024

Javier Castro, Claudio Muñoz, and Nicolás Valenzuela. The calderón’s problem via deeponets.Vietnam Journal of Mathematics, 52(3):775–806, 2024

work page 2024
[7]

Chen and H

T. Chen and H. Chen. Approximations of continuous functionals by neural networks with application to dynamic systems.IEEE Transactions on Neural Networks, 4(6):910–918, 1993

work page 1993
[8]

Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems.IEEE Transactions on Neural Networks, 6(4):911–917, 1995

work page 1995
[9]

Applied and Numerical Harmonic Analysis

Stephan Dahlke, Filippo De Mari, Philipp Grohs, and Demetrio Labate, editors.Harmonic and Applied Analysis. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham, 2015

work page 2015
[10]

de Hoop, Daniel Zhengyu Huang, Elizabeth Qian, and Andrew M

Maarten V . de Hoop, Daniel Zhengyu Huang, Elizabeth Qian, and Andrew M. Stuart. The cost-accuracy trade-off in operator learning with neural networks, 2022

work page 2022
[11]

D. L. Donoho. Sparse components of images and optimal atomic decompositions.Constructive Approx- imation, 17(3):353–382, 2001

work page 2001
[12]

Takashi Furuya, Michael Anthony Puthawala, Matti Lassas, and Maarten V . de Hoop. Globally injective and bijective neural operators. InThirty-seventh Conference on Neural Information Processing Systems, 2023

work page 2023
[13]

Theory-to-practice gap for neural networks and neural operators, 2025

Philipp Grohs, Samuel Lanthaler, and Margaret Trautner. Theory-to-practice gap for neural networks and neural operators, 2025

work page 2025
[14]

Poseidon: Efficient foundation models for PDEs

Maximilian Herde, Bogdan Raonic, Tobias Rohner, Roger Käppeli, Roberto Molinaro, Emmanuel de Bezenac, and Siddhartha Mishra. Poseidon: Efficient foundation models for PDEs. InThe Thirty- eighth Annual Conference on Neural Information Processing Systems, 2024

work page 2024
[15]

Neural and spectral operator surrogates: unified construction and expression rate bounds.Advances in Computational Mathematics, 50(4):72, 2024

Lukas Herrmann, Christoph Schwab, and Jakob Zech. Neural and spectral operator surrogates: unified construction and expression rate bounds.Advances in Computational Mathematics, 50(4):72, 2024

work page 2024
[16]

Nelsen, and Margaret Trautner

Daniel Zhengyu Huang, Nicholas H. Nelsen, and Margaret Trautner. An operator learning perspective on parameter-to-observable maps.Foundations of Data Science, 7(1):163–225, 2025

work page 2025
[17]

Mionet: Learning multiple-input operators via tensor product

Pengzhan Jin, Shuai Meng, and Lu Lu. Mionet: Learning multiple-input operators via tensor product. SIAM Journal on Scientific Computing, 44(6):A3490–A3514, 2022. 25

work page 2022
[18]

Time-series forecasting and refine- ment within a multimodal pde foundation model.Journal of Machine Learning for Modeling and Com- puting, 6(2):77–89, 2025

Derek Jollie, Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. Time-series forecasting and refine- ment within a multimodal pde foundation model.Journal of Machine Learning for Modeling and Com- puting, 6(2):77–89, 2025

work page 2025
[19]

Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models, 2020

work page 2020
[20]

On universal approximation and error bounds for fourier neural operators.J

Nikola Kovachki, Samuel Lanthaler, and Siddhartha Mishra. On universal approximation and error bounds for fourier neural operators.J. Mach. Learn. Res., 22(1), January 2021

work page 2021
[21]

Neural operator: learning maps between function spaces with applica- tions to pdes.J

Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: learning maps between function spaces with applica- tions to pdes.J. Mach. Learn. Res., 24(1), January 2023

work page 2023
[22]

Kovachki, Samuel Lanthaler, and Hrushikesh Mhaskar

Nikola B. Kovachki, Samuel Lanthaler, and Hrushikesh Mhaskar. Data complexity estimates for operator learning, 2024

work page 2024
[23]

Kovachki, Samuel Lanthaler, and Andrew M

Nikola B. Kovachki, Samuel Lanthaler, and Andrew M. Stuart. Chapter 9 - operator learning: Algorithms and analysis. In Siddhartha Mishra and Alex Townsend, editors,Numerical Analysis Meets Machine Learning, volume 25 ofHandbook of Numerical Analysis, pages 419–467. Elsevier, 2024

work page 2024
[24]

Operator learning with pca-net: upper and lower complexity bounds.J

Samuel Lanthaler. Operator learning with pca-net: upper and lower complexity bounds.J. Mach. Learn. Res., 24(1), January 2023

work page 2023
[25]

Error estimates for deeponets: a deep learning framework in infinite dimensions.Transactions of Mathematics and Its Applications, 6(1):tnac001, 03 2022

Samuel Lanthaler, Siddhartha Mishra, and George E Karniadakis. Error estimates for deeponets: a deep learning framework in infinite dimensions.Transactions of Mathematics and Its Applications, 6(1):tnac001, 03 2022

work page 2022
[26]

The parametric complexity of operator learning.IMA Journal of Numerical Analysis, page draf028, 08 2025

Samuel Lanthaler and Andrew M Stuart. The parametric complexity of operator learning.IMA Journal of Numerical Analysis, page draf028, 08 2025

work page 2025
[27]

Jose Antonio Lara Benitez, Takashi Furuya, Florian Faucher, Anastasis Kratsios, Xavier Tricoche, and Maarten V . de Hoop. Out-of-distributional risk bounds for neural operators with applications to the helmholtz equation.J. Comput. Phys., 513(C), September 2024

work page 2024
[28]

Deep neural networks are adaptive to function regularity and data distribution in approximation and estimation.Journal of Machine Learning Research, 26(213):1–56, 2025

Hao Liu, Jiahui Cheng, and Wenjing Liao. Deep neural networks are adaptive to function regularity and data distribution in approximation and estimation.Journal of Machine Learning Research, 26(213):1–56, 2025

work page 2025
[29]

Generalization error guaranteed auto-encoder- based nonlinear model reduction for operator learning.Applied and Computational Harmonic Analysis, 74:101717, 2025

Hao Liu, Biraj Dahal, Rongjie Lai, and Wenjing Liao. Generalization error guaranteed auto-encoder- based nonlinear model reduction for operator learning.Applied and Computational Harmonic Analysis, 74:101717, 2025

work page 2025
[30]

Deep nonparametric estimation of operators between infinite dimensional spaces.J

Hao Liu, Haizhao Yang, Minshuo Chen, Tuo Zhao, and Wenjing Liao. Deep nonparametric estimation of operators between infinite dimensional spaces.J. Mach. Learn. Res., 25(1), January 2024

work page 2024
[31]

Neural scaling laws of deep relu and deep operator network: A theoretical study, 2024

Hao Liu, Zecheng Zhang, Wenjing Liao, and Hayden Schaeffer. Neural scaling laws of deep relu and deep operator network: A theoretical study, 2024

work page 2024
[32]

PROSE-FD: A multimodal PDE foundation model for learning multiple operators for forecasting fluid dynamics

Yuxuan Liu, Jingmin Sun, Xinjie He, Griffin Pinney, Zecheng Zhang, and Hayden Schaeffer. PROSE-FD: A multimodal PDE foundation model for learning multiple operators for forecasting fluid dynamics. In Neurips 2024 Workshop Foundation Models for Science: Progress, Opportunities, and Challenges, 2024

work page 2024
[33]

Bcat: A block causal transformer for pde foundation models for fluid dynamics.arXiv preprint arXiv:2501.18972, 2025

Yuxuan Liu, Jingmin Sun, and Hayden Schaeffer. Bcat: A block causal transformer for pde foundation models for fluid dynamics.arXiv preprint arXiv:2501.18972, 2025

work page arXiv 2025
[34]

Prose: Predicting multiple operators and symbolic expressions using multimodal transformers.Neural Networks, 180:106707, 2024

Yuxuan Liu, Zecheng Zhang, and Hayden Schaeffer. Prose: Predicting multiple operators and symbolic expressions using multimodal transformers.Neural Networks, 180:106707, 2024. 26

work page 2024
[35]

Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, 2021

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, 2021

work page 2021
[36]

A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data.Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022

Lu Lu, Xuhui Meng, Shengze Cai, Zhiping Mao, Somdatta Goswami, Zhongqiang Zhang, and George Em Karniadakis. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data.Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022

work page 2022
[37]

Exponential convergence of deep operator networks for elliptic partial differential equations.SIAM Journal on Numerical Analysis, 61(3):1513–1545, 2023

Carlo Marcati and Christoph Schwab. Exponential convergence of deep operator networks for elliptic partial differential equations.SIAM Journal on Numerical Analysis, 61(3):1513–1545, 2023

work page 2023
[38]

Expression rates of neural operators for linear elliptic pdes in polytopes.CoRR, abs/2409.17552, 2024

Carlo Marcati and Christoph Schwab. Expression rates of neural operators for linear elliptic pdes in polytopes.CoRR, abs/2409.17552, 2024

work page arXiv 2024
[39]

Multiple physics pretraining for physical surrogate models.arXiv preprint arXiv:2310.02994, 2023

Michael McCabe, Bruno Régaldo-Saint Blancard, Liam Holden Parker, Ruben Ohana, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Siavash Golkar, Géraud Krawezik, Francois Lanusse, et al. Multiple physics pretraining for physical surrogate models.arXiv preprint arXiv:2310.02994, 2023

work page arXiv 2023
[40]

A multimodal pde founda- tion model for prediction and scientific text descriptions.arXiv preprint arXiv:2502.06026, 2025

Elisa Negrini, Yuxuan Liu, Liu Yang, Stanley J Osher, and Hayden Schaeffer. A multimodal pde founda- tion model for prediction and scientific text descriptions.arXiv preprint arXiv:2502.06026, 2025

work page arXiv 2025
[41]

Optimal approximation of piecewise smooth functions using deep relu neural networks.Neural Networks, 108:296–330, 2018

Philipp Petersen and Felix V oigtlaender. Optimal approximation of piecewise smooth functions using deep relu neural networks.Neural Networks, 108:296–330, 2018

work page 2018
[42]

Deep operator network approximation rates for lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

Christoph Schwab, Andreas Stein, and Jakob Zech. Deep operator network approximation rates for lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

work page 2026
[43]

Towards a foundation model for partial differential equations: Multioperator learning and extrapolation.Physical Review E, 111(3):035304, 2025

Jingmin Sun, Yuxuan Liu, Zecheng Zhang, and Hayden Schaeffer. Towards a foundation model for partial differential equations: Multioperator learning and extrapolation.Physical Review E, 111(3):035304, 2025

work page 2025
[44]

Lemon: Learning to learn multi-operator networks, 2025

Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. Lemon: Learning to learn multi-operator networks, 2025

work page 2025
[45]

Opinf-llm: Parametric pde solving with llms via operator inference, 2026

Zhuoyuan Wang, Hanjiang Hu, Xiyu Deng, Saviz Mowlavi, and Yorie Nakahira. Opinf-llm: Parametric pde solving with llms via operator inference, 2026

work page 2026
[46]

Generalization bounds and statistical guarantees for multi-task and multiple operator learning with mno networks, 2026

Adrien Weihs and Hayden Schaeffer. Generalization bounds and statistical guarantees for multi-task and multiple operator learning with mno networks, 2026

work page 2026
[47]

A deep learning framework for multi-operator learning: Architectures and approximation theory, 2025

Adrien Weihs, Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. A deep learning framework for multi-operator learning: Architectures and approximation theory, 2025

work page 2025
[48]

In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023

Liu Yang, Siting Liu, Tingwei Meng, and Stanley J Osher. In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023

work page 2023
[49]

Prompting in-context operator learning with sensor data, equations, and natural language.arXiv preprint arXiv:2308.05061, 2023

Liu Yang, Tingwei Meng, Siting Liu, and Stanley J Osher. Prompting in-context operator learning with sensor data, equations, and natural language.arXiv preprint arXiv:2308.05061, 2023

work page arXiv 2023
[50]

Pdeformer-2: A versatile foundation model for two- dimensional partial differential equations.arXiv preprint arXiv:2507.15409, 2025

Zhanhong Ye, Zining Liu, Bingyang Wu, Hongjie Jiang, Leheng Chen, Minyan Zhang, Xiang Huang, Qinghe Meng Zou, Hongsheng Liu, and Bin Dong. Pdeformer-2: A versatile foundation model for two- dimensional partial differential equations.arXiv preprint arXiv:2507.15409, 2025

work page arXiv 2025
[51]

Probabilistic operator learn- ing: generative modeling and uncertainty quantification for foundation models of differential equations

Benjamin J Zhang, Siting Liu, Stanley J Osher, and Markos A Katsoulakis. Probabilistic operator learn- ing: generative modeling and uncertainty quantification for foundation models of differential equations. arXiv preprint arXiv:2509.05186, 2025

work page arXiv 2025
[52]

Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024

Zecheng Zhang. Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024. 27

work page 2024
[53]

A discretization-invariant extension and analysis of some deep operator networks.Journal of Computational and Applied Mathematics, 456:116226, 2025

Zecheng Zhang, Wing Tat Leung, and Hayden Schaeffer. A discretization-invariant extension and analysis of some deep operator networks.Journal of Computational and Applied Mathematics, 456:116226, 2025

work page 2025
[54]

Coefficient-to-basis network: a fine-tunable op- erator learning framework for inverse problems with adaptive discretizations and theoretical guarantees

Zecheng Zhang, Hao Liu, Wenjing Liao, and Guang Lin. Coefficient-to-basis network: a fine-tunable op- erator learning framework for inverse problems with adaptive discretizations and theoretical guarantees. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 383(2305):20240054, 09 2025

work page 2025
[55]

D2no: Efficient handling of heterogeneous input function spaces with distributed deep neural operators.Computer Methods in Applied Mechanics and Engineering, 428:117084, 2024

Zecheng Zhang, Christian Moya, Lu Lu, Guang Lin, and Hayden Schaeffer. D2no: Efficient handling of heterogeneous input function spaces with distributed deep neural operators.Computer Methods in Applied Mechanics and Engineering, 428:117084, 2024

work page 2024
[56]

Belnet: basis enhanced learning, a mesh-free neural operator.Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 479(2276):20230043, 2023

Zecheng Zhang, Leung Wing Tat, and Hayden Schaeffer. Belnet: basis enhanced learning, a mesh-free neural operator.Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 479(2276):20230043, 2023. Appendix In this section, we present detailed proofs of all our results. A Near-Optimal Approximation Rates Proof of Theorem 3.1.For...

work page 2023
[57]

From part 1, we may taker= 1to obtainG: L rG(ΩW )×L rG(ΩU)→Vwhich is Frechet differentiable onL rG(ΩW )×L rG(ΩU). Specifically, from the proof of Corollary 3.14, we know that G[α][u](x) =F(α)ϕ(x), whereF: L rG(ΩW )→Ris the Frechet differentiable functional provided by [26, Theorem 2.11], and ϕ∈Vis a fixed nontrivial function. Next, the proof of [26, Lemma...

work page
[58]

The upper bound is a direct consequence of Theorem 3.1

The lower bound in (16) is given by combining parts 1 and 2 of the theorem. The upper bound is a direct consequence of Theorem 3.1. C An Extension of DeepONet to Multi-Task Learning Proof of Lemma 3.21.Let(α, u)∈W×Uandx∈Ω V . Then evx ◦NN[α][u] = HX k=1 NX ℓ=1 θkℓ bk(MW (α), MU(u))τ ℓ(x) = HX k=1 NX ℓ=1 θkℓτℓ(x) ! bk(MW (α), MU(u)) =: ˜τ(x)⊤b(MW (α), MU(u...

work page
[59]

From part 1, we may taker= 1to obtainG: L rG(ΩW )×L rG(ΩU)→Vwhich is Frechet differentiable onL rG(ΩW )×L rG(ΩU). Specifically, from the proof of [26, Corollary 2.12], we know that G[α][u](x) =F(α, u)ϕ(x), whereF: L rG(ΩW )×L rG(ΩU)→Ris the Frechet differentiable functional provided by [26, Theorem 2.11], andϕ∈Vis a fixed nontrivial function. Next, the pr...

work page
[60]

The upper bound is a direct consequence of Proposition 3.25 and Remark 3.26

The lower bound in (16) is given by combining parts 1 and 2 of the theorem. The upper bound is a direct consequence of Proposition 3.25 and Remark 3.26. 46

work page

[1] [1]

The sample complexity of learning lipschitz operators with respect to gaussian measures, 2025

Ben Adcock, Michael Griebel, and Gregor Maier. The sample complexity of learning lipschitz operators with respect to gaussian measures, 2025

work page 2025

[2] [2]

Sorokin, Xianjin Yang, Théo Bourdais, Edoardo Calvello, Matthieu Darcy, Alexander Hsu, Bamdad Hosseini, and Houman Owhadi

Aras Bacho, Aleksei G. Sorokin, Xianjin Yang, Théo Bourdais, Edoardo Calvello, Matthieu Darcy, Alexander Hsu, Bamdad Hosseini, and Houman Owhadi. Operator learning at machine precision, 2025

work page 2025

[3] [3]

Kovachki, and Andrew M

Kaushik Bhattacharya, Bamdad Hosseini, Nikola B. Kovachki, and Andrew M. Stuart. Model Reduction And Neural Networks For Parametric PDEs.The SMAI Journal of computational mathematics, 7:121– 157, 2021

work page 2021

[4] [4]

Vicon: Vision in-context operator networks for multi-physics fluid dynamics prediction.arXiv:2411.16063, 2024

Yadi Cao, Yuxuan Liu, Liu Yang, Rose Yu, Hayden Schaeffer, and Stanley Osher. Vicon: Vision in- context operator networks for multi-physics fluid dynamics prediction.arXiv preprint arXiv:2411.16063, 2024

work page arXiv 2024

[5] [5]

The kolmogorov infinite dimensional equation in a hilbert space via deep learning methods

Javier Castro. The kolmogorov infinite dimensional equation in a hilbert space via deep learning methods. Journal of Mathematical Analysis and Applications, 527(2):127413, 2023

work page 2023

[6] [6]

The calderón’s problem via deeponets.Vietnam Journal of Mathematics, 52(3):775–806, 2024

Javier Castro, Claudio Muñoz, and Nicolás Valenzuela. The calderón’s problem via deeponets.Vietnam Journal of Mathematics, 52(3):775–806, 2024

work page 2024

[7] [7]

Chen and H

T. Chen and H. Chen. Approximations of continuous functionals by neural networks with application to dynamic systems.IEEE Transactions on Neural Networks, 4(6):910–918, 1993

work page 1993

[8] [8]

Tianping Chen and Hong Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems.IEEE Transactions on Neural Networks, 6(4):911–917, 1995

work page 1995

[9] [9]

Applied and Numerical Harmonic Analysis

Stephan Dahlke, Filippo De Mari, Philipp Grohs, and Demetrio Labate, editors.Harmonic and Applied Analysis. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham, 2015

work page 2015

[10] [10]

de Hoop, Daniel Zhengyu Huang, Elizabeth Qian, and Andrew M

Maarten V . de Hoop, Daniel Zhengyu Huang, Elizabeth Qian, and Andrew M. Stuart. The cost-accuracy trade-off in operator learning with neural networks, 2022

work page 2022

[11] [11]

D. L. Donoho. Sparse components of images and optimal atomic decompositions.Constructive Approx- imation, 17(3):353–382, 2001

work page 2001

[12] [12]

Takashi Furuya, Michael Anthony Puthawala, Matti Lassas, and Maarten V . de Hoop. Globally injective and bijective neural operators. InThirty-seventh Conference on Neural Information Processing Systems, 2023

work page 2023

[13] [13]

Theory-to-practice gap for neural networks and neural operators, 2025

Philipp Grohs, Samuel Lanthaler, and Margaret Trautner. Theory-to-practice gap for neural networks and neural operators, 2025

work page 2025

[14] [14]

Poseidon: Efficient foundation models for PDEs

Maximilian Herde, Bogdan Raonic, Tobias Rohner, Roger Käppeli, Roberto Molinaro, Emmanuel de Bezenac, and Siddhartha Mishra. Poseidon: Efficient foundation models for PDEs. InThe Thirty- eighth Annual Conference on Neural Information Processing Systems, 2024

work page 2024

[15] [15]

Neural and spectral operator surrogates: unified construction and expression rate bounds.Advances in Computational Mathematics, 50(4):72, 2024

Lukas Herrmann, Christoph Schwab, and Jakob Zech. Neural and spectral operator surrogates: unified construction and expression rate bounds.Advances in Computational Mathematics, 50(4):72, 2024

work page 2024

[16] [16]

Nelsen, and Margaret Trautner

Daniel Zhengyu Huang, Nicholas H. Nelsen, and Margaret Trautner. An operator learning perspective on parameter-to-observable maps.Foundations of Data Science, 7(1):163–225, 2025

work page 2025

[17] [17]

Mionet: Learning multiple-input operators via tensor product

Pengzhan Jin, Shuai Meng, and Lu Lu. Mionet: Learning multiple-input operators via tensor product. SIAM Journal on Scientific Computing, 44(6):A3490–A3514, 2022. 25

work page 2022

[18] [18]

Time-series forecasting and refine- ment within a multimodal pde foundation model.Journal of Machine Learning for Modeling and Com- puting, 6(2):77–89, 2025

Derek Jollie, Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. Time-series forecasting and refine- ment within a multimodal pde foundation model.Journal of Machine Learning for Modeling and Com- puting, 6(2):77–89, 2025

work page 2025

[19] [19]

Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models, 2020

work page 2020

[20] [20]

On universal approximation and error bounds for fourier neural operators.J

Nikola Kovachki, Samuel Lanthaler, and Siddhartha Mishra. On universal approximation and error bounds for fourier neural operators.J. Mach. Learn. Res., 22(1), January 2021

work page 2021

[21] [21]

Neural operator: learning maps between function spaces with applica- tions to pdes.J

Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Neural operator: learning maps between function spaces with applica- tions to pdes.J. Mach. Learn. Res., 24(1), January 2023

work page 2023

[22] [22]

Kovachki, Samuel Lanthaler, and Hrushikesh Mhaskar

Nikola B. Kovachki, Samuel Lanthaler, and Hrushikesh Mhaskar. Data complexity estimates for operator learning, 2024

work page 2024

[23] [23]

Kovachki, Samuel Lanthaler, and Andrew M

Nikola B. Kovachki, Samuel Lanthaler, and Andrew M. Stuart. Chapter 9 - operator learning: Algorithms and analysis. In Siddhartha Mishra and Alex Townsend, editors,Numerical Analysis Meets Machine Learning, volume 25 ofHandbook of Numerical Analysis, pages 419–467. Elsevier, 2024

work page 2024

[24] [24]

Operator learning with pca-net: upper and lower complexity bounds.J

Samuel Lanthaler. Operator learning with pca-net: upper and lower complexity bounds.J. Mach. Learn. Res., 24(1), January 2023

work page 2023

[25] [25]

Error estimates for deeponets: a deep learning framework in infinite dimensions.Transactions of Mathematics and Its Applications, 6(1):tnac001, 03 2022

Samuel Lanthaler, Siddhartha Mishra, and George E Karniadakis. Error estimates for deeponets: a deep learning framework in infinite dimensions.Transactions of Mathematics and Its Applications, 6(1):tnac001, 03 2022

work page 2022

[26] [26]

The parametric complexity of operator learning.IMA Journal of Numerical Analysis, page draf028, 08 2025

Samuel Lanthaler and Andrew M Stuart. The parametric complexity of operator learning.IMA Journal of Numerical Analysis, page draf028, 08 2025

work page 2025

[27] [27]

Jose Antonio Lara Benitez, Takashi Furuya, Florian Faucher, Anastasis Kratsios, Xavier Tricoche, and Maarten V . de Hoop. Out-of-distributional risk bounds for neural operators with applications to the helmholtz equation.J. Comput. Phys., 513(C), September 2024

work page 2024

[28] [28]

Deep neural networks are adaptive to function regularity and data distribution in approximation and estimation.Journal of Machine Learning Research, 26(213):1–56, 2025

Hao Liu, Jiahui Cheng, and Wenjing Liao. Deep neural networks are adaptive to function regularity and data distribution in approximation and estimation.Journal of Machine Learning Research, 26(213):1–56, 2025

work page 2025

[29] [29]

Generalization error guaranteed auto-encoder- based nonlinear model reduction for operator learning.Applied and Computational Harmonic Analysis, 74:101717, 2025

Hao Liu, Biraj Dahal, Rongjie Lai, and Wenjing Liao. Generalization error guaranteed auto-encoder- based nonlinear model reduction for operator learning.Applied and Computational Harmonic Analysis, 74:101717, 2025

work page 2025

[30] [30]

Deep nonparametric estimation of operators between infinite dimensional spaces.J

Hao Liu, Haizhao Yang, Minshuo Chen, Tuo Zhao, and Wenjing Liao. Deep nonparametric estimation of operators between infinite dimensional spaces.J. Mach. Learn. Res., 25(1), January 2024

work page 2024

[31] [31]

Neural scaling laws of deep relu and deep operator network: A theoretical study, 2024

Hao Liu, Zecheng Zhang, Wenjing Liao, and Hayden Schaeffer. Neural scaling laws of deep relu and deep operator network: A theoretical study, 2024

work page 2024

[32] [32]

PROSE-FD: A multimodal PDE foundation model for learning multiple operators for forecasting fluid dynamics

Yuxuan Liu, Jingmin Sun, Xinjie He, Griffin Pinney, Zecheng Zhang, and Hayden Schaeffer. PROSE-FD: A multimodal PDE foundation model for learning multiple operators for forecasting fluid dynamics. In Neurips 2024 Workshop Foundation Models for Science: Progress, Opportunities, and Challenges, 2024

work page 2024

[33] [33]

Bcat: A block causal transformer for pde foundation models for fluid dynamics.arXiv preprint arXiv:2501.18972, 2025

Yuxuan Liu, Jingmin Sun, and Hayden Schaeffer. Bcat: A block causal transformer for pde foundation models for fluid dynamics.arXiv preprint arXiv:2501.18972, 2025

work page arXiv 2025

[34] [34]

Prose: Predicting multiple operators and symbolic expressions using multimodal transformers.Neural Networks, 180:106707, 2024

Yuxuan Liu, Zecheng Zhang, and Hayden Schaeffer. Prose: Predicting multiple operators and symbolic expressions using multimodal transformers.Neural Networks, 180:106707, 2024. 26

work page 2024

[35] [35]

Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, 2021

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, 2021

work page 2021

[36] [36]

A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data.Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022

Lu Lu, Xuhui Meng, Shengze Cai, Zhiping Mao, Somdatta Goswami, Zhongqiang Zhang, and George Em Karniadakis. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data.Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022

work page 2022

[37] [37]

Exponential convergence of deep operator networks for elliptic partial differential equations.SIAM Journal on Numerical Analysis, 61(3):1513–1545, 2023

Carlo Marcati and Christoph Schwab. Exponential convergence of deep operator networks for elliptic partial differential equations.SIAM Journal on Numerical Analysis, 61(3):1513–1545, 2023

work page 2023

[38] [38]

Expression rates of neural operators for linear elliptic pdes in polytopes.CoRR, abs/2409.17552, 2024

Carlo Marcati and Christoph Schwab. Expression rates of neural operators for linear elliptic pdes in polytopes.CoRR, abs/2409.17552, 2024

work page arXiv 2024

[39] [39]

Multiple physics pretraining for physical surrogate models.arXiv preprint arXiv:2310.02994, 2023

Michael McCabe, Bruno Régaldo-Saint Blancard, Liam Holden Parker, Ruben Ohana, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Siavash Golkar, Géraud Krawezik, Francois Lanusse, et al. Multiple physics pretraining for physical surrogate models.arXiv preprint arXiv:2310.02994, 2023

work page arXiv 2023

[40] [40]

A multimodal pde founda- tion model for prediction and scientific text descriptions.arXiv preprint arXiv:2502.06026, 2025

Elisa Negrini, Yuxuan Liu, Liu Yang, Stanley J Osher, and Hayden Schaeffer. A multimodal pde founda- tion model for prediction and scientific text descriptions.arXiv preprint arXiv:2502.06026, 2025

work page arXiv 2025

[41] [41]

Optimal approximation of piecewise smooth functions using deep relu neural networks.Neural Networks, 108:296–330, 2018

Philipp Petersen and Felix V oigtlaender. Optimal approximation of piecewise smooth functions using deep relu neural networks.Neural Networks, 108:296–330, 2018

work page 2018

[42] [42]

Deep operator network approximation rates for lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

Christoph Schwab, Andreas Stein, and Jakob Zech. Deep operator network approximation rates for lipschitz operators.Analysis and Applications, 24(01):199–239, 2026

work page 2026

[43] [43]

Towards a foundation model for partial differential equations: Multioperator learning and extrapolation.Physical Review E, 111(3):035304, 2025

Jingmin Sun, Yuxuan Liu, Zecheng Zhang, and Hayden Schaeffer. Towards a foundation model for partial differential equations: Multioperator learning and extrapolation.Physical Review E, 111(3):035304, 2025

work page 2025

[44] [44]

Lemon: Learning to learn multi-operator networks, 2025

Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. Lemon: Learning to learn multi-operator networks, 2025

work page 2025

[45] [45]

Opinf-llm: Parametric pde solving with llms via operator inference, 2026

Zhuoyuan Wang, Hanjiang Hu, Xiyu Deng, Saviz Mowlavi, and Yorie Nakahira. Opinf-llm: Parametric pde solving with llms via operator inference, 2026

work page 2026

[46] [46]

Generalization bounds and statistical guarantees for multi-task and multiple operator learning with mno networks, 2026

Adrien Weihs and Hayden Schaeffer. Generalization bounds and statistical guarantees for multi-task and multiple operator learning with mno networks, 2026

work page 2026

[47] [47]

A deep learning framework for multi-operator learning: Architectures and approximation theory, 2025

Adrien Weihs, Jingmin Sun, Zecheng Zhang, and Hayden Schaeffer. A deep learning framework for multi-operator learning: Architectures and approximation theory, 2025

work page 2025

[48] [48]

In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023

Liu Yang, Siting Liu, Tingwei Meng, and Stanley J Osher. In-context operator learning with data prompts for differential equation problems.Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023

work page 2023

[49] [49]

Prompting in-context operator learning with sensor data, equations, and natural language.arXiv preprint arXiv:2308.05061, 2023

Liu Yang, Tingwei Meng, Siting Liu, and Stanley J Osher. Prompting in-context operator learning with sensor data, equations, and natural language.arXiv preprint arXiv:2308.05061, 2023

work page arXiv 2023

[50] [50]

Pdeformer-2: A versatile foundation model for two- dimensional partial differential equations.arXiv preprint arXiv:2507.15409, 2025

Zhanhong Ye, Zining Liu, Bingyang Wu, Hongjie Jiang, Leheng Chen, Minyan Zhang, Xiang Huang, Qinghe Meng Zou, Hongsheng Liu, and Bin Dong. Pdeformer-2: A versatile foundation model for two- dimensional partial differential equations.arXiv preprint arXiv:2507.15409, 2025

work page arXiv 2025

[51] [51]

Probabilistic operator learn- ing: generative modeling and uncertainty quantification for foundation models of differential equations

Benjamin J Zhang, Siting Liu, Stanley J Osher, and Markos A Katsoulakis. Probabilistic operator learn- ing: generative modeling and uncertainty quantification for foundation models of differential equations. arXiv preprint arXiv:2509.05186, 2025

work page arXiv 2025

[52] [52]

Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024

Zecheng Zhang. Modno: Multi-operator learning with distributed neural operators.Computer Methods in Applied Mechanics and Engineering, 431:117229, 2024. 27

work page 2024

[53] [53]

A discretization-invariant extension and analysis of some deep operator networks.Journal of Computational and Applied Mathematics, 456:116226, 2025

Zecheng Zhang, Wing Tat Leung, and Hayden Schaeffer. A discretization-invariant extension and analysis of some deep operator networks.Journal of Computational and Applied Mathematics, 456:116226, 2025

work page 2025

[54] [54]

Coefficient-to-basis network: a fine-tunable op- erator learning framework for inverse problems with adaptive discretizations and theoretical guarantees

Zecheng Zhang, Hao Liu, Wenjing Liao, and Guang Lin. Coefficient-to-basis network: a fine-tunable op- erator learning framework for inverse problems with adaptive discretizations and theoretical guarantees. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 383(2305):20240054, 09 2025

work page 2025

[55] [55]

D2no: Efficient handling of heterogeneous input function spaces with distributed deep neural operators.Computer Methods in Applied Mechanics and Engineering, 428:117084, 2024

Zecheng Zhang, Christian Moya, Lu Lu, Guang Lin, and Hayden Schaeffer. D2no: Efficient handling of heterogeneous input function spaces with distributed deep neural operators.Computer Methods in Applied Mechanics and Engineering, 428:117084, 2024

work page 2024

[56] [56]

Belnet: basis enhanced learning, a mesh-free neural operator.Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 479(2276):20230043, 2023

Zecheng Zhang, Leung Wing Tat, and Hayden Schaeffer. Belnet: basis enhanced learning, a mesh-free neural operator.Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 479(2276):20230043, 2023. Appendix In this section, we present detailed proofs of all our results. A Near-Optimal Approximation Rates Proof of Theorem 3.1.For...

work page 2023

[57] [57]

From part 1, we may taker= 1to obtainG: L rG(ΩW )×L rG(ΩU)→Vwhich is Frechet differentiable onL rG(ΩW )×L rG(ΩU). Specifically, from the proof of Corollary 3.14, we know that G[α][u](x) =F(α)ϕ(x), whereF: L rG(ΩW )→Ris the Frechet differentiable functional provided by [26, Theorem 2.11], and ϕ∈Vis a fixed nontrivial function. Next, the proof of [26, Lemma...

work page

[58] [58]

The upper bound is a direct consequence of Theorem 3.1

The lower bound in (16) is given by combining parts 1 and 2 of the theorem. The upper bound is a direct consequence of Theorem 3.1. C An Extension of DeepONet to Multi-Task Learning Proof of Lemma 3.21.Let(α, u)∈W×Uandx∈Ω V . Then evx ◦NN[α][u] = HX k=1 NX ℓ=1 θkℓ bk(MW (α), MU(u))τ ℓ(x) = HX k=1 NX ℓ=1 θkℓτℓ(x) ! bk(MW (α), MU(u)) =: ˜τ(x)⊤b(MW (α), MU(u...

work page

[59] [59]

From part 1, we may taker= 1to obtainG: L rG(ΩW )×L rG(ΩU)→Vwhich is Frechet differentiable onL rG(ΩW )×L rG(ΩU). Specifically, from the proof of [26, Corollary 2.12], we know that G[α][u](x) =F(α, u)ϕ(x), whereF: L rG(ΩW )×L rG(ΩU)→Ris the Frechet differentiable functional provided by [26, Theorem 2.11], andϕ∈Vis a fixed nontrivial function. Next, the pr...

work page

[60] [60]

The upper bound is a direct consequence of Proposition 3.25 and Remark 3.26

The lower bound in (16) is given by combining parts 1 and 2 of the theorem. The upper bound is a direct consequence of Proposition 3.25 and Remark 3.26. 46

work page