pith. machine review for the scientific record. sign in

arxiv: 2605.04832 · v1 · submitted 2026-05-06 · 💻 cs.LG

Recognition: unknown

Replay-Based Continual Learning for Physics-Informed Neural Operators

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:19 UTC · model grok-4.3

classification 💻 cs.LG
keywords continual learningphysics-informed neural operatorscatastrophic forgettingreplay-based learningLoRA adaptationout-of-distribution adaptationphysics-informed training
0
0 comments X

The pith

A replay-based strategy lets physics-informed neural operators adapt to new physical problems without forgetting prior ones and without needing labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to overcome catastrophic forgetting in neural operators when they encounter new out-of-distribution physical scenarios by introducing a replay mechanism that stores only a small set of past input fields. It combines this with a distillation constraint that enforces physical laws on the replayed data and a LoRA adapter for quick parameter updates on the new inputs. A sympathetic reader would care because real applications in fluid mechanics, biomechanics, and solid mechanics often require models to evolve with incoming data without repeated full retraining or access to original labeled datasets. The validation on Darcy flow, hyperelastic tumor modeling, and triply periodic minimal surfaces problems shows retained accuracy on old tasks alongside faster convergence on new ones compared with joint training.

Core claim

By replaying a limited number of previous input fields and applying a distillation loss based solely on physical constraints, together with a low-rank adaptation module, the method preserves performance on earlier distributions while rapidly incorporating new out-of-distribution data, all in a fully unsupervised, physics-informed setting.

What carries the argument

Replay buffer of past input fields plus distillation on physical laws, paired with LoRA for parameter-efficient adaptation to new data.

If this is right

  • Memory and compute costs drop relative to retraining on all accumulated data at each step.
  • The approach scales to multiple physical domains without requiring task-specific labels.
  • Training time for each new problem shortens while old-task performance remains stable.
  • The framework applies directly to operator architectures built on Transolver and similar backbones.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the replay size can be kept very small across many sequential tasks, the method could support lifelong learning pipelines for evolving simulation environments.
  • The reliance on input fields alone suggests possible extension to settings where only boundary conditions or geometry change over time.
  • Success here raises the question of whether similar replay-plus-distillation ideas transfer to other operator-learning architectures beyond the tested Transolver base.

Load-bearing premise

Replaying a small number of past input fields together with a distillation constraint on physical laws is sufficient to preserve prior knowledge without any labeled data or access to the original training distribution.

What would settle it

An experiment in which accuracy on the original physical problems falls sharply after sequential training on new out-of-distribution cases even when the replay buffer and distillation terms are active.

Figures

Figures reproduced from arXiv: 2605.04832 by Mohammad Sadegh Eshaghi, Timon Rabczuk, Xiaoying Zhuang, Yinghua Liu, Yizheng Wang.

Figure 1
Figure 1. Figure 1: Introduction to neural operators and several training paradigms The proposed replay-based continual learning framework is systematically validated on three representative physical problems, including the Darcy flow problem in fluid mechanics, a two-dimensional hyperelastic brain tumor problem in biomechanics, and a three-dimensional linear elastic Triply Periodic Minimal Surfaces (TPMS) problem in solid me… view at source ↗
Figure 2
Figure 2. Figure 2: Comparison between continual learning and joint training: f(x; θcl) and f(x; θjoint) are the outputs of the continual learning and joint training models, respectively. Continual learning learns a sequence of contents one by one and aims to approximate the effect of joint training on all tasks simultaneously. Joint learning refers to training on all data at once. Continual learning includes five common cate… view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the four types of problems that continual learning needs to handle in operator learning, categorized by differences in physical equations and input distributions. The greater the difference, the more necessary continual learning becomes. • Different physics and different distribution: The physical equations differ, and the distribution of input X also differs, i.e., p(Xpast) ̸= p(Xnew) and … view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of a replay-based continual learning in neural operators: first, mix past and new data to form mixed data. Use the old neural operator model to predict on the mixed data, rank by error to obtain a score. Select data based on the score as the training set, which is often much smaller than the full dataset. Then retrain the neural operator using this training set view at source ↗
Figure 5
Figure 5. Figure 5: Illustration of Supervised Fine-Tuning in neural operators: we copy the parameters of the PDE-driven pretrained model to the "teacher model" and the "SFT neural operator." The unlabeled data Dleft = {Xleft} are fed into both the "teacher model" and the "SFT neural operator" for distillation, constructing the "Distill loss." The labeled high-fidelity data Dsft = {Xsft, Ysft} are fed into the "SFT neural ope… view at source ↗
Figure 6
Figure 6. Figure 6: Joint learning performance on the Darcy problem: (a) Ten groups of permeability field datasets generated from ten different random distributions; (b) Relative errors of data-driven trained FNO and Transolver on each dataset; (c) Relative errors of PDE-driven trained FNO and Transolver on each dataset view at source ↗
Figure 7
Figure 7. Figure 7: Comparison between PDEs-driven and data-driven Transolver models for the Darcy problem: (a) relative L2 error; (b) relative H1 error. Next we demonstrate the performance of FNO and Transolver on OOD problems. We record the errors of FNO and Transolver after joint learning on {D1, D2, · · · , Dm} and testing on dataset Dj . If j ≤ m it indicates the performance of the neural operator on ID problems, and if … view at source ↗
Figure 8
Figure 8. Figure 8: Performance on in-distribution (ID) and out-of-distribution (OOD) data for the Darcy problem. The numbers indicate relative errors. Bold numbers represent relative errors on OOD data; otherwise, they are on ID data: (a) FNO trained via data-driven; (b) FNO trained via PDE-driven; (c) Transolver trained via data-driven; (d) Transolver trained via PDE-driven view at source ↗
Figure 9
Figure 9. Figure 9: Performance of fine-tuning on new data for the Darcy problem. Numbers indicate relative errors; bold numbers represent relative errors on new data: (a) FNO trained via data-driven; (b) FNO trained via PDE-driven; (c) Transolver trained via data￾driven; (d) Transolver trained via PDE-driven view at source ↗
Figure 10
Figure 10. Figure 10: Performance of replay-based continual learning on Darcy flow. Bold diagonal numbers correspond to relative errors on new data, white numbers correspond to relative errors on training data, and black numbers correspond to data not in the training set: (a) error map of replay-based continual learning, where the model parameters are initialized by joint learning on past data; (b) error map of replay-based co… view at source ↗
Figure 11
Figure 11. Figure 11: Distribution of the physics-based loss and relative error of the Transolver model on the Darcy flow problem during replay-based continual learning. Points with different colors represent results from different groups. The horizontal axis denotes the relative error, while the vertical axis denotes the physics-based loss. with E and υ being the Young’s modulus and Poisson’s ratio, respectively. Furthermore,… view at source ↗
Figure 12
Figure 12. Figure 12: Shapes of meningiomas and gliomas: (a) Magnetic Resonance Imaging (MRI) of a meningioma, data from https: //www.synapse.org/Synapse:syn51514106 [35]; (b) Magnetic Resonance Imaging of a glioma, glioma data from https://www.canc erimagingarchive.net/collection/ucsf-pdgm/ [36]; (c) Distribution of Young’s modulus for the meningioma; (d) Distribution of Young’s modulus for the glioma view at source ↗
Figure 13
Figure 13. Figure 13: Evolution trends of relative errors on the test set for the meningioma problem when training the Transolver neural operator with data-driven and PDE-driven approaches. Left: relative error of displacement; Right: relative error of Von Mises stress. it very suitable for problems with complex geometries [18]. Next, we test the performance of Transolver trained in data-driven and PDE-driven fashions, respect… view at source ↗
Figure 14
Figure 14. Figure 14: Prediction results of the PDE-driven Transolver neural operator on the meningioma problem. Rows 1 to 5: distribution of Young’s modulus, absolute displacement field from FEM, predicted absolute displacement field from Transolver, Von Mises stress from FEM, and predicted Von Mises stress from Transolver. problem, we focus only on the PDE-driven approach in the following and no longer consider the data-driv… view at source ↗
Figure 15
Figure 15. Figure 15: shows the performance of replay-based continual learning on past data (meningioma) and new data (glioma). We can see that after replay-based continual learning, Transolver matches the FEM reference solution very well, with relative errors of only about 5%. Importantly, the prediction efficiency during inference is improved by thousands of times, as indicated by the “Time” column in view at source ↗
Figure 16
Figure 16. Figure 16: Evolution trends of relative errors of joint learning and replay-based continual learning on past data (meningioma) and new data (glioma). Left: relative error of displacement; Right: relative error of Von Mises stress view at source ↗
Figure 17
Figure 17. Figure 17: Effect of Supervised Fine-Tuning (SFT) on the PDE-trained Transolver neural operator. Left: relative error of dis￾placement; Right: relative error of Von Mises stress. We select 10% of the worst-performing samples for SFT fine-tuning, and the remaining 90% of the data are used for model distillation. problem. We apply SFT on top of the model obtained after replay-based continual learning. We evaluate the … view at source ↗
Figure 18
Figure 18. Figure 18: Six different types of TPMS structures: from left to right, Schoen Gyroid, Fischer Koch S, and Schwarz Diamond. From top to bottom, “Solid-networks” and “Sheet-networks” view at source ↗
Figure 19
Figure 19. Figure 19: Prediction of replay-based continual learning with PDE-trained Transolver on “Solid-networks” TPMS. From top to bottom: Schoen Gyroid, Fischer Koch S, and Schwarz Diamond. From left to right: FEM reference solution of absolute displacement field, Transolver prediction of absolute displacement field, corresponding absolute displacement error contour, and cross-sectional error contour. uses “Sheet-networks”… view at source ↗
Figure 20
Figure 20. Figure 20: Prediction of replay-based continual learning with PDE-trained Transolver on “Sheet-networks” TPMS. From top to bottom: Schoen Gyroid, Fischer Koch S, and Schwarz Diamond. From left to right: FEM reference solution of absolute displacement field, Transolver prediction of absolute displacement field, corresponding absolute displacement error contour, and cross-sectional error contour view at source ↗
Figure 21
Figure 21. Figure 21: Variation of relative error with iterations for joint learning and replay-based continual learning on 3D TPMS: (a) past data are “Solid-networks”, new data are “Sheet-networks”, relative error on “Solid-networks”; (b) past data are “Sheet-networks”, new data are “Solid-networks”, relative error on “Sheet-networks”; (c) past data are “Solid-networks”, new data are “Sheet-networks”, relative error on “Sheet… view at source ↗
Figure 22
Figure 22. Figure 22: Comparison of the data-driven and PDE-driven workflows: solid lines indicate the PDE-driven workflow; solid plus dashed lines indicate the data-driven workflow; the dashed line represents the extra steps in the data-driven workflow compared to the PDE-driven one. unnecessary. Since the source of data-driven methods is the PDEs, the PDEs constitute a complete source of all data. Therefore, it is entirely f… view at source ↗
Figure 23
Figure 23. Figure 23: Schematic diagram of the closed-loop computational framework: geometries, materials, and boundary conditions are processed into point cloud data and fed into the Transolver neural operator, which is then pre-trained using the PDEs. Subsequently, the neural operator undergoes supervised fine-tuning training based on the available labeled data. successfully update the model. When the new data are small, the… view at source ↗
read the original abstract

Neural operators generally demonstrate strong predictive performance on in-distribution (ID) problems. However, a critical limitation of existing methods is their significant performance degradation when encountering out-of-distribution (OOD) data. To address this issue, this work introduces continual learning into physics-informed neural operators, with particular emphasis on neural operators built upon the Transolver architecture, and proposes a simple yet effective replay-based continual learning strategy. The proposed method is fully physics-informed and does not require labeled data, relying solely on input fields together with physical constraints for training. When new OOD data become available, a small number of past data are incorporated through a distillation-based constraint to preserve previously acquired knowledge and alleviate catastrophic forgetting. Meanwhile, a transfer learning LoRA is employed to enable rapid adaptation to the new data. The proposed framework is systematically validated on three representative physical problems, including the Darcy flow problem in fluid mechanics, a two-dimensional hyperelastic brain tumor problem in biomechanics, and a three-dimensional linear elastic Triply Periodic Minimal Surfaces problem in solid mechanics. The results demonstrate that the proposed method effectively mitigates catastrophic forgetting on previously learned data while maintaining fast adaptability to new data. Compared with conventional joint training strategies, the proposed method significantly improves training efficiency while reducing additional memory usage and computational cost.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a replay-based continual learning strategy for physics-informed neural operators (PINOs) built on the Transolver architecture. When new out-of-distribution data arrives, a small replay buffer of past input fields is combined with a physics-law distillation loss (no labels required) to preserve prior knowledge and reduce catastrophic forgetting; LoRA is used for fast adaptation to the new task. The framework is evaluated on the Darcy flow problem, a 2D hyperelastic brain-tumor problem, and a 3D linear-elastic TPMS problem, with claims of effective forgetting mitigation, rapid adaptability, and lower memory/compute cost than joint retraining.

Significance. If the empirical claims hold under rigorous coverage analysis, the work would be a useful contribution to continual learning for scientific machine learning. It demonstrates a label-free, physics-constrained replay mechanism that could reduce the cost of adapting neural operators to new physical regimes, a practical bottleneck in deploying PINOs for multi-task or streaming scientific data.

major comments (2)
  1. [§3 and §4] The central claim that replaying a small number of past input fields plus PDE-residual distillation suffices to prevent operator drift (abstract and §3) rests on an unverified coverage assumption. In high-dimensional function spaces typical of Darcy, hyperelasticity, and elasticity operators, a fixed-size replay buffer chosen without diversity or importance sampling can leave large regions of the prior input manifold unconstrained; the distillation term only enforces the PDE on those few points, allowing silent deviation elsewhere. No analysis of replay selection strategy, buffer-size sensitivity, or out-of-buffer generalization is provided in the experiments (§4).
  2. [§4] The reported performance gains over joint training (abstract) are not accompanied by quantitative tables, ablation studies on replay size, or statistical significance tests in the provided description. Without these, it is impossible to assess whether the efficiency and forgetting-mitigation claims are load-bearing or merely qualitative.
minor comments (2)
  1. [§3] Notation for the distillation loss and LoRA adaptation should be introduced with explicit equations rather than high-level description.
  2. [§3] Clarify whether the replay samples are drawn from the original training distribution or generated on-the-fly; this affects reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We have carefully reviewed each point and provide point-by-point responses below. We agree that additional analyses will strengthen the paper and commit to incorporating them in the revision.

read point-by-point responses
  1. Referee: [§3 and §4] The central claim that replaying a small number of past input fields plus PDE-residual distillation suffices to prevent operator drift (abstract and §3) rests on an unverified coverage assumption. In high-dimensional function spaces typical of Darcy, hyperelasticity, and elasticity operators, a fixed-size replay buffer chosen without diversity or importance sampling can leave large regions of the prior input manifold unconstrained; the distillation term only enforces the PDE on those few points, allowing silent deviation elsewhere. No analysis of replay selection strategy, buffer-size sensitivity, or out-of-buffer generalization is provided in the experiments (§4).

    Authors: We appreciate the referee highlighting the coverage assumption underlying our replay-based approach. Our framework uses a small replay buffer of past input fields together with PDE-residual distillation (no labels) to constrain operator drift, and LoRA for adaptation. In §4 we evaluate this on three problems spanning 2D and 3D domains (Darcy flow, hyperelastic brain tumor, 3D linear-elastic TPMS) and report that small buffers suffice to maintain prior-task accuracy. We acknowledge, however, that the current experiments do not include explicit sensitivity studies on buffer size, alternative selection strategies (e.g., diversity sampling), or explicit checks of generalization to inputs outside the replay set. We will add these analyses—including buffer-size ablation curves, discussion of random versus importance-based selection, and out-of-buffer evaluation—to the revised §4 to directly address the coverage concern. revision: yes

  2. Referee: [§4] The reported performance gains over joint training (abstract) are not accompanied by quantitative tables, ablation studies on replay size, or statistical significance tests in the provided description. Without these, it is impossible to assess whether the efficiency and forgetting-mitigation claims are load-bearing or merely qualitative.

    Authors: We thank the referee for this observation. While the abstract summarizes the efficiency and forgetting-mitigation results, the full §4 already contains quantitative comparisons of our method against joint training and fine-tuning baselines, reporting relative L2 errors and wall-clock training times on the three benchmark problems. To make these claims fully verifiable, we will expand §4 with (i) complete numerical tables, (ii) systematic ablations varying replay buffer size, and (iii) statistical significance testing (means and standard deviations over multiple random seeds together with appropriate hypothesis tests). These additions will be included in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity in the proposed replay-based continual learning framework

full rationale

The paper proposes an empirical replay-based continual learning strategy for physics-informed neural operators that combines a small replay buffer with physics-law distillation and LoRA adaptation. Claims of mitigated forgetting and improved efficiency are supported by direct validation on three PDE problems (Darcy flow, hyperelasticity, elasticity) rather than any derivation that reduces by construction to its own inputs. No self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the method description or abstract; the central premise remains an externally testable engineering approach.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The abstract relies on standard assumptions of physics-informed training and continual learning without introducing new free parameters or invented entities.

axioms (1)
  • domain assumption Physical constraints alone, without labels, suffice to train and preserve knowledge in the continual learning setting
    The method is described as fully physics-informed and relying solely on input fields together with physical constraints.

pith-pipeline@v0.9.0 · 5535 in / 1237 out tokens · 59347 ms · 2026-05-08T17:19:40.415023+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 12 canonical work pages · 3 internal anchors

  1. [1]

    S. Wang, H. Wang, P. Perdikaris, Learning the solution operator of parametric partial differential equations with physics-informed deeponets, Science advances 7 (40) (2021) eabi8605

  2. [2]

    H. Wang, T. Fu, Y. Du, W. Gao, K. Huang, Z. Liu, P. Chandak, S. Liu, P. Van Katwyk, A. Deac, et al., Scientific discovery in the age of artificial intelligence, Nature 620 (7972) (2023) 47–60

  3. [3]

    N. B. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. M. Stuart, A. Anandkumar, Neural operator: Learning maps between function spaces with applications to pdes., J. Mach. Learn. Res. 24 (89) (2023) 1–97

  4. [4]

    L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinear operators via deeponet based on the universal approximation theorem of operators, Nature Machine Intelligence 3 (3) (2021) 218–229. doi:10.1038/s42256-021-00302-5

  5. [5]

    Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, A. Anandkumar, Fourier neural operator for parametric partial differential equations, arXiv preprint arXiv:2010.08895 (2020)

  6. [6]

    Z. Hao, Z. Wang, H. Su, C. Ying, Y. Dong, S. Liu, Z. Cheng, J. Song, J. Zhu, Gnot: A general neural operator transformer for operator learning, in: International Conference on Machine Learning, PMLR, 2023, pp. 12556–12569

  7. [7]

    H. Wu, H. Luo, H. Wang, J. Wang, M. Long, Transolver: A fast transformer solver for pdes on general geometries, arXiv preprint arXiv:2402.02366 (2024)

  8. [8]

    L. Wang, X. Zhang, H. Su, J. Zhu, A comprehensive survey of continual learning: Theory, method and application, IEEE transactions on pattern analysis and machine intelligence 46 (8) (2024) 5362–5383

  9. [9]

    Kirkpatrick, R

    J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ra- malho, A. Grabska-Barwinska, et al., Overcoming catastrophic forgetting in neural networks, Proceedings of the national academy of sciences 114 (13) (2017) 3521–3526

  10. [10]

    Tripura, S

    T. Tripura, S. Chakraborty, Neural combinatorial wavelet neural operator for catastrophic forgetting free in-context operator learning of multiple partial differential equations, Computer Physics Communications (2025) 109882. 40

  11. [11]

    SLE-FNO: Single-Layer Extensions for Task-Agnostic Continual Learning in Fourier Neural Operators

    M. Elhadidy, R. M. D’Souza, A. Arzani, Sle-fno: Single-layer extensions for task-agnostic continual learning in fourier neural operators, arXiv preprint arXiv:2603.20410 (2026)

  12. [12]

    S. S. Menon, T. Mondal, S. Brahmachary, A. Panda, S. M. Joshi, K. Kalyanaraman, A. D. Jagtap, On scientific foundation models: Rigorous definitions, key applications, and a comprehensive survey, Neural Networks (2026) 108567

  13. [13]

    Y. Choi, S. W. Cheung, Y. Kim, P.-H. Tsai, A. N. Diaz, I. Zanardi, S. W. Chung, D. M. Copeland, C. Kendrick, W. Anderson, et al., Defining foundation models for computational science: A call for clarity and rigor, arXiv preprint arXiv:2505.22904 (2025)

  14. [14]

    Z. Hao, C. Su, S. Liu, J. Berner, C. Ying, H. Su, A. Anandkumar, J. Song, J. Zhu, Dpot: Auto-regressive denoising operator transformer for large-scale pde pre-training, arXiv preprint arXiv:2403.03542 (2024)

  15. [15]

    H. Zhou, Y. Ma, H. Wu, H. Wang, M. Long, Unisolver: Pde-conditional transformers towards universal neural pde solvers, arXiv preprint arXiv:2405.17527 (2024)

  16. [16]

    Walrus: A cross-domain foundation model for continuum dynamics.arXiv preprint arXiv:2511.15684, 2025

    M. McCabe, P. Mukhopadhyay, T. Marwah, B. R.-S. Blancard, F. Rozet, C. Diaconu, L. Meyer, K. W. Wong, H. Sotoudeh, A. Bietti, et al., Walrus: A cross-domain foundation model for continuum dynamics, arXiv preprint arXiv:2511.15684 (2025)

  17. [17]

    K. Bi, L. Xie, H. Zhang, X. Chen, X. Gu, Q. Tian, Accurate medium-range global weather forecasting with 3d neural networks, Nature (2023) 1–6

  18. [18]

    Y. Wang, Z. Hao, M. S. Eshaghi, C. Anitescu, X. Zhuang, T. Rabczuk, Y. Liu, Pretrain finite element method: A pretraining and warm-start framework for pdes via physics-informed neural operators, arXiv preprint arXiv:2601.03086 (2026)

  19. [19]

    M. S. Eshaghi, C. Anitescu, M. Thombre, Y. Wang, X. Zhuang, T. Rabczuk, Variational physics-informed neural operator (vino) for solving partial differential equations, Computer Methods in Applied Mechanics and Engineering 437 (2025) 117785

  20. [20]

    Al-Ketan, R

    O. Al-Ketan, R. K. Abu Al-Rub, Multifunctional mechanical metamaterials based on triply periodic min- imal surface lattices, Advanced Engineering Materials 21 (10) (2019) 1900524

  21. [21]

    Z. Li, H. Zheng, N. Kovachki, D. Jin, H. Chen, B. Liu, K. Azizzadenesheli, A. Anandkumar, Physics- informed neural operator for learning partial differential equations, ACM/JMS Journal of Data Science 1 (3) (2024) 1–27

  22. [22]

    Y. Wang, J. Sun, J. Bai, C. Anitescu, M. S. Eshaghi, X. Zhuang, T. Rabczuk, Y. Liu, Kolmogorov arnold informed neural network: A physics-informed deep learning framework for solving forward and inverse problems based on kolmogorov–arnold networks, Computer Methods in Applied Mechanics and Engineering 433 (2025) 117518

  23. [23]

    Revisiting replay and gradient alignment for continual pre-training of large language models.arXiv preprint arXiv:2508.01908, 2025

    I. Abbes, G. Subbaraj, M. Riemer, N. Islah, B. Therien, T. Tabaru, H. Kingetsu, S. Chandar, I. Rish, Revisiting replay and gradient alignment for continual pre-training of large language models, arXiv preprint arXiv:2508.01908 (2025)

  24. [24]

    L. Yang, S. Liu, T. Meng, S. J. Osher, In-context operator learning with data prompts for differential equation problems, Proceedings of the National Academy of Sciences 120 (39) (2023) e2310142120

  25. [25]

    E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, Lora: Low-rank adaptation of large language models, arXiv preprint arXiv:2106.09685 (2021)

  26. [26]

    Y. Wang, J. Bai, M. S. Eshaghi, C. Anitescu, X. Zhuang, T. Rabczuk, Y. Liu, Transfer learning in physics- informed neurals networks: full fine-tuning, lightweight fine-tuning, and low-rank adaptation, International Journal of Mechanical System Dynamics 5 (2) (2025) 212–235. 41

  27. [27]

    G. Dong, H. Yuan, K. Lu, C. Li, M. Xue, D. Liu, W. Wang, Z. Yuan, C. Zhou, J. Zhou, How abilities in large language models are affected by supervised fine-tuning data composition, in: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 177–198

  28. [28]

    L. M. DeAngelis, Brain tumors, New England journal of medicine 344 (2) (2001) 114–123

  29. [29]

    Ciasca, T

    G. Ciasca, T. E. Sassun, E. Minelli, M. Antonelli, M. Papi, A. Santoro, F. Giangaspero, R. Delfini, M. De Spirito, Nano-mechanical signature of brain tumours, Nanoscale 8 (47) (2016) 19629–19643

  30. [30]

    Chauvet, M

    D. Chauvet, M. Imbault, L. Capelle, C. Demene, M. Mossad, C. Karachi, A.-L. Boch, J.-L. Gennisson, M. Tanter, In vivo measurement of brain tumor elasticity using intraoperative shear wave elastography, Ultraschall in der Medizin-European Journal of Ultrasound 37 (06) (2016) 584–590

  31. [31]

    Miller, K

    K. Miller, K. Chinzei, G. Orssengo, P. Bednarz, Mechanical properties of brain tissue in-vivo: experiment and computer simulation, Journal of biomechanics 33 (11) (2000) 1369–1376

  32. [32]

    Bunevicius, K

    A. Bunevicius, K. Schregel, R. Sinkus, A. Golby, S. Patz, Mr elastography of brain tumors, NeuroImage: Clinical 25 (2020) 102109

  33. [33]

    Wittek, T

    A. Wittek, T. Hawkins, K. Miller, On the unimportance of constitutive models in computing brain defor- mation for image-guided surgery, Biomechanics and modeling in mechanobiology 8 (1) (2009) 77–84

  34. [34]

    Z. Lin, J. Bai, S. Li, X. Chen, B. Li, X.-Q. Feng, A physics-informed neural network framework for sim- ulating creep buckling in growing viscoelastic biological tissues, Computer Methods in Applied Mechanics and Engineering 452 (2026) 118715

  35. [35]

    Calabrese, D

    E. Calabrese, D. LaBella, Brats-men (2023).doi:10.7303/SYN51514106. URLhttps://repo-prod.prod.sagebase.org/repo/v1/doi/locate?id=syn51514106&type=ENTITY

  36. [36]

    Calabrese, J

    E. Calabrese, J. E. Villanueva-Meyer, J. D. Rudie, A. M. Rauschecker, U. Baid, S. Bakas, S. Cha, J. T. Mongan, C. P. Hess, The university of california san francisco preoperative diffuse glioma mri dataset, Radiology: Artificial Intelligence 4 (6) (2022) e220058

  37. [37]

    Hashin, et al., Analysis of composite materials, J

    Z. Hashin, et al., Analysis of composite materials, J. appl. Mech 50 (2) (1983) 481–505

  38. [38]

    Guedes, N

    J. Guedes, N. Kikuchi, Preprocessing and postprocessing for materials based on the homogenization method with adaptive finite element methods, Computer methods in applied mechanics and engineering 83 (2) (1990) 143–198

  39. [39]

    Hassani, E

    B. Hassani, E. Hinton, A review of homogenization and topology optimization i homogenization theory for media with periodic structure, Computers and Structures 69 (6) (1998) 707–717

  40. [40]

    Harandi, H

    A. Harandi, H. Danesh, K. Linka, S. Reese, S. Rezaei, Spifol: A spectral-based physics-informed finite operator learning for prediction of mechanical behavior of microstructures, Journal of the Mechanics and Physics of Solids (2025) 106219

  41. [41]

    Y. Wang, X. Li, Z. Yan, S. Ma, J. Bai, B. Liu, X. Zhuang, T. Rabczuk, Y. Liu, A pretraining-finetuning computational framework for material homogenization, International Journal of Mechanical Sciences (2026) 111388

  42. [42]

    Andreassen, C

    E. Andreassen, C. S. Andreasen, How to determine composite material properties using numerical homog- enization, Computational Materials Science 83 (2014) 488–495

  43. [43]

    J. T. Oden, S. Prudhomme, Goal-oriented error estimation and adaptivity for the finite element method, Computers & mathematics with applications 41 (5-6) (2001) 735–756

  44. [44]

    Z. Li, D. Z. Huang, B. Liu, A. Anandkumar, Fourier neural operator with learned deformations for pdes on general geometries, Journal of Machine Learning Research 24 (388) (2023) 1–26