arxiv: 2604.23098 · v1 · submitted 2026-04-25 · 💻 cs.CE

Recognition: unknown

In-context modeling as a retrain-free paradigm for foundation models in computational science

Lingfeng Li , Zhuoyuan Li , Shun Li , Kaixin Zhan , Huajian Gao , Changqing Chen , Liu Yang

Authors on Pith no claims yet

Pith reviewed 2026-05-08 07:03 UTC · model grok-4.3

classification 💻 cs.CE

keywords in-context modelingfoundation modelscomputational mechanicshyperelasticityphysics-informed learningretrain-freefinite element methodscaling behavior

0 comments

The pith

A single model can generalize across unseen materials, geometries, and loading conditions by treating measurements as physical context for single-pass inference.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces In-Context Modeling (ICM) as a retrain-free approach for foundation models in computational science. Rather than learning fixed parameters for each system, the model is trained in a physics-informed, label-free manner on governing equations so it can take observational fields as context and infer relationships directly. Demonstrated on hyperelasticity, the model integrates with finite-element simulations and matches experimental full-field measurements. Performance improves with greater data diversity and compute budget, showing scaling behavior similar to large language models. If the approach holds, physical modeling shifts from repeated system-specific training to reusable in-context inference across many problems.

Core claim

In-Context Modeling (ICM) recasts physical modeling as in-context inference: a model trained label-free using governing equations assimilates measurements as physical context and performs accurate inference for arbitrary unseen systems through a single forward pass. This produces a single model that generalizes across materials, geometries, and loading conditions, as shown for hyperelasticity where it integrates with finite-element codes and agrees with experimental data. The method exhibits scaling behavior where performance rises with more diverse training data and larger compute budgets.

What carries the argument

In-Context Modeling (ICM), a mechanism that assimilates observational data fields as dynamic context for inference rather than encoding system-specific behavior in fixed parameters.

If this is right

A single pre-trained model suffices for many different physical systems without any retraining.
ICM integrates directly with existing finite-element simulation workflows.
Experimental validation on hyperelastic materials confirms that predictions match real-world full-field measurements.
Accuracy increases when training data covers more diverse conditions and when more computation is available.
This paradigm supports reusable foundation models that can be shared across computational science tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same trained model could handle other physics domains such as fluid flow or heat transfer simply by changing the governing equations supplied during training.
Engineers could employ one ICM model as a fast surrogate across many designs, avoiding repeated fitting for each new case.
A direct test would be to extend ICM to time-dependent or multi-physics problems.
The observed scaling suggests that larger and more varied datasets would produce models capable of ever more complex real-world scenarios.

Load-bearing premise

That training solely on governing equations in a label-free physics-informed manner equips the model to correctly interpret new observational fields from arbitrary unseen systems and produce accurate inferences in a single pass.

What would settle it

Apply the trained model to full-field measurements from a hyperelastic specimen whose material response or geometry lies outside the training distribution and verify whether its single-pass predictions match independent finite-element solutions or experimental data within acceptable error.

read the original abstract

Building models that generalize across physical systems without retraining remains a central challenge in computational science. Here we introduce In-Context Modeling (ICM), a retrain-free paradigm that infers physical relationships directly from observational fields. Rather than encoding system-specific behavior in fixed parameters, ICM assimilates measurements as physical context and performs inference through a single forward pass. Trained in a physics-informed, label-free manner using governing equations, a single model generalizes across unseen materials, geometries, and loading conditions. Demonstrated on hyperelasticity, ICM integrates with finite-element simulations and is validated using experimental full-field measurements. Moreover, performance improves with increasing data diversity and computational budget, exhibiting favorable scaling behavior analogous to foundation models. By recasting physical modeling as in-context inference, this work establishes a transferable paradigm for retrain-free scientific learning and a foundation for scalable modeling across computational science.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper frames physical modeling as in-context inference for retrain-free generalization but the abstract-level evidence leaves the core claims hard to verify.

read the letter

The main point is that they train a single model on governing equations without labels, then treat new measurements as context for one-pass inference on unseen materials, geometries, and loads. This is shown on hyperelasticity, with the model plugged into finite-element simulations and checked against experimental full-field data. Performance is said to improve as data diversity and compute increase, in the style of foundation models. That experimental validation and the scaling observation are the parts that land as concrete steps forward rather than pure speculation. The integration with existing FEM workflows also makes the idea more immediately usable than some purely data-driven alternatives. The soft spots are the lack of any numbers on error magnitudes, no architecture or training details, and no clear breakdown of how the model actually extracts constitutive behavior from arbitrary context. The stress-test concern holds: the setup does not yet rule out that the network is performing a form of interpolation over the training distribution of strains and boundary conditions instead of recovering truly novel responses. Without those specifics it is difficult to judge whether the physics-informed, label-free training delivers unique and reliable inference or just a sophisticated lookup. This is for readers in scientific machine learning and computational mechanics who want to explore foundation-model style approaches in physics. It deserves a serious referee because the framing is distinct enough to be worth testing and refining, even though the current version is preliminary and will need substantial technical additions to be convincing.

Referee Report

2 major / 0 minor

Summary. The paper introduces In-Context Modeling (ICM), a retrain-free paradigm for foundation models in computational science. A single model is trained physics-informed and label-free on governing equations, then assimilates observational fields (displacements, strains) as context to infer physical relationships and perform predictions via one forward pass. It claims generalization across unseen materials, geometries, and loading conditions, demonstrated on hyperelasticity, with integration into finite-element simulations and validation against experimental full-field measurements. Performance improves with data diversity and computational budget, exhibiting favorable scaling.

Significance. If substantiated with concrete methods and results, the work could be significant by recasting physical modeling as in-context inference, offering a transferable, retrain-free approach that scales like foundation models and potentially unifies modeling across computational science domains.

major comments (2)

[Abstract] Abstract: the central claim that 'a single model generalizes across unseen materials, geometries, and loading conditions' and is 'validated using experimental full-field measurements' is presented without quantitative results, error metrics, specific test cases, or comparisons, which is load-bearing for distinguishing true out-of-distribution generalization from interpolation within the training distribution of contexts.
[Abstract] Abstract (training and inference description): the assertion that physics-informed, label-free training on governing equations enables the model to 'assimilate measurements as physical context' and output correct predictions for arbitrary unseen hyperelastic systems lacks any equations, loss formulation, architecture details, or implementation, leaving open the risk that inference remains tied to the distribution of training contexts rather than becoming a general engine.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comments point by point below, focusing on revisions to the abstract to provide stronger support for the central claims while preserving its concise nature.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'a single model generalizes across unseen materials, geometries, and loading conditions' and is 'validated using experimental full-field measurements' is presented without quantitative results, error metrics, specific test cases, or comparisons, which is load-bearing for distinguishing true out-of-distribution generalization from interpolation within the training distribution of contexts.

Authors: We agree that the abstract would be strengthened by including quantitative evidence to support the generalization claims. The main text already contains detailed results with error metrics, specific test cases across unseen hyperelastic materials and geometries, and comparisons to finite-element baselines and experimental data (see Results section and associated figures/tables). In the revised manuscript, we will update the abstract to incorporate representative quantitative highlights, such as observed error reductions on out-of-distribution cases and scaling trends with data diversity, to better substantiate the distinction from interpolation within training contexts. revision: yes
Referee: [Abstract] Abstract (training and inference description): the assertion that physics-informed, label-free training on governing equations enables the model to 'assimilate measurements as physical context' and output correct predictions for arbitrary unseen hyperelastic systems lacks any equations, loss formulation, architecture details, or implementation, leaving open the risk that inference remains tied to the distribution of training contexts rather than becoming a general engine.

Authors: The abstract is designed as a high-level summary, with full technical details on the physics-informed loss, governing equation integration, architecture, and implementation provided in the Methods and Model sections of the manuscript. To address the concern that this leaves open questions about generalization versus context-specific interpolation, we will revise the abstract to briefly reference the physics-informed training objective and the single-forward-pass in-context inference process. This addition will help clarify the intended generality without exceeding abstract length constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; derivation remains self-contained

full rationale

The abstract presents ICM as trained in a physics-informed, label-free manner on governing equations, with generalization to unseen materials, geometries, and loadings demonstrated via integration with FEM and experimental validation. No equations, self-citations, or implementation details are provided that would reduce any claimed prediction to a fitted input by construction, a self-definitional mapping, or an ansatz smuggled through prior author work. The central claim of retrain-free inference is framed as an empirical outcome of the training paradigm rather than a tautology. This aligns with the default expectation that most papers exhibit no circularity when their load-bearing steps are independent of the target result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not identify or quantify any free parameters, axioms, or invented entities; the approach is described at a high level only.

pith-pipeline@v0.9.0 · 5464 in / 1094 out tokens · 52306 ms · 2026-05-08T07:03:20.215512+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 3 canonical work pages · 2 internal anchors

[1]

Nature620, 47–60 (2023)

Wang, H., Fu, T., Du, Y., Gao, W., Huang, K., Liu, Z., Chandak, P., Liu, S., Van Katwyk, P., Deac, A.,et al.: Scientific discovery in the age of artificial intelligence. Nature620, 47–60 (2023)

2023
[2]

Proceedings of the National Academy of Sciences of the United States of America113, 3932–3937 (2016)

Brunton, S.L., Proctor, J.L., Kutz, J.N.: Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences of the United States of America113, 3932–3937 (2016)

2016
[3]

Science Advances3, 1602614 (2017)

Rudy, S.H., Brunton, S.L., Proctor, J.L., Kutz, J.N.: Data-driven discovery of partial differential equations. Science Advances3, 1602614 (2017)

2017
[4]

Nature Communications12, 6136 (2021) 21

Chen, Z., Liu, Y., Sun, H.: Physics-informed learning of governing equations from scarce data. Nature Communications12, 6136 (2021) 21

2021
[5]

Nature Computational Science6, 156–168 (2026)

Yu, Z., Ding, J., Li, Y.: Discover network dynamics with neural symbolic regression. Nature Computational Science6, 156–168 (2026)

2026
[6]

Nature Reviews Physics3, 422–440 (2021)

Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics3, 422–440 (2021)

2021
[7]

experiments on the deformation of rubber

Rivlin, R.S., Saunders, D.W.: Large elastic deformations of isotropic materials vii. experiments on the deformation of rubber. Philosophical Transactions of the Royal Society of London, Series A: Mathematical and Physical Sciences243, 251–288 (1951)

1951
[8]

Oxford University Press, Oxford (1975)

G.Treloar, L.R.G.: The Physics of Rubber Elasticity. Oxford University Press, Oxford (1975)

1975
[9]

Journal of Computational Physics428, 110072 (2021)

Xu, K., Huang, D.Z., Darve, E.: Learning constitutive relations using symmetric positive definite neural networks. Journal of Computational Physics428, 110072 (2021)

2021
[10]

Computer Methods in Applied Mechanics and Engineering381, 113852 (2021)

Flaschel, M., Kumar, S., De Lorenzis, L.: Unsupervised discovery of interpretable hyperelastic constitutive laws. Computer Methods in Applied Mechanics and Engineering381, 113852 (2021)

2021
[11]

Journal of the Mechanics and Physics of Solids164, 104931 (2022)

Li, L., Chen, C.Q.: Equilibrium-based convolution neural networks for constitu- tive modeling of hyperelastic materials. Journal of the Mechanics and Physics of Solids164, 104931 (2022)

2022
[12]

Journal of the Mechanics and Physics of Solids169, 105076 (2022)

Thakolkaran, P., Joshi, A., Zheng, Y., Flaschel, M., De Lorenzis, L., Kumar, S.: NN-EUCLID: Deep-learning hyperelasticity without stress data. Journal of the Mechanics and Physics of Solids169, 105076 (2022)

2022
[13]

Journal of the Mechanics and Physics of Solids179, 105363 (2023)

Linden, L., Klein, D.K., Kalina, K.A., Brummund, J., Weeger, O., K¨ astner, M.: Neural networks meet hyperelasticity: A guide to enforcing physics. Journal of the Mechanics and Physics of Solids179, 105363 (2023)

2023
[14]

John Wiley & Sons, Hoboken (2012)

De Borst, R., Crisfield, M.A., Remmers, J.J., Verhoosel, C.V.: Nonlinear Finite Element Analysis of Solids and Structures. John Wiley & Sons, Hoboken (2012)

2012
[15]

John Wiley & Sons, Chichester (2014)

Belytschko, T., Liu, W.K., Moran, B., Elkhodary, K.: Nonlinear Finite Elements for Continua and Structures. John Wiley & Sons, Chichester (2014)

2014
[16]

In: Advances in Neural Information Processing Systems, vol

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008 (2017)

2017
[17]

Proceedings of the Royal Society of London

Ogden, R.W.: Large deformation isotropic elasticity–on the correlation of theory and experiment for incompressible rubberlike solids. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences326, 565–584 (1972) 22

1972
[18]

Dover Publications, Mineola, NY (1997)

Ogden, R.W.: Non-linear Elastic Deformations. Dover Publications, Mineola, NY (1997)

1997
[19]

Rubber Chemistry and Technology75, 839–852 (2002)

Pucci, E., Saccomandi, G.: A note on the Gent model for rubber-like materials. Rubber Chemistry and Technology75, 839–852 (2002)

2002
[20]

European Journal of Mechanics-A/Solids38, 144–151 (2013)

Khajehsaeid, H., Arghavani, J., Naghdabadi, R.: A hyperelastic constitutive model for rubber-like materials. European Journal of Mechanics-A/Solids38, 144–151 (2013)

2013
[21]

Colloid and Polymer Science264, 866–876 (1986)

Kilian, H.-G., Enderle, H., Unseld, K.: The use of the van der Waals model to elucidate universal aspects of structure-property relationships in simply extended dry and swollen rubbers. Colloid and Polymer Science264, 866–876 (1986)

1986
[22]

Rubber Chemistry and Technology79, 835–858 (2006)

Marckmann, G., Verron, E.: Comparison of hyperelastic models for rubber-like materials. Rubber Chemistry and Technology79, 835–858 (2006)

2006
[23]

Journal of the Mechanics and Physics of Solids200, 106117 (2025)

Li, L., Li, S., Gao, H., Chen, C.Q.: ENNStressNet - An unsupervised equilibrium- based neural network for end-to-end stress mapping in elastoplastic solids. Journal of the Mechanics and Physics of Solids200, 106117 (2025)

2025
[24]

Archive of Numerical Software3, 9–23 (2015)

Alnaes, M.S., Blechta, J., Hake, J., Johansson, A., Kehlet, B., Logg, A., Richard- son, C.N., Ring, J., Rognes, M.E., Wells, G.N.: The FEniCS project version 1.5. Archive of Numerical Software3, 9–23 (2015)

2015
[25]

Springer, Berlin, Heidelberg (2012)

Logg, A., Mardal, K.-A., Wells, G.N.,et al.: Automated Solution of Differential Equations by the Finite Element Method. Springer, Berlin, Heidelberg (2012)

2012
[26]

Experimental Mechanics25, 232–244 (1985)

Chu, T., Ranson, W., Sutton, M.A.: Applications of digital-image-correlation techniques to experimental mechanics. Experimental Mechanics25, 232–244 (1985)

1985
[27]

Experimental Mechanics 39, 217–226 (1999)

Bay, B.K., Smith, T.S., Fyhrie, D.P., Saad, M.: Digital volume correlation: Three- dimensional strain mapping using X-ray tomography. Experimental Mechanics 39, 217–226 (1999)

1999
[28]

Measurement Science and Technology20, 062001 (2009)

Pan, B., Qian, K., Xie, H., Asundi, A.: Two-dimensional digital image correlation for in-plane displacement and strain measurement: a review. Measurement Science and Technology20, 062001 (2009)

2009
[29]

Proceedings of the National Academy of Sciences of the United States of America121, 2404205121 (2024)

Wang, Z., Das, S., Joshi, A., Shaikeea, A.J., Deshpande, V.S.: 3D observa- tions provide striking findings in rubber elasticity. Proceedings of the National Academy of Sciences of the United States of America121, 2404205121 (2024)

2024
[30]

Experimental Mechanics58, 661–708 (2018) 23

Buljac, A., Jailin, C., Mendoza, A., Neggers, J., Taillandier-Thomas, T., Bouterf, A., Smaniotto, B., Hild, F., Roux, S.: Digital volume correlation: review of progress and challenges. Experimental Mechanics58, 661–708 (2018) 23

2018
[31]

Journal of Machine Learning Research9, 2579–2605 (2008)

Maaten, L.v.d., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research9, 2579–2605 (2008)

2008
[32]

Wiley, Hoboken (2017)

Bergman, T.L., Lavine, A.S., Incropera, F.P., DeWitt, D.P.: Fundamentals of Heat and Mass Transfer. Wiley, Hoboken (2017)

2017
[33]

Proceedings of the National Academy of Sciences of the United States of America120, 2310142120 (2023)

Yang, L., Liu, S., Meng, T., Osher, S.J.: In-context operator learning with data prompts for differential equation problems. Proceedings of the National Academy of Sciences of the United States of America120, 2310142120 (2023)

2023
[34]

Journal of Computational Physics 519, 113379 (2024)

Yang, L., Osher, S.J.: PDE generalization of in-context operator networks: A study on 1D scalar nonlinear conservation laws. Journal of Computational Physics 519, 113379 (2024)

2024
[35]

Vicon: Vision in-context operator networks for multi-physics fluid dynamics prediction.arXiv preprint arXiv:2411.16063, 2024

Cao, Y., Liu, Y., Yang, L., Yu, R., Schaeffer, H., Osher, S.: VICON: Vision in-context operator networks for multi-physics fluid dynamics prediction. arXiv preprint arXiv:2411.16063 (2024)

work page arXiv 2024
[36]

Neural Networks, 107455 (2025)

Yang, L., Liu, S., Osher, S.J.: Fine-tune language models as multi-modal differential equation solvers. Neural Networks, 107455 (2025)

2025
[37]

Muon is Scalable for LLM Training

Liu, J., Su, J., Yao, X., Jiang, Z., Lai, G., Du, Y., Qin, Y., Xu, W., Lu, E., Yan, J., et al.: Muon is scalable for LLM training. arXiv preprint arXiv:2502.16982 (2025)

work page internal anchor Pith review arXiv 2025
[38]

In: Interna- tional Conference on Learning Representations, pp

Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Interna- tional Conference on Learning Representations, pp. 4061–4079 (2019)

2019
[39]

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

Goyal, P., Doll´ ar, P., Girshick, R., Noordhuis, P., Wesolowski, L., Kyrola, A., Tulloch, A., Jia, Y., He, K.: Accurate, large minibatch SGD: Training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677 (2017)

work page internal anchor Pith review arXiv 2017
[40]

In: International Conference on Learning Representations, pp

Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations, pp. 1769–1784 (2017)

2017
[41]

Acta Mechanica148, 129– 155 (2001)

Hartmann, S.: Numerical studies on the identification of the material parameters of rivlin’s hyperelasticity using tension-torsion tests. Acta Mechanica148, 129– 155 (2001)

2001
[42]

International Journal of Precision Engineering and Manufacturing 13, 759–764 (2012)

Kim, B., Lee, S.B., Lee, J., Cho, S., Park, H., Yeom, S., Park, S.H.: A comparison among Neo-Hookean model, Mooney-Rivlin model, and Ogden model for chloro- prene rubber. International Journal of Precision Engineering and Manufacturing 13, 759–764 (2012)

2012
[43]

Acta Biomaterialia48, 319–340 (2017) 24

Budday, S., Sommer, G., Birkl, C., Langkammer, C., Haybaeck, J., Kohn- ert, J., Bauer, M., Paulsen, F., Steinmann, P., Kuhl, E.,et al.: Mechanical characterization of human brain tissue. Acta Biomaterialia48, 319–340 (2017) 24

2017
[44]

Rubber Chemistry and Technology69, 59–61 (1996)

Gent, A.N.: A new constitutive relation for rubber. Rubber Chemistry and Technology69, 59–61 (1996)

1996
[45]

Nature Methods17, 261–272 (2020) Acknowledgements The work is supported by the National Natural Science Foundation of China (Nos

Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Courna- peau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J.,et al.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nature Methods17, 261–272 (2020) Acknowledgements The work is supported by the National Natural Science Foundation of China (Nos. 12132007...

2020
[46]

All volumetric coefficientsDwere sampled from 28 Table 1Sampling distributions of the Pucci–Saccomandi model

within the interval [1.2,20]. All volumetric coefficientsDwere sampled from 28 Table 1Sampling distributions of the Pucci–Saccomandi model. Parameter Description Distribution µ(MPa) Shear modulusµ∼ U(1,101) Jm Chain stretch limit √Jm ∼ U(4,6) C2 Second invariant coefficientC 2 ∼ U(0,100) D(MPa) Volumetric penalty coefficientD∼ U(1,501) Table 2Sampling dis...

1922