Recognition: unknown
In-context modeling as a retrain-free paradigm for foundation models in computational science
Pith reviewed 2026-05-08 07:03 UTC · model grok-4.3
The pith
A single model can generalize across unseen materials, geometries, and loading conditions by treating measurements as physical context for single-pass inference.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In-Context Modeling (ICM) recasts physical modeling as in-context inference: a model trained label-free using governing equations assimilates measurements as physical context and performs accurate inference for arbitrary unseen systems through a single forward pass. This produces a single model that generalizes across materials, geometries, and loading conditions, as shown for hyperelasticity where it integrates with finite-element codes and agrees with experimental data. The method exhibits scaling behavior where performance rises with more diverse training data and larger compute budgets.
What carries the argument
In-Context Modeling (ICM), a mechanism that assimilates observational data fields as dynamic context for inference rather than encoding system-specific behavior in fixed parameters.
If this is right
- A single pre-trained model suffices for many different physical systems without any retraining.
- ICM integrates directly with existing finite-element simulation workflows.
- Experimental validation on hyperelastic materials confirms that predictions match real-world full-field measurements.
- Accuracy increases when training data covers more diverse conditions and when more computation is available.
- This paradigm supports reusable foundation models that can be shared across computational science tasks.
Where Pith is reading between the lines
- The same trained model could handle other physics domains such as fluid flow or heat transfer simply by changing the governing equations supplied during training.
- Engineers could employ one ICM model as a fast surrogate across many designs, avoiding repeated fitting for each new case.
- A direct test would be to extend ICM to time-dependent or multi-physics problems.
- The observed scaling suggests that larger and more varied datasets would produce models capable of ever more complex real-world scenarios.
Load-bearing premise
That training solely on governing equations in a label-free physics-informed manner equips the model to correctly interpret new observational fields from arbitrary unseen systems and produce accurate inferences in a single pass.
What would settle it
Apply the trained model to full-field measurements from a hyperelastic specimen whose material response or geometry lies outside the training distribution and verify whether its single-pass predictions match independent finite-element solutions or experimental data within acceptable error.
read the original abstract
Building models that generalize across physical systems without retraining remains a central challenge in computational science. Here we introduce In-Context Modeling (ICM), a retrain-free paradigm that infers physical relationships directly from observational fields. Rather than encoding system-specific behavior in fixed parameters, ICM assimilates measurements as physical context and performs inference through a single forward pass. Trained in a physics-informed, label-free manner using governing equations, a single model generalizes across unseen materials, geometries, and loading conditions. Demonstrated on hyperelasticity, ICM integrates with finite-element simulations and is validated using experimental full-field measurements. Moreover, performance improves with increasing data diversity and computational budget, exhibiting favorable scaling behavior analogous to foundation models. By recasting physical modeling as in-context inference, this work establishes a transferable paradigm for retrain-free scientific learning and a foundation for scalable modeling across computational science.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces In-Context Modeling (ICM), a retrain-free paradigm for foundation models in computational science. A single model is trained physics-informed and label-free on governing equations, then assimilates observational fields (displacements, strains) as context to infer physical relationships and perform predictions via one forward pass. It claims generalization across unseen materials, geometries, and loading conditions, demonstrated on hyperelasticity, with integration into finite-element simulations and validation against experimental full-field measurements. Performance improves with data diversity and computational budget, exhibiting favorable scaling.
Significance. If substantiated with concrete methods and results, the work could be significant by recasting physical modeling as in-context inference, offering a transferable, retrain-free approach that scales like foundation models and potentially unifies modeling across computational science domains.
major comments (2)
- [Abstract] Abstract: the central claim that 'a single model generalizes across unseen materials, geometries, and loading conditions' and is 'validated using experimental full-field measurements' is presented without quantitative results, error metrics, specific test cases, or comparisons, which is load-bearing for distinguishing true out-of-distribution generalization from interpolation within the training distribution of contexts.
- [Abstract] Abstract (training and inference description): the assertion that physics-informed, label-free training on governing equations enables the model to 'assimilate measurements as physical context' and output correct predictions for arbitrary unseen hyperelastic systems lacks any equations, loss formulation, architecture details, or implementation, leaving open the risk that inference remains tied to the distribution of training contexts rather than becoming a general engine.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address the major comments point by point below, focusing on revisions to the abstract to provide stronger support for the central claims while preserving its concise nature.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that 'a single model generalizes across unseen materials, geometries, and loading conditions' and is 'validated using experimental full-field measurements' is presented without quantitative results, error metrics, specific test cases, or comparisons, which is load-bearing for distinguishing true out-of-distribution generalization from interpolation within the training distribution of contexts.
Authors: We agree that the abstract would be strengthened by including quantitative evidence to support the generalization claims. The main text already contains detailed results with error metrics, specific test cases across unseen hyperelastic materials and geometries, and comparisons to finite-element baselines and experimental data (see Results section and associated figures/tables). In the revised manuscript, we will update the abstract to incorporate representative quantitative highlights, such as observed error reductions on out-of-distribution cases and scaling trends with data diversity, to better substantiate the distinction from interpolation within training contexts. revision: yes
-
Referee: [Abstract] Abstract (training and inference description): the assertion that physics-informed, label-free training on governing equations enables the model to 'assimilate measurements as physical context' and output correct predictions for arbitrary unseen hyperelastic systems lacks any equations, loss formulation, architecture details, or implementation, leaving open the risk that inference remains tied to the distribution of training contexts rather than becoming a general engine.
Authors: The abstract is designed as a high-level summary, with full technical details on the physics-informed loss, governing equation integration, architecture, and implementation provided in the Methods and Model sections of the manuscript. To address the concern that this leaves open questions about generalization versus context-specific interpolation, we will revise the abstract to briefly reference the physics-informed training objective and the single-forward-pass in-context inference process. This addition will help clarify the intended generality without exceeding abstract length constraints. revision: yes
Circularity Check
No significant circularity detected; derivation remains self-contained
full rationale
The abstract presents ICM as trained in a physics-informed, label-free manner on governing equations, with generalization to unseen materials, geometries, and loadings demonstrated via integration with FEM and experimental validation. No equations, self-citations, or implementation details are provided that would reduce any claimed prediction to a fitted input by construction, a self-definitional mapping, or an ansatz smuggled through prior author work. The central claim of retrain-free inference is framed as an empirical outcome of the training paradigm rather than a tautology. This aligns with the default expectation that most papers exhibit no circularity when their load-bearing steps are independent of the target result.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Nature620, 47–60 (2023)
Wang, H., Fu, T., Du, Y., Gao, W., Huang, K., Liu, Z., Chandak, P., Liu, S., Van Katwyk, P., Deac, A.,et al.: Scientific discovery in the age of artificial intelligence. Nature620, 47–60 (2023)
2023
-
[2]
Proceedings of the National Academy of Sciences of the United States of America113, 3932–3937 (2016)
Brunton, S.L., Proctor, J.L., Kutz, J.N.: Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences of the United States of America113, 3932–3937 (2016)
2016
-
[3]
Science Advances3, 1602614 (2017)
Rudy, S.H., Brunton, S.L., Proctor, J.L., Kutz, J.N.: Data-driven discovery of partial differential equations. Science Advances3, 1602614 (2017)
2017
-
[4]
Nature Communications12, 6136 (2021) 21
Chen, Z., Liu, Y., Sun, H.: Physics-informed learning of governing equations from scarce data. Nature Communications12, 6136 (2021) 21
2021
-
[5]
Nature Computational Science6, 156–168 (2026)
Yu, Z., Ding, J., Li, Y.: Discover network dynamics with neural symbolic regression. Nature Computational Science6, 156–168 (2026)
2026
-
[6]
Nature Reviews Physics3, 422–440 (2021)
Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nature Reviews Physics3, 422–440 (2021)
2021
-
[7]
experiments on the deformation of rubber
Rivlin, R.S., Saunders, D.W.: Large elastic deformations of isotropic materials vii. experiments on the deformation of rubber. Philosophical Transactions of the Royal Society of London, Series A: Mathematical and Physical Sciences243, 251–288 (1951)
1951
-
[8]
Oxford University Press, Oxford (1975)
G.Treloar, L.R.G.: The Physics of Rubber Elasticity. Oxford University Press, Oxford (1975)
1975
-
[9]
Journal of Computational Physics428, 110072 (2021)
Xu, K., Huang, D.Z., Darve, E.: Learning constitutive relations using symmetric positive definite neural networks. Journal of Computational Physics428, 110072 (2021)
2021
-
[10]
Computer Methods in Applied Mechanics and Engineering381, 113852 (2021)
Flaschel, M., Kumar, S., De Lorenzis, L.: Unsupervised discovery of interpretable hyperelastic constitutive laws. Computer Methods in Applied Mechanics and Engineering381, 113852 (2021)
2021
-
[11]
Journal of the Mechanics and Physics of Solids164, 104931 (2022)
Li, L., Chen, C.Q.: Equilibrium-based convolution neural networks for constitu- tive modeling of hyperelastic materials. Journal of the Mechanics and Physics of Solids164, 104931 (2022)
2022
-
[12]
Journal of the Mechanics and Physics of Solids169, 105076 (2022)
Thakolkaran, P., Joshi, A., Zheng, Y., Flaschel, M., De Lorenzis, L., Kumar, S.: NN-EUCLID: Deep-learning hyperelasticity without stress data. Journal of the Mechanics and Physics of Solids169, 105076 (2022)
2022
-
[13]
Journal of the Mechanics and Physics of Solids179, 105363 (2023)
Linden, L., Klein, D.K., Kalina, K.A., Brummund, J., Weeger, O., K¨ astner, M.: Neural networks meet hyperelasticity: A guide to enforcing physics. Journal of the Mechanics and Physics of Solids179, 105363 (2023)
2023
-
[14]
John Wiley & Sons, Hoboken (2012)
De Borst, R., Crisfield, M.A., Remmers, J.J., Verhoosel, C.V.: Nonlinear Finite Element Analysis of Solids and Structures. John Wiley & Sons, Hoboken (2012)
2012
-
[15]
John Wiley & Sons, Chichester (2014)
Belytschko, T., Liu, W.K., Moran, B., Elkhodary, K.: Nonlinear Finite Elements for Continua and Structures. John Wiley & Sons, Chichester (2014)
2014
-
[16]
In: Advances in Neural Information Processing Systems, vol
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008 (2017)
2017
-
[17]
Proceedings of the Royal Society of London
Ogden, R.W.: Large deformation isotropic elasticity–on the correlation of theory and experiment for incompressible rubberlike solids. Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences326, 565–584 (1972) 22
1972
-
[18]
Dover Publications, Mineola, NY (1997)
Ogden, R.W.: Non-linear Elastic Deformations. Dover Publications, Mineola, NY (1997)
1997
-
[19]
Rubber Chemistry and Technology75, 839–852 (2002)
Pucci, E., Saccomandi, G.: A note on the Gent model for rubber-like materials. Rubber Chemistry and Technology75, 839–852 (2002)
2002
-
[20]
European Journal of Mechanics-A/Solids38, 144–151 (2013)
Khajehsaeid, H., Arghavani, J., Naghdabadi, R.: A hyperelastic constitutive model for rubber-like materials. European Journal of Mechanics-A/Solids38, 144–151 (2013)
2013
-
[21]
Colloid and Polymer Science264, 866–876 (1986)
Kilian, H.-G., Enderle, H., Unseld, K.: The use of the van der Waals model to elucidate universal aspects of structure-property relationships in simply extended dry and swollen rubbers. Colloid and Polymer Science264, 866–876 (1986)
1986
-
[22]
Rubber Chemistry and Technology79, 835–858 (2006)
Marckmann, G., Verron, E.: Comparison of hyperelastic models for rubber-like materials. Rubber Chemistry and Technology79, 835–858 (2006)
2006
-
[23]
Journal of the Mechanics and Physics of Solids200, 106117 (2025)
Li, L., Li, S., Gao, H., Chen, C.Q.: ENNStressNet - An unsupervised equilibrium- based neural network for end-to-end stress mapping in elastoplastic solids. Journal of the Mechanics and Physics of Solids200, 106117 (2025)
2025
-
[24]
Archive of Numerical Software3, 9–23 (2015)
Alnaes, M.S., Blechta, J., Hake, J., Johansson, A., Kehlet, B., Logg, A., Richard- son, C.N., Ring, J., Rognes, M.E., Wells, G.N.: The FEniCS project version 1.5. Archive of Numerical Software3, 9–23 (2015)
2015
-
[25]
Springer, Berlin, Heidelberg (2012)
Logg, A., Mardal, K.-A., Wells, G.N.,et al.: Automated Solution of Differential Equations by the Finite Element Method. Springer, Berlin, Heidelberg (2012)
2012
-
[26]
Experimental Mechanics25, 232–244 (1985)
Chu, T., Ranson, W., Sutton, M.A.: Applications of digital-image-correlation techniques to experimental mechanics. Experimental Mechanics25, 232–244 (1985)
1985
-
[27]
Experimental Mechanics 39, 217–226 (1999)
Bay, B.K., Smith, T.S., Fyhrie, D.P., Saad, M.: Digital volume correlation: Three- dimensional strain mapping using X-ray tomography. Experimental Mechanics 39, 217–226 (1999)
1999
-
[28]
Measurement Science and Technology20, 062001 (2009)
Pan, B., Qian, K., Xie, H., Asundi, A.: Two-dimensional digital image correlation for in-plane displacement and strain measurement: a review. Measurement Science and Technology20, 062001 (2009)
2009
-
[29]
Proceedings of the National Academy of Sciences of the United States of America121, 2404205121 (2024)
Wang, Z., Das, S., Joshi, A., Shaikeea, A.J., Deshpande, V.S.: 3D observa- tions provide striking findings in rubber elasticity. Proceedings of the National Academy of Sciences of the United States of America121, 2404205121 (2024)
2024
-
[30]
Experimental Mechanics58, 661–708 (2018) 23
Buljac, A., Jailin, C., Mendoza, A., Neggers, J., Taillandier-Thomas, T., Bouterf, A., Smaniotto, B., Hild, F., Roux, S.: Digital volume correlation: review of progress and challenges. Experimental Mechanics58, 661–708 (2018) 23
2018
-
[31]
Journal of Machine Learning Research9, 2579–2605 (2008)
Maaten, L.v.d., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research9, 2579–2605 (2008)
2008
-
[32]
Wiley, Hoboken (2017)
Bergman, T.L., Lavine, A.S., Incropera, F.P., DeWitt, D.P.: Fundamentals of Heat and Mass Transfer. Wiley, Hoboken (2017)
2017
-
[33]
Proceedings of the National Academy of Sciences of the United States of America120, 2310142120 (2023)
Yang, L., Liu, S., Meng, T., Osher, S.J.: In-context operator learning with data prompts for differential equation problems. Proceedings of the National Academy of Sciences of the United States of America120, 2310142120 (2023)
2023
-
[34]
Journal of Computational Physics 519, 113379 (2024)
Yang, L., Osher, S.J.: PDE generalization of in-context operator networks: A study on 1D scalar nonlinear conservation laws. Journal of Computational Physics 519, 113379 (2024)
2024
-
[35]
Cao, Y., Liu, Y., Yang, L., Yu, R., Schaeffer, H., Osher, S.: VICON: Vision in-context operator networks for multi-physics fluid dynamics prediction. arXiv preprint arXiv:2411.16063 (2024)
-
[36]
Neural Networks, 107455 (2025)
Yang, L., Liu, S., Osher, S.J.: Fine-tune language models as multi-modal differential equation solvers. Neural Networks, 107455 (2025)
2025
-
[37]
Muon is Scalable for LLM Training
Liu, J., Su, J., Yao, X., Jiang, Z., Lai, G., Du, Y., Qin, Y., Xu, W., Lu, E., Yan, J., et al.: Muon is scalable for LLM training. arXiv preprint arXiv:2502.16982 (2025)
work page internal anchor Pith review arXiv 2025
-
[38]
In: Interna- tional Conference on Learning Representations, pp
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Interna- tional Conference on Learning Representations, pp. 4061–4079 (2019)
2019
-
[39]
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Goyal, P., Doll´ ar, P., Girshick, R., Noordhuis, P., Wesolowski, L., Kyrola, A., Tulloch, A., Jia, Y., He, K.: Accurate, large minibatch SGD: Training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677 (2017)
work page internal anchor Pith review arXiv 2017
-
[40]
In: International Conference on Learning Representations, pp
Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations, pp. 1769–1784 (2017)
2017
-
[41]
Acta Mechanica148, 129– 155 (2001)
Hartmann, S.: Numerical studies on the identification of the material parameters of rivlin’s hyperelasticity using tension-torsion tests. Acta Mechanica148, 129– 155 (2001)
2001
-
[42]
International Journal of Precision Engineering and Manufacturing 13, 759–764 (2012)
Kim, B., Lee, S.B., Lee, J., Cho, S., Park, H., Yeom, S., Park, S.H.: A comparison among Neo-Hookean model, Mooney-Rivlin model, and Ogden model for chloro- prene rubber. International Journal of Precision Engineering and Manufacturing 13, 759–764 (2012)
2012
-
[43]
Acta Biomaterialia48, 319–340 (2017) 24
Budday, S., Sommer, G., Birkl, C., Langkammer, C., Haybaeck, J., Kohn- ert, J., Bauer, M., Paulsen, F., Steinmann, P., Kuhl, E.,et al.: Mechanical characterization of human brain tissue. Acta Biomaterialia48, 319–340 (2017) 24
2017
-
[44]
Rubber Chemistry and Technology69, 59–61 (1996)
Gent, A.N.: A new constitutive relation for rubber. Rubber Chemistry and Technology69, 59–61 (1996)
1996
-
[45]
Nature Methods17, 261–272 (2020) Acknowledgements The work is supported by the National Natural Science Foundation of China (Nos
Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Courna- peau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J.,et al.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nature Methods17, 261–272 (2020) Acknowledgements The work is supported by the National Natural Science Foundation of China (Nos. 12132007...
2020
-
[46]
All volumetric coefficientsDwere sampled from 28 Table 1Sampling distributions of the Pucci–Saccomandi model
within the interval [1.2,20]. All volumetric coefficientsDwere sampled from 28 Table 1Sampling distributions of the Pucci–Saccomandi model. Parameter Description Distribution µ(MPa) Shear modulusµ∼ U(1,101) Jm Chain stretch limit √Jm ∼ U(4,6) C2 Second invariant coefficientC 2 ∼ U(0,100) D(MPa) Volumetric penalty coefficientD∼ U(1,501) Table 2Sampling dis...
1922
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.