Recognition: unknown
Watching Physics: the Generative Science of Matter and Motion
Pith reviewed 2026-05-10 07:28 UTC · model grok-4.3
The pith
Generative models recover measurable physical quantities like surface strain from video when the underlying mechanics appear directly in the visible motion.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using deformation mechanics as a testbed, we study rubber compression, can crushing, and cardiac motion to identify regimes in which visual learning succeeds, fails, and requires mechanistic supervision. When physics manifests in visible kinematics, generative models recover measurable quantities such as surface strain; when internal state variables dominate, visual plausibility no longer ensures physical admissibility. This convergence defines the Generative Sciences of Matter and Motion, which unifies Simulogenics, Physiogenics, and Materiogenics as physics-grounded foundation models for inference, prediction, and design.
What carries the argument
The regime distinction between visible kinematics (where strain and motion directly encode physics) and hidden internal state variables (where visual output can be plausible yet inadmissible), enforced by coupling video data to experiments and simulations.
If this is right
- Generative models become usable for quantitative inference of strains and dynamics directly from video when the physics is kinematically visible.
- In systems dominated by internal variables, additional mechanistic supervision is required to restore physical admissibility.
- The three proposed subfields (Simulogenics, Physiogenics, Materiogenics) provide a common framework for turning visual generation into a scientific instrument.
- Visual generation can support design loops once it is constrained to produce only admissible physical states.
Where Pith is reading between the lines
- The same visibility criterion may apply to other time-evolving physical systems where video is abundant but internal fields are not directly observed.
- A practical test would be to replace the current deformation examples with a new class of motion, such as fluid free-surface flow, and measure whether the success threshold still tracks kinematic visibility.
- If the regime distinction holds, it supplies a diagnostic for deciding when purely data-driven video models can be trusted versus when hybrid physics-informed architectures are mandatory.
Load-bearing premise
That coupling visual data with experiments and high-fidelity simulations will make generative outputs physically admissible enough to support scientific inference, prediction, and design.
What would settle it
Training a generative model only on video of cardiac motion and then checking whether its predicted surface strains match independent experimental measurements or high-fidelity finite-element results within measurement error.
read the original abstract
Can we learn the physics of matter in motion directly from images and video--and trust it? Answering this question requires integrating experiments, physics-based simulation, and data across traditionally separate disciplines. Much of this knowledge is visual and temporal rather than textual: images and videos encode structure, dynamics, and causality that equations alone cannot fully capture. Recent generative models produce compelling visual content, yet they rely on observational data and often lack physical validity. Here we show that generative video models gain scientific value when they couple visual data with experiments and high-fidelity simulations. Using deformation mechanics as a testbed, we study three systems of increasing complexity--rubber compression, can crushing, and cardiac motion--and identify regimes in which visual learning succeeds, fails, and requires mechanistic supervision. When physics manifests in visible kinematics, generative models recover measurable quantities such as surface strain; when internal state variables dominate, visual plausibility no longer ensures physical admissibility. We propose that this convergence defines a new frontier, the Generative Sciences of Matter and Motion, which unifies Simulogenics, Physiogenics, and Materiogenics. These physics-grounded foundation models can turn visual generation into a scientific instrument for inference, prediction, and design of matter in motion.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that generative video models acquire scientific utility for the physics of matter in motion when visual data are coupled to experiments and high-fidelity simulations. Using three deformation systems of increasing complexity (rubber compression, can crushing, cardiac motion) as testbeds, it identifies regimes in which visible kinematics permit recovery of measurable quantities such as surface strain, while internal-state-dominated regimes render visual plausibility insufficient to guarantee physical admissibility. The manuscript proposes that this convergence defines a new interdisciplinary frontier—the Generative Sciences of Matter and Motion—unifying Simulogenics, Physiogenics, and Materiogenics as physics-grounded foundation models for inference, prediction, and design.
Significance. If the case studies demonstrate reliable recovery of physical quantities and a reproducible distinction between admissible and merely plausible outputs, the work could establish a practical framework for trustworthy generative models in mechanics and biomechanics. The explicit grounding in both experimental data and high-fidelity simulations is a clear strength, as is the regime-based analysis that supplies falsifiable criteria for when visual learning suffices for scientific use. These elements could influence the development of foundation models that support design tasks in material science and cardiac mechanics.
major comments (3)
- [Abstract] Abstract: the central claim that 'generative models recover measurable quantities such as surface strain' when physics manifests in visible kinematics is load-bearing for the regime distinction, yet the abstract supplies no quantitative metrics, error analysis, or comparison to ground-truth strain fields from the cited experiments or simulations.
- [Abstract] Abstract / proposal section: the unification claim rests on the newly introduced terms Simulogenics, Physiogenics, and Materiogenics, but these are presented without explicit definitions, scope boundaries, or differentiation from existing physics-informed generative modeling approaches, rendering the 'new frontier' assertion difficult to evaluate.
- [Cardiac motion case study] Cardiac motion case study: the assertion that internal state variables cause visual plausibility to fail to ensure admissibility is central to the failure-regime identification, yet no concrete failure examples, comparison against high-fidelity simulation outputs, or quantitative admissibility metric is referenced.
minor comments (2)
- [Abstract] The long sentence beginning 'Here we show that generative video models gain scientific value...' could be split for readability.
- [Throughout] Consider adding a short table or diagram that maps the three systems to the success/failure regimes and the required level of mechanistic supervision.
Simulated Author's Rebuttal
We thank the referee for their constructive review and for identifying specific areas where the manuscript can be clarified and strengthened. We address each major comment below and commit to revisions that directly respond to the concerns raised.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that 'generative models recover measurable quantities such as surface strain' when physics manifests in visible kinematics is load-bearing for the regime distinction, yet the abstract supplies no quantitative metrics, error analysis, or comparison to ground-truth strain fields from the cited experiments or simulations.
Authors: We agree that the abstract should include quantitative support for this claim to make the regime distinction immediately evaluable. In the revised version we will incorporate specific metrics from the rubber compression and can crushing studies, including mean absolute error in surface strain recovery relative to digital image correlation measurements and finite-element ground truth, along with brief error analysis and direct comparisons. revision: yes
-
Referee: [Abstract] Abstract / proposal section: the unification claim rests on the newly introduced terms Simulogenics, Physiogenics, and Materiogenics, but these are presented without explicit definitions, scope boundaries, or differentiation from existing physics-informed generative modeling approaches, rendering the 'new frontier' assertion difficult to evaluate.
Authors: We accept that the new terminology requires explicit grounding. We will add concise definitions and scope statements for each term in the proposal section and will differentiate them from prior physics-informed generative methods (e.g., those enforcing PDE residuals or conservation laws) to clarify the distinct contribution of coupling visual generation with experimental and simulation data. revision: yes
-
Referee: [Cardiac motion case study] Cardiac motion case study: the assertion that internal state variables cause visual plausibility to fail to ensure admissibility is central to the failure-regime identification, yet no concrete failure examples, comparison against high-fidelity simulation outputs, or quantitative admissibility metric is referenced.
Authors: This observation is correct and points to a needed strengthening of the cardiac case. We will revise the section to present concrete examples of visually plausible yet inadmissible outputs (e.g., violations of myocardial incompressibility), direct side-by-side comparisons with high-fidelity electromechanical simulation results, and quantitative admissibility metrics such as divergence error norms and strain-energy deviations to substantiate the failure regime. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper's derivation rests on empirical case studies of three deformation systems (rubber compression, can crushing, cardiac motion) that distinguish visible-kinematics regimes from internal-state-dominated regimes. These distinctions are drawn directly from coupling visual data to experiments and high-fidelity simulations as external ground truth, without any reduction of outputs to inputs by construction, fitted parameters renamed as predictions, or load-bearing self-citations. The proposal of unifying terms (Simulogenics, Physiogenics, Materiogenics) is a naming convention for the suggested frontier rather than a self-definitional loop in which a claimed result is presupposed by its own definition. No equations or uniqueness theorems are invoked that collapse the central claim into the inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Visual data from images and video encodes sufficient kinematic information to recover physical quantities when internal state variables do not dominate.
invented entities (1)
-
Generative Sciences of Matter and Motion
no independent evidence
Reference graph
Works this paper leans on
-
[1]
F., 1996
Ashby, M. F., 1996. Materials Selection in Mechan- ical Design. Butterworth-Heinemann
1996
-
[2]
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
M. Assran, A. Bardes, D. Fan, Q. Garrido, R. Howes, M. Komeili, M. Muckley, A. Rizvi, C. Roberts, K. Sinha, A. Zholus, S. Arnaud, A. Gejji, A. Martin, F. R. Hogan, D. Dugas, P. Bojanowski, V. Khalidov, P. Labatut, F. Massa, M. Szafraniec, K. Krishnakumar, Y. Li, X. Ma, S. Chandar, F. Meier, Y. LeCun, M. Rabbat, N. Ballas, V-JEPA 2: Self-Supervised Video M...
work page internal anchor Pith review arXiv 2025
-
[3]
The Living Heart Project: A robust and integrative simulator for human heart function
Baillargeon, B., Rebelo, N., Fox, D.D., Taylor, R.L., Kuhl, E., 2014. The Living Heart Project: A robust and integrative simulator for human heart function. European Journal of Mechanics A/Solids. 48, 38-47
2014
-
[4]
P., Sigmund, O., 2003
Bendsøe, M. P., Sigmund, O., 2003. Topology Optimization: Theory, Methods, and Applications. Springer
2003
-
[5]
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert- Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford,...
2020
-
[6]
A., 1785
Coulomb, C. A., 1785. Th´ eorie des machines sim- ples. M´ emoires de l’Acad´ emie Royale des Sciences
-
[7]
P., 1948
Feynman, R. P., 1948. Space-time approach to non-relativistic quantum mechanics. Reviews of Modern Physics, 20, 367–387
1948
-
[8]
J., 1979
Gibson, J. J., 1979. The Ecological Approach to Visual Perception. Houghton Mifflin
1979
-
[9]
Generative adversarial nets
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets. Advances in Neural Information Processing Sys- tems (NeurIPS), 27
2014
-
[10]
Digital Twin: Manufacturing Excellence through Virtual Factory Replication
Grieves, M., 2014. Digital Twin: Manufacturing Excellence through Virtual Factory Replication. White paper, Florida Institute of Technology
2014
-
[11]
Recurrent world models facilitate policy evolution
Ha, D., Schmidhuber, J., 2018. Recurrent world models facilitate policy evolution. Advances in Neural Information Processing Systems (NeurIPS), 31
2018
-
[12]
The Hamiltonian formula- tion of classical mechanics
Hestenes, D., 1973. The Hamiltonian formula- tion of classical mechanics. American Journal of Physics, 41, 905–914
1973
-
[13]
Hirst, J., Sibson, L., Tayeb, A., Poole, B., Samp- son, M., Bielajewa, W., Atkinson, M., Marsh, A., Spencer, R., Hamill, R., Hamelin, C., Harte, A., Fletcher, L., 2026. PYVALE: A Fast, Scalable, Open-Source 2D Digital Image Correlation (DIC) Engine Capable of Handling Gigapixel Images. arXiv preprint arXiv:2601.12941
-
[14]
Denoising dif- fusion probabilistic models
Ho, J., Jain, A., Abbeel, P., 2020. Denoising dif- fusion probabilistic models. Advances in Neural Information Processing Systems (NeurIPS), 33
2020
-
[15]
Ho, J., Salimans, T., Gritsenko, A., Chan, W., Dhariwal, P., Chen, M., Sutskever, I., 2022. Video diffusion models. arXiv preprint arXiv:2204.03458
work page internal anchor Pith review arXiv 2022
-
[16]
Lectures de Potentia Restitutiva
Hooke, R., 1678. Lectures de Potentia Restitutiva. Royal Society, London
-
[17]
B., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A
Kovachki, N. B., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A. M., Anandkumar, A., 2023. Neural operator: Learning maps between function spaces. Journal of Machine Learning Research, 24(89), 1–97
2023
-
[18]
N., 1941
Kolmogorov, A. N., 1941. The local structure of turbulence in incompressible viscous fluid. Doklady Akademii Nauk SSSR, 30, 301–305
1941
-
[19]
Deep learning
LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature, 521, 436–444
2015
- [20]
-
[21]
Marr, D., 1982. Vision. MIT Press
1982
-
[22]
A proposal for the Dartmouth summer research project on artificial intelligence
McCarthy, J., Minsky, M., Rochester, N., Shan- non, C., 1955. A proposal for the Dartmouth summer research project on artificial intelligence. Dartmouth College
1955
-
[23]
Moor, M., Banerjee, O., Abad, Z. S. H., Krumholz, H. M., Leskovec, J., Topol, E. J., Rajpurkar, P.,
-
[24]
Nature, 616, 259–265
Foundation models for generalist medical artificial intelligence. Nature, 616, 259–265
-
[25]
E., 1965
Moore, G. E., 1965. Cramming more components onto integrated circuits. Electronics, 38, 114–117
1965
-
[26]
Navier, C. L. M. H., 1822. M´ emoire sur les lois du mouvement des fluides. M´ emoires de l’Acad´ emie Royale des Sciences de l’Institut de France
-
[27]
Philosophiæ Naturalis Principia Mathematica
Newton, I., 1687. Philosophiæ Naturalis Principia Mathematica. Royal Society, London
-
[28]
Peirlinck, M., Sahli Costabal, F., Yao, J., Guc- cione, J.M., Tripathy, S., Wang, Y., Ozturk, D., Segars, P., Morrison, T.M., Levine, S., Kuhl, E,
-
[29]
Perspectives, challenges and opportunities
Precision medicine in human heart mod- eling. Perspectives, challenges and opportunities. Biomechanics and Modeling in Mechanobiology, 20, 803-831
-
[30]
W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I., 2021
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I., 2021. Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learn- ing (ICML), 139, 8748–8763
2021
-
[31]
Saleh, M., Luzin, V., Toppler, K., Kabir, K.,
-
[32]
Composites Part B: Engineering, 78, 415–430
Response of thin-skinned sandwich pan- els to contact loading with flat-ended cylindri- cal punches: Experiments, numerical simulations and neutron diffraction measurements. Composites Part B: Engineering, 78, 415–430
-
[33]
What is Life? Cambridge University Press
Schr¨ odinger, E., 1944. What is Life? Cambridge University Press
1944
-
[34]
E., 1948
Shannon, C. E., 1948. A mathematical theory of communication. Bell System Technical Journal, 27, 379–423
1948
-
[35]
Deep unsupervised learning using nonequilibrium ther- modynamics
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S., Sohl-Dickstein, J., 2015. Deep unsupervised learning using nonequilibrium ther- modynamics. Proceedings of the 32nd Interna- tional Conference on Machine Learning (ICML)
2015
-
[36]
Hyperelastic models for rubber-like materials: consistent tangent operators and suitability for Treloar’s data
Steinmann, P., Hossain, M., Possart, G., 2012. Hyperelastic models for rubber-like materials: consistent tangent operators and suitability for Treloar’s data. Archive of Applied Mechanics, 82, 1183–1217
2012
-
[37]
G., 1845
Stokes, G. G., 1845. On the theories of the internal friction of fluids. Transactions of the Cambridge Philosophical Society
-
[38]
Treloar, L. R. G., 1944. Stress–strain data for vul- canised rubber under various types of deformation. Transactions of the Faraday Society, 40, 59–70
1944
-
[39]
M., 1950
Turing, A. M., 1950. Computing machinery and intelligence. Mind, 59, 433–460
1950
-
[40]
N., Kaiser, L., Polosukhin, I.,
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I.,
-
[41]
Advances in Neural Information Processing Systems (NeurIPS), 30
Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 30
-
[42]
First draft of a report on the EDVAC
von Neumann, J., 1945. First draft of a report on the EDVAC. University of Pennsylvania
1945
-
[43]
C., 1977
Zienkiewicz, O. C., 1977. The Finite Element Method. McGraw-Hill. 9 Appendix Figure 5 provides additional details about the region of interest for digital image correlation and the dimensions for the finite element simulation of the compression of a rubber block and Table 1 sum- marizes details about the finite element simulation. Figure 6 defines the she...
1977
-
[44]
1.75 Lab FEM Fig. 7Exploratory Study II: Structural collapse of crushed can.The reaction force–time curves from both experiment (blue) and simulation (orange) exhibit the characteristic instability-driven force drops and oscillations during progres- sive buckling and fold formation. 10 T able 1Exploratory Study I: Material model, loading and numerical par...
2082
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.