pith. machine review for the scientific record. sign in

arxiv: 2604.16843 · v1 · submitted 2026-04-18 · 💻 cs.CE

Recognition: unknown

Watching Physics: the Generative Science of Matter and Motion

Ellen Kuhl, Hagen Holthusen, Kevin Linka

Authors on Pith no claims yet

Pith reviewed 2026-05-10 07:28 UTC · model grok-4.3

classification 💻 cs.CE
keywords generative video modelsdeformation mechanicsphysics simulationkinematicsscientific inferencestrain recoverymatter in motion
0
0 comments X

The pith

Generative models recover measurable physical quantities like surface strain from video when the underlying mechanics appear directly in the visible motion.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates whether generative video models can extract reliable physics of matter in motion from images alone. It combines visual data with targeted experiments and high-fidelity simulations across three deformation systems that grow in complexity. Success occurs only when the relevant physics is encoded in observable kinematics; otherwise visual realism decouples from physical correctness. The work therefore frames a unified approach that turns image generation into a tool for inference and design rather than mere appearance matching.

Core claim

Using deformation mechanics as a testbed, we study rubber compression, can crushing, and cardiac motion to identify regimes in which visual learning succeeds, fails, and requires mechanistic supervision. When physics manifests in visible kinematics, generative models recover measurable quantities such as surface strain; when internal state variables dominate, visual plausibility no longer ensures physical admissibility. This convergence defines the Generative Sciences of Matter and Motion, which unifies Simulogenics, Physiogenics, and Materiogenics as physics-grounded foundation models for inference, prediction, and design.

What carries the argument

The regime distinction between visible kinematics (where strain and motion directly encode physics) and hidden internal state variables (where visual output can be plausible yet inadmissible), enforced by coupling video data to experiments and simulations.

If this is right

  • Generative models become usable for quantitative inference of strains and dynamics directly from video when the physics is kinematically visible.
  • In systems dominated by internal variables, additional mechanistic supervision is required to restore physical admissibility.
  • The three proposed subfields (Simulogenics, Physiogenics, Materiogenics) provide a common framework for turning visual generation into a scientific instrument.
  • Visual generation can support design loops once it is constrained to produce only admissible physical states.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same visibility criterion may apply to other time-evolving physical systems where video is abundant but internal fields are not directly observed.
  • A practical test would be to replace the current deformation examples with a new class of motion, such as fluid free-surface flow, and measure whether the success threshold still tracks kinematic visibility.
  • If the regime distinction holds, it supplies a diagnostic for deciding when purely data-driven video models can be trusted versus when hybrid physics-informed architectures are mandatory.

Load-bearing premise

That coupling visual data with experiments and high-fidelity simulations will make generative outputs physically admissible enough to support scientific inference, prediction, and design.

What would settle it

Training a generative model only on video of cardiac motion and then checking whether its predicted surface strains match independent experimental measurements or high-fidelity finite-element results within measurement error.

read the original abstract

Can we learn the physics of matter in motion directly from images and video--and trust it? Answering this question requires integrating experiments, physics-based simulation, and data across traditionally separate disciplines. Much of this knowledge is visual and temporal rather than textual: images and videos encode structure, dynamics, and causality that equations alone cannot fully capture. Recent generative models produce compelling visual content, yet they rely on observational data and often lack physical validity. Here we show that generative video models gain scientific value when they couple visual data with experiments and high-fidelity simulations. Using deformation mechanics as a testbed, we study three systems of increasing complexity--rubber compression, can crushing, and cardiac motion--and identify regimes in which visual learning succeeds, fails, and requires mechanistic supervision. When physics manifests in visible kinematics, generative models recover measurable quantities such as surface strain; when internal state variables dominate, visual plausibility no longer ensures physical admissibility. We propose that this convergence defines a new frontier, the Generative Sciences of Matter and Motion, which unifies Simulogenics, Physiogenics, and Materiogenics. These physics-grounded foundation models can turn visual generation into a scientific instrument for inference, prediction, and design of matter in motion.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that generative video models acquire scientific utility for the physics of matter in motion when visual data are coupled to experiments and high-fidelity simulations. Using three deformation systems of increasing complexity (rubber compression, can crushing, cardiac motion) as testbeds, it identifies regimes in which visible kinematics permit recovery of measurable quantities such as surface strain, while internal-state-dominated regimes render visual plausibility insufficient to guarantee physical admissibility. The manuscript proposes that this convergence defines a new interdisciplinary frontier—the Generative Sciences of Matter and Motion—unifying Simulogenics, Physiogenics, and Materiogenics as physics-grounded foundation models for inference, prediction, and design.

Significance. If the case studies demonstrate reliable recovery of physical quantities and a reproducible distinction between admissible and merely plausible outputs, the work could establish a practical framework for trustworthy generative models in mechanics and biomechanics. The explicit grounding in both experimental data and high-fidelity simulations is a clear strength, as is the regime-based analysis that supplies falsifiable criteria for when visual learning suffices for scientific use. These elements could influence the development of foundation models that support design tasks in material science and cardiac mechanics.

major comments (3)
  1. [Abstract] Abstract: the central claim that 'generative models recover measurable quantities such as surface strain' when physics manifests in visible kinematics is load-bearing for the regime distinction, yet the abstract supplies no quantitative metrics, error analysis, or comparison to ground-truth strain fields from the cited experiments or simulations.
  2. [Abstract] Abstract / proposal section: the unification claim rests on the newly introduced terms Simulogenics, Physiogenics, and Materiogenics, but these are presented without explicit definitions, scope boundaries, or differentiation from existing physics-informed generative modeling approaches, rendering the 'new frontier' assertion difficult to evaluate.
  3. [Cardiac motion case study] Cardiac motion case study: the assertion that internal state variables cause visual plausibility to fail to ensure admissibility is central to the failure-regime identification, yet no concrete failure examples, comparison against high-fidelity simulation outputs, or quantitative admissibility metric is referenced.
minor comments (2)
  1. [Abstract] The long sentence beginning 'Here we show that generative video models gain scientific value...' could be split for readability.
  2. [Throughout] Consider adding a short table or diagram that maps the three systems to the success/failure regimes and the required level of mechanistic supervision.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive review and for identifying specific areas where the manuscript can be clarified and strengthened. We address each major comment below and commit to revisions that directly respond to the concerns raised.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that 'generative models recover measurable quantities such as surface strain' when physics manifests in visible kinematics is load-bearing for the regime distinction, yet the abstract supplies no quantitative metrics, error analysis, or comparison to ground-truth strain fields from the cited experiments or simulations.

    Authors: We agree that the abstract should include quantitative support for this claim to make the regime distinction immediately evaluable. In the revised version we will incorporate specific metrics from the rubber compression and can crushing studies, including mean absolute error in surface strain recovery relative to digital image correlation measurements and finite-element ground truth, along with brief error analysis and direct comparisons. revision: yes

  2. Referee: [Abstract] Abstract / proposal section: the unification claim rests on the newly introduced terms Simulogenics, Physiogenics, and Materiogenics, but these are presented without explicit definitions, scope boundaries, or differentiation from existing physics-informed generative modeling approaches, rendering the 'new frontier' assertion difficult to evaluate.

    Authors: We accept that the new terminology requires explicit grounding. We will add concise definitions and scope statements for each term in the proposal section and will differentiate them from prior physics-informed generative methods (e.g., those enforcing PDE residuals or conservation laws) to clarify the distinct contribution of coupling visual generation with experimental and simulation data. revision: yes

  3. Referee: [Cardiac motion case study] Cardiac motion case study: the assertion that internal state variables cause visual plausibility to fail to ensure admissibility is central to the failure-regime identification, yet no concrete failure examples, comparison against high-fidelity simulation outputs, or quantitative admissibility metric is referenced.

    Authors: This observation is correct and points to a needed strengthening of the cardiac case. We will revise the section to present concrete examples of visually plausible yet inadmissible outputs (e.g., violations of myocardial incompressibility), direct side-by-side comparisons with high-fidelity electromechanical simulation results, and quantitative admissibility metrics such as divergence error norms and strain-energy deviations to substantiate the failure regime. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's derivation rests on empirical case studies of three deformation systems (rubber compression, can crushing, cardiac motion) that distinguish visible-kinematics regimes from internal-state-dominated regimes. These distinctions are drawn directly from coupling visual data to experiments and high-fidelity simulations as external ground truth, without any reduction of outputs to inputs by construction, fitted parameters renamed as predictions, or load-bearing self-citations. The proposal of unifying terms (Simulogenics, Physiogenics, Materiogenics) is a naming convention for the suggested frontier rather than a self-definitional loop in which a claimed result is presupposed by its own definition. No equations or uniqueness theorems are invoked that collapse the central claim into the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper introduces new unifying terminology and assumes that visible kinematics plus simulation coupling suffice for physical validity without providing the supporting derivation or data.

axioms (1)
  • domain assumption Visual data from images and video encodes sufficient kinematic information to recover physical quantities when internal state variables do not dominate.
    Invoked to distinguish success and failure regimes in the three test systems.
invented entities (1)
  • Generative Sciences of Matter and Motion no independent evidence
    purpose: Unifying framework encompassing Simulogenics, Physiogenics, and Materiogenics
    New umbrella term proposed to describe the convergence of generative models with physics.

pith-pipeline@v0.9.0 · 5521 in / 1282 out tokens · 39022 ms · 2026-05-10T07:28:04.197248+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 4 canonical work pages · 2 internal anchors

  1. [1]

    F., 1996

    Ashby, M. F., 1996. Materials Selection in Mechan- ical Design. Butterworth-Heinemann

  2. [2]

    V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

    M. Assran, A. Bardes, D. Fan, Q. Garrido, R. Howes, M. Komeili, M. Muckley, A. Rizvi, C. Roberts, K. Sinha, A. Zholus, S. Arnaud, A. Gejji, A. Martin, F. R. Hogan, D. Dugas, P. Bojanowski, V. Khalidov, P. Labatut, F. Massa, M. Szafraniec, K. Krishnakumar, Y. Li, X. Ma, S. Chandar, F. Meier, Y. LeCun, M. Rabbat, N. Ballas, V-JEPA 2: Self-Supervised Video M...

  3. [3]

    The Living Heart Project: A robust and integrative simulator for human heart function

    Baillargeon, B., Rebelo, N., Fox, D.D., Taylor, R.L., Kuhl, E., 2014. The Living Heart Project: A robust and integrative simulator for human heart function. European Journal of Mechanics A/Solids. 48, 38-47

  4. [4]

    P., Sigmund, O., 2003

    Bendsøe, M. P., Sigmund, O., 2003. Topology Optimization: Theory, Methods, and Applications. Springer

  5. [5]

    Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert- Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford,...

  6. [6]

    A., 1785

    Coulomb, C. A., 1785. Th´ eorie des machines sim- ples. M´ emoires de l’Acad´ emie Royale des Sciences

  7. [7]

    P., 1948

    Feynman, R. P., 1948. Space-time approach to non-relativistic quantum mechanics. Reviews of Modern Physics, 20, 367–387

  8. [8]

    J., 1979

    Gibson, J. J., 1979. The Ecological Approach to Visual Perception. Houghton Mifflin

  9. [9]

    Generative adversarial nets

    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets. Advances in Neural Information Processing Sys- tems (NeurIPS), 27

  10. [10]

    Digital Twin: Manufacturing Excellence through Virtual Factory Replication

    Grieves, M., 2014. Digital Twin: Manufacturing Excellence through Virtual Factory Replication. White paper, Florida Institute of Technology

  11. [11]

    Recurrent world models facilitate policy evolution

    Ha, D., Schmidhuber, J., 2018. Recurrent world models facilitate policy evolution. Advances in Neural Information Processing Systems (NeurIPS), 31

  12. [12]

    The Hamiltonian formula- tion of classical mechanics

    Hestenes, D., 1973. The Hamiltonian formula- tion of classical mechanics. American Journal of Physics, 41, 905–914

  13. [13]

    PYVALE: A Fast, Scalable, Open-Source 2D Digital Image Correlation (DIC) Engine Capable of Handling Gigapixel Images

    Hirst, J., Sibson, L., Tayeb, A., Poole, B., Samp- son, M., Bielajewa, W., Atkinson, M., Marsh, A., Spencer, R., Hamill, R., Hamelin, C., Harte, A., Fletcher, L., 2026. PYVALE: A Fast, Scalable, Open-Source 2D Digital Image Correlation (DIC) Engine Capable of Handling Gigapixel Images. arXiv preprint arXiv:2601.12941

  14. [14]

    Denoising dif- fusion probabilistic models

    Ho, J., Jain, A., Abbeel, P., 2020. Denoising dif- fusion probabilistic models. Advances in Neural Information Processing Systems (NeurIPS), 33

  15. [15]

    Video Diffusion Models

    Ho, J., Salimans, T., Gritsenko, A., Chan, W., Dhariwal, P., Chen, M., Sutskever, I., 2022. Video diffusion models. arXiv preprint arXiv:2204.03458

  16. [16]

    Lectures de Potentia Restitutiva

    Hooke, R., 1678. Lectures de Potentia Restitutiva. Royal Society, London

  17. [17]

    B., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A

    Kovachki, N. B., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A. M., Anandkumar, A., 2023. Neural operator: Learning maps between function spaces. Journal of Machine Learning Research, 24(89), 1–97

  18. [18]

    N., 1941

    Kolmogorov, A. N., 1941. The local structure of turbulence in incompressible viscous fluid. Doklady Akademii Nauk SSSR, 30, 301–305

  19. [19]

    Deep learning

    LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature, 521, 436–444

  20. [20]

    S. Liu, Z. Ren, S. Gupta, S. Wang, PhysGen: Rigid- Body Physics-Grounded Image-to-Video Genera- tion, arXiv preprint arXiv:2409.18964, 2024

  21. [21]

    Marr, D., 1982. Vision. MIT Press

  22. [22]

    A proposal for the Dartmouth summer research project on artificial intelligence

    McCarthy, J., Minsky, M., Rochester, N., Shan- non, C., 1955. A proposal for the Dartmouth summer research project on artificial intelligence. Dartmouth College

  23. [23]

    Moor, M., Banerjee, O., Abad, Z. S. H., Krumholz, H. M., Leskovec, J., Topol, E. J., Rajpurkar, P.,

  24. [24]

    Nature, 616, 259–265

    Foundation models for generalist medical artificial intelligence. Nature, 616, 259–265

  25. [25]

    E., 1965

    Moore, G. E., 1965. Cramming more components onto integrated circuits. Electronics, 38, 114–117

  26. [26]

    Navier, C. L. M. H., 1822. M´ emoire sur les lois du mouvement des fluides. M´ emoires de l’Acad´ emie Royale des Sciences de l’Institut de France

  27. [27]

    Philosophiæ Naturalis Principia Mathematica

    Newton, I., 1687. Philosophiæ Naturalis Principia Mathematica. Royal Society, London

  28. [28]

    Peirlinck, M., Sahli Costabal, F., Yao, J., Guc- cione, J.M., Tripathy, S., Wang, Y., Ozturk, D., Segars, P., Morrison, T.M., Levine, S., Kuhl, E,

  29. [29]

    Perspectives, challenges and opportunities

    Precision medicine in human heart mod- eling. Perspectives, challenges and opportunities. Biomechanics and Modeling in Mechanobiology, 20, 803-831

  30. [30]

    W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I., 2021

    Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I., 2021. Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learn- ing (ICML), 139, 8748–8763

  31. [31]

    Saleh, M., Luzin, V., Toppler, K., Kabir, K.,

  32. [32]

    Composites Part B: Engineering, 78, 415–430

    Response of thin-skinned sandwich pan- els to contact loading with flat-ended cylindri- cal punches: Experiments, numerical simulations and neutron diffraction measurements. Composites Part B: Engineering, 78, 415–430

  33. [33]

    What is Life? Cambridge University Press

    Schr¨ odinger, E., 1944. What is Life? Cambridge University Press

  34. [34]

    E., 1948

    Shannon, C. E., 1948. A mathematical theory of communication. Bell System Technical Journal, 27, 379–423

  35. [35]

    Deep unsupervised learning using nonequilibrium ther- modynamics

    Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S., Sohl-Dickstein, J., 2015. Deep unsupervised learning using nonequilibrium ther- modynamics. Proceedings of the 32nd Interna- tional Conference on Machine Learning (ICML)

  36. [36]

    Hyperelastic models for rubber-like materials: consistent tangent operators and suitability for Treloar’s data

    Steinmann, P., Hossain, M., Possart, G., 2012. Hyperelastic models for rubber-like materials: consistent tangent operators and suitability for Treloar’s data. Archive of Applied Mechanics, 82, 1183–1217

  37. [37]

    G., 1845

    Stokes, G. G., 1845. On the theories of the internal friction of fluids. Transactions of the Cambridge Philosophical Society

  38. [38]

    Treloar, L. R. G., 1944. Stress–strain data for vul- canised rubber under various types of deformation. Transactions of the Faraday Society, 40, 59–70

  39. [39]

    M., 1950

    Turing, A. M., 1950. Computing machinery and intelligence. Mind, 59, 433–460

  40. [40]

    N., Kaiser, L., Polosukhin, I.,

    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I.,

  41. [41]

    Advances in Neural Information Processing Systems (NeurIPS), 30

    Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 30

  42. [42]

    First draft of a report on the EDVAC

    von Neumann, J., 1945. First draft of a report on the EDVAC. University of Pennsylvania

  43. [43]

    C., 1977

    Zienkiewicz, O. C., 1977. The Finite Element Method. McGraw-Hill. 9 Appendix Figure 5 provides additional details about the region of interest for digital image correlation and the dimensions for the finite element simulation of the compression of a rubber block and Table 1 sum- marizes details about the finite element simulation. Figure 6 defines the she...

  44. [44]

    1.75 Lab FEM Fig. 7Exploratory Study II: Structural collapse of crushed can.The reaction force–time curves from both experiment (blue) and simulation (orange) exhibit the characteristic instability-driven force drops and oscillations during progres- sive buckling and fold formation. 10 T able 1Exploratory Study I: Material model, loading and numerical par...