Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics

Katherine Rosenfeld; Maike Sonnewald

arxiv: 2606.11657 · v1 · pith:LMID66D6new · submitted 2026-06-10 · 💻 cs.LG · cs.AI

Sparse probes and murky physics: a case study of interpretability challenges in a foundation model for continuum dynamics

Katherine Rosenfeld , Maike Sonnewald This is my paper

Pith reviewed 2026-06-27 10:53 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords sparse autoencodermechanistic interpretabilityfoundation modelcontinuum dynamicsshear flowenstrophyfluid dynamics emulation

0 comments

The pith

A foundation model for continuum dynamics recruits SAE features in piecewise consistent but physically unaligned patterns across shear flow setups.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies a sparse autoencoder to a layer inside the Walrus foundation model for continuum dynamics and uses enstrophy to triage more than 20,000 features. In shear-flow test cases it compares feature activation across different numerical parameter values and finds that some features recur in similar roles. This recurrence is only partial and does not line up with conventional physical quantities such as energy or vorticity fields. Output mismatches between the emulator and direct simulation, including overly diffuse or localized structures, are traced to shifts in particular SAE features. The work points to open practical problems in ranking mechanistically relevant features and separating stable internal structure from single-layer or SAE artifacts.

Core claim

Across multiple shear-flow setups the model shows evidence of piecewise consistency in which subsets of SAE features recur in similar roles, but this structure is intermittent and does not map cleanly onto standard physical decompositions; parts of the observed discrepancies between numerical simulation and emulator outputs can be connected to changes in specific SAE feature usage.

What carries the argument

Sparse autoencoder features triaged by enstrophy in one selected layer of the Walrus foundation model for continuum dynamics.

If this is right

Subsets of SAE features recur in similar roles across different shear-flow parameter values.
The observed consistency remains only piecewise and does not align with standard physical decompositions.
Some systematic output discrepancies between simulation and emulation are traceable to changes in particular SAE feature usage.
Single-layer SAE analysis leaves open how to separate stable internal structure from analysis artifacts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Using additional or combined physical metrics for triage might expose whether low-enstrophy features carry overlooked mechanistic information.
The intermittent consistency could indicate that the model develops effective yet non-physical internal representations for continuum tasks.
Extending the same probing approach to other layers or to different foundation models would test whether these interpretability issues are widespread.

Load-bearing premise

Enstrophy supplies a sufficient and unbiased filter for selecting important SAE features from over 20,000 without missing low-enstrophy but mechanistically relevant ones or creating selection artifacts that alter the reported consistency and discrepancy patterns.

What would settle it

Repeating the triage and comparison with a different physical quantity such as integrated kinetic energy instead of enstrophy, and obtaining feature sets that map cleanly onto physical decompositions with stable roles across all setups, would falsify the claim of intermittent and non-mapping structure.

Figures

Figures reproduced from arXiv: 2606.11657 by Katherine Rosenfeld, Maike Sonnewald.

**Figure 2.** Figure 2: Comparing enstrophy, E, of Sim50 versus total feature activation for the feature with greatest Spearman’s rank corelation coefficient (ρ = 0.85). We also show the tracer field (middle row) and blue enstrophy overlaid by red feature activation heatmaps (bottom row) [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: As in Fig. 2 for feature 8245 is Sim [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: As in Fig. 2 for feature 8245 is Sim [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Energy spectra for simulations Sim50 (left) and Sim3 (right) and two timesteps near the beginning (solid line) and middle (dashed line) of the simulation. We show results from the numerical simulation (dark blue) as well as the Walrus single step prediction (light blue). representation to capture, while representational spreading makes whatever remains harder to isolate. However, neither pattern is unive… view at source ↗

**Figure 6.** Figure 6: MSE loss (left), Aux loss (center), and alive fraction (right) from our training runs for [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Enstrophy distributions per time-step and trajectory. We order the simulations by the mean [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: Distribution of Spearman’s rank coefficient, [PITH_FULL_IMAGE:figures/full_fig_p061_8.png] view at source ↗

read the original abstract

Generative AI emulators are increasingly used in scientific domains where we already have strong theory, benchmarks, and physical intuition. This raises a central evaluation and interpretability question: when a foundation-style model can reproduce known continuum dynamics, what internal mechanism supports that behavior, is the internal behaviour consistent with known physics, and how does it relate to where the emulator succeeds or fails? We investigate a cross-domain foundation model for continuum dynamics, Walrus by Polymathic, using mechanistic interpretability guided by physical principles. We apply a sparse autoencoder (SAE) to probe a selected layer, and address the practical challenge of triaging a large feature set (over 20,000) using enstrophy as a physically grounded metric. As a deliberately simple testbed, we focus on shear flow and compare feature recruitment across multiple shear-flow setups, i.e. parameter values in the numerical simulation. Across setups we find evidence of piecewise consistency, with subsets of features recurring in similar roles, but this structure is intermittent and does not map cleanly onto standard physical decompositions. In parallel, direct comparisons between numerical simulation and the emulator reveal systematic output-level discrepancies, including regimes where energy/structures become too diffuse or too localized. We connect parts of these discrepancies to changes in specific SAE feature usage. Our work highlights open questions for scientific foundation models: how to robustly prioritize mechanistically meaningful features, how to separate stable structure from analysis artifacts (including single-layer and SAE limitations), and how to use established benchmarks to decide when "different" internal representations are genuinely informative rather than merely effective.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a narrow case study on SAE probing of one continuum model that uses enstrophy to triage features and reports intermittent consistency plus links to output errors, but the triage step looks like the weakest link.

read the letter

The paper applies sparse autoencoders to a layer in the Walrus foundation model for continuum dynamics, focusing on shear flow as a testbed. They use enstrophy to sort through more than 20,000 features across different parameter setups and report that some features recur in similar roles but the pattern is intermittent and does not align cleanly with standard physical decompositions. They also note output discrepancies like overly diffuse or localized structures and tie some of those to shifts in feature usage.

What stands out is the attempt to bring a physical quantity into the feature triage instead of relying on purely statistical sorting. That choice makes sense for a scientific emulator and gives the work a clearer anchor than generic SAE applications.

The soft spot is the enstrophy filter itself. If features that matter for the actual mechanics sit at low enstrophy, the selection could miss them and make the consistency look more broken up than it really is, or connect the wrong features to the output errors. The abstract gives no numbers on how many features survive the threshold, no ablation on the cutoff, and no check that the retained set covers the relevant mechanistic space. Without those, the claimed links stay hard to evaluate.

This is for people already working on mechanistic interpretability for physics or simulation models. It surfaces practical questions about feature selection and artifact separation that others will run into. The work is preliminary but the direction is worth testing, so it deserves a serious referee rather than a desk reject.

Referee Report

2 major / 1 minor

Summary. The paper applies sparse autoencoders to a selected layer of the Walrus foundation model for continuum dynamics, using enstrophy to triage >20,000 features in shear-flow simulations across parameter values. It reports piecewise consistency in recurring feature roles that is intermittent and does not map cleanly to standard physical decompositions, while linking some output discrepancies (diffuse vs. localized structures) between simulation and emulator to changes in specific SAE feature usage.

Significance. If the central empirical observations hold after addressing triage validation, the work usefully surfaces open methodological questions for interpretability in scientific foundation models: robust prioritization of mechanistically meaningful features, separation of stable structure from single-layer/SAE artifacts, and criteria for when internal differences are informative. The deliberate choice of a simple shear-flow testbed and external physical metric (enstrophy) is a strength for grounding the analysis.

major comments (2)

[Methods] The enstrophy triage procedure (described in the methods) is load-bearing for the claims of piecewise consistency and discrepancy-feature linkages, yet the manuscript supplies no quantitative check (e.g., overlap with a non-enstrophy metric, recall of known vorticity features, or sensitivity analysis) that the threshold preserves the relevant mechanistic subspace rather than systematically excluding low-enstrophy but causally important features such as subtle boundary or gradient encodings.
[Results] The abstract and results sections state findings of 'piecewise consistency' and causal connections to output discrepancies without accompanying quantitative support (overlap fractions, statistical tests, ablation on feature subsets, or error bars), which prevents assessment of effect sizes and reproducibility of the intermittency observation.

minor comments (1)

Notation for SAE feature indices and the precise layer chosen should be defined explicitly on first use to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which identify important opportunities to strengthen the methodological grounding and quantitative presentation of our case study. We respond to each major comment below.

read point-by-point responses

Referee: [Methods] The enstrophy triage procedure (described in the methods) is load-bearing for the claims of piecewise consistency and discrepancy-feature linkages, yet the manuscript supplies no quantitative check (e.g., overlap with a non-enstrophy metric, recall of known vorticity features, or sensitivity analysis) that the threshold preserves the relevant mechanistic subspace rather than systematically excluding low-enstrophy but causally important features such as subtle boundary or gradient encodings.

Authors: We agree that the triage procedure would benefit from additional validation. Enstrophy was selected because it is a physically natural metric for the vorticity-dominated shear-flow testbed. In the revision we will add a sensitivity analysis across threshold values and report feature overlap with a secondary metric (kinetic energy) to check whether low-enstrophy but potentially relevant encodings are excluded. This will make explicit the extent to which the selected subspace is preserved. revision: yes
Referee: [Results] The abstract and results sections state findings of 'piecewise consistency' and causal connections to output discrepancies without accompanying quantitative support (overlap fractions, statistical tests, ablation on feature subsets, or error bars), which prevents assessment of effect sizes and reproducibility of the intermittency observation.

Authors: The reported patterns are qualitative observations drawn from the deliberately limited shear-flow testbed. We will add overlap fractions for recurring features across parameter values and include a limited ablation on the most frequently recruited feature subsets to quantify their contribution to the observed output discrepancies. Because the intermittency itself is the central empirical finding, formal statistical tests are not straightforward, but we will clarify the exploratory character of the results and note reproducibility across the tested configurations. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical case study with external physical triage metric

full rationale

The paper is an observational interpretability case study applying SAE probes to a pre-trained foundation model and triaging >20k features via the external physical quantity enstrophy. No derivation chain, fitted-parameter predictions, self-definitional steps, or load-bearing self-citations exist. Claims of piecewise consistency and discrepancy linkage rest on direct comparisons to numerical simulations and feature activation patterns, not on any reduction to the paper's own inputs or prior author work by construction. The enstrophy triage is an analysis choice whose adequacy is debatable on methodological grounds but does not create circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities; the work rests on standard SAE machinery and the domain assumption that enstrophy is a suitable triage signal.

pith-pipeline@v0.9.1-grok · 5823 in / 1265 out tokens · 42119 ms · 2026-06-27T10:53:00.316956+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 30 canonical work pages · 5 internal anchors

[1]

Journal of Computational Physics , author =

A transformer-based convolutional method to model inverse cascade in forced two-dimensional turbulence , volume =. Journal of Computational Physics , author =. 2025 , pages =. doi:10.1016/j.jcp.2024.113475 , abstract =

work page doi:10.1016/j.jcp.2024.113475 2025
[2]

McCabe, Michael and Mukhopadhyay, Payel and Marwah, Tanya and Blancard, Bruno Regaldo-Saint and Rozet, Francois and Diaconu, Cristiana and Meyer, Lucas and Wong, Kaze W. K. and Sotoudeh, Hadi and Bietti, Alberto and Espejo, Irina and Fear, Rio and Golkar, Siavash and Hehir, Tom and Hirashima, Keiya and Krawezik, Geraud and Lanusse, Francois and Morel, Rud...

work page doi:10.48550/arxiv.2511.15684
[3]

and Beneitez, Miguel and Berger, Marsha and Burkhart, Blakesley and Burns, Keaton and Dalziel, Stuart B

Ohana, Ruben and McCabe, Michael and Meyer, Lucas and Morel, Rudy and Agocs, Fruzsina J. and Beneitez, Miguel and Berger, Marsha and Burkhart, Blakesley and Burns, Keaton and Dalziel, Stuart B. and Fielding, Drummond B. and Fortunato, Daniel and Goldberg, Jared A. and Hirashima, Keiya and Jiang, Yan-Fei and Kerswell, Rich R. and Maddu, Suryanarayana and M...

work page doi:10.48550/arxiv.2412.00568
[4]

, month = dec, year =

MacMillan, Theodore and Ouellette, Nicholas T. , month = dec, year =. Towards mechanistic understanding in a data-driven weather model: internal activations reveal interpretable physical features , shorttitle =. doi:10.48550/arXiv.2512.24440 , abstract =

work page doi:10.48550/arxiv.2512.24440
[5]

Poseidon:

Herde, Maximilian and Raonić, Bogdan and Rohner, Tobias and Käppeli, Roger and Molinaro, Roberto and Bézenac, Emmanuel de and Mishra, Siddhartha , month = nov, year =. Poseidon:. doi:10.48550/arXiv.2405.19101 , abstract =

work page doi:10.48550/arxiv.2405.19101
[6]

Park, Kiho and Choe, Yo Joong and Veitch, Victor , month = jul, year =. The. doi:10.48550/arXiv.2311.03658 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2311.03658
[7]

every concept is activated with positive probability

Zoom. Distill , author =. 2020 , pages =. doi:10.23915/distill.00024.001 , number =

work page doi:10.23915/distill.00024.001 2020
[8]

transformer-circuits , author =

Towards. transformer-circuits , author =
[9]

Fear, Rio Alexa and Mukhopadhyay, Payel and McCabe, Michael and Bietti, Alberto and Cranmer, Miles , month = nov, year =. Physics. doi:10.48550/arXiv.2511.20798 , abstract =

work page doi:10.48550/arxiv.2511.20798
[10]

Physical Review Research , keywords =

Dedalus:. Physical Review Research , author =. 2020 , note =. doi:10.1103/PhysRevResearch.2.023068 , abstract =

work page doi:10.1103/physrevresearch.2.023068 2020
[11]

Adam: A Method for Stochastic Optimization

Kingma, Diederik P. and Ba, Jimmy , month = jan, year =. Adam:. doi:10.48550/arXiv.1412.6980 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1412.6980
[12]

Probabilistic machine learning: an introduction , publisher =

Murphy, Kevin P , year =. Probabilistic machine learning: an introduction , publisher =
[13]

Cunningham, Hoagy and Ewart, Aidan and Riggs, Logan and Huben, Robert and Sharkey, Lee , month = oct, year =. Sparse. doi:10.48550/arXiv.2309.08600 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2309.08600
[14]

Proceedings of the National Academy of Sciences , author =

Sparse autoencoders uncover biologically interpretable features in protein language model representations , volume =. Proceedings of the National Academy of Sciences , author =. 2025 , pages =. doi:10.1073/pnas.2506316122 , abstract =

work page doi:10.1073/pnas.2506316122 2025
[15]

Guan, Haoxiang and He, Jiyan and Zhang, Jie , month = jul, year =. Sparse. doi:10.48550/arXiv.2507.07486 , abstract =

work page doi:10.48550/arxiv.2507.07486
[16]

Shu, Dong and Wu, Xuansheng and Zhao, Haiyan and Rai, Daking and Yao, Ziyu and Liu, Ninghao and Du, Mengnan , year =. A. doi:10.48550/ARXIV.2503.05613 , abstract =

work page doi:10.48550/arxiv.2503.05613
[17]

2025 , pages =

Nature Methods , author =. 2025 , pages =. doi:10.1038/s41592-025-02836-7 , language =

work page doi:10.1038/s41592-025-02836-7 2025
[18]

and Castro, Daniel C

Abdulaal, Ahmed and Fry, Hugo and Montaña-Brown, Nina and Ijishakin, Ayodeji and Gao, Jack and Hyland, Stephanie and Alexander, Daniel C. and Castro, Daniel C. , year =. An. doi:10.48550/ARXIV.2410.03334 , abstract =

work page doi:10.48550/arxiv.2410.03334
[19]

doi:10.48550/arXiv.2212.12794 , abstract =

Lam, Remi and Sanchez-Gonzalez, Alvaro and Willson, Matthew and Wirnsberger, Peter and Fortunato, Meire and Alet, Ferran and Ravuri, Suman and Ewalds, Timo and Eaton-Rosen, Zach and Hu, Weihua and Merose, Alexander and Hoyer, Stephan and Holland, George and Vinyals, Oriol and Stott, Jacklynn and Pritzel, Alexander and Mohamed, Shakir and Battaglia, Peter ...

work page doi:10.48550/arxiv.2212.12794
[20]

the-well-rbc-sf , url =

Morel, Rudy , month = nov, year =. the-well-rbc-sf , url =
[21]

Scaling and evaluating sparse autoencoders

Gao, Leo and Tour, Tom Dupré la and Tillman, Henk and Goh, Gabriel and Troll, Rajan and Radford, Alec and Sutskever, Ilya and Leike, Jan and Wu, Jeffrey , month = jun, year =. Scaling and evaluating sparse autoencoders , url =. doi:10.48550/arXiv.2406.04093 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2406.04093
[22]

arXiv.org , author =

Controllable. arXiv.org , author =
[23]

arXiv.org , author =

Axial. arXiv.org , author =
[24]

arXiv.org , author =

Multiple. arXiv.org , author =
[25]

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

Rudin, Cynthia , year =. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
[26]

Elhage, Nelson and Hume, Tristan and Olsson, Catherine and Schiefer, Nicholas and Henighan, Tom and Kravec, Shauna and Hatfield-Dodds, Zac and Lasenby, Robert and Drain, Dawn and Chen, Carol and Grosse, Roger and McCandlish, Sam and Kaplan, Jared and Amodei, Dario and Wattenberg, Martin and Olah, Christopher , month = sep, year =. Toy. doi:10.48550/arXiv....

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2209.10652
[27]

Carleo, I

Machine learning and the physical sciences , volume =. Reviews of Modern Physics , author =. 2019 , pages =. doi:10.1103/RevModPhys.91.045002 , language =

work page doi:10.1103/revmodphys.91.045002 2019
[28]

Arithmetic

Nikankin, Yaniv and Reusch, Anja and Mueller, Aaron and Belinkov, Yonatan , month = may, year =. Arithmetic. doi:10.48550/arXiv.2410.21272 , abstract =

work page doi:10.48550/arxiv.2410.21272
[29]

Interpretable

Wetzel, Sebastian Johann and Ha, Seungwoong and Iten, Raban and Klopotek, Miriam and Liu, Ziming , year =. Interpretable. doi:10.48550/ARXIV.2503.23616 , abstract =

work page doi:10.48550/arxiv.2503.23616
[30]

Nature Communications , author =

Physics-informed learning of governing equations from scarce data , volume =. Nature Communications , author =. 2021 , note =. doi:10.1038/s41467-021-26434-1 , abstract =

work page doi:10.1038/s41467-021-26434-1 2021
[31]

, month = sep, year =

Sanderse, Benjamin and Stinis, Panos and Maulik, Romit and Ahmed, Shady E. , month = sep, year =. Scientific machine learning for closure models in multiscale problems: a review , shorttitle =. doi:10.48550/arXiv.2403.02913 , abstract =

work page doi:10.48550/arxiv.2403.02913
[32]

Templeton, Adly and Conerly, Tom , month = may, year =. Scaling
[33]

Annual Review of Condensed Matter Physics , author =

Machine. Annual Review of Condensed Matter Physics , author =. 2025 , note =. doi:10.1146/annurev-conmatphys-043024-114758 , abstract =

work page doi:10.1146/annurev-conmatphys-043024-114758 2025
[34]

Artificial Intelligence for the Earth Systems , author =

A hierarchical ensemble manifold methodology for new knowledge on spatial data:. Artificial Intelligence for the Earth Systems , author =
[35]

Yik, William and Sonnewald, Maike and Clare, Mariana C. A. and Lguensat, Redouane , month = dec, year =. Southern. doi:10.48550/arXiv.2310.13916 , abstract =

work page doi:10.48550/arxiv.2310.13916
[36]

Journal of Advances in Modeling Earth Systems , author =

Revealing the. Journal of Advances in Modeling Earth Systems , author =. 2021 , note =. doi:10.1029/2021MS002496 , abstract =

work page doi:10.1029/2021ms002496 2021
[37]

Journal of Advances in Modeling Earth Systems , author =

Explainable. Journal of Advances in Modeling Earth Systems , author =. 2022 , note =. doi:10.1029/2022MS003162 , abstract =

work page doi:10.1029/2022ms003162 2022
[38]

arXiv.org , author =
[39]

in review , author =

Machine. in review , author =
[40]

Environmental Research Letters , author =

Bridging observations, theory and numerical simulation of the ocean using machine learning , volume =. Environmental Research Letters , author =. 2021 , note =. doi:10.1088/1748-9326/ac0eb0 , abstract =

work page doi:10.1088/1748-9326/ac0eb0 2021
[41]

Engineering Applications of Artificial Intelligence , author =

Automated identification of dominant physical processes , volume =. Engineering Applications of Artificial Intelligence , author =. 2022 , keywords =. doi:10.1016/j.engappai.2022.105496 , abstract =

work page doi:10.1016/j.engappai.2022.105496 2022

[1] [1]

Journal of Computational Physics , author =

A transformer-based convolutional method to model inverse cascade in forced two-dimensional turbulence , volume =. Journal of Computational Physics , author =. 2025 , pages =. doi:10.1016/j.jcp.2024.113475 , abstract =

work page doi:10.1016/j.jcp.2024.113475 2025

[2] [2]

McCabe, Michael and Mukhopadhyay, Payel and Marwah, Tanya and Blancard, Bruno Regaldo-Saint and Rozet, Francois and Diaconu, Cristiana and Meyer, Lucas and Wong, Kaze W. K. and Sotoudeh, Hadi and Bietti, Alberto and Espejo, Irina and Fear, Rio and Golkar, Siavash and Hehir, Tom and Hirashima, Keiya and Krawezik, Geraud and Lanusse, Francois and Morel, Rud...

work page doi:10.48550/arxiv.2511.15684

[3] [3]

and Beneitez, Miguel and Berger, Marsha and Burkhart, Blakesley and Burns, Keaton and Dalziel, Stuart B

Ohana, Ruben and McCabe, Michael and Meyer, Lucas and Morel, Rudy and Agocs, Fruzsina J. and Beneitez, Miguel and Berger, Marsha and Burkhart, Blakesley and Burns, Keaton and Dalziel, Stuart B. and Fielding, Drummond B. and Fortunato, Daniel and Goldberg, Jared A. and Hirashima, Keiya and Jiang, Yan-Fei and Kerswell, Rich R. and Maddu, Suryanarayana and M...

work page doi:10.48550/arxiv.2412.00568

[4] [4]

, month = dec, year =

MacMillan, Theodore and Ouellette, Nicholas T. , month = dec, year =. Towards mechanistic understanding in a data-driven weather model: internal activations reveal interpretable physical features , shorttitle =. doi:10.48550/arXiv.2512.24440 , abstract =

work page doi:10.48550/arxiv.2512.24440

[5] [5]

Poseidon:

Herde, Maximilian and Raonić, Bogdan and Rohner, Tobias and Käppeli, Roger and Molinaro, Roberto and Bézenac, Emmanuel de and Mishra, Siddhartha , month = nov, year =. Poseidon:. doi:10.48550/arXiv.2405.19101 , abstract =

work page doi:10.48550/arxiv.2405.19101

[6] [6]

Park, Kiho and Choe, Yo Joong and Veitch, Victor , month = jul, year =. The. doi:10.48550/arXiv.2311.03658 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2311.03658

[7] [7]

every concept is activated with positive probability

Zoom. Distill , author =. 2020 , pages =. doi:10.23915/distill.00024.001 , number =

work page doi:10.23915/distill.00024.001 2020

[8] [8]

transformer-circuits , author =

Towards. transformer-circuits , author =

[9] [9]

Fear, Rio Alexa and Mukhopadhyay, Payel and McCabe, Michael and Bietti, Alberto and Cranmer, Miles , month = nov, year =. Physics. doi:10.48550/arXiv.2511.20798 , abstract =

work page doi:10.48550/arxiv.2511.20798

[10] [10]

Physical Review Research , keywords =

Dedalus:. Physical Review Research , author =. 2020 , note =. doi:10.1103/PhysRevResearch.2.023068 , abstract =

work page doi:10.1103/physrevresearch.2.023068 2020

[11] [11]

Adam: A Method for Stochastic Optimization

Kingma, Diederik P. and Ba, Jimmy , month = jan, year =. Adam:. doi:10.48550/arXiv.1412.6980 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1412.6980

[12] [12]

Probabilistic machine learning: an introduction , publisher =

Murphy, Kevin P , year =. Probabilistic machine learning: an introduction , publisher =

[13] [13]

Cunningham, Hoagy and Ewart, Aidan and Riggs, Logan and Huben, Robert and Sharkey, Lee , month = oct, year =. Sparse. doi:10.48550/arXiv.2309.08600 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2309.08600

[14] [14]

Proceedings of the National Academy of Sciences , author =

Sparse autoencoders uncover biologically interpretable features in protein language model representations , volume =. Proceedings of the National Academy of Sciences , author =. 2025 , pages =. doi:10.1073/pnas.2506316122 , abstract =

work page doi:10.1073/pnas.2506316122 2025

[15] [15]

Guan, Haoxiang and He, Jiyan and Zhang, Jie , month = jul, year =. Sparse. doi:10.48550/arXiv.2507.07486 , abstract =

work page doi:10.48550/arxiv.2507.07486

[16] [16]

Shu, Dong and Wu, Xuansheng and Zhao, Haiyan and Rai, Daking and Yao, Ziyu and Liu, Ninghao and Du, Mengnan , year =. A. doi:10.48550/ARXIV.2503.05613 , abstract =

work page doi:10.48550/arxiv.2503.05613

[17] [17]

2025 , pages =

Nature Methods , author =. 2025 , pages =. doi:10.1038/s41592-025-02836-7 , language =

work page doi:10.1038/s41592-025-02836-7 2025

[18] [18]

and Castro, Daniel C

Abdulaal, Ahmed and Fry, Hugo and Montaña-Brown, Nina and Ijishakin, Ayodeji and Gao, Jack and Hyland, Stephanie and Alexander, Daniel C. and Castro, Daniel C. , year =. An. doi:10.48550/ARXIV.2410.03334 , abstract =

work page doi:10.48550/arxiv.2410.03334

[19] [19]

doi:10.48550/arXiv.2212.12794 , abstract =

Lam, Remi and Sanchez-Gonzalez, Alvaro and Willson, Matthew and Wirnsberger, Peter and Fortunato, Meire and Alet, Ferran and Ravuri, Suman and Ewalds, Timo and Eaton-Rosen, Zach and Hu, Weihua and Merose, Alexander and Hoyer, Stephan and Holland, George and Vinyals, Oriol and Stott, Jacklynn and Pritzel, Alexander and Mohamed, Shakir and Battaglia, Peter ...

work page doi:10.48550/arxiv.2212.12794

[20] [20]

the-well-rbc-sf , url =

Morel, Rudy , month = nov, year =. the-well-rbc-sf , url =

[21] [21]

Scaling and evaluating sparse autoencoders

Gao, Leo and Tour, Tom Dupré la and Tillman, Henk and Goh, Gabriel and Troll, Rajan and Radford, Alec and Sutskever, Ilya and Leike, Jan and Wu, Jeffrey , month = jun, year =. Scaling and evaluating sparse autoencoders , url =. doi:10.48550/arXiv.2406.04093 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2406.04093

[22] [22]

arXiv.org , author =

Controllable. arXiv.org , author =

[23] [23]

arXiv.org , author =

Axial. arXiv.org , author =

[24] [24]

arXiv.org , author =

Multiple. arXiv.org , author =

[25] [25]

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

Rudin, Cynthia , year =. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

[26] [26]

Elhage, Nelson and Hume, Tristan and Olsson, Catherine and Schiefer, Nicholas and Henighan, Tom and Kravec, Shauna and Hatfield-Dodds, Zac and Lasenby, Robert and Drain, Dawn and Chen, Carol and Grosse, Roger and McCandlish, Sam and Kaplan, Jared and Amodei, Dario and Wattenberg, Martin and Olah, Christopher , month = sep, year =. Toy. doi:10.48550/arXiv....

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2209.10652

[27] [27]

Carleo, I

Machine learning and the physical sciences , volume =. Reviews of Modern Physics , author =. 2019 , pages =. doi:10.1103/RevModPhys.91.045002 , language =

work page doi:10.1103/revmodphys.91.045002 2019

[28] [28]

Arithmetic

Nikankin, Yaniv and Reusch, Anja and Mueller, Aaron and Belinkov, Yonatan , month = may, year =. Arithmetic. doi:10.48550/arXiv.2410.21272 , abstract =

work page doi:10.48550/arxiv.2410.21272

[29] [29]

Interpretable

Wetzel, Sebastian Johann and Ha, Seungwoong and Iten, Raban and Klopotek, Miriam and Liu, Ziming , year =. Interpretable. doi:10.48550/ARXIV.2503.23616 , abstract =

work page doi:10.48550/arxiv.2503.23616

[30] [30]

Nature Communications , author =

Physics-informed learning of governing equations from scarce data , volume =. Nature Communications , author =. 2021 , note =. doi:10.1038/s41467-021-26434-1 , abstract =

work page doi:10.1038/s41467-021-26434-1 2021

[31] [31]

, month = sep, year =

Sanderse, Benjamin and Stinis, Panos and Maulik, Romit and Ahmed, Shady E. , month = sep, year =. Scientific machine learning for closure models in multiscale problems: a review , shorttitle =. doi:10.48550/arXiv.2403.02913 , abstract =

work page doi:10.48550/arxiv.2403.02913

[32] [32]

Templeton, Adly and Conerly, Tom , month = may, year =. Scaling

[33] [33]

Annual Review of Condensed Matter Physics , author =

Machine. Annual Review of Condensed Matter Physics , author =. 2025 , note =. doi:10.1146/annurev-conmatphys-043024-114758 , abstract =

work page doi:10.1146/annurev-conmatphys-043024-114758 2025

[34] [34]

Artificial Intelligence for the Earth Systems , author =

A hierarchical ensemble manifold methodology for new knowledge on spatial data:. Artificial Intelligence for the Earth Systems , author =

[35] [35]

Yik, William and Sonnewald, Maike and Clare, Mariana C. A. and Lguensat, Redouane , month = dec, year =. Southern. doi:10.48550/arXiv.2310.13916 , abstract =

work page doi:10.48550/arxiv.2310.13916

[36] [36]

Journal of Advances in Modeling Earth Systems , author =

Revealing the. Journal of Advances in Modeling Earth Systems , author =. 2021 , note =. doi:10.1029/2021MS002496 , abstract =

work page doi:10.1029/2021ms002496 2021

[37] [37]

Journal of Advances in Modeling Earth Systems , author =

Explainable. Journal of Advances in Modeling Earth Systems , author =. 2022 , note =. doi:10.1029/2022MS003162 , abstract =

work page doi:10.1029/2022ms003162 2022

[38] [38]

arXiv.org , author =

[39] [39]

in review , author =

Machine. in review , author =

[40] [40]

Environmental Research Letters , author =

Bridging observations, theory and numerical simulation of the ocean using machine learning , volume =. Environmental Research Letters , author =. 2021 , note =. doi:10.1088/1748-9326/ac0eb0 , abstract =

work page doi:10.1088/1748-9326/ac0eb0 2021

[41] [41]

Engineering Applications of Artificial Intelligence , author =

Automated identification of dominant physical processes , volume =. Engineering Applications of Artificial Intelligence , author =. 2022 , keywords =. doi:10.1016/j.engappai.2022.105496 , abstract =

work page doi:10.1016/j.engappai.2022.105496 2022