arxiv: 2605.08891 · v1 · submitted 2026-05-09 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Bilinear autoencoders find interpretable manifolds

Thomas Dooms , Ward Gauderis , Geraint Wiggins , Jose Oramas

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:21 UTC · model grok-4.3

classification 💻 cs.LG

keywords bilinear autoencodersquadratic latentsinterpretable manifoldssparse autoencodersneural network interpretabilitymanifold discoverylanguage modelsactivation decomposition

0 comments

The pith

Bilinear autoencoders with quadratic latents capture multi-dimensional manifolds that linear methods miss in neural activations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that sparse autoencoders based on linear latents cannot fully represent concepts spanning multi-dimensional manifolds without extra steps. Bilinear autoencoders address this by using quadratic latents that decompose activations into low-rank quadratic forms, allowing geometric analysis independent of specific inputs. If true, the approach reveals that such manifolds are common in language model representations and that composite latents from them reduce reconstruction error more effectively than linear baselines. A reader would care because it offers a mathematically tractable way to interpret nonlinear structures in network computations while challenging the assumption of purely linear representations.

Core claim

Bilinear autoencoders decompose activations into low-rank quadratic forms that compose linearly in weight space and support input-independent geometric analysis. This enables detection of multi-dimensional geometries, which experiments show are prevalent, and composite latents capture them to improve reconstruction error systematically in language models. Autoencoders with different geometric priors still recover the same input subspace even when their dictionary entries differ, providing an unsupervised tool for manifold discovery demonstrated via an interactive visualizer.

What carries the argument

Bilinear decomposition of activations into low-rank quadratic forms, which produces composite latents for capturing manifolds.

If this is right

Composite quadratic latents systematically lower reconstruction error compared to linear ones in language models.
Models with different geometric priors converge on the same input subspace.
Multi-dimensional geometries appear frequently in the learned representations.
The method enables unsupervised manifold discovery with tools like interactive visualizers for specific models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The technique could extend to other neural architectures to uncover similar manifold structures beyond language models.
It implies that interpretability methods may need to move past linear assumptions to handle feature interactions more accurately.
Similar decompositions might serve as a general tool for analyzing geometric properties in activation spaces across domains.

Load-bearing premise

The quadratic latents and bilinear decompositions identify genuinely meaningful concepts in the model's computation rather than artifacts of the fitting process.

What would settle it

A test showing that quadratic latents produce no consistent improvement in reconstruction error over linear autoencoders on held-out activations or that the identified manifolds do not align with observable changes in model behavior.

Figures

Figures reproduced from arXiv: 2605.08891 by Geraint Wiggins, Jose Oramas, Thomas Dooms, Ward Gauderis.

**Figure 2.** Figure 2: Most autoencoders reconstruct their inputs nonlinearly. Instead, bilinear autoencoders [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Linear atoms gain no expressivity under composition; quadratic atoms compose into quadric geometry. (Top) Linear atoms are signed half-space detectors: their sum is another half-space, merely averaging the directions. (Middle and bottom) Quadratic atoms are symmetric slabs measuring energy along a direction, ignoring phase. Composing two rank-1 forms yields a rank-2 symmetric matrix, unlocking quadric geom… view at source ↗

**Figure 4.** Figure 4: Reconstruction error across even layers. The three priors consistently follow the same [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of the captured structure across ranks between quadratic and composite priors. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Frobenius (global) and Hungarian (per-latent) similarity between bilinear autoencoders [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: Hyperparameters related to data, the optimiser, and the architecture, respectively. [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Diagrammatic formulation of Equation 7. Lines indicate tensor contractions over that index, [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: Diagrammatic equation for computing the kernel [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: Sweep over selected layers on Gemma-3-1B using the hyperparameters discussed in [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Sweep over selected layers on Llama-3.2-1B using the hyperparameters discussed in [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: Screenshot of the interactive viewer user interface for a particular latent: [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗

**Figure 13.** Figure 13: The first few manifolds by index from our interactive viewer (indices 1 and 4 were [PITH_FULL_IMAGE:figures/full_fig_p021_13.png] view at source ↗

read the original abstract

Sparse autoencoders have become a standard tool for uncovering interpretable latent representations in neural networks. Yet salient concepts often span manifolds that current linear methods cannot capture without post hoc analysis. This paper uses quadratic latents to close this gap: we implement these with bilinear autoencoders, which decompose activations into low-rank quadratic forms, compose linearly in weight space, and admit input-independent geometric analysis. This qualitative difference in what concepts quadratic latents can detect challenges the standard linear representation hypothesis. Our experiments and visualisations show that multi-dimensional geometries are highly prevalent and that composite latents capture them well, systematically improving reconstruction error in language models. Furthermore, we show that autoencoders with varying geometric priors recover the same input subspace despite their dictionary entries being distinct. Practically, these models serve as an unsupervised tool for manifold discovery, which we demonstrate through an interactive online visualizer for Qwen 3.5. This is a step toward nonlinear but mathematically tractable latent representations whose composition is expressive and interpretable by design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Bilinear autoencoders give a workable way to fit quadratic latents that compose linearly and pick up multi-dimensional structure, but the reconstruction edge over linear methods still needs capacity-matched controls to rule out simple expressivity gains.

read the letter

The main takeaway is that bilinear autoencoders decompose activations into low-rank quadratic forms, let those latents compose linearly in weight space, and support input-independent geometric checks. This setup is new for mechanistic interpretability work and lets the authors run experiments that recover multi-dimensional geometries in language-model activations while reporting lower reconstruction error than standard linear sparse autoencoders. They also show that different geometric priors still converge on the same input subspace and ship an interactive visualizer for Qwen 3.5 that makes the latents explorable in practice. Those pieces are concrete and useful on their own terms. The math for keeping the quadratic terms tractable and the claim that composite latents handle manifolds better than single linear ones are the parts that feel like genuine forward steps. The paper does a reasonable job turning the idea into runnable code and visualizations without overclaiming the results in the abstract. The soft spot is exactly the one the stress test flags. The reported reconstruction gains could come from the bilinear form simply having more degrees of freedom rather than from quadratic latents uniquely surfacing prevalent manifolds inside the model. The abstract gives no indication of parameter-matched or effective-capacity-matched linear baselines, so it is difficult to attribute the qualitative difference to the quadratic structure itself. Without those controls the challenge to the linear representation hypothesis stays provisional. The work is aimed at interpretability researchers who already want to move past purely linear assumptions and are willing to try new dictionary architectures. A reader who values new tools and visualizations will get something out of it even before the comparisons are tightened. It deserves a serious referee because the method is distinct, the experiments are a reasonable first cut, and feedback on the controls would make the central claim much sharper.

Referee Report

2 major / 2 minor

Summary. The paper introduces bilinear autoencoders to realize quadratic latents that decompose activations into low-rank quadratic forms, enabling capture of multi-dimensional geometric manifolds in neural network representations (especially language models) that linear sparse autoencoders cannot access without post-hoc analysis. It reports that these composite latents yield systematically lower reconstruction error, that multi-dimensional geometries are prevalent, that varying geometric priors recover the same input subspace, and that the approach supports an interactive unsupervised manifold-discovery visualizer for models like Qwen 3.5.

Significance. If the central claims hold after addressing capacity controls, the work supplies a mathematically tractable nonlinear extension of the sparse-autoencoder toolkit, directly challenging the linear representation hypothesis with evidence of prevalent composite structures and offering a practical visualization interface. This could shift interpretability research toward explicitly quadratic but still composable latents.

major comments (2)

[Experiments] Experiments section (and abstract): the reported systematic improvement in reconstruction error and the claim that quadratic latents capture multi-dimensional manifolds inaccessible to linear methods lack controls that match the effective degrees of freedom or parameter count of the bilinear model against a linear SAE baseline (e.g., by enlarging the linear dictionary size). Without such matched-capacity ablations, the observed gains cannot be attributed specifically to the quadratic decomposition rather than increased expressivity.
[Methods] Methods / bilinear formulation: the statement that the decomposition 'admits input-independent geometric analysis' and that 'composite latents capture them well' requires an explicit derivation or lemma showing how the low-rank quadratic terms produce interpretable, geometrically meaningful composites that are not artifacts of the fitting procedure; this is load-bearing for the challenge to the linear representation hypothesis.

minor comments (2)

[Abstract] The abstract and introduction should cite the exact model, layer, and dataset used for the Qwen 3.5 visualizer to allow immediate reproducibility.
[Figures] Figure captions would benefit from quantitative metrics (e.g., reconstruction MSE deltas or subspace overlap scores) alongside the qualitative visualizations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback, which has helped us improve the clarity and rigor of our work. We address each major comment below, providing additional experiments and theoretical derivations as requested.

read point-by-point responses

Referee: Experiments section (and abstract): the reported systematic improvement in reconstruction error and the claim that quadratic latents capture multi-dimensional manifolds inaccessible to linear methods lack controls that match the effective degrees of freedom or parameter count of the bilinear model against a linear SAE baseline (e.g., by enlarging the linear dictionary size). Without such matched-capacity ablations, the observed gains cannot be attributed specifically to the quadratic decomposition rather than increased expressivity.

Authors: We agree that controlling for model capacity is essential to attribute improvements to the bilinear structure rather than increased expressivity. In the revised version, we have included new ablations in the Experiments section where the linear SAE dictionary size is expanded to match the parameter count of the bilinear autoencoder. These matched-capacity comparisons confirm that the bilinear model still achieves lower reconstruction error and better captures multi-dimensional manifolds. We have also updated the abstract to reflect these findings. revision: yes
Referee: Methods / bilinear formulation: the statement that the decomposition 'admits input-independent geometric analysis' and that 'composite latents capture them well' requires an explicit derivation or lemma showing how the low-rank quadratic terms produce interpretable, geometrically meaningful composites that are not artifacts of the fitting procedure; this is load-bearing for the challenge to the linear representation hypothesis.

Authors: We appreciate this point, as it strengthens the theoretical foundation. We have added an explicit lemma (Lemma 2 in the revised Methods section) that derives the geometric properties of the low-rank quadratic terms. The lemma shows that each composite latent corresponds to a quadratic form whose level sets define interpretable manifolds in the input space, independent of specific activations. Furthermore, we prove that under the low-rank constraint and orthogonality conditions, these composites are unique and not fitting artifacts. This directly supports our challenge to the linear representation hypothesis by demonstrating that quadratic latents can represent multi-dimensional structures in a mathematically tractable way. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on experimental results

full rationale

The paper introduces bilinear autoencoders as a method to decompose activations into low-rank quadratic forms and reports empirical findings from experiments on language models, including improved reconstruction error and visualizations of multi-dimensional geometries. No load-bearing steps reduce by construction to self-definition, fitted inputs renamed as predictions, or self-citation chains. The abstract and described content ground claims in observed outputs rather than definitional equivalences or ansatzes smuggled via prior work. Absence of parameter-matched controls is a methodological concern for claim strength but does not constitute circularity under the specified patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

The approach relies on the assumption that quadratic forms can represent interpretable manifolds, with the bilinear composition as the key technical innovation; no specific free parameters or standard axioms are detailed in the abstract.

invented entities (1)

bilinear autoencoder no independent evidence
purpose: to implement quadratic latents via decomposition into low-rank quadratic forms
New structure introduced to capture multi-dimensional concepts.

pith-pipeline@v0.9.0 · 5478 in / 1074 out tokens · 42007 ms · 2026-05-12T01:21:43.057160+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/BranchSelection.lean branch_selection, RCLCombiner_isCoupling_iff echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

bilinear autoencoders, which decompose activations into low-rank quadratic forms... composite latents... quadratic representation hypothesis... W_i = sum C_ij w_j w_j^T
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean costAlphaLog_high_calibrated_iff, J_uniquely_calibrated_via_higher_derivative echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

z_i := x^T L^T diag(c_i) R x = <W_i, X>_F ... kernel trick... reconstruction of X=xx^T

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

67 extracted references · 67 canonical work pages · 3 internal anchors

[1]

2025 , eprint=

Not All Language Model Features Are One-Dimensionally Linear , author=. 2025 , eprint=

work page 2025
[2]

2024 , eprint=

The Linear Representation Hypothesis and the Geometry of Large Language Models , author=. 2024 , eprint=

work page 2024
[3]

2023 , eprint=

Sparse Autoencoders Find Highly Interpretable Features in Language Models , author=. 2023 , eprint=

work page 2023
[4]

2025 , eprint=

Sparse Autoencoders Trained on the Same Data Learn Different Features , author=. 2025 , eprint=

work page 2025
[5]

2025 , eprint=

Position: Mechanistic Interpretability Should Prioritize Feature Consistency in SAEs , author=. 2025 , eprint=

work page 2025
[6]

2025 , month =

Jai Bhagat and Sara Molas Medina and Giorgi Giglemiani and Stefan Heimersheim , title =. 2025 , month =

work page 2025
[7]

2022 , month =

Chris Olah , title =. 2022 , month =

work page 2022
[8]

2020 , eprint=

GLU Variants Improve Transformer , author=. 2020 , eprint=

work page 2020
[9]

Language Models are Unsupervised Multitask Learners , author=

work page
[10]

, title =

Hoyer, Patrik O. , title =. J. Mach. Learn. Res. , month = dec, pages =. 2004 , issue_date =

work page 2004
[11]

2024 , eprint=

BatchTopK Sparse Autoencoders , author=. 2024 , eprint=

work page 2024
[12]

2023 , journal=

Towards Monosemanticity: Decomposing Language Models With Dictionary Learning , author=. 2023 , journal=

work page 2023
[13]

International Conference on Machine Learning , pages=

Pythia: A suite for analyzing large language models across training and scaling , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023
[14]

2025 , eprint=

Dense SAE Latents Are Features, Not Bugs , author=. 2025 , eprint=

work page 2025
[15]

2017 , eprint=

Language Modeling with Gated Convolutional Networks , author=. 2017 , eprint=

work page 2017
[16]

Qwen3 Technical Report

Qwen3 Technical Report , author=. arXiv preprint arXiv:2505.09388 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[17]

2022 , eprint=

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness , author=. 2022 , eprint=

work page 2022
[18]

2025 , eprint=

Sparse Autoencoders Do Not Find Canonical Units of Analysis , author=. 2025 , eprint=

work page 2025
[19]

2024 , eprint=

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs , author=. 2024 , eprint=

work page 2024
[20]

2025 , eprint=

Learning Multi-Level Features with Matryoshka Sparse Autoencoders , author=. 2025 , eprint=

work page 2025
[21]

G. E. Hinton and R. R. Salakhutdinov , title =. Science , volume =. 2006 , doi =

work page 2006
[22]

2026 , eprint=

Do Sparse Autoencoders Capture Concept Manifolds? , author=. 2026 , eprint=

work page 2026
[23]

2025 , eprint=

The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability? , author=. 2025 , eprint=

work page 2025
[24]

2025 , eprint=

Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry , author=. 2025 , eprint=

work page 2025
[25]

2025 , eprint=

Bilinear MLPs enable weight-based mechanistic interpretability , author=. 2025 , eprint=

work page 2025
[26]

2021 , eprint=

Compositionality as we see it, everywhere around us , author=. 2021 , eprint=

work page 2021
[27]

2025 , eprint=

Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable? , author=. 2025 , eprint=

work page 2025
[28]

2022 , eprint=

Toy Models of Superposition , author=. 2022 , eprint=

work page 2022
[29]

2025 , eprint=

Stochastic Parameter Decomposition , author=. 2025 , eprint=

work page 2025
[30]

2025 , eprint=

The Origins of Representation Manifolds in Large Language Models , author=. 2025 , eprint=

work page 2025
[31]

2024 , eprint=

Scaling and evaluating sparse autoencoders , author=. 2024 , eprint=

work page 2024
[32]

2024 , eprint=

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders , author=. 2024 , eprint=

work page 2024
[33]

Tensor Diagram Notation , author =

work page
[34]

2025 , eprint=

A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders , author=. 2025 , eprint=

work page 2025
[35]

Ameisen, Emmanuel and Lindsey, Jack and Pearce, Adam and Gurnee, Wes and Turner, Nicholas L. and Chen, Brian and Citro, Craig and Abrahams, David and Carter, Shan and Hosmer, Basil and Marcus, Jonathan and Sklar, Michael and Templeton, Adly and Bricken, Trenton and McDougall, Callum and Cunningham, Hoagy and Henighan, Thomas and Jermyn, Adam and Jones, An...

work page
[36]

2024 , url =

Keller Jordan and Yuchen Jin and Vlado Boza and You Jiacheng and Franz Cesista and Laker Newhouse and Jeremy Bernstein , title =. 2024 , url =

work page 2024
[37]

Tull, Sean and Lorenz, Robin and Clark, Stephen and Khan, Ilyas and Coecke, Bob , year =. Towards. doi:10.48550/arXiv.2406.17583 , urldate =. arXiv , keywords =:2406.17583 , primaryclass =

work page doi:10.48550/arxiv.2406.17583
[38]

Compositionality

Dooms, Thomas and Gauderis, Ward and Wiggins, Geraint and Mogrovejo, Jose Antonio Oramas , year =. Compositionality. Connecting

work page
[39]

, title =

Gr\"unwald, Peter D. , title =. 2007 , isbn =

work page 2007
[40]

arXiv preprint arXiv:2511.13653 , year=

Weight-Sparse Transformers Have Interpretable Circuits , author =. doi:10.48550/arXiv.2511.13653 , urldate =. arXiv , keywords =:2511.13653 , primaryclass =

work page doi:10.48550/arxiv.2511.13653
[41]

2024 , month =

What is a Linear Representation? What is a Multidimensional Feature? , author =. 2024 , month =

work page 2024
[42]

2010 , publisher=

Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing , author=. 2010 , publisher=

work page 2010
[43]

Nature , volume=

Emergence of simple-cell receptive field properties by learning a sparse code for natural images , author=. Nature , volume=. 1996 , publisher=

work page 1996
[44]

Qwen3.5: Accelerating Productivity with Native Multimodal Agents , url =

Qwen Team , month =. Qwen3.5: Accelerating Productivity with Native Multimodal Agents , url =

work page
[45]

Aharon, Michal and Elad, Michael and Bruckstein, Alfred , journal=. K-. 2006 , doi=

work page 2006
[46]

Neural Computation , volume=

Learning Overcomplete Representations , author=. Neural Computation , volume=. 2000 , doi=

work page 2000
[47]

Proceedings of the 26th Annual International Conference on Machine Learning (ICML) , pages=

Online Dictionary Learning for Sparse Coding , author=. Proceedings of the 26th Annual International Conference on Machine Learning (ICML) , pages=. 2009 , doi=

work page 2009
[48]

Advances in Neural Information Processing Systems (NeurIPS) , volume=

Efficient Sparse Coding Algorithms , author=. Advances in Neural Information Processing Systems (NeurIPS) , volume=

work page
[49]

Neural Computation , volume=

An Information-Maximization Approach to Blind Separation and Blind Deconvolution , author=. Neural Computation , volume=. 1995 , doi=

work page 1995
[50]

Neural Computation , volume=

Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , author=. Neural Computation , volume=. 2000 , doi=

work page 2000
[51]

Journal of Machine Learning Research , volume=

Proximal Methods for Hierarchical Sparse Coding , author=. Journal of Machine Learning Research , volume=

work page
[52]

Statistical Science , volume=

Structured Sparsity through Convex Optimization , author=. Statistical Science , volume=. 2012 , doi=

work page 2012
[53]

Automatica , volume=

Modeling by Shortest Data Description , author=. Automatica , volume=. 1978 , doi=

work page 1978
[54]

Sharkey, Lee and Chughtai, Bilal and Batson, Joshua and Lindsey, Jack and Wu, Jeff and Bushnaq, Lucius and. Open. doi:10.48550/arXiv.2501.16496 , urldate =. arXiv , langid =:2501.16496 , primaryclass =

work page internal anchor Pith review doi:10.48550/arxiv.2501.16496
[55]

The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track , year=

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale , author=. The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track , year=

work page
[56]

2025 , eprint=

Gemma 3 Technical Report , author=. 2025 , eprint=

work page 2025
[57]

2024 , eprint=

The Llama 3 Herd of Models , author=. 2024 , eprint=

work page 2024
[58]

2019 , eprint=

PyTorch: An Imperative Style, High-Performance Deep Learning Library , author=. 2019 , eprint=

work page 2019
[59]

, title =

Hitchcock, Frank L. , title =. Journal of Mathematics and Physics , volume =. doi:https://doi.org/10.1002/sapm192761164 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1002/sapm192761164 , year =

work page doi:10.1002/sapm192761164
[60]

and Ayonrinde, Kola and Wiggins, Geraint A

Gauderis, Ward and Dooms, Thomas and Homer, Steven T. and Ayonrinde, Kola and Wiggins, Geraint A. , month = jul, year =. From. Proceedings of the 2nd

work page
[61]

Compositional

Lewis, Martha , month = oct, year =. Compositional. Proceedings -. doi:10.26615/978-954-452-056-4_075 , abstract =

work page doi:10.26615/978-954-452-056-4_075
[62]

ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , url =

Le, Quoc and Karpenko, Alexandre and Ngiam, Jiquan and Ng, Andrew , booktitle =. ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , url =

work page
[63]

Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry

Fel, Thomas and Wang, Binxu and Lepori, Michael A. and Kowal, Matthew and Lee, Andrew and Balestriero, Randall and Joseph, Sonia and Lubana, Ekdeep S. and Konkle, Talia and Ba, Demba and Wattenberg, Martin , month = oct, year =. Into the. doi:10.48550/arXiv.2510.08638 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2510.08638
[64]

2025 , author =

Learning. 2025 , author =

work page 2025
[65]

and Nasrabadi, Nasser M

Nguyen, Hien Van and Patel, Vishal M. and Nasrabadi, Nasser M. and Chellappa, Rama , month = mar, year =. Kernel dictionary learning , isbn =. 2012. doi:10.1109/ICASSP.2012.6288305 , urldate =

work page doi:10.1109/icassp.2012.6288305 2012
[66]

Nguyen, Hien Van and Patel, Vishal M and Nasrabadi, Nasser M and Chellappa, Rama , file =

work page
[67]

Luo, Yifan and Zhan, Yang and Jiang, Jiedong and Liu, Tianyang and Wu, Mingrui and Zhou, Zhennan and Dong, Bin , month = feb, year =. From. doi:10.48550/arXiv.2602.11881 , abstract =

work page doi:10.48550/arxiv.2602.11881