pith. machine review for the scientific record. sign in

Feature visualization

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

fields

cs.LG 4

representative citing papers

Toy Models of Superposition

cs.LG · 2022-09-21 · accept · novelty 8.0

Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.

From Mechanistic to Compositional Interpretability

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

Compositional interpretability defines explanations as commuting syntactic-semantic mapping pairs grounded in compositionality and minimum description length, with compressive refinement and a parsimony theorem guaranteeing concise human-aligned decompositions.

citing papers explorer

Showing 4 of 4 citing papers.

  • Toy Models of Superposition cs.LG · 2022-09-21 · accept · none · ref 10

    Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.

  • From Mechanistic to Compositional Interpretability cs.LG · 2026-05-09 · unverdicted · none · ref 203

    Compositional interpretability defines explanations as commuting syntactic-semantic mapping pairs grounded in compositionality and minimum description length, with compressive refinement and a parsimony theorem guaranteeing concise human-aligned decompositions.

  • NeuroViz: Real-time Interactive Visualization of Forward and Backward Passes in Neural Network Training cs.LG · 2026-05-03 · unverdicted · none · ref 34

    NeuroViz offers interactive real-time visualization of neural network forward and backward passes, achieving top usability scores in a study with 31 participants compared to existing tools.

  • Open Problems in Mechanistic Interpretability cs.LG · 2025-01-27 · unverdicted · none · ref 9

    A review paper that organizes conceptual, practical, and socio-technical open problems in mechanistic interpretability.