pith. machine review for the scientific record. sign in

hub

Finding neurons in a haystack: Case studies with sparse probing

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

hub tools

years

2026 9 2024 1

representative citing papers

WriteSAE: Sparse Autoencoders for Recurrent State

cs.LG · 2026-05-12 · unverdicted · novelty 8.0

WriteSAE is the first sparse autoencoder that factors decoder atoms into the native d_k x d_v cache write shape of recurrent models and supplies a closed-form per-token logit shift for atom substitution.

Do Audio-Visual Large Language Models Really See and Hear?

cs.AI · 2026-04-03 · unverdicted · novelty 8.0

AVLLMs encode audio semantics in middle layers but suppress them in final text outputs when audio conflicts with vision, due to training that largely inherits from vision-language base models.

Scaling and evaluating sparse autoencoders

cs.LG · 2024-06-06 · unverdicted · novelty 7.0

K-sparse autoencoders with dead-latent fixes produce clean scaling laws and better feature quality metrics that improve with size, shown by training a 16-million-latent model on GPT-4 activations.

Darkness Visible: Reading the Exception Handler of a Language Model

cs.LG · 2026-04-06 · conditional · novelty 6.0

GPT-2 Small's terminal MLP implements a legible three-tier exception handler with 27 named neurons that routes predictions, while previously identified knowledge neurons function as amplifiers of residual-stream signals rather than fact storage.

There Will Be a Scientific Theory of Deep Learning

stat.ML · 2026-04-23 · unverdicted · novelty 2.0

A mechanics of the learning process is emerging in deep learning theory, characterized by dynamics, coarse statistics, and falsifiable predictions across idealized settings, limits, laws, hyperparameters, and universal behaviors.

citing papers explorer

Showing 10 of 10 citing papers.