pith. sign in

Title resolution pending

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.AI 2 cs.LG 1

years

2026 2 2025 1

representative citing papers

Interpretability Can Be Actionable

cs.LG · 2026-05-11 · conditional · novelty 6.0

Interpretability research should be judged by actionability—the degree to which its insights support concrete decisions and interventions—rather than explanatory power alone.

Polysemantic Experts, Monosemantic Paths: Routing as Control in MoEs

cs.AI · 2026-04-20 · unverdicted · novelty 6.0

A parameter-free decomposition in MoE models separates routing control from content, showing that expert trajectories cluster tokens by semantic function across languages and forms, making paths rather than experts the natural unit of interpretability.

citing papers explorer

Showing 3 of 3 citing papers.

  • Interpretability Can Be Actionable cs.LG · 2026-05-11 · conditional · none · ref 105

    Interpretability research should be judged by actionability—the degree to which its insights support concrete decisions and interventions—rather than explanatory power alone.

  • Polysemantic Experts, Monosemantic Paths: Routing as Control in MoEs cs.AI · 2026-04-20 · unverdicted · none · ref 21

    A parameter-free decomposition in MoE models separates routing control from content, showing that expert trajectories cluster tokens by semantic function across languages and forms, making paths rather than experts the natural unit of interpretability.

  • Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety cs.AI · 2025-07-15 · unverdicted · none · ref 105

    Chain-of-thought monitorability provides a promising but fragile method for AI safety oversight that developers should actively preserve.