Causal Learning and Reasoning , pages=

Finding alignments between interpretable causal variables, distributed neural representations , author= · 2024

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Refusal in Language Models Is Mediated by a Single Direction

cs.LG · 2024-06-17 · accept · novelty 7.0

Refusal in language models is mediated by a single direction in residual stream activations that can be erased to disable safety or added to elicit refusal.

Absorber LLM: Harnessing Causal Synchronization for Test-Time Training

cs.LG · 2026-04-22 · unverdicted · novelty 5.0

Absorber LLM introduces causal synchronization to absorb context into parameters for memory-efficient long-context LLM inference while preserving causal effects.

citing papers explorer

Showing 2 of 2 citing papers.

Refusal in Language Models Is Mediated by a Single Direction cs.LG · 2024-06-17 · accept · none · ref 27
Refusal in language models is mediated by a single direction in residual stream activations that can be erased to disable safety or added to elicit refusal.
Absorber LLM: Harnessing Causal Synchronization for Test-Time Training cs.LG · 2026-04-22 · unverdicted · none · ref 42
Absorber LLM introduces causal synchronization to absorb context into parameters for memory-efficient long-context LLM inference while preserving causal effects.

Causal Learning and Reasoning , pages=

fields

years

verdicts

representative citing papers

citing papers explorer