MechaRule localizes sparse agonist neurons via contrastive hierarchical ablation and adaptive group testing to ground rule extraction, recalling 97% of high-effect activations at 2.14% cost while enabling near-total elimination of target behaviors.
Implicit reasoning in transformers is reasoning through shortcuts
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.
citing papers explorer
-
Neuron-Anchored Rule Extraction for Large Language Models via Contrastive Hierarchical Ablation
MechaRule localizes sparse agonist neurons via contrastive hierarchical ablation and adaptive group testing to ground rule extraction, recalling 97% of high-effect activations at 2.14% cost while enabling near-total elimination of target behaviors.
-
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems
This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.