pith. sign in

InProceedings of the 7th Black- boxNLP Workshop: Analyzing and Interpreting Neu- ral Networks for NLP, pages 88–104, Miami, Florida, US

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 6

verdicts

UNVERDICTED 6

roles

background 2

polarities

background 2

representative citing papers

Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs

cs.LG · 2026-04-10 · unverdicted · novelty 6.0

DACO curates a 15,000-concept dictionary from 400K image-caption pairs and uses it to initialize an SAE that enables granular, concept-specific steering of MLLM activations, raising safety scores on MM-SafetyBench and JailBreakV while preserving general capabilities.

There Will Be a Scientific Theory of Deep Learning

stat.ML · 2026-04-23 · unverdicted · novelty 2.0

A mechanics of the learning process is emerging in deep learning theory, characterized by dynamics, coarse statistics, and falsifiable predictions across idealized settings, limits, laws, hyperparameters, and universal behaviors.

citing papers explorer

Showing 6 of 6 citing papers.