A cross-attention SAE with sparsemax attention achieves lower reconstruction loss and higher-quality concepts than fixed-sparsity baselines by making activation counts data-dependent.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
AdvFLYP finetunes CLIP on web image-text pairs using adversarial contrastive learning and regularization to boost zero-shot adversarial robustness across domains better than prior proxy-dataset methods.
ACE-Merging estimates task input covariances from parameter differences to enable closed-form data-free merging that reduces interference and outperforms prior baselines on vision and language tasks.
citing papers explorer
-
Improving Sparse Autoencoder with Dynamic Attention
A cross-attention SAE with sparsemax attention achieves lower reconstruction loss and higher-quality concepts than fixed-sparsity baselines by making activation counts data-dependent.
-
Finetune Like You Pretrain: Boosting Zero-shot Adversarial Robustness in Vision-language Models
AdvFLYP finetunes CLIP on web image-text pairs using adversarial contrastive learning and regularization to boost zero-shot adversarial robustness across domains better than prior proxy-dataset methods.
-
ACE-Merging: Data-Free Model Merging with Adaptive Covariance Estimation
ACE-Merging estimates task input covariances from parameter differences to enable closed-form data-free merging that reduces interference and outperforms prior baselines on vision and language tasks.