Grad-ECLIP produces gradient-based visual and textual explanation heatmaps for CLIP by applying channel and spatial weights to token features instead of relying on sparse self-attention maps.
Ss-cam: Smoothed score-cam for sharper visual feature localization
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
Manhattan and Correlation distance metrics best align CAM saliency maps with human perception on ImageNet chihuahuas, ranking LayerCAM, Score-CAM, and IS-CAM highest when compared to crowdsourced choices via RBO.
citing papers explorer
-
Grad-ECLIP: Gradient-based Visual and Textual Explanations for CLIP
Grad-ECLIP produces gradient-based visual and textual explanation heatmaps for CLIP by applying channel and spatial weights to token features instead of relying on sparse self-attention maps.
-
How Can One Choose the Best CAM-Based Explainability Method for a CNN Model?
Manhattan and Correlation distance metrics best align CAM saliency maps with human perception on ImageNet chihuahuas, ranking LayerCAM, Score-CAM, and IS-CAM highest when compared to crowdsourced choices via RBO.