BlackVIP adapts foundation models via a Coordinator for input-dependent visual prompts and SPSA-GC for gradient estimation, enabling robust transfer on 19 datasets with low memory use and a link to randomized smoothing robustness.
Functional map of the world
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
ChatENV fine-tunes Qwen-2.5-VL on a 177k-image dataset of temporal satellite pairs with sensor metadata to support interactive temporal and what-if reasoning for environmental monitoring.
citing papers explorer
-
Robust Adaptation of Foundation Models with Black-Box Visual Prompting
BlackVIP adapts foundation models via a Coordinator for input-dependent visual prompts and SPSA-GC for gradient estimation, enabling robust transfer on 19 datasets with low memory use and a link to randomized smoothing robustness.
-
ChatENV: An Interactive Vision-Language Model for Sensor-Guided Environmental Monitoring and Scenario Simulation
ChatENV fine-tunes Qwen-2.5-VL on a 177k-image dataset of temporal satellite pairs with sensor metadata to support interactive temporal and what-if reasoning for environmental monitoring.