Gate-and-Merge enables zero-shot compositional personalization of VLMs by independently learning concept-specific LoRA adapters and merging them in weight space with cue-based gating to suppress interference.
Visionllm: Large language model is also an open- ended decoder for vision-centric tasks.Advances in Neural Information Processing Systems, 36:61501–61513
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
Gate-and-Merge: Zero-shot Compositional Personalization of Vision Language Models
Gate-and-Merge enables zero-shot compositional personalization of VLMs by independently learning concept-specific LoRA adapters and merging them in weight space with cue-based gating to suppress interference.