Composed image retrieval is reframed as calibrated intent resolution under uncertainty via conformal prediction sets and expected-information-gain clarification, with new AmbiCIR benchmark showing matched single-turn SOTA and faster multi-turn resolution with valid coverage.
Rennie, and Roge- rio Feris
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Introduces DFHR task, DFHR-Bench with over 180K triplets, and MFHC framework for mixed-modality dual face-hair retrieval.
Framework uses LLaVA for triplet generation and two-stage fine-tuning to enhance composed fashion image retrieval.
citing papers explorer
-
Resolving Ambiguity in Composed Image Retrieval via Calibrated Interaction
Composed image retrieval is reframed as calibrated intent resolution under uncertainty via conformal prediction sets and expected-information-gain clarification, with new AmbiCIR benchmark showing matched single-turn SOTA and faster multi-turn resolution with valid coverage.
-
Mixed-Modality Dual Face-Hair Retrieval
Introduces DFHR task, DFHR-Bench with over 180K triplets, and MFHC framework for mixed-modality dual face-hair retrieval.
-
Exploring Multi-Modal Large Language Models and Two-Stage Fine-Tuning for Fashion Image Retrieval
Framework uses LLaVA for triplet generation and two-stage fine-tuning to enhance composed fashion image retrieval.