Modality- agnostic attention fusion for visual search with text feedback,

· 2007 · arXiv 2007.00145

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Learning to Compose: Revisiting Proxy Task Design for Zero-Shot Composed Image Retrieval

cs.CV · 2026-07-01 · unverdicted · novelty 7.0

FoCo learns composition for zero-shot CIR via text-anchored visual aggregation and context-conditioned semantic completion trained jointly with cross-instance contrastive loss, reporting SOTA on four benchmarks.

Never Seen Before: Benchmarking Genuine Zero-Shot Composed Image Retrieval with Consistent Video-Sourced Datasets

cs.CV · 2026-06-05 · unverdicted · novelty 7.0

ZeroSight supplies a video-derived dataset and evaluation protocol for genuine zero-shot composed image retrieval plus the SC4CIR consistency method, demonstrating that prior benchmarks inflate reported performance across 27 tested approaches.

Resolving Ambiguity in Composed Image Retrieval via Calibrated Interaction

cs.CV · 2026-05-23 · unverdicted · novelty 7.0

Composed image retrieval is reframed as calibrated intent resolution under uncertainty via conformal prediction sets and expected-information-gain clarification, with new AmbiCIR benchmark showing matched single-turn SOTA and faster multi-turn resolution with valid coverage.

citing papers explorer

Showing 3 of 3 citing papers.

Learning to Compose: Revisiting Proxy Task Design for Zero-Shot Composed Image Retrieval cs.CV · 2026-07-01 · unverdicted · none · ref 5
FoCo learns composition for zero-shot CIR via text-anchored visual aggregation and context-conditioned semantic completion trained jointly with cross-instance contrastive loss, reporting SOTA on four benchmarks.
Never Seen Before: Benchmarking Genuine Zero-Shot Composed Image Retrieval with Consistent Video-Sourced Datasets cs.CV · 2026-06-05 · unverdicted · none · ref 75
ZeroSight supplies a video-derived dataset and evaluation protocol for genuine zero-shot composed image retrieval plus the SC4CIR consistency method, demonstrating that prior benchmarks inflate reported performance across 27 tested approaches.
Resolving Ambiguity in Composed Image Retrieval via Calibrated Interaction cs.CV · 2026-05-23 · unverdicted · none · ref 4
Composed image retrieval is reframed as calibrated intent resolution under uncertainty via conformal prediction sets and expected-information-gain clarification, with new AmbiCIR benchmark showing matched single-turn SOTA and faster multi-turn resolution with valid coverage.

Modality- agnostic attention fusion for visual search with text feedback,

fields

years

verdicts

representative citing papers

citing papers explorer