FashionMV introduces product-level multi-view CIR, a 127K-product dataset built via automated LMM pipeline, and a 0.8B ProCIR model that beats larger baselines on three fashion benchmarks.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
CIRThan is a new sketch+text composed image retrieval dataset for Thangka imagery with 2,287 images, sketches, and multi-level hierarchical texts.
WRF4CIR uses weight-regularized fine-tuning with adversarial perturbations to mitigate overfitting in composed image retrieval and narrows the generalization gap on benchmarks.
citing papers explorer
-
FashionMV: Product-Level Composed Image Retrieval with Multi-View Fashion Data
FashionMV introduces product-level multi-view CIR, a 127K-product dataset built via automated LMM pipeline, and a 0.8B ProCIR model that beats larger baselines on three fashion benchmarks.
-
A Sketch+Text Composed Image Retrieval Dataset for Thangka
CIRThan is a new sketch+text composed image retrieval dataset for Thangka imagery with 2,287 images, sketches, and multi-level hierarchical texts.
-
WRF4CIR: Weight-Regularized Fine-Tuning Network for Composed Image Retrieval
WRF4CIR uses weight-regularized fine-tuning with adversarial perturbations to mitigate overfitting in composed image retrieval and narrows the generalization gap on benchmarks.