Formalizes Fashion Detail Generation task, releases FDBench benchmark with 40K+ pairs, and proposes CFAD distillation method plus RL consistency reward that outperforms open-source baselines.
Text embedding is not all you need: Attention control for text-to-image semantic alignment with text self-attention maps.arXiv preprint arXiv:2411.15236, 2024
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
DetailAnywhere: Fashion Detail Generation via Cross-Modal Feature Alignment Distillation
Formalizes Fashion Detail Generation task, releases FDBench benchmark with 40K+ pairs, and proposes CFAD distillation method plus RL consistency reward that outperforms open-source baselines.