AI-T2I improves text-to-image alignment in diffusion models by using aggregation and isolation losses on cross-attention maps to fix scattering and overlap issues.
Make it count: Text-to-image generation with an accurate number of objects.arXiv preprint arXiv:2406.10210
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
CARINOX unifies noise optimization and exploration with human-correlated reward selection to boost compositional alignment in diffusion models, reporting +16% on T2I-CompBench++ and +11% on HRS while keeping quality and diversity.
citing papers explorer
-
AI-T2I: Aggregating-and-Isolating Cross-Attention to Diffusion Models for Text-to-Image Synthesis
AI-T2I improves text-to-image alignment in diffusion models by using aggregation and isolation losses on cross-attention maps to fix scattering and overlap issues.
-
CARINOX: Inference-time Scaling with Category-Aware Reward-based Initial Noise Optimization and Exploration
CARINOX unifies noise optimization and exploration with human-correlated reward selection to boost compositional alignment in diffusion models, reporting +16% on T2I-CompBench++ and +11% on HRS while keeping quality and diversity.