pith. sign in

Photorealistic text-to-image diffusion models with deep language understanding

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

citation-role summary

background 1 dataset 1

citation-polarity summary

fields

cs.CV 2

years

2026 1 2025 1

verdicts

UNVERDICTED 2

representative citing papers

Asymmetric Flow Models

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

AsymFlow uses rank-asymmetric velocity prediction to reach 1.57 FID on ImageNet 256x256 and enables finetuning of latent flow models into superior pixel-space text-to-image generators.

Emerging Properties in Unified Multimodal Pretraining

cs.CV · 2025-05-20 · unverdicted · novelty 5.0

BAGEL is a unified decoder-only model that develops emerging complex multimodal reasoning abilities after pretraining on large-scale interleaved data and outperforms prior open-source unified models.

citing papers explorer

Showing 2 of 2 citing papers.

  • Asymmetric Flow Models cs.CV · 2026-05-13 · unverdicted · none · ref 56

    AsymFlow uses rank-asymmetric velocity prediction to reach 1.57 FID on ImageNet 256x256 and enables finetuning of latent flow models into superior pixel-space text-to-image generators.

  • Emerging Properties in Unified Multimodal Pretraining cs.CV · 2025-05-20 · unverdicted · none · ref 62

    BAGEL is a unified decoder-only model that develops emerging complex multimodal reasoning abilities after pretraining on large-scale interleaved data and outperforms prior open-source unified models.