pith. sign in

arxiv: 2404.07949 · v1 · pith:CB527KS2new · submitted 2024-04-11 · 💻 cs.CV

Taming Stable Diffusion for Text to 360{deg} Panorama Image Generation

classification 💻 cs.CV
keywords panoramadiffusiongenerationimagetextimagespanfusionstable
0
0 comments X
read the original abstract

Generative models, e.g., Stable Diffusion, have enabled the creation of photorealistic images from text prompts. Yet, the generation of 360-degree panorama images from text remains a challenge, particularly due to the dearth of paired text-panorama data and the domain gap between panorama and perspective images. In this paper, we introduce a novel dual-branch diffusion model named PanFusion to generate a 360-degree image from a text prompt. We leverage the stable diffusion model as one branch to provide prior knowledge in natural image generation and register it to another panorama branch for holistic image generation. We propose a unique cross-attention mechanism with projection awareness to minimize distortion during the collaborative denoising process. Our experiments validate that PanFusion surpasses existing methods and, thanks to its dual-branch structure, can integrate additional constraints like room layout for customized panorama outputs. Code is available at https://chengzhag.github.io/publication/panfusion.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Pano2World: End-to-End 3D Generation via Unified Multi-View Sequences

    cs.CV 2026-07 unverdicted novelty 6.0

    Pano2World generates an explorable 3D Gaussian scene directly from a single indoor panorama via coarse proxy rendering, view-aware joint denoising, and a latent feature adapter.