pith. machine review for the scientific record. sign in

arxiv: 2605.11311 · v1 · submitted 2026-05-11 · 💻 cs.LG · cs.CV· stat.CO· stat.ML

Recognition: no theorem link

Couple to Control: Joint Initial Noise Design in Diffusion Models

Authors on Pith no claims yet

Pith reviewed 2026-05-13 02:10 UTC · model grok-4.3

classification 💻 cs.LG cs.CVstat.COstat.ML
keywords diffusion modelsinitial noisenoise couplingbatch diversityimage generationStable Diffusionrepulsive couplinggenerative models
0
0 comments X

The pith

Designing dependence across initial noises lets diffusion models generate more diverse batches without added cost or changed inputs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Diffusion models normally draw each image in a batch from its own independent standard Gaussian noise. The paper shows this independence is only one possible choice and that any joint distribution can be used as long as every individual noise stays marginally standard Gaussian. By choosing the dependence structure across the batch, one can directly influence properties such as output diversity while the pretrained model still sees valid single-sample inputs. A repulsive Gaussian coupling is presented as one concrete construction that raises diversity metrics on SD1.5, SDXL, and SD3 while keeping prompt alignment and image quality close to the independent baseline. The same coupled noises can also initialize optimization routines when extra computation is available.

Core claim

The authors claim that initial-noise design can be reframed as the choice of a coupling over multiple noises, each remaining marginally standard Gaussian, so that dependence across samples is set by design rather than left to chance. This framework encompasses existing methods as special cases and yields new constructions such as repulsive Gaussian coupling, which empirically increases gallery diversity on standard diffusion models at the same sampling cost as independent generation while largely preserving alignment and quality. Subspace couplings within the framework further support controlled background variation for fixed foreground objects.

What carries the argument

A coupling of initial noises that preserves the marginal standard-Gaussian distribution for each sample while allowing explicit control over their joint dependence structure.

If this is right

  • Repulsive Gaussian coupling raises several diversity metrics on SD1.5, SDXL, and SD3 at identical sampling cost to independent noise.
  • Prompt alignment and perceptual image quality remain largely unchanged under the coupling.
  • Coupled noise supplies a structured initialization that can be fed into test-time optimization pipelines for further refinement.
  • Subspace couplings generate diverse natural backgrounds for a fixed foreground object, with a tunable trade-off against foreground fidelity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The coupling perspective could extend to other generative models that begin from random inputs, such as flows or autoregressive transformers.
  • Instead of hand-designed repulsion, one might learn couplings from data to target specific diversity-quality operating points.
  • The approach suggests treating the space of possible batch noises as a geometric object whose structure can be optimized for downstream tasks.
  • In user-facing tools, defaulting to coupled rather than independent noise could reduce the number of independent runs needed to obtain varied options from one prompt.

Load-bearing premise

The chosen dependence across noises does not produce artifacts that the pretrained diffusion model cannot handle under its original training assumptions.

What would settle it

Applying repulsive Gaussian coupling to a broad set of prompts across multiple diffusion models and observing that diversity metrics fail to increase or that quality and alignment metrics drop below those of independent noise would falsify the claimed practical benefit.

Figures

Figures reproduced from arXiv: 2605.11311 by Guanyang Wang, Jing Jia, Liyue Shen.

Figure 1
Figure 1. Figure 1: Five couplings of Z1, Z2 ∼ N(0, 1), illustrated via 20,000 joint samples. Each panel shows the joint scatter of (Z1, Z2) with the marginal density of Z1 (green, top) and Z2 (amber, right); the dashed line marks Y = X. (a) Antithetic (ρ = −1): Z2 = −Z1, achieving maximal negative correlation. (b) ρ = −0.5: an intermediate negatively correlated Gaussian coupling. (c) Independent (ρ = 0): Z1 and Z2 are drawn … view at source ↗
Figure 2
Figure 2. Figure 2: Independent versus coupled noises with the same prompt “A classic motorcycle in a parking [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Independent versus coupled noises with the same prompt “Assortment of doughnuts and [PITH_FULL_IMAGE:figures/full_fig_p025_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Examples of brightness-clustered generation with Stable Diffusion 1.5. The learned [PITH_FULL_IMAGE:figures/full_fig_p026_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Fixed-object background generation on an LSUN bedroom image. The first row shows the [PITH_FULL_IMAGE:figures/full_fig_p026_5.png] view at source ↗
read the original abstract

Diffusion models typically generate image batches from independent Gaussian initial noises. We argue that this independence assumption is only one choice within a broader class of valid joint noise designs. Instead, one can specify a coupling of the initial noises: each noise remains marginally standard Gaussian, so the pretrained diffusion model receives the same single-sample input distribution, while the dependence across samples is chosen by design. This reframes initial-noise control from selecting or optimizing individual seeds to designing the dependence structure of a multi-sample gallery. This view gives a general framework for initial-noise design, covering several existing methods as special cases and leading naturally to new coupled-noise constructions. Coupled noise can improve generation on its own without adding sampling cost, and it is flexible enough to serve as a structured initialization for optimization-based pipelines when additional computation is available. Empirically, repulsive Gaussian coupling improves gallery diversity on SD1.5, SDXL, and SD3 while largely preserving prompt alignment and image quality. It matches or outperforms recent test-time noise-optimization baselines on several diversity metrics at the same sampling cost as independent generation. Subspace couplings also support fixed-object background generation, producing diverse, natural backgrounds compared with specialized inpainting baselines, with a tunable trade-off in foreground fidelity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript proposes reframing initial noise selection in diffusion models as the design of a joint distribution (coupling) over multiple noise vectors, each with exact standard-Gaussian marginals. This preserves the per-sample input distribution seen during training while allowing control over inter-sample dependence. The authors present repulsive Gaussian coupling and subspace couplings as concrete constructions, show that several prior methods are special cases, and report that repulsive coupling increases gallery diversity on SD1.5, SDXL, and SD3 while preserving prompt alignment and perceptual quality at the same sampling cost as independent noise; subspace couplings are further shown to enable diverse background generation around fixed foreground objects.

Significance. If the empirical results hold, the work supplies a zero-extra-cost, training-free technique for improving batch diversity that is compatible with any pretrained diffusion model whose architecture processes samples independently (e.g., group-norm UNets). The framing unifies existing noise-control heuristics under a single probabilistic view and supplies new, easily implemented constructions. The approach is particularly attractive for creative applications that require varied outputs from a single prompt without incurring the cost of test-time optimization.

major comments (2)
  1. [§4, Tables 1–3] §4 (Empirical Evaluation), Tables 1–3: the reported diversity gains for repulsive coupling are given as point estimates without error bars, number of random seeds, or statistical significance tests. Because the central claim is that the method “matches or outperforms” recent baselines on several metrics, the absence of variability measures makes it impossible to judge whether the observed differences are robust or could be explained by seed choice.
  2. [§3.2] §3.2 (Repulsive Gaussian Coupling): the construction is presented as parameter-free once the repulsion strength is fixed, yet the text does not specify how the strength hyper-parameter is chosen across the three model families or whether the same value is used for all prompts. If the value is tuned per model or per prompt, the “no extra cost” claim relative to independent sampling requires clarification.
minor comments (3)
  1. [Abstract and §4] The abstract states that the method “matches or outperforms recent test-time noise-optimization baselines at the same sampling cost,” but the main text does not provide a side-by-side wall-clock or FLOPs comparison that would allow a reader to verify cost equivalence.
  2. [Figure 4] Figure 4 (subspace-coupling backgrounds) would benefit from an explicit statement of the foreground mask generation procedure and the precise trade-off parameter used to produce the displayed examples.
  3. [§4] A short paragraph summarizing the exact diversity and alignment metrics (e.g., CLIP similarity, LPIPS, or perceptual hash distance) and the number of images per gallery would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation for minor revision. We address each major comment below and will incorporate the requested clarifications and additional results.

read point-by-point responses
  1. Referee: [§4, Tables 1–3] §4 (Empirical Evaluation), Tables 1–3: the reported diversity gains for repulsive coupling are given as point estimates without error bars, number of random seeds, or statistical significance tests. Because the central claim is that the method “matches or outperforms” recent baselines on several metrics, the absence of variability measures makes it impossible to judge whether the observed differences are robust or could be explained by seed choice.

    Authors: We agree that reporting variability measures would allow readers to better judge robustness. In the revised manuscript we will rerun the main experiments on SD1.5, SDXL, and SD3 using multiple random seeds, report means and standard deviations for all diversity and quality metrics in Tables 1–3, and include paired statistical significance tests against the independent-noise baseline. revision: yes

  2. Referee: [§3.2] §3.2 (Repulsive Gaussian Coupling): the construction is presented as parameter-free once the repulsion strength is fixed, yet the text does not specify how the strength hyper-parameter is chosen across the three model families or whether the same value is used for all prompts. If the value is tuned per model or per prompt, the “no extra cost” claim relative to independent sampling requires clarification.

    Authors: The repulsion strength is a single fixed hyper-parameter whose value was selected once via a small preliminary grid on SD1.5 and then held constant for all prompts and all three models (SD1.5, SDXL, SD3). No per-prompt or per-model retuning occurs at inference time, preserving the zero-extra-cost property relative to independent sampling. We will add an explicit statement of this fixed choice to the revised §3.2. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained design choice

full rationale

The paper reframes initial-noise selection as choosing a joint distribution whose marginals are standard Gaussian (by explicit construction) while the dependence structure is a free design parameter. This is not derived from data or prior results within the paper; it follows directly from the fact that pretrained diffusion models (UNet with group norm) process batch elements independently, an external architectural property. No claim reduces to a fitted parameter renamed as prediction, no self-citation chain bears the central argument, and the reported diversity gains are empirical outcomes rather than tautological consequences of the coupling definition. The framework is therefore independent of its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that any coupling whose marginals are standard Gaussian is a valid input to a pretrained diffusion model. No free parameters are explicitly fitted in the abstract; the coupling constructions are presented as design choices.

axioms (1)
  • domain assumption A pretrained diffusion model accepts any initial noise whose marginal distribution is standard Gaussian.
    Stated in the opening paragraph of the abstract as the justification for keeping marginals unchanged.

pith-pipeline@v0.9.0 · 5523 in / 1298 out tokens · 35581 ms · 2026-05-13T02:10:00.450143+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

89 extracted references · 89 canonical work pages · 2 internal anchors

  1. [1]

    Schwartz, Lorraine , journal=. On. 1965 , publisher=

  2. [2]

    2000 , publisher=

    Asymptotic statistics , author=. 2000 , publisher=

  3. [3]

    Advances in Neural Information Processing Systems , volume=

    Learning mixtures of gaussians using the ddpm objective , author=. Advances in Neural Information Processing Systems , volume=

  4. [4]

    Proceedings of the 41st International Conference on Machine Learning , pages=

    Theoretical insights for diffusion guidance: a case study for Gaussian mixture models , author=. Proceedings of the 41st International Conference on Machine Learning , pages=

  5. [5]

    arXiv preprint arXiv:2404.18869 , year=

    Learning mixtures of gaussians using diffusion models , author=. arXiv preprint arXiv:2404.18869 , year=

  6. [6]

    Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

    Unraveling the smoothness properties of diffusion models: A gaussian mixture perspective , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

  7. [7]

    2000 , publisher=

    Finite mixture models , author=. 2000 , publisher=

  8. [8]

    InverseBench:

    Hongkai Zheng and Wenda Chu and Bingliang Zhang and Zihui Wu and Austin Wang and Berthy Feng and Caifeng Zou and Yu Sun and Nikola Borislavov Kovachki and Zachary E Ross and Katherine Bouman and Yisong Yue , booktitle=. InverseBench:. 2025 , url=

  9. [9]

    The Eleventh International Conference on Learning Representations , year=

    Diffusion Posterior Sampling for General Noisy Inverse Problems , author=. The Eleventh International Conference on Learning Representations , year=

  10. [10]

    Advances in Neural Information Processing Systems , volume=

    Blind image restoration via fast diffusion inversion , author=. Advances in Neural Information Processing Systems , volume=

  11. [11]

    2025 , url=

    Bowen Song and Zecheng Zhang and Zhaoxu Luo and Jason Hu and Wei Yuan and Jing Jia and Zhengxu Tang and Guanyang Wang and Liyue Shen , booktitle=. 2025 , url=

  12. [12]

    Advances in Neural Information Processing Systems , volume=

    Dmplug: A plug-in method for solving inverse problems with diffusion models , author=. Advances in Neural Information Processing Systems , volume=

  13. [13]

    International Conference on Learning Representations , year=

    Score-Based Generative Modeling through Stochastic Differential Equations , author=. International Conference on Learning Representations , year=

  14. [14]

    Advances in neural information processing systems , volume=

    Denoising diffusion restoration models , author=. Advances in neural information processing systems , volume=

  15. [15]

    arXiv preprint arXiv:2211.12343 , year=

    Diffusion model based posterior sampling for noisy linear inverse problems , author=. arXiv preprint arXiv:2211.12343 , year=

  16. [16]

    arXiv preprint arXiv:2411.09850 , year=

    Enhancing diffusion posterior sampling for inverse problems by integrating crafted measurements , author=. arXiv preprint arXiv:2411.09850 , year=

  17. [17]

    SIAM Journal on Imaging Sciences , volume=

    Efficient Diffusion Posterior Sampling for Noisy Inverse Problems , author=. SIAM Journal on Imaging Sciences , volume=. 2025 , publisher=

  18. [18]

    arXiv preprint arXiv:2412.20045 , year=

    Enhancing diffusion models for inverse problems with covariance-aware posterior sampling , author=. arXiv preprint arXiv:2412.20045 , year=

  19. [19]

    arXiv preprint arXiv:2403.08728 , year=

    Ambient diffusion posterior sampling: Solving inverse problems with diffusion models trained on corrupted data , author=. arXiv preprint arXiv:2403.08728 , year=

  20. [20]

    IEEE Transactions on Medical Imaging , year=

    Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction , author=. IEEE Transactions on Medical Imaging , year=

  21. [21]

    Magnetic resonance in medicine , volume=

    Assessment of the generalization of learned image reconstruction and the potential for transfer learning , author=. Magnetic resonance in medicine , volume=. 2019 , publisher=

  22. [22]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Solving 3d inverse problems using pre-trained 2d diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  23. [23]

    DAGM German Conference on Pattern Recognition , pages=

    Bigger Isn’t Always Better: Towards a General Prior for Medical Image Reconstruction , author=. DAGM German Conference on Pattern Recognition , pages=. 2024 , organization=

  24. [24]

    The Twelfth International Conference on Learning Representations , year=

    Solving Inverse Problems with Latent Diffusion Models via Hard Data Consistency , author=. The Twelfth International Conference on Learning Representations , year=

  25. [25]

    International Conference on Learning Representations , year=

    Solving Inverse Problems in Medical Imaging with Score-Based Generative Models , author=. International Conference on Learning Representations , year=

  26. [26]

    International conference on machine learning , pages=

    Compressed sensing using generative models , author=. International conference on machine learning , pages=. 2017 , organization=

  27. [27]

    2018 , publisher=

    High-dimensional probability: An introduction with applications in data science , author=. 2018 , publisher=

  28. [28]

    Advances in neural information processing systems , volume=

    Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

  29. [29]

    IEEE Transactions on Computational Imaging , year=

    Test-time adaptation improves inverse problem solving with patch-based diffusion models , author=. IEEE Transactions on Computational Imaging , year=

  30. [30]

    International Conference on Learning Representations , year=

    Denoising Diffusion Implicit Models , author=. International Conference on Learning Representations , year=

  31. [31]

    Advances in neural information processing systems , volume=

    Robust compressed sensing mri with deep generative priors , author=. Advances in neural information processing systems , volume=

  32. [32]

    Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

    The unreasonable effectiveness of deep features as a perceptual metric , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

  33. [33]

    Yu, Fisher and Seff, Ari and Zhang, Yinda and Song, Shuran and Funkhouser, Thomas and Xiao, Jianxiong , journal=

  34. [34]

    Proceedings of International Conference on Computer Vision (ICCV) , month =

    Deep Learning Face Attributes in the Wild , author =. Proceedings of International Conference on Computer Vision (ICCV) , month =

  35. [35]

    International Conference on Machine Learning , pages=

    Guidance with Spherical Gaussian Constraint for Conditional Diffusion , author=. International Conference on Machine Learning , pages=. 2024 , organization=

  36. [36]

    arXiv preprint arXiv:2509.13936 , year=

    Noise-Level Diffusion Guidance: Well Begun is Half Done , author=. arXiv preprint arXiv:2509.13936 , year=

  37. [37]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    High-resolution image synthesis with latent diffusion models , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  38. [38]

    Proceedings of the IEEE/CVF international conference on computer vision , pages=

    Scalable diffusion models with transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

  39. [39]

    2009 IEEE conference on computer vision and pattern recognition , pages=

    Imagenet: A large-scale hierarchical image database , author=. 2009 IEEE conference on computer vision and pattern recognition , pages=. 2009 , organization=

  40. [40]

    2021 , booktitle =

    Phong Tran and Anh Tran and Quynh Phung and Minh Hoai , title =. 2021 , booktitle =

  41. [41]

    Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

    Freedom: Training-free energy-guided conditional diffusion model , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

  42. [42]

    2017 , publisher=

    Fundamentals of nonparametric Bayesian inference , author=. 2017 , publisher=

  43. [43]

    Journal of the American statistical association , volume=

    Probability inequalities for sums of bounded random variables , author=. Journal of the American statistical association , volume=. 1963 , publisher=

  44. [44]

    Bernoulli , volume=

    Concentration inequalities for sampling without replacement , author=. Bernoulli , volume=

  45. [45]

    International Conference on Machine Learning , pages=

    D-Flow: Differentiating through Flows for Controlled Generation , author=. International Conference on Machine Learning , pages=. 2024 , organization=

  46. [46]

    Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

    Scaling Inference Time Compute for Diffusion Models , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=

  47. [47]

    The Fourteenth International Conference on Learning Representations , year=

    Antithetic Noise in Diffusion Models , author=. The Fourteenth International Conference on Learning Representations , year=

  48. [48]

    Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

    Golden noise for diffusion models: A learning framework , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

  49. [49]

    arXiv preprint arXiv:2508.00721 , year=

    FMPlug: Plug-In Foundation Flow-Matching Priors for Inverse Problems , author=. arXiv preprint arXiv:2508.00721 , year=

  50. [50]

    Forty-second International Conference on Machine Learning , year=

    Inference-Time Alignment of Diffusion Models with Direct Noise Optimization , author=. Forty-second International Conference on Machine Learning , year=

  51. [51]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    A style-based generator architecture for generative adversarial networks , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  52. [52]

    2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=

    Good seed makes a good crop: Discovering secret seeds in text-to-image diffusion models , author=. 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , pages=. 2025 , organization=

  53. [53]

    The Thirteenth International Conference on Learning Representations , year=

    The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise , author=. The Thirteenth International Conference on Learning Representations , year=

  54. [54]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Initno: Boosting text-to-image diffusion models via initial noise optimization , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  55. [55]

    arXiv preprint arXiv:2601.22443 , year=

    Weak Diffusion Priors Can Still Achieve Strong Inverse-Problem Performance , author=. arXiv preprint arXiv:2601.22443 , year=

  56. [56]

    Eyring, Luca and Karthik, Shyamgopal and Roth, Karsten and Dosovitskiy, Alexey and Akata, Zeynep , journal=

  57. [57]

    Not all noises are created equally: Diffusion noise selection and optimization.arXiv preprint arXiv:2407.14041, 2024

    Not all noises are created equally: Diffusion noise selection and optimization , author=. arXiv preprint arXiv:2407.14041 , year=

  58. [58]

    Pacific J

    On chains of infinite order , author=. Pacific J. Math. , volume=

  59. [59]

    2017 , publisher=

    Markov chains and mixing times , author=. 2017 , publisher=

  60. [60]

    Doeblin, Wolfgang , journal=. Expos

  61. [61]

    Proceedings 38th Annual Symposium on Foundations of Computer Science , pages=

    Path coupling: A technique for proving rapid mixing in Markov chains , author=. Proceedings 38th Annual Symposium on Foundations of Computer Science , pages=. 1997 , organization=

  62. [62]

    2002 , publisher=

    Lectures on the coupling method , author=. 2002 , publisher=

  63. [63]

    Random Structures & Algorithms , volume=

    Exact sampling with coupled Markov chains and applications to statistical mechanics , author=. Random Structures & Algorithms , volume=. 1996 , publisher=

  64. [64]

    arXiv preprint arXiv:1711.04399 , year=

    Circularly-coupled Markov chain sampling , author=. arXiv preprint arXiv:1711.04399 , year=

  65. [65]

    Journal of the American Statistical Association , volume=

    A coupling-regeneration scheme for diagnosing convergence in Markov chain Monte Carlo algorithms , author=. Journal of the American Statistical Association , volume=. 1998 , publisher=

  66. [66]

    Advances in neural information processing systems , volume=

    Estimating convergence of Markov chains with L-lag couplings , author=. Advances in neural information processing systems , volume=

  67. [67]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Unbiased Markov chain Monte Carlo methods with couplings , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2020 , publisher=

  68. [68]

    Journal of Applied Probability , volume=

    Exact estimation for Markov chain equilibrium expectations , author=. Journal of Applied Probability , volume=. 2014 , publisher=

  69. [69]

    Deligiannidis, P

    On importance sampling and independent Metropolis-Hastings with an unbounded weight function , author=. arXiv preprint arXiv:2411.09514 , year=

  70. [70]

    arXiv preprint arXiv:2406.06851 , year=

    Atchad. arXiv preprint arXiv:2406.06851 , year=

  71. [71]

    The Fourteenth International Conference on Learning Representations , year=

    Diverse Text-to-Image Generation via Contrastive Noise Optimization , author=. The Fourteenth International Conference on Learning Representations , year=

  72. [72]

    Advances in neural information processing systems , volume=

    Diffusion models beat gans on image synthesis , author=. Advances in neural information processing systems , volume=

  73. [73]

    Classifier-Free Diffusion Guidance

    Classifier-free diffusion guidance , author=. arXiv preprint arXiv:2207.12598 , year=

  74. [74]

    International Conference on Machine Learning , pages=

    Learning transferable visual models from natural language supervision , author=. International Conference on Machine Learning , pages=. 2021 , organization=

  75. [75]

    Weber , title =

    Seyedmorteza Sadat and Jakob Buhmann and Derek Bradley and Otmar Hilliges and Romann M. Weber , title =. The Twelfth International Conference on Learning Representations,. 2024 , url =

  76. [76]

    What’s in the image? a deep-dive into the vision of vision language models

    Soobin Um and Jong Chul Ye , title =. 2025 , url =. doi:10.1109/CVPR52734.2025.01949 , timestamp =

  77. [77]

    CoRR , volume =

    Mariia Zameshina and Olivier Teytaud and Laurent Najman , title =. CoRR , volume =. 2023 , url =. doi:10.48550/ARXIV.2310.12583 , eprinttype =. 2310.12583 , timestamp =

  78. [78]

    It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models

    It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models , author=. arXiv preprint arXiv:2601.00090 , year=

  79. [79]

    Diverse text-to-image generation via contrastive noise optimization.arXiv preprint arXiv:2510.03813, 2025

    Diverse Text-to-Image Generation via Contrastive Noise Optimization , author=. arXiv preprint arXiv:2510.03813 , year=

  80. [80]

    Jaakkola , title =

    Gabriele Corso and Yilun Xu and Valentin De Bortoli and Regina Barzilay and Tommi S. Jaakkola , title =. The Twelfth International Conference on Learning Representations,. 2024 , url =

Showing first 80 references.