Recognition: unknown
Image-Guided Geometric Stylization of 3D Meshes
Pith reviewed 2026-05-10 17:27 UTC · model grok-4.3
The pith
A coarse-to-fine pipeline deforms 3D meshes to express the geometric style of an image while retaining the original topology and part semantics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose a geometric stylization framework that deforms a 3D mesh, allowing it to express the style of an image. While style is inherently ambiguous, we utilize pre-trained diffusion models to extract an abstract representation of the provided image. Our coarse-to-fine stylization pipeline can drastically deform the input 3D model to express a diverse range of geometric variations while retaining the valid topology of the original mesh and part-level semantics. We also propose an approximate VAE encoder that provides efficient and reliable gradients from mesh renderings. Extensive experiments demonstrate that our method can create stylized 3D meshes that reflect unique geometric features.
What carries the argument
The coarse-to-fine stylization pipeline that uses abstract representations extracted from diffusion models and is driven by gradients from an approximate VAE encoder on mesh renderings.
If this is right
- Stylized 3D meshes can reflect unique geometric features of the pictured assets such as expressive poses and silhouettes.
- The method retains the valid topology of the original mesh and part-level semantics after deformation.
- Bold geometric distortions are possible that go beyond existing data distributions.
- The approach supports the creation of distinctive artistic 3D creations from image references.
Where Pith is reading between the lines
- This could allow artists to generate varied 3D asset libraries quickly from 2D reference sketches or photos.
- The method might extend naturally to consistent stylization across multiple viewpoints or short animation sequences.
- One could test generalization by applying the pipeline to scanned real-world meshes with noisy geometry.
- Integration with existing 3D editing tools might give users finer control over which parts receive the strongest style influence.
Load-bearing premise
The abstract representation extracted from the image by pre-trained diffusion models corresponds to transferable geometric features for 3D mesh deformation, and the approximate VAE encoder provides reliable gradients from renderings.
What would settle it
Render the stylized output mesh and check whether its silhouette and pose fail to match the reference image or whether the mesh topology becomes invalid or part semantics are lost.
Figures
read the original abstract
Recent generative models can create visually plausible 3D representations of objects. However, the generation process often allows for implicit control signals, such as contextual descriptions, and rarely supports bold geometric distortions beyond existing data distributions. We propose a geometric stylization framework that deforms a 3D mesh, allowing it to express the style of an image. While style is inherently ambiguous, we utilize pre-trained diffusion models to extract an abstract representation of the provided image. Our coarse-to-fine stylization pipeline can drastically deform the input 3D model to express a diverse range of geometric variations while retaining the valid topology of the original mesh and part-level semantics. We also propose an approximate VAE encoder that provides efficient and reliable gradients from mesh renderings. Extensive experiments demonstrate that our method can create stylized 3D meshes that reflect unique geometric features of the pictured assets, such as expressive poses and silhouettes, thereby supporting the creation of distinctive artistic 3D creations. Project page: https://changwoonchoi.github.io/GeoStyle
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes an image-guided geometric stylization method for 3D meshes. It extracts an abstract representation from a guiding image via pre-trained diffusion models, then applies a coarse-to-fine optimization pipeline to deform an input mesh so that it expresses the image's geometric style (e.g., expressive poses and silhouettes). An approximate VAE encoder is introduced to supply efficient gradients from differentiable renderings. The method claims to produce large deformations while preserving mesh topology, manifold validity, and part-level semantics, with extensive experiments demonstrating distinctive artistic 3D outputs.
Significance. If the central claims hold, the work would offer a practical bridge between 2D generative models and controllable 3D geometry editing, enabling new artistic workflows. The coarse-to-fine strategy and approximate VAE gradient mechanism address real computational challenges in mesh optimization. However, the significance is tempered by the unproven assumption that diffusion latents supply disentangled geometric signals transferable to 3D vertex displacements; without strong evidence against appearance/viewpoint entanglement or gradient instability, the practical utility remains uncertain.
major comments (2)
- [Abstract / Method Overview] The central claim (abstract) that diffusion-derived representations enable 'drastic' yet topology-preserving deformations rests on the untested premise that these latents encode transferable 3D geometric features (poses, silhouettes) rather than entangled 2D appearance and viewpoint cues. The manuscript should add an explicit analysis or ablation (e.g., in the method or experiments section) quantifying how much of the deformation is driven by shape versus texture signals, together with metrics for self-intersection rate and semantic part consistency under large deformations.
- [Approximate VAE Encoder / Optimization] The approximate VAE encoder (abstract) is presented as providing 'efficient and reliable gradients from mesh renderings,' yet the skeptic correctly flags that any bias or variance in this approximation could compound across the coarse stage and produce invalid meshes precisely when deformations are largest. The paper must report quantitative validation of gradient accuracy (e.g., comparison to exact rendering gradients or error bounds) and demonstrate stability on examples with extreme pose changes.
minor comments (2)
- [Experiments] The abstract states 'extensive experiments demonstrate effectiveness,' but the manuscript would benefit from clearer reporting of failure cases, quantitative baselines, and ablation tables that isolate the contribution of the coarse-to-fine schedule versus the VAE approximation.
- [Method] Notation for the diffusion latent extraction and the VAE approximation should be introduced with explicit equations early in the method section to improve readability for readers unfamiliar with the specific pre-trained models used.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to incorporate additional analyses and validations as suggested.
read point-by-point responses
-
Referee: [Abstract / Method Overview] The central claim (abstract) that diffusion-derived representations enable 'drastic' yet topology-preserving deformations rests on the untested premise that these latents encode transferable 3D geometric features (poses, silhouettes) rather than entangled 2D appearance and viewpoint cues. The manuscript should add an explicit analysis or ablation (e.g., in the method or experiments section) quantifying how much of the deformation is driven by shape versus texture signals, together with metrics for self-intersection rate and semantic part consistency under large deformations.
Authors: We agree that an explicit ablation would strengthen the central claim. Our optimization objective is formulated to align rendered geometric properties (silhouettes and poses) with diffusion-derived features, which prior work has shown to capture structural information. To directly address potential entanglement, we have added an ablation in the revised experiments section comparing stylization results when using diffusion features from the original image versus texture-ablated inputs (grayscale and edge-extracted versions). We also report quantitative metrics: self-intersection rates computed via standard mesh intersection algorithms and semantic part consistency measured using a pre-trained segmentation network, specifically for large-deformation cases. These results indicate that geometric cues dominate the deformations. revision: yes
-
Referee: [Approximate VAE Encoder / Optimization] The approximate VAE encoder (abstract) is presented as providing 'efficient and reliable gradients from mesh renderings,' yet the skeptic correctly flags that any bias or variance in this approximation could compound across the coarse stage and produce invalid meshes precisely when deformations are largest. The paper must report quantitative validation of gradient accuracy (e.g., comparison to exact rendering gradients or error bounds) and demonstrate stability on examples with extreme pose changes.
Authors: We acknowledge the importance of validating the approximate VAE encoder's gradient reliability, particularly for large deformations. In the revised manuscript, we have added a quantitative comparison in the method and experiments sections: we compute the mean squared error between approximate gradients and exact differentiable rendering gradients over a set of test meshes with varying deformation magnitudes, providing explicit error bounds. We further demonstrate stability by including additional examples with extreme pose changes and report aggregate statistics on mesh validity (self-intersection rates and manifold checks) across the full experimental suite, showing no increase in invalid outputs during the coarse stage. revision: yes
Circularity Check
No circularity; derivation relies on external pre-trained models
full rationale
The paper proposes a coarse-to-fine stylization pipeline that deforms input 3D meshes using abstract representations extracted from images via pre-trained diffusion models, plus a new approximate VAE encoder to obtain gradients from renderings. No load-bearing steps reduce by construction to self-definitions, fitted inputs renamed as predictions, or self-citation chains. The central claims of drastic geometric variation while preserving topology and semantics are supported by the proposed components and external models without circular reduction to the paper's own inputs or fitted parameters.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Pre-trained diffusion models can extract an abstract representation of the provided image suitable for geometric stylization of 3D meshes
Reference graph
Works this paper leans on
-
[1]
Neural jaco- bian fields: Learning intrinsic mappings of arbitrary meshes
Noam Aigerman, Kunal Gupta, Vladimir G Kim, Siddhartha Chaudhuri, Jun Saito, and Thibault Groueix. Neural jaco- bian fields: Learning intrinsic mappings of arbitrary meshes. SIGGRAPH, 2022. 2, 3
2022
-
[2]
Artflow: Unbiased image style transfer via re- versible neural flows
Jie An, Siyu Huang, Yibing Song, Dejing Dou, Wei Liu, and Jiebo Luo. Artflow: Unbiased image style transfer via re- versible neural flows. InCVPR, 2021. 2
2021
-
[3]
DeepFloyd IF: a novel state- of-the-art open-source text-to-image model with a high de- gree of photorealism and language understanding.https: //www.deepfloyd.ai/deepfloyd-if, 2023
DeepFloyd Lab at StabilityAI. DeepFloyd IF: a novel state- of-the-art open-source text-to-image model with a high de- gree of photorealism and language understanding.https: //www.deepfloyd.ai/deepfloyd-if, 2023. 2, 4, 6
2023
-
[4]
Fast patch-based style transfer of arbitrary style.arXiv:1612.04337, 2016
Tian Qi Chen and Mark Schmidt. Fast patch-based style transfer of arbitrary style.arXiv:1612.04337, 2016. 2
-
[5]
Stylizing 3d scene via im- plicit representation and hypernetwork
Pei-Ze Chiang, Meng-Shiun Tsai, Hung-Yu Tseng, Wei- Sheng Lai, and Wei-Chen Chiu. Stylizing 3d scene via im- plicit representation and hypernetwork. InWACV, 2022. 2
2022
-
[6]
Geometry in style: 3d stylization via surface normal deformation
Nam Anh Dinh, Itai Lang, Hyunwoo Kim, Oded Stein, and Rana Hanocka. Geometry in style: 3d stylization via surface normal deformation. InCVPR, 2025. 1, 2
2025
-
[7]
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H Bermano, Gal Chechik, and Daniel Cohen-Or. An image is worth one word: Personalizing text-to-image gen- eration using textual inversion.arXiv:2208.01618, 2022. 5
work page internal anchor Pith review arXiv 2022
-
[8]
Textdeformer: Geometry manipu- lation using text guidance
William Gao, Noam Aigerman, Thibault Groueix, V ova Kim, and Rana Hanocka. Textdeformer: Geometry manipu- lation using text guidance. InSIGGRAPH, 2023. 2, 3, 5, 6, 11, 12
2023
-
[9]
Im- age style transfer using convolutional neural networks
Leon A Gatys, Alexander S Ecker, and Matthias Bethge. Im- age style transfer using convolutional neural networks. In CVPR, 2016. 1, 2
2016
-
[10]
Controllable neural style transfer for dynamic meshes
Guilherme Gomes Haetinger, Jingwei Tang, Raphael Ortiz, Paul Kanyuk, and Vinicius Azevedo. Controllable neural style transfer for dynamic meshes. InSIGGRAPH, 2024. 2
2024
-
[11]
Ar- bitrary style transfer with deep feature reshuffle
Shuyang Gu, Congliang Chen, Jing Liao, and Lu Yuan. Ar- bitrary style transfer with deep feature reshuffle. InCVPR,
-
[12]
Deep geometric texture synthesis.ACM TOG, 2020
Amir Hertz, Rana Hanocka, Raja Giryes, and Daniel Cohen- Or. Deep geometric texture synthesis.ACM TOG, 2020. 2
2020
-
[13]
Gans trained by a two time-scale update rule converge to a local nash equilib- rium.NIPS, 2017
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilib- rium.NIPS, 2017. 11
2017
-
[14]
Denoising diffu- sion probabilistic models.NeurIPS, 2020
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models.NeurIPS, 2020. 3
2020
-
[15]
Stylemesh: Style transfer for indoor 3d scene reconstruc- tions
Lukas H ¨ollein, Justin Johnson, and Matthias Nießner. Stylemesh: Style transfer for indoor 3d scene reconstruc- tions. InCVPR, 2022. 2
2022
-
[16]
Lora: Low-rank adaptation of large language models.ICLR,
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.ICLR,
-
[17]
Learning to stylize novel views
Hsin-Ping Huang, Hung-Yu Tseng, Saurabh Saini, Maneesh Singh, and Ming-Hsuan Yang. Learning to stylize novel views. InICCV, 2021. 2
2021
-
[18]
Arbitrary style transfer in real-time with adaptive instance normalization
Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. InICCV,
-
[19]
Geometry transfer for stylizing radiance fields
Hyunyoung Jung, Seonghyeon Nam, Nikolaos Sarafianos, Sungjoo Yoo, Alexander Sorkine-Hornung, and Rakesh Ran- jan. Geometry transfer for stylizing radiance fields. InCVPR,
-
[20]
Neu- ral 3d mesh renderer
Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. Neu- ral 3d mesh renderer. InCVPR, 2018. 2, 5, 6, 11, 12
2018
-
[21]
Meshup: Multi-target mesh deformation via blended score distillation
Hyunwoo Kim, Itai Lang, Noam Aigerman, Thibault Groueix, Vladimir G Kim, and Rana Hanocka. Meshup: Multi-target mesh deformation via blended score distillation. In3DV, 2025. 2, 4, 5, 6, 11, 12
2025
-
[22]
Auto-Encoding Variational Bayes
Diederik P Kingma and Max Welling. Auto-encoding varia- tional bayes.arXiv:1312.6114, 2013. 4, 7, 11
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[23]
Gauss Stylization: Interactive Artistic Mesh Modeling based on Preferred Surface Normals.Com- puter Graphics Forum, 2021
Maximilian Kohlbrenner, Ugo Finnendahl, Tobias Djuren, and Marc Alexa. Gauss Stylization: Interactive Artistic Mesh Modeling based on Preferred Surface Normals.Com- puter Graphics Forum, 2021. 1, 2
2021
-
[24]
Style transfer by relaxed optimal transport and self-similarity
Nicholas Kolkin, Jason Salavon, and Gregory Shakhnarovich. Style transfer by relaxed optimal transport and self-similarity. InCVPR, 2019. 2
2019
-
[25]
arXiv preprint arXiv:2203.13215 (2022)
Nicholas Kolkin, Michal Kucera, Sylvain Paris, Daniel Sykora, Eli Shechtman, and Greg Shakhnarovich. Neural neighbor style transfer.arXiv:2203.13215, 2022. 2
-
[26]
Content and style disentanglement for artistic style transfer
Dmytro Kotovenko, Artsiom Sanakoyeu, Sabine Lang, and Bjorn Ommer. Content and style disentanglement for artistic style transfer. InICCV, 2019. 2
2019
-
[27]
A content transformation block for image style transfer
Dmytro Kotovenko, Artsiom Sanakoyeu, Pingchuan Ma, Sabine Lang, and Bjorn Ommer. A content transformation block for image style transfer. InCVPR, 2019
2019
-
[28]
Rethinking style transfer: From pixels to parameterized brushstrokes
Dmytro Kotovenko, Matthias Wright, Arthur Heimbrecht, and Bjorn Ommer. Rethinking style transfer: From pixels to parameterized brushstrokes. InCVPR, 2021. 2
2021
-
[29]
Modular primitives for high-performance differentiable rendering.ACM TOG,
Samuli Laine, Janne Hellsten, Tero Karras, Yeongho Seol, Jaakko Lehtinen, and Timo Aila. Modular primitives for high-performance differentiable rendering.ACM TOG,
-
[30]
Combining markov random fields and convolutional neural networks for image synthesis
Chuan Li and Michael Wand. Combining markov random fields and convolutional neural networks for image synthesis. InCVPR, 2016. 2
2016
-
[31]
Universal style transfer via feature transforms.NIPS, 2017
Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang. Universal style transfer via feature transforms.NIPS, 2017. 2
2017
-
[32]
Visual attribute transfer through deep image analogy
Jing Liao, Yuan Yao, Lu Yuan, Gang Hua, and Sing Bing Kang. Visual attribute transfer through deep image analogy. arXiv:1705.01088, 2017. 2 9
-
[33]
Cubic stylization
Hsueh-Ti Derek Liu and Alec Jacobson. Cubic stylization. ACM TOG, 2019. 1, 2
2019
-
[34]
Normal-driven spherical shape analogies
Hsueh-Ti Derek Liu and Alec Jacobson. Normal-driven spherical shape analogies. InComputer Graphics Forum,
-
[35]
Pa- parazzi: surface editing by way of multi-view image process- ing.ACM TOG, 2018
Hsueh-Ti Derek Liu, Michael Tao, and Alec Jacobson. Pa- parazzi: surface editing by way of multi-view image process- ing.ACM TOG, 2018. 2, 5, 6, 11, 12
2018
-
[36]
Stylegaussian: Instant 3d style transfer with gaussian splatting
Kunhao Liu, Fangneng Zhan, Muyu Xu, Christian Theobalt, Ling Shao, and Shijian Lu. Stylegaussian: Instant 3d style transfer with gaussian splatting. InSIGGRAPH Asia Techni- cal Communications. 2024. 2
2024
-
[37]
Partfield: Learning 3d feature fields for part segmentation and beyond
Minghua Liu, Mikaela Angelina Uy, Donglai Xiang, Hao Su, Sanja Fidler, Nicholas Sharp, and Jun Gao. Partfield: Learning 3d feature fields for part segmentation and beyond. ICCV, 2025. 4, 5, 7
2025
-
[38]
Geometric style transfer.arXiv:2007.05471, 2020
Xiao-Chang Liu, Xuan-Yi Li, Ming-Ming Cheng, and Peter Hall. Geometric style transfer.arXiv:2007.05471, 2020. 2
-
[39]
The contextual loss for image transformation with non-aligned data
Roey Mechrez, Itamar Talmi, and Lihi Zelnik-Manor. The contextual loss for image transformation with non-aligned data. InECCV, 2018. 2
2018
-
[40]
Latent-nerf for shape-guided generation of 3d shapes and textures
Gal Metzer, Elad Richardson, Or Patashnik, Raja Giryes, and Daniel Cohen-Or. Latent-nerf for shape-guided generation of 3d shapes and textures. InCVPR, 2023. 4
2023
-
[41]
Text2mesh: Text-driven neural stylization for meshes
Oscar Michel, Roi Bar-On, Richard Liu, Sagie Benaim, and Rana Hanocka. Text2mesh: Text-driven neural stylization for meshes. InCVPR, 2022. 1, 5, 6, 11, 12
2022
-
[42]
3d photo stylization: Learning to generate stylized novel views from a single image
Fangzhou Mu, Jian Wang, Yicheng Wu, and Yin Li. 3d photo stylization: Learning to generate stylized novel views from a single image. InCVPR, 2022. 2
2022
-
[43]
Lo- cally stylized neural radiance fields
Hong-Wing Pang, Binh-Son Hua, and Sai-Kit Yeung. Lo- cally stylized neural radiance fields. InICCV, 2023. 2
2023
-
[44]
Arbitrary style trans- fer with style-attentional networks
Dae Young Park and Kwang Hee Lee. Arbitrary style trans- fer with style-attentional networks. InCVPR, 2019. 2
2019
-
[45]
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas M ¨uller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion mod- els for high-resolution image synthesis.arXiv:2307.01952,
work page internal anchor Pith review Pith/arXiv arXiv
-
[46]
Dreamfusion: Text-to-3d using 2d diffusion.ICLR,
Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Milden- hall. Dreamfusion: Text-to-3d using 2d diffusion.ICLR,
-
[47]
Learn- ing transferable visual models from natural language super- vision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision. InICML, 2021. 2, 5
2021
-
[48]
arXiv preprint arXiv:1701.08893 (2017)
Eric Risser, Pierre Wilmot, and Connelly Barnes. Stable and controllable neural texture synthesis and style transfer using histogram losses.arXiv:1701.08893, 2017. 2
-
[49]
High-resolution image syn- thesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨orn Ommer. High-resolution image syn- thesis with latent diffusion models. InCVPR, 2022. 3
2022
-
[50]
U- net: Convolutional networks for biomedical image segmen- tation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image segmen- tation. InInternational Conference on Medical image com- puting and computer-assisted intervention, 2015. 3
2015
-
[51]
Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. InCVPR, 2023. 3, 5, 11
2023
-
[52]
Avatar- net: Multi-scale zero-shot style transfer by feature decora- tion
Lu Sheng, Ziyi Lin, Jing Shao, and Xiaogang Wang. Avatar- net: Multi-scale zero-shot style transfer by feature decora- tion. InCVPR, 2018. 2
2018
-
[53]
Very deep convolutional net- works for large-scale image recognition
K Simonyan and A Zisserman. Very deep convolutional net- works for large-scale image recognition. InICLR, 2015. 2
2015
-
[54]
Denois- ing diffusion implicit models.ICLR, 2021
Jiaming Song, Chenlin Meng, and Stefano Ermon. Denois- ing diffusion implicit models.ICLR, 2021. 3
2021
-
[55]
The face of art: landmark detection and geometric style in portraits
Jordan Yaniv, Yael Newman, and Ariel Shamir. The face of art: landmark detection and geometric style in portraits. ACM TOG, 2019. 2
2019
-
[56]
3dstylenet: Creating 3d shapes with geometric and texture style variations
Kangxue Yin, Jun Gao, Maria Shugrina, Sameh Khamis, and Sanja Fidler. 3dstylenet: Creating 3d shapes with geometric and texture style variations. InICCV, 2021. 2
2021
-
[57]
Styl- izedgs: Controllable stylization for 3d gaussian splatting
Dingxi Zhang, Yu-Jie Yuan, Zhuoxun Chen, Fang-Lue Zhang, Zhenliang He, Shiguang Shan, and Lin Gao. Styl- izedgs: Controllable stylization for 3d gaussian splatting. TPAMI, 2025. 2
2025
-
[58]
Arf: Artistic radiance fields
Kai Zhang, Nick Kolkin, Sai Bi, Fujun Luan, Zexiang Xu, Eli Shechtman, and Noah Snavely. Arf: Artistic radiance fields. InECCV, 2022. 2
2022
-
[59]
A photo of TOK sculpture
Yuechen Zhang, Zexin He, Jinbo Xing, Xufeng Yao, and Ji- aya Jia. Ref-npr: Reference-based non-photorealistic radi- ance fields for controllable scene stylization. InCVPR, 2023. 2 10 Image-Guided Geometric Stylization of 3D Meshes Supplementary Material A. Implementation Details To train LoRA weight [16] on Stable Diffusion XL [45] us- ing DreamBooth [51]...
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.