arxiv: 2604.03448 · v1 · submitted 2026-04-03 · 💻 cs.CV · cs.AI· cs.HC· cs.LG

Recognition: 2 theorem links

· Lean Theorem

ExpressEdit: Fast Editing of Stylized Facial Expressions with Diffusion Models in Photoshop

Kenan Tang , Jiasheng Guo , Jeffrey Lin , Yao Qin

Authors on Pith no claims yet

Pith reviewed 2026-05-13 19:48 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.HCcs.LG

keywords facial expression editingdiffusion modelsPhotoshop pluginstylized image editingartifact-free editsexpression databaseAI art tools

0 comments

The pith

ExpressEdit is a Photoshop plugin that edits stylized facial expressions in three seconds without artifacts using diffusion models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ExpressEdit, an open-source plugin for Photoshop that allows artists to quickly change facial expressions in stylized images. Current AI editing tools often add noise and shift pixels, making them unusable in professional software. ExpressEdit avoids these issues and works seamlessly with tools like Liquify. It runs in under three seconds on regular GPUs and includes a large database of expressions for varied storytelling needs. By open-sourcing the code and data, the authors aim to support more artistic applications.

Core claim

ExpressEdit, a fully open-source Photoshop plugin, edits stylized facial expressions without common artifacts like global noise and pixel drift. It performs edits in 3 seconds on consumer hardware and works with tools such as Liquify. A database of 135 expression tags with example stories supports diverse generation needs.

What carries the argument

A fine-tuned diffusion model packaged as a Photoshop plugin that performs localized expression edits without introducing global noise or pixel drift.

Load-bearing premise

The diffusion model can be adapted and fine-tuned to produce edits that introduce neither global noise nor pixel drift when operating inside the Photoshop environment and when combined with native tools.

What would settle it

A side-by-side comparison of original and edited images showing pixel-level drift or added noise after using the plugin followed by Liquify or other native Photoshop operations.

Figures

Figures reproduced from arXiv: 2604.03448 by Jeffrey Lin, Jiasheng Guo, Kenan Tang, Yao Qin.

**Figure 1.** Figure 1: ExpressEdit can generate diverse, stylized expressions on an original image. The first row shows the original image, and the second and third rows show the edited images, with the user-specified expression above each image. ExpressEdit can handle both detailed multi-word descriptions (such as “clenched teeth”) and emoticons (such as “@ @”), generating stylized depictions of expressions with ease. Abstract … view at source ↗

**Figure 2.** Figure 2: ExpressEdit consists of two consecutive pipelines for a user-friendly yet professional editing experience. The prompt generation pipeline takes in a story paragraph and uses a VLM to retrieve relevant expression tags from a multi-modal expression database we curate. The relevant expression tags are inserted into a prompt, which is used in the image editing pipeline. The image editing pipeline starts by the… view at source ↗

**Figure 3.** Figure 3: Creative short stories are included in ExpressEdit to assist retrieval-augmented generation (RAG). The prompt template and an example story are shown here. We have a rich database of expression tags, far exceeding the number of common categorization systems [24, 25]. We also include emoticons [21, 42, 78], which vividly correspond to stylized expressions. Well-grounded in existing Danbooru and Pixiv tag… view at source ↗

**Figure 4.** Figure 4: Baseline methods introduce destructive noise into the original image after each editing step, whereas the pixel changes from ExpressEdit are non-destructive, and the minor artifacts are easily repaired. Please see Section 3.2 for details. which further degrades image quality beyond the user’s control. Even in a stress-testing case where selections strictly overlap over 100 steps, ExpressEdit only accumula… view at source ↗

**Figure 5.** Figure 5: The selection version of Nano Banana Pro in Photoshop and the naive inpainting both create visible artifacts around the selected region. Nano Banana Pro leaves artifacts around earlobes and the chin, making it impossible to contain the destructive noise via selecting. Naive inpainting also leaves artifacts on the right side of the neck and on the braid, necessitating the use of the SPICE backend. Origina… view at source ↗

**Figure 6.** Figure 6: ExpressEdit can generate detailed expressions even with challenging, high-resolution inputs, whereas Nano Banana 2 Pro generates lower quality faces, despite its support for 2K image generation. Nano Banana 2 Pro generates a face with blurry contours, creates visible artifacts around the red string, lowers the saturation, and arbitrarily changes the shape of the face. 3.3. Responsive and Precise Edits In … view at source ↗

**Figure 8.** Figure 8: ExpressEdit conveniently fixes artifacts from manual editing. The first row shows the results from the Liquify tool in Photoshop, and the second row shows the fixed results. Liquify supports high editing flexibility at the notorious cost of long manual editing time. A casual use of the tool introduces heavy deformations on the white bow or on the iris, but the deformation can be quickly fixed. Original S… view at source ↗

**Figure 9.** Figure 9: Besides expressions, ExpressEdit can fix other details on a character. The correct design of the character includes 4 bows with interleaving blue and yellow colors [72]. Hinted by simple sketches, ExpressEdit fixes the design in a few seconds. ple, complex designs of characters are usually not generated correctly, with artifacts such as incorrect number of accessories or scrambled colors ( [PITH_FULL_IMA… view at source ↗

read the original abstract

Facial expressions of characters are a vital component of visual storytelling. While current AI image editing models hold promise for assisting artists in the task of stylized expression editing, these models introduce global noise and pixel drift into the edited image, preventing the integration of these models into professional image editing software and workflows. To bridge this gap, we introduce ExpressEdit, a fully open-source Photoshop plugin that is free from common artifacts of proprietary image editing models and robustly synergizes with native Photoshop operations such as Liquify. ExpressEdit seamlessly edits an expression within 3 seconds on a single consumer-grade GPU, significantly faster than popular proprietary models. Moreover, to support the generation of diverse expressions according to different narrative needs, we compile a comprehensive expression database of 135 expression tags enriched with example stories and images designed for retrieval-augmented generation. We open source the code and dataset to facilitate future research and artistic exploration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ExpressEdit ships a usable open-source Photoshop plugin and expression database for stylized edits, but the no-artifact claims lack the metrics needed to confirm they hold up in real workflows.

read the letter

The main point is a practical Photoshop plugin that runs diffusion-based expression edits in about three seconds on a consumer GPU and claims to avoid the noise and pixel drift that usually blocks integration with native tools like Liquify. They also release a 135-tag expression database with story examples to support retrieval-augmented generation. Both the plugin and the dataset are new artifacts not described in the cited prior work, and opening the code is a clear plus for anyone who wants to try the system themselves.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces ExpressEdit, an open-source Photoshop plugin that adapts diffusion models for editing stylized facial expressions. It claims to perform edits in approximately 3 seconds on consumer GPUs without introducing global noise or pixel drift, to integrate seamlessly with native Photoshop tools such as Liquify, and to release a supporting database of 135 expression tags with narrative examples for retrieval-augmented generation.

Significance. If the central claims are substantiated, the work would provide a practical bridge between diffusion-based image editing and professional artist workflows, with the open release of code and dataset constituting a clear strength for reproducibility and future research.

major comments (3)

[Abstract and §4] Abstract and §4 (Results): the assertions that edits are 'free from common artifacts' and 'robustly synergize' with Liquify are presented without any quantitative support (PSNR, SSIM, LPIPS, masked pixel-difference statistics, or ablation isolating drift under combined native-tool use).
[§3] §3 (Method): the adaptation and fine-tuning procedure that is asserted to eliminate global noise and pixel drift is described at a high level only; no training details, loss terms, or regularization mechanisms are supplied that would allow verification of the claimed output distribution.
[§4 and §5] §4 and §5: no user study, baseline comparison against proprietary models, or error analysis is reported to substantiate the 3-second timing claim or the absence of artifacts relative to existing tools.

minor comments (2)

[Abstract] The abstract states performance numbers but the manuscript should include a dedicated evaluation subsection with tables of metrics.
[§5] Figure captions and the expression database description would benefit from explicit cross-references to the released dataset files.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the constructive review and the recommendation for major revision. We value the feedback on strengthening the empirical support and methodological transparency. We address each major comment below and commit to revisions that will incorporate quantitative metrics, expanded training details, and additional evaluations without misrepresenting the current manuscript.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Results): the assertions that edits are 'free from common artifacts' and 'robustly synergize' with Liquify are presented without any quantitative support (PSNR, SSIM, LPIPS, masked pixel-difference statistics, or ablation isolating drift under combined native-tool use).

Authors: We agree that the claims would be strengthened by quantitative evidence. In the revised manuscript, we will add PSNR, SSIM, LPIPS, and masked pixel-difference statistics computed on a test set of stylized images. We will also include an ablation isolating the effect of native-tool combinations (e.g., Liquify) on drift. These results will be reported in §4 with corresponding figures. revision: yes
Referee: [§3] §3 (Method): the adaptation and fine-tuning procedure that is asserted to eliminate global noise and pixel drift is described at a high level only; no training details, loss terms, or regularization mechanisms are supplied that would allow verification of the claimed output distribution.

Authors: The current §3 provides a high-level overview of the adaptation. We will expand it with concrete training hyperparameters, the full set of loss terms (including any regularization for pixel-level fidelity), and the mechanisms used to constrain the output distribution. This will enable independent verification of how global noise and drift are addressed. revision: yes
Referee: [§4 and §5] §4 and §5: no user study, baseline comparison against proprietary models, or error analysis is reported to substantiate the 3-second timing claim or the absence of artifacts relative to existing tools.

Authors: The 3-second timing is based on wall-clock measurements on consumer GPUs; we will report these with exact hardware specifications and variance in the revision. We will add baseline comparisons against open-source alternatives and an error analysis in §4. A targeted user study with artists will be included in §5 to assess perceived artifacts and workflow integration. Direct comparisons to proprietary models are constrained by API access, but we will leverage publicly reported metrics where available. revision: partial

Circularity Check

0 steps flagged

No significant circularity; engineering artifact with no load-bearing derivations

full rationale

The paper describes a Photoshop plugin and dataset for expression editing. No equations, fitted parameters, or mathematical derivations appear in the provided text. Claims of artifact-free output and synergy with native tools are presented as empirical properties of the implemented system rather than predictions derived from self-referential definitions or self-citations. No self-citation load-bearing steps, uniqueness theorems, or ansatz smuggling are present. The contribution reduces to a software artifact whose performance assertions stand or fall on external testing, not internal tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Because only the abstract is available, the ledger is necessarily incomplete. No explicit free parameters, axioms, or invented entities are stated in the provided text; the central claim rests on the unstated assumption that a diffusion model can be constrained to local edits inside a commercial raster editor.

pith-pipeline@v0.9.0 · 5465 in / 1083 out tokens · 40720 ms · 2026-05-13T19:48:02.786539+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ExpressEdit ... diffusion-model-based backend ... SPICE ... Canny edge ControlNet ... Liquify transformation ... 135 expression tags ... retrieval-augmented generation
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

clean edits without degradation ... L1 distance visualization ... no noise outside the edited region

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

109 extracted references · 109 canonical work pages · 3 internal anchors

[1]

In- teractive exploration and refinement of facial expression us- ing manifold learning

Rinat Abdrashitov, Fanny Chevalier, and Karan Singh. In- teractive exploration and refinement of facial expression us- ing manifold learning. InProceedings of the 33rd Annual ACM Symposium on User Interface Software and Technol- ogy, pages 778–790, 2020. 2

work page 2020
[2]

Choose colors in Photoshop Elements - Adobe Help Center.https : / / helpx

Adobe. Choose colors in Photoshop Elements - Adobe Help Center.https : / / helpx . adobe . com/photoshop- elements/using/choosing- colors.html, 2022. Accessed: 2026-03-14. 8

work page 2022
[3]

Create layers in Photoshop Elements - Adobe Help Center.https : / / helpx

Adobe. Create layers in Photoshop Elements - Adobe Help Center.https : / / helpx . adobe . com/photoshop- elements/using/creating- layers.html, 2022. Accessed: 2026-03-14. 7

work page 2022
[4]

Set up brushes in Photoshop Elements - Adobe Support.https : / / helpx

Adobe. Set up brushes in Photoshop Elements - Adobe Support.https : / / helpx . adobe . com / photoshop - elements / using / setting - brushes.html, 2022. Accessed: 2026-03-14. 8

work page 2022
[5]

Adobe UXP Developer Tool.https : / / developer.adobe.com/photoshop/uxp/2022/ guides/devtool/, 2022

Adobe. Adobe UXP Developer Tool.https : / / developer.adobe.com/photoshop/uxp/2022/ guides/devtool/, 2022. Accessed: 2026-03-14. 4

work page 2022
[6]

Get started with selections - Adobe Sup- port.https://helpx.adobe.com/photoshop/ using / making - selections

Adobe. Get started with selections - Adobe Sup- port.https://helpx.adobe.com/photoshop/ using / making - selections . html, 2023. Ac- cessed: 2026-03-14. 4

work page 2023
[7]

Change color saturation, hue, and vibrance in Photoshop Elements.https://helpx.adobe

Adobe. Change color saturation, hue, and vibrance in Photoshop Elements.https://helpx.adobe. com/photoshop-elements/using/adjusting- color-saturation-hue-vibrance.html, 2024. Accessed: 2026-03-14. 8

work page 2024
[8]

Select with lasso tools in Photoshop - Adobe.https://helpx.adobe.com/photoshop/ using / selecting - lasso - tools

Adobe. Select with lasso tools in Photoshop - Adobe.https://helpx.adobe.com/photoshop/ using / selecting - lasso - tools . html, 2024. Accessed: 2026-03-14. 6

work page 2024
[9]

Adjust scale, rotation, and perspective - Adobe Support.https : / / helpx

Adobe. Adjust scale, rotation, and perspective - Adobe Support.https : / / helpx . adobe . com / photoshop / desktop / crop - resize - transform / transform - manipulate - reshape / adjust - scale - rotation - and - perspective.html, 2026. Accessed: 2026-03-14. 7

work page 2026
[10]

Expand or contract a selection - Adobe

Adobe. Expand or contract a selection - Adobe. https : / / helpx . adobe . com / photoshop / desktop / make - selections / refine - modify - selections / expand - or - contract - selection.html, 2026. Accessed: 2026-03-14. 5

work page 2026
[11]

Adobe. Fringe pixels around a selection - Adobe Sup- port.https://helpx.adobe.com/photoshop/ desktop/make- selections/refine- modify- selections / fringe - pixels - around - a - selection.html, 2026. Accessed: 2026-03-14. 5

work page 2026
[12]

Photoshop Generative Fill: Use AI to Fill in Images |Adobe.https://www.adobe.com/products/ photoshop/generative- fill.html, 2026

Adobe. Photoshop Generative Fill: Use AI to Fill in Images |Adobe.https://www.adobe.com/products/ photoshop/generative- fill.html, 2026. Ac- cessed: 2026-03-14. 2, 6

work page 2026
[13]

Overview of Liquify filter - Adobe Help Center

Adobe. Overview of Liquify filter - Adobe Help Center. https : / / helpx . adobe . com / photoshop / desktop / effects - filters / artistic - stylize - filters / overview - of - liquify - filter.html, 2026. Accessed: 2026-03-14. 4

work page 2026
[14]

How to merge layers in Photoshop - 5 Methods - Adobe.https://www.adobe.com/products/ photoshop/merge-layers.html, 2026

Adobe. How to merge layers in Photoshop - 5 Methods - Adobe.https://www.adobe.com/products/ photoshop/merge-layers.html, 2026. Accessed: 2026-03-14. 4

work page 2026
[15]

Paint a selection with Quick Selection tool - Adobe Support.https : / / helpx

Adobe. Paint a selection with Quick Selection tool - Adobe Support.https : / / helpx . adobe . com/photoshop/desktop/make- selections/ automatic-color-based-selections/paint- a- selection- with- quick- selection- tool. html, 2026. Accessed: 2026-03-14. 5

work page 2026
[16]

Adobe Photoshop on desktop release notes

Adobe. Adobe Photoshop on desktop release notes. https : / / helpx . adobe . com / photoshop / desktop/whats-new/photoshop-on-desktop- release-notes.html, 2026. Accessed: 2026-03-14. 5

work page 2026
[17]

Refine and soften selection edges - Adobe

Adobe. Refine and soften selection edges - Adobe. https : / / helpx . adobe . com / photoshop / desktop/make- selections/refine- modify- selections/refine-and-soften-selection- edges.html, 2026. Accessed: 2026-03-14. 5

work page 2026
[18]

Sample from all visible layers - Adobe Help Center.https : / / helpx

Adobe. Sample from all visible layers - Adobe Help Center.https : / / helpx . adobe . com / photoshop/desktop/create-manage-layers/ get - started - layers / sample - from - all - visible- layers.html, 2026. Accessed: 2026-03-

work page 2026
[19]

Interactive digital photomon- tage

Aseem Agarwala, Mira Dontcheva, Maneesh Agrawala, Steven Drucker, Alex Colburn, Brian Curless, David Salesin, and Michael Cohen. Interactive digital photomon- tage. InACM SIGGRAPH 2004 Papers, pages 294–302

work page 2004
[20]

Plot’n polish: Zero-shot story visualiza- tion and disentangled editing with text-to-image diffusion models.arXiv preprint arXiv:2509.04446, 2025

Kiymet Akdemir, Jing Shi, Kushal Kafle, Brian Price, and Pinar Yanardag. Plot’n polish: Zero-shot story visualiza- tion and disentangled editing with text-to-image diffusion models.arXiv preprint arXiv:2509.04446, 2025. 2

work page arXiv 2025
[21]

An inte- grated review of emoticons in computer-mediated commu- nication.Frontiers in psychology, 7:2061, 2017

Nerea Aldunate and Roberto Gonz ´alez-Ib´a˜nez. An inte- grated review of emoticons in computer-mediated commu- nication.Frontiers in psychology, 7:2061, 2017. 4

work page 2061
[22]

Nasal analysis of classic animated movie villains versus hero counterparts.Facial Plastic Surgery, 37(03):348–353, 2021

Meredith A Allen, Jordyn P Lucas, Michael Chung, Hani M Rayess, and Giancarlo Zuliani. Nasal analysis of classic animated movie villains versus hero counterparts.Facial Plastic Surgery, 37(03):348–353, 2021. 2

work page 2021
[23]

Hapfacs: An open source api/software to generate facs-based expressions for ecas an- imation and for corpus generation

Reza Amini and Christine Lisetti. Hapfacs: An open source api/software to generate facs-based expressions for ecas an- imation and for corpus generation. In2013 Humaine Asso- ciation Conference on Affective Computing and Intelligent Interaction, pages 270–275. IEEE, 2013. 2

work page 2013
[24]

Modeling stylized character expres- sions via deep learning

Deepali Aneja, Alex Colburn, Gary Faigin, Linda Shapiro, and Barbara Mones. Modeling stylized character expres- sions via deep learning. InAsian conference on computer vision, pages 136–153. Springer, 2016. 2, 4

work page 2016
[25]

Learning to gener- ate 3d stylized character expressions from humans

Deepali Aneja, Bindita Chaudhuri, Alex Colburn, Gary Fai- gin, Linda Shapiro, and Barbara Mones. Learning to gener- ate 3d stylized character expressions from humans. In2018 IEEE Winter Conference on Applications of Computer Vi- sion (WACV), pages 160–169. IEEE, 2018. 2, 4 9

work page 2018
[26]

Image Editing AI Leaderboard - Best Mod- els Compared.https://arena.ai/leaderboard/ image-edit, 2026

Arena AI. Image Editing AI Leaderboard - Best Mod- els Compared.https://arena.ai/leaderboard/ image-edit, 2026. Accessed: 2026-03-14. 2, 5

work page 2026
[27]

Identity-motion trade-offs in text-to-video generation

Yuval Atzmon, Rinon Gal, Yoad Tewel, Yoni Kasten, and Gal Chechik. Identity-motion trade-offs in text-to-video generation. In36th British Machine Vision Conference 2025, BMVC 2025, Sheffield, UK, November 24-27, 2025. BMV A, 2025. 2

work page 2025
[28]

GitHub - AUTOMATIC1111/stable- diffusion-webui: Stable Diffusion web UI.https : / / github

AUTOMATIC1111. GitHub - AUTOMATIC1111/stable- diffusion-webui: Stable Diffusion web UI.https : / / github . com / AUTOMATIC1111 / stable - diffusion-webui, 2026. Accessed: 2026-03-14. 4

work page 2026
[29]

Re:verse - can your vlm read a manga? InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision (ICCV) Workshops, pages 3761–3771, 2025

Aaditya Baranwal, Madhav Kataria, Naitik Agrawal, Yo- gesh S Rawat, and Shruti Vyas. Re:verse - can your vlm read a manga? InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision (ICCV) Workshops, pages 3761–3771, 2025. 2

work page 2025
[30]

FLUX.2: Frontier Visual Intelligence

Black Forest Labs. FLUX.2: Frontier Visual Intelligence. https://bfl.ai/blog/flux-2, 2025. Accessed: 2026-03-14. 5

work page 2025
[31]

black-forest-labs/FLUX.2-dev - Hug- ging Face.https://huggingface.co/black- forest-labs/FLUX.2-dev, 2026

Black Forest Labs. black-forest-labs/FLUX.2-dev - Hug- ging Face.https://huggingface.co/black- forest-labs/FLUX.2-dev, 2026. Accessed: 2026- 03-14. 5

work page 2026
[32]

Re: draw-context aware translation as a controllable method for artistic production

Jo ˜ao Lib ´orio Cardoso, Francesco Banterle, Paolo Cignoni, and Michael Wimmer. Re: draw-context aware translation as a controllable method for artistic production. InProceed- ings of the Thirty-Third International Joint Conference on Artificial Intelligence, pages 7609–7617, 2024. 2

work page 2024
[33]

Designing ani- mated characters for children of different ages

Elizabeth J Carter, Moshe Mahler, Maryyann Landlord, Kyna McIntosh, and Jessica K Hodgins. Designing ani- mated characters for children of different ages. InProceed- ings of the The 15th International Conference on Interac- tion Design and Children, pages 421–427, 2016. 2

work page 2016
[34]

Deep generation of face images from sketches

Shu-Yu Chen, Wanchao Su, Lin Gao, Shihong Xia, and Hongbo Fu. Deep generation of face images from sketches. arXiv preprint arXiv:2006.01047, 2020. 2

work page arXiv 2006
[35]

Exploring the cor- relation between gaze patterns and facial geometric param- eters: A cross-cultural comparison between real and ani- mated faces.Symmetry, 17(4):528, 2025

Zhi-Lin Chen and Kang-Ming Chang. Exploring the cor- relation between gaze patterns and facial geometric param- eters: A cross-cultural comparison between real and ani- mated faces.Symmetry, 17(4):528, 2025. 2

work page 2025
[36]

Evaluating the effect of outfit on personality perception in virtual characters

Yanbo Cheng and Yingying Wang. Evaluating the effect of outfit on personality perception in virtual characters. In Virtual Worlds, pages 21–39. MDPI, 2024. 8

work page 2024
[37]

Prompting for products: investigating design space exploration strategies for text-to- image generative models.Design Science, 11:e2, 2025

Leah Chong, I-Ping Lo, Jude Rayan, Steven Dow, Faez Ahmed, and Ioanna Lykourentzou. Prompting for products: investigating design space exploration strategies for text-to- image generative models.Design Science, 11:e2, 2025. 2

work page 2025
[38]

SDXL Lightning LoRAs - 8 Steps|Stable Diffu- sion XL LoRA.https://civitai.com/models/ 350450 ? modelVersionId = 391999, 2024

Civitai. SDXL Lightning LoRAs - 8 Steps|Stable Diffu- sion XL LoRA.https://civitai.com/models/ 350450 ? modelVersionId = 391999, 2024. Ac- cessed: 2026-03-14. 8

work page 2024
[39]

Mask Editor - Create and Edit Masks in Com- fyUI - ComfyUI.https : / / docs

ComfyUI. Mask Editor - Create and Edit Masks in Com- fyUI - ComfyUI.https : / / docs . comfy . org / interface/maskeditor, 2026. Accessed: 2026-03-

work page 2026
[40]

Tag Group: Eyes Tags|Danbooru.https: / / danbooru

Danbooru. Tag Group: Eyes Tags|Danbooru.https: / / danbooru . donmai . us / wiki _ pages / tag _ group:eyes_tags, 2026. Accessed: 2026-03-14. 3

work page 2026
[41]

Tag Group: Face Tags|Danbooru.https: / / danbooru

Danbooru. Tag Group: Face Tags|Danbooru.https: / / danbooru . donmai . us / wiki _ pages / tag _ group:face_tags, 2026. Accessed: 2026-03-14. 3

work page 2026
[42]

Emoticons and online message interpretation.Social Sci- ence Computer Review, 26(3):379–388, 2008

Daantje Derks, Arjan ER Bos, and Jasper V on Grumbkow. Emoticons and online message interpretation.Social Sci- ence Computer Review, 26(3):379–388, 2008. 4

work page 2008
[43]

Aether weaver: Multimodal affective nar- rative co-generation with dynamic scene graphs.arXiv preprint arXiv:2507.21893, 2025

Saeed Ghorbani. Aether weaver: Multimodal affective nar- rative co-generation with dynamic scene graphs.arXiv preprint arXiv:2507.21893, 2025. 2

work page arXiv 2025
[44]

Baby schema in infant faces induces cuteness perception and motivation for caretaking in adults.Ethology, 115(3): 257–263, 2009

Melanie L Glocker, Daniel D Langleben, Kosha Ruparel, James W Loughead, Ruben C Gur, and Norbert Sachser. Baby schema in infant faces induces cuteness perception and motivation for caretaking in adults.Ethology, 115(3): 257–263, 2009. 2

work page 2009
[45]

Gemini 3 Flash: frontier intelligence built for speed.https://blog.google/products- and-platforms/products/gemini/gemini-3- flash/, 2025

Google. Gemini 3 Flash: frontier intelligence built for speed.https://blog.google/products- and-platforms/products/gemini/gemini-3- flash/, 2025. Accessed: 2026-03-14. 4

work page 2025
[46]

Nano Banana 2: Combining Pro capabilities with lightning-fast speed.https://blog.google/ innovation - and - ai / technology / ai / nano - banana-2/, 2026

Google. Nano Banana 2: Combining Pro capabilities with lightning-fast speed.https://blog.google/ innovation - and - ai / technology / ai / nano - banana-2/, 2026. Accessed: 2026-03-14. 2, 5

work page 2026
[47]

SynthID - Google DeepMind.https: / / deepmind

Google DeepMind. SynthID - Google DeepMind.https: / / deepmind . google / models / synthid/, 2026. Accessed: 2026-03-14. 2, 5

work page 2026
[48]

Synthid-image: Image watermarking at internet scale.arXiv preprint arXiv:2510.09263, 2025

Sven Gowal, Rudy Bunel, Florian Stimberg, David Stutz, Guillermo Ortiz-Jimenez, Christina Kouridi, Mel Vecerik, Jamie Hayes, Sylvestre-Alvise Rebuffi, Paul Bernard, et al. Synthid-image: Image watermarking at internet scale. arXiv preprint arXiv:2510.09263, 2025. 2, 5

work page arXiv 2025
[49]

ImageEditor - Gradio Docs.https://www

Gradio. ImageEditor - Gradio Docs.https://www. gradio.app/docs/gradio/imageeditor, 2026. Accessed: 2026-03-14. 4

work page 2026
[50]

Dress is a fundamental com- ponent of person perception.Personality and Social Psy- chology Review, 27(4):414–433, 2023

Neil Hester and Eric Hehman. Dress is a fundamental com- ponent of person perception.Personality and Social Psy- chology Review, 27(4):414–433, 2023. 8

work page 2023
[51]

Flaw or artifact? rethinking prompt sensitivity in evaluating LLMs

Andong Hua, Kenan Tang, Chenhe Gu, Jindong Gu, Eric Wong, and Yao Qin. Flaw or artifact? rethinking prompt sensitivity in evaluating LLMs. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 19889–19899, Suzhou, China, 2025. As- sociation for Computational Linguistics. 4

work page 2025
[52]

Promptnavi: Text-to-image generation through interactive prompt visual exploration

Bofei Huang and Haoran Xie. Promptnavi: Text-to-image generation through interactive prompt visual exploration. Computers & Graphics, page 104417, 2025. 2

work page 2025
[53]

T2i-compbench: A comprehensive bench- mark for open-world compositional text-to-image genera- tion.Advances in Neural Information Processing Systems, 36:78723–78747, 2023

Kaiyi Huang, Kaiyue Sun, Enze Xie, Zhenguo Li, and Xihui Liu. T2i-compbench: A comprehensive bench- mark for open-world compositional text-to-image genera- tion.Advances in Neural Information Processing Systems, 36:78723–78747, 2023. 7

work page 2023
[54]

Adaptivesliders: User-aligned semantic slider-based editing of text-to-image model output

Rahul Jain, Amit Goel, Koichiro Niinuma, and Aakar Gupta. Adaptivesliders: User-aligned semantic slider-based editing of text-to-image model output. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–27, 2025. 7 10

work page 2025
[55]

Emojidiff: Advanced facial expression control with high identity preservation in portrait genera- tion

Liangwei Jiang, Ruida Li, Zhifeng Zhang, Shuo Fang, and Chenguang Ma. Emojidiff: Advanced facial expression control with high identity preservation in portrait genera- tion. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 328–338, 2026. 2

work page 2026
[56]

Compre- hensive database for facial expression analysis

Takeo Kanade, Jeffrey F Cohn, and Yingli Tian. Compre- hensive database for facial expression analysis. InProceed- ings fourth IEEE international conference on automatic face and gesture recognition (cat. No. PR00580), pages 46–

work page
[57]

Is clip ideal? no

Raphi Kang, Yue Song, Georgia Gkioxari, and Pietro Per- ona. Is clip ideal? no. can we fix it? yes! InProceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 22436–22446, 2025. 7

work page 2025
[58]

Python Scripting - Krita Manual 5.3.0 documen- tation.https : / / docs

Krita. Python Scripting - Krita Manual 5.3.0 documen- tation.https : / / docs . krita . org / en / user _ manual / python _ scripting . html, 2026. Ac- cessed: 2026-03-14. 5

work page 2026
[59]

Principles of traditional animation applied to 3d computer animation

John Lasseter. Principles of traditional animation applied to 3d computer animation. InSeminal graphics: pioneering efforts that shaped the field, pages 263–272. 1998. 2

work page 1998
[60]

Face poser: Interactive modeling of 3d fa- cial expressions using facial priors.ACM Transactions on Graphics (TOG), 29(1):1–17, 2009

Manfred Lau, Jinxiang Chai, Ying-Qing Xu, and Heung- Yeung Shum. Face poser: Interactive modeling of 3d fa- cial expressions using facial priors.ACM Transactions on Graphics (TOG), 29(1):1–17, 2009. 2

work page 2009
[61]

Okbench: De- mocratizing llm evaluation with fully automated, on- demand, open knowledge benchmarking.arXiv preprint arXiv:2511.08598, 2025

Yanhong Li, Tianyang Xu, Kenan Tang, Karen Livescu, David McAllester, and Jiawei Zhou. Okbench: De- mocratizing llm evaluation with fully automated, on- demand, open knowledge benchmarking.arXiv preprint arXiv:2511.08598, 2025. 2

work page arXiv 2025
[62]

Sdxl- lightning: Progressive adversarial diffusion distillation

Shanchuan Lin, Anran Wang, and Xiao Yang. Sdxl- lightning: Progressive adversarial diffusion distillation. arXiv preprint arXiv:2402.13929, 2024. 8

work page arXiv 2024
[63]

Freedrag: Feature dragging for reliable point-based image editing

Pengyang Ling, Lin Chen, Pan Zhang, Huaian Chen, Yi Jin, and Jinjin Zheng. Freedrag: Feature dragging for reliable point-based image editing. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6860–6870, 2024. 7

work page 2024
[64]

LittleJelly. Yanami Anna [4 outfits]|Illustrious|Make Heroine Ga Oosugiru! - Illu v1.0|Illustrious LoRA|Civ- itai.https://civitai.com/models/1166558/ yanami - anna - 4 - outfits - or - illustrious - or-make-heroine-ga-oosugiru, 2025. Accessed: 2026-03-14. 8

work page arXiv 2025
[65]

Drag your noise: Interactive point-based editing via diffusion semantic propagation

Haofeng Liu, Chenshu Xu, Yifei Yang, Lihua Zeng, and Shengfeng He. Drag your noise: Interactive point-based editing via diffusion semantic propagation. InProceedings of the IEEE/CVF conference on computer vision and pat- tern recognition, pages 6743–6752, 2024. 7

work page 2024
[66]

Design guidelines for prompt engineering text-to-image generative models

Vivian Liu and Lydia B Chilton. Design guidelines for prompt engineering text-to-image generative models. In Proceedings of the 2022 CHI conference on human factors in computing systems, pages 1–23, 2022. 2

work page 2022
[67]

Magicquill: An intelligent interactive image editing system

Zichen Liu, Yue Yu, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Wen Wang, Zhiheng Liu, Qifeng Chen, and Yujun Shen. Magicquill: An intelligent interactive image editing system. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 13072–13082, 2025. 4

work page 2025
[68]

Magicquillv2: Precise and interac- tive image editing with layered visual cues.arXiv preprint arXiv:2512.03046, 2025

Zichen Liu, Yue Yu, Hao Ouyang, Qiuyu Wang, Shuailei Ma, Ka Leong Cheng, Wen Wang, Qingyan Bai, Yuxuan Zhang, Yanhong Zeng, et al. Magicquillv2: Precise and interactive image editing with layered visual cues.arXiv preprint arXiv:2512.03046, 2025. 4

work page arXiv 2025
[69]

lllyasviel. diffusers xl canny mid.safetensors - lllyasviel/sd control collection at main.https : //huggingface.co/lllyasviel/sd_control_ collection / blob / main / diffusers _ xl _ canny_mid.safetensors, 2023. Accessed: 2026- 03-14. 5

work page 2023
[70]

GitHub - lllyasviel/stable-diffusion-webui-forge

lllyasviel. GitHub - lllyasviel/stable-diffusion-webui-forge. https : / / github . com / lllyasviel / stable - diffusion- webui- forge, 2026. Accessed: 2026- 03-14. 4

work page 2026
[71]

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Simian Luo, Yiqin Tan, Longbo Huang, Jian Li, and Hang Zhao. Latent consistency models: Synthesizing high- resolution images with few-step inference.arXiv preprint arXiv:2310.04378, 2023. 8

work page internal anchor Pith review Pith/arXiv arXiv 2023
[72]

Make Heroine ga Oosugiru!

Makeine Support Committee. CHARACTER|TV Anime “Make Heroine ga Oosugiru!” Official Website.https: //makeine- anime.com/character/, 2026. Ac- cessed: 2026-03-14. 8

work page 2026
[73]

A Survey of Context Engineering for Large Language Models

Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Bao- long Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, et al. A survey of context engineering for large language models.arXiv preprint arXiv:2507.13334,

work page internal anchor Pith review arXiv
[74]

GitHub - mirabarukaso/character select stand alone app: Char- acter Select Stand Alone App with AI prompt and ComfyUI/WebUI API support for wai-il model

mirabarukaso. GitHub - mirabarukaso/character select stand alone app: Char- acter Select Stand Alone App with AI prompt and ComfyUI/WebUI API support for wai-il model. https : / / github . com / mirabarukaso / character _ select _ stand _ alone _ app, 2026. Accessed: 2026-03-14. 2

work page 2026
[75]

Dynamic prompt optimizing for text- to-image generation

Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen, and Qing Yang. Dynamic prompt optimizing for text- to-image generation. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 26627–26636, 2024. 4

work page 2024
[76]

Dragondiffusion: Enabling drag-style manip- ulation on diffusion models

Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, and Jian Zhang. Dragondiffusion: Enabling drag-style manip- ulation on diffusion models. InThe Twelfth International Conference on Learning Representations, 2024. 7

work page 2024
[77]

The new ChatGPT Images is here.https:// openai.com/index/new-chatgpt-images-is- here/, 2025

OpenAI. The new ChatGPT Images is here.https:// openai.com/index/new-chatgpt-images-is- here/, 2025. Accessed: 2026-03-14. 5

work page 2025
[78]

Emoticon style: Interpreting differences in emoticons across cultures

Jaram Park, Vladimir Barash, Clay Fink, and Meeyoung Cha. Emoticon style: Interpreting differences in emoticons across cultures. InProceedings of the international AAAI conference on web and social media, pages 466–475, 2013. 4

work page 2013
[79]

Illustrious: an open advanced illustration model.arXiv preprint arXiv:2409.19946, 2024

Sang Hyun Park, Jun Young Koh, Junha Lee, Joy Song, Dongha Kim, Hoyeon Moon, Hyunju Lee, and Min Song. Illustrious: an open advanced illustration model.arXiv preprint arXiv:2409.19946, 2024. 3 11

work page arXiv 2024
[80]

pixiv Encyclopedia.https://dic.pixiv

pixiv. pixiv Encyclopedia.https://dic.pixiv. net/en/, 2026. Accessed: 2026-03-14. 3, 4

work page 2026

Showing first 80 references.