pith. machine review for the scientific record. sign in

arxiv: 2604.03448 · v1 · submitted 2026-04-03 · 💻 cs.CV · cs.AI· cs.HC· cs.LG

Recognition: 2 theorem links

· Lean Theorem

ExpressEdit: Fast Editing of Stylized Facial Expressions with Diffusion Models in Photoshop

Authors on Pith no claims yet

Pith reviewed 2026-05-13 19:48 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.HCcs.LG
keywords facial expression editingdiffusion modelsPhotoshop pluginstylized image editingartifact-free editsexpression databaseAI art tools
0
0 comments X

The pith

ExpressEdit is a Photoshop plugin that edits stylized facial expressions in three seconds without artifacts using diffusion models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ExpressEdit, an open-source plugin for Photoshop that allows artists to quickly change facial expressions in stylized images. Current AI editing tools often add noise and shift pixels, making them unusable in professional software. ExpressEdit avoids these issues and works seamlessly with tools like Liquify. It runs in under three seconds on regular GPUs and includes a large database of expressions for varied storytelling needs. By open-sourcing the code and data, the authors aim to support more artistic applications.

Core claim

ExpressEdit, a fully open-source Photoshop plugin, edits stylized facial expressions without common artifacts like global noise and pixel drift. It performs edits in 3 seconds on consumer hardware and works with tools such as Liquify. A database of 135 expression tags with example stories supports diverse generation needs.

What carries the argument

A fine-tuned diffusion model packaged as a Photoshop plugin that performs localized expression edits without introducing global noise or pixel drift.

Load-bearing premise

The diffusion model can be adapted and fine-tuned to produce edits that introduce neither global noise nor pixel drift when operating inside the Photoshop environment and when combined with native tools.

What would settle it

A side-by-side comparison of original and edited images showing pixel-level drift or added noise after using the plugin followed by Liquify or other native Photoshop operations.

Figures

Figures reproduced from arXiv: 2604.03448 by Jeffrey Lin, Jiasheng Guo, Kenan Tang, Yao Qin.

Figure 1
Figure 1. Figure 1: ExpressEdit can generate diverse, stylized expressions on an original image. The first row shows the original image, and the second and third rows show the edited images, with the user-specified expression above each image. ExpressEdit can handle both detailed multi-word descriptions (such as “clenched teeth”) and emoticons (such as “@ @”), generating stylized depictions of expressions with ease. Abstract … view at source ↗
Figure 2
Figure 2. Figure 2: ExpressEdit consists of two consecutive pipelines for a user-friendly yet professional editing experience. The prompt generation pipeline takes in a story paragraph and uses a VLM to retrieve relevant expression tags from a multi-modal expression database we curate. The relevant expression tags are inserted into a prompt, which is used in the image editing pipeline. The image editing pipeline starts by the… view at source ↗
Figure 3
Figure 3. Figure 3: Creative short stories are included in ExpressEdit to assist retrieval-augmented generation (RAG). The prompt tem￾plate and an example story are shown here. We have a rich database of expression tags, far exceed￾ing the number of common categorization systems [24, 25]. We also include emoticons [21, 42, 78], which vividly cor￾respond to stylized expressions. Well-grounded in existing Danbooru and Pixiv tag… view at source ↗
Figure 4
Figure 4. Figure 4: Baseline methods introduce destructive noise into the original image after each editing step, whereas the pixel changes from ExpressEdit are non-destructive, and the minor artifacts are easily repaired. Please see Section 3.2 for details. which further degrades image quality beyond the user’s con￾trol. Even in a stress-testing case where selections strictly overlap over 100 steps, ExpressEdit only accumula… view at source ↗
Figure 5
Figure 5. Figure 5: The selection version of Nano Banana Pro in Pho￾toshop and the naive inpainting both create visible artifacts around the selected region. Nano Banana Pro leaves artifacts around earlobes and the chin, making it impossible to contain the destructive noise via selecting. Naive inpainting also leaves arti￾facts on the right side of the neck and on the braid, necessitating the use of the SPICE backend. Origina… view at source ↗
Figure 6
Figure 6. Figure 6: ExpressEdit can generate detailed expressions even with challenging, high-resolution inputs, whereas Nano Ba￾nana 2 Pro generates lower quality faces, despite its support for 2K image generation. Nano Banana 2 Pro generates a face with blurry contours, creates visible artifacts around the red string, lowers the saturation, and arbitrarily changes the shape of the face. 3.3. Responsive and Precise Edits In … view at source ↗
Figure 8
Figure 8. Figure 8: ExpressEdit conveniently fixes artifacts from manual editing. The first row shows the results from the Liquify tool in Photoshop, and the second row shows the fixed results. Liquify supports high editing flexibility at the notorious cost of long man￾ual editing time. A casual use of the tool introduces heavy defor￾mations on the white bow or on the iris, but the deformation can be quickly fixed. Original S… view at source ↗
Figure 9
Figure 9. Figure 9: Besides expressions, ExpressEdit can fix other details on a character. The correct design of the character includes 4 bows with interleaving blue and yellow colors [72]. Hinted by simple sketches, ExpressEdit fixes the design in a few seconds. ple, complex designs of characters are usually not gener￾ated correctly, with artifacts such as incorrect number of accessories or scrambled colors ( [PITH_FULL_IMA… view at source ↗
read the original abstract

Facial expressions of characters are a vital component of visual storytelling. While current AI image editing models hold promise for assisting artists in the task of stylized expression editing, these models introduce global noise and pixel drift into the edited image, preventing the integration of these models into professional image editing software and workflows. To bridge this gap, we introduce ExpressEdit, a fully open-source Photoshop plugin that is free from common artifacts of proprietary image editing models and robustly synergizes with native Photoshop operations such as Liquify. ExpressEdit seamlessly edits an expression within 3 seconds on a single consumer-grade GPU, significantly faster than popular proprietary models. Moreover, to support the generation of diverse expressions according to different narrative needs, we compile a comprehensive expression database of 135 expression tags enriched with example stories and images designed for retrieval-augmented generation. We open source the code and dataset to facilitate future research and artistic exploration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces ExpressEdit, an open-source Photoshop plugin that adapts diffusion models for editing stylized facial expressions. It claims to perform edits in approximately 3 seconds on consumer GPUs without introducing global noise or pixel drift, to integrate seamlessly with native Photoshop tools such as Liquify, and to release a supporting database of 135 expression tags with narrative examples for retrieval-augmented generation.

Significance. If the central claims are substantiated, the work would provide a practical bridge between diffusion-based image editing and professional artist workflows, with the open release of code and dataset constituting a clear strength for reproducibility and future research.

major comments (3)
  1. [Abstract and §4] Abstract and §4 (Results): the assertions that edits are 'free from common artifacts' and 'robustly synergize' with Liquify are presented without any quantitative support (PSNR, SSIM, LPIPS, masked pixel-difference statistics, or ablation isolating drift under combined native-tool use).
  2. [§3] §3 (Method): the adaptation and fine-tuning procedure that is asserted to eliminate global noise and pixel drift is described at a high level only; no training details, loss terms, or regularization mechanisms are supplied that would allow verification of the claimed output distribution.
  3. [§4 and §5] §4 and §5: no user study, baseline comparison against proprietary models, or error analysis is reported to substantiate the 3-second timing claim or the absence of artifacts relative to existing tools.
minor comments (2)
  1. [Abstract] The abstract states performance numbers but the manuscript should include a dedicated evaluation subsection with tables of metrics.
  2. [§5] Figure captions and the expression database description would benefit from explicit cross-references to the released dataset files.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the constructive review and the recommendation for major revision. We value the feedback on strengthening the empirical support and methodological transparency. We address each major comment below and commit to revisions that will incorporate quantitative metrics, expanded training details, and additional evaluations without misrepresenting the current manuscript.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (Results): the assertions that edits are 'free from common artifacts' and 'robustly synergize' with Liquify are presented without any quantitative support (PSNR, SSIM, LPIPS, masked pixel-difference statistics, or ablation isolating drift under combined native-tool use).

    Authors: We agree that the claims would be strengthened by quantitative evidence. In the revised manuscript, we will add PSNR, SSIM, LPIPS, and masked pixel-difference statistics computed on a test set of stylized images. We will also include an ablation isolating the effect of native-tool combinations (e.g., Liquify) on drift. These results will be reported in §4 with corresponding figures. revision: yes

  2. Referee: [§3] §3 (Method): the adaptation and fine-tuning procedure that is asserted to eliminate global noise and pixel drift is described at a high level only; no training details, loss terms, or regularization mechanisms are supplied that would allow verification of the claimed output distribution.

    Authors: The current §3 provides a high-level overview of the adaptation. We will expand it with concrete training hyperparameters, the full set of loss terms (including any regularization for pixel-level fidelity), and the mechanisms used to constrain the output distribution. This will enable independent verification of how global noise and drift are addressed. revision: yes

  3. Referee: [§4 and §5] §4 and §5: no user study, baseline comparison against proprietary models, or error analysis is reported to substantiate the 3-second timing claim or the absence of artifacts relative to existing tools.

    Authors: The 3-second timing is based on wall-clock measurements on consumer GPUs; we will report these with exact hardware specifications and variance in the revision. We will add baseline comparisons against open-source alternatives and an error analysis in §4. A targeted user study with artists will be included in §5 to assess perceived artifacts and workflow integration. Direct comparisons to proprietary models are constrained by API access, but we will leverage publicly reported metrics where available. revision: partial

Circularity Check

0 steps flagged

No significant circularity; engineering artifact with no load-bearing derivations

full rationale

The paper describes a Photoshop plugin and dataset for expression editing. No equations, fitted parameters, or mathematical derivations appear in the provided text. Claims of artifact-free output and synergy with native tools are presented as empirical properties of the implemented system rather than predictions derived from self-referential definitions or self-citations. No self-citation load-bearing steps, uniqueness theorems, or ansatz smuggling are present. The contribution reduces to a software artifact whose performance assertions stand or fall on external testing, not internal tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Because only the abstract is available, the ledger is necessarily incomplete. No explicit free parameters, axioms, or invented entities are stated in the provided text; the central claim rests on the unstated assumption that a diffusion model can be constrained to local edits inside a commercial raster editor.

pith-pipeline@v0.9.0 · 5465 in / 1083 out tokens · 40720 ms · 2026-05-13T19:48:02.786539+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

109 extracted references · 109 canonical work pages · 3 internal anchors

  1. [1]

    In- teractive exploration and refinement of facial expression us- ing manifold learning

    Rinat Abdrashitov, Fanny Chevalier, and Karan Singh. In- teractive exploration and refinement of facial expression us- ing manifold learning. InProceedings of the 33rd Annual ACM Symposium on User Interface Software and Technol- ogy, pages 778–790, 2020. 2

  2. [2]

    Choose colors in Photoshop Elements - Adobe Help Center.https : / / helpx

    Adobe. Choose colors in Photoshop Elements - Adobe Help Center.https : / / helpx . adobe . com/photoshop- elements/using/choosing- colors.html, 2022. Accessed: 2026-03-14. 8

  3. [3]

    Create layers in Photoshop Elements - Adobe Help Center.https : / / helpx

    Adobe. Create layers in Photoshop Elements - Adobe Help Center.https : / / helpx . adobe . com/photoshop- elements/using/creating- layers.html, 2022. Accessed: 2026-03-14. 7

  4. [4]

    Set up brushes in Photoshop Elements - Adobe Support.https : / / helpx

    Adobe. Set up brushes in Photoshop Elements - Adobe Support.https : / / helpx . adobe . com / photoshop - elements / using / setting - brushes.html, 2022. Accessed: 2026-03-14. 8

  5. [5]

    Adobe UXP Developer Tool.https : / / developer.adobe.com/photoshop/uxp/2022/ guides/devtool/, 2022

    Adobe. Adobe UXP Developer Tool.https : / / developer.adobe.com/photoshop/uxp/2022/ guides/devtool/, 2022. Accessed: 2026-03-14. 4

  6. [6]

    Get started with selections - Adobe Sup- port.https://helpx.adobe.com/photoshop/ using / making - selections

    Adobe. Get started with selections - Adobe Sup- port.https://helpx.adobe.com/photoshop/ using / making - selections . html, 2023. Ac- cessed: 2026-03-14. 4

  7. [7]

    Change color saturation, hue, and vibrance in Photoshop Elements.https://helpx.adobe

    Adobe. Change color saturation, hue, and vibrance in Photoshop Elements.https://helpx.adobe. com/photoshop-elements/using/adjusting- color-saturation-hue-vibrance.html, 2024. Accessed: 2026-03-14. 8

  8. [8]

    Select with lasso tools in Photoshop - Adobe.https://helpx.adobe.com/photoshop/ using / selecting - lasso - tools

    Adobe. Select with lasso tools in Photoshop - Adobe.https://helpx.adobe.com/photoshop/ using / selecting - lasso - tools . html, 2024. Accessed: 2026-03-14. 6

  9. [9]

    Adjust scale, rotation, and perspective - Adobe Support.https : / / helpx

    Adobe. Adjust scale, rotation, and perspective - Adobe Support.https : / / helpx . adobe . com / photoshop / desktop / crop - resize - transform / transform - manipulate - reshape / adjust - scale - rotation - and - perspective.html, 2026. Accessed: 2026-03-14. 7

  10. [10]

    Expand or contract a selection - Adobe

    Adobe. Expand or contract a selection - Adobe. https : / / helpx . adobe . com / photoshop / desktop / make - selections / refine - modify - selections / expand - or - contract - selection.html, 2026. Accessed: 2026-03-14. 5

  11. [11]

    Adobe. Fringe pixels around a selection - Adobe Sup- port.https://helpx.adobe.com/photoshop/ desktop/make- selections/refine- modify- selections / fringe - pixels - around - a - selection.html, 2026. Accessed: 2026-03-14. 5

  12. [12]

    Photoshop Generative Fill: Use AI to Fill in Images |Adobe.https://www.adobe.com/products/ photoshop/generative- fill.html, 2026

    Adobe. Photoshop Generative Fill: Use AI to Fill in Images |Adobe.https://www.adobe.com/products/ photoshop/generative- fill.html, 2026. Ac- cessed: 2026-03-14. 2, 6

  13. [13]

    Overview of Liquify filter - Adobe Help Center

    Adobe. Overview of Liquify filter - Adobe Help Center. https : / / helpx . adobe . com / photoshop / desktop / effects - filters / artistic - stylize - filters / overview - of - liquify - filter.html, 2026. Accessed: 2026-03-14. 4

  14. [14]

    How to merge layers in Photoshop - 5 Methods - Adobe.https://www.adobe.com/products/ photoshop/merge-layers.html, 2026

    Adobe. How to merge layers in Photoshop - 5 Methods - Adobe.https://www.adobe.com/products/ photoshop/merge-layers.html, 2026. Accessed: 2026-03-14. 4

  15. [15]

    Paint a selection with Quick Selection tool - Adobe Support.https : / / helpx

    Adobe. Paint a selection with Quick Selection tool - Adobe Support.https : / / helpx . adobe . com/photoshop/desktop/make- selections/ automatic-color-based-selections/paint- a- selection- with- quick- selection- tool. html, 2026. Accessed: 2026-03-14. 5

  16. [16]

    Adobe Photoshop on desktop release notes

    Adobe. Adobe Photoshop on desktop release notes. https : / / helpx . adobe . com / photoshop / desktop/whats-new/photoshop-on-desktop- release-notes.html, 2026. Accessed: 2026-03-14. 5

  17. [17]

    Refine and soften selection edges - Adobe

    Adobe. Refine and soften selection edges - Adobe. https : / / helpx . adobe . com / photoshop / desktop/make- selections/refine- modify- selections/refine-and-soften-selection- edges.html, 2026. Accessed: 2026-03-14. 5

  18. [18]

    Sample from all visible layers - Adobe Help Center.https : / / helpx

    Adobe. Sample from all visible layers - Adobe Help Center.https : / / helpx . adobe . com / photoshop/desktop/create-manage-layers/ get - started - layers / sample - from - all - visible- layers.html, 2026. Accessed: 2026-03-

  19. [19]

    Interactive digital photomon- tage

    Aseem Agarwala, Mira Dontcheva, Maneesh Agrawala, Steven Drucker, Alex Colburn, Brian Curless, David Salesin, and Michael Cohen. Interactive digital photomon- tage. InACM SIGGRAPH 2004 Papers, pages 294–302

  20. [20]

    Plot’n polish: Zero-shot story visualiza- tion and disentangled editing with text-to-image diffusion models.arXiv preprint arXiv:2509.04446, 2025

    Kiymet Akdemir, Jing Shi, Kushal Kafle, Brian Price, and Pinar Yanardag. Plot’n polish: Zero-shot story visualiza- tion and disentangled editing with text-to-image diffusion models.arXiv preprint arXiv:2509.04446, 2025. 2

  21. [21]

    An inte- grated review of emoticons in computer-mediated commu- nication.Frontiers in psychology, 7:2061, 2017

    Nerea Aldunate and Roberto Gonz ´alez-Ib´a˜nez. An inte- grated review of emoticons in computer-mediated commu- nication.Frontiers in psychology, 7:2061, 2017. 4

  22. [22]

    Nasal analysis of classic animated movie villains versus hero counterparts.Facial Plastic Surgery, 37(03):348–353, 2021

    Meredith A Allen, Jordyn P Lucas, Michael Chung, Hani M Rayess, and Giancarlo Zuliani. Nasal analysis of classic animated movie villains versus hero counterparts.Facial Plastic Surgery, 37(03):348–353, 2021. 2

  23. [23]

    Hapfacs: An open source api/software to generate facs-based expressions for ecas an- imation and for corpus generation

    Reza Amini and Christine Lisetti. Hapfacs: An open source api/software to generate facs-based expressions for ecas an- imation and for corpus generation. In2013 Humaine Asso- ciation Conference on Affective Computing and Intelligent Interaction, pages 270–275. IEEE, 2013. 2

  24. [24]

    Modeling stylized character expres- sions via deep learning

    Deepali Aneja, Alex Colburn, Gary Faigin, Linda Shapiro, and Barbara Mones. Modeling stylized character expres- sions via deep learning. InAsian conference on computer vision, pages 136–153. Springer, 2016. 2, 4

  25. [25]

    Learning to gener- ate 3d stylized character expressions from humans

    Deepali Aneja, Bindita Chaudhuri, Alex Colburn, Gary Fai- gin, Linda Shapiro, and Barbara Mones. Learning to gener- ate 3d stylized character expressions from humans. In2018 IEEE Winter Conference on Applications of Computer Vi- sion (WACV), pages 160–169. IEEE, 2018. 2, 4 9

  26. [26]

    Image Editing AI Leaderboard - Best Mod- els Compared.https://arena.ai/leaderboard/ image-edit, 2026

    Arena AI. Image Editing AI Leaderboard - Best Mod- els Compared.https://arena.ai/leaderboard/ image-edit, 2026. Accessed: 2026-03-14. 2, 5

  27. [27]

    Identity-motion trade-offs in text-to-video generation

    Yuval Atzmon, Rinon Gal, Yoad Tewel, Yoni Kasten, and Gal Chechik. Identity-motion trade-offs in text-to-video generation. In36th British Machine Vision Conference 2025, BMVC 2025, Sheffield, UK, November 24-27, 2025. BMV A, 2025. 2

  28. [28]

    GitHub - AUTOMATIC1111/stable- diffusion-webui: Stable Diffusion web UI.https : / / github

    AUTOMATIC1111. GitHub - AUTOMATIC1111/stable- diffusion-webui: Stable Diffusion web UI.https : / / github . com / AUTOMATIC1111 / stable - diffusion-webui, 2026. Accessed: 2026-03-14. 4

  29. [29]

    Re:verse - can your vlm read a manga? InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision (ICCV) Workshops, pages 3761–3771, 2025

    Aaditya Baranwal, Madhav Kataria, Naitik Agrawal, Yo- gesh S Rawat, and Shruti Vyas. Re:verse - can your vlm read a manga? InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision (ICCV) Workshops, pages 3761–3771, 2025. 2

  30. [30]

    FLUX.2: Frontier Visual Intelligence

    Black Forest Labs. FLUX.2: Frontier Visual Intelligence. https://bfl.ai/blog/flux-2, 2025. Accessed: 2026-03-14. 5

  31. [31]

    black-forest-labs/FLUX.2-dev - Hug- ging Face.https://huggingface.co/black- forest-labs/FLUX.2-dev, 2026

    Black Forest Labs. black-forest-labs/FLUX.2-dev - Hug- ging Face.https://huggingface.co/black- forest-labs/FLUX.2-dev, 2026. Accessed: 2026- 03-14. 5

  32. [32]

    Re: draw-context aware translation as a controllable method for artistic production

    Jo ˜ao Lib ´orio Cardoso, Francesco Banterle, Paolo Cignoni, and Michael Wimmer. Re: draw-context aware translation as a controllable method for artistic production. InProceed- ings of the Thirty-Third International Joint Conference on Artificial Intelligence, pages 7609–7617, 2024. 2

  33. [33]

    Designing ani- mated characters for children of different ages

    Elizabeth J Carter, Moshe Mahler, Maryyann Landlord, Kyna McIntosh, and Jessica K Hodgins. Designing ani- mated characters for children of different ages. InProceed- ings of the The 15th International Conference on Interac- tion Design and Children, pages 421–427, 2016. 2

  34. [34]

    Deep generation of face images from sketches

    Shu-Yu Chen, Wanchao Su, Lin Gao, Shihong Xia, and Hongbo Fu. Deep generation of face images from sketches. arXiv preprint arXiv:2006.01047, 2020. 2

  35. [35]

    Exploring the cor- relation between gaze patterns and facial geometric param- eters: A cross-cultural comparison between real and ani- mated faces.Symmetry, 17(4):528, 2025

    Zhi-Lin Chen and Kang-Ming Chang. Exploring the cor- relation between gaze patterns and facial geometric param- eters: A cross-cultural comparison between real and ani- mated faces.Symmetry, 17(4):528, 2025. 2

  36. [36]

    Evaluating the effect of outfit on personality perception in virtual characters

    Yanbo Cheng and Yingying Wang. Evaluating the effect of outfit on personality perception in virtual characters. In Virtual Worlds, pages 21–39. MDPI, 2024. 8

  37. [37]

    Prompting for products: investigating design space exploration strategies for text-to- image generative models.Design Science, 11:e2, 2025

    Leah Chong, I-Ping Lo, Jude Rayan, Steven Dow, Faez Ahmed, and Ioanna Lykourentzou. Prompting for products: investigating design space exploration strategies for text-to- image generative models.Design Science, 11:e2, 2025. 2

  38. [38]

    SDXL Lightning LoRAs - 8 Steps|Stable Diffu- sion XL LoRA.https://civitai.com/models/ 350450 ? modelVersionId = 391999, 2024

    Civitai. SDXL Lightning LoRAs - 8 Steps|Stable Diffu- sion XL LoRA.https://civitai.com/models/ 350450 ? modelVersionId = 391999, 2024. Ac- cessed: 2026-03-14. 8

  39. [39]

    Mask Editor - Create and Edit Masks in Com- fyUI - ComfyUI.https : / / docs

    ComfyUI. Mask Editor - Create and Edit Masks in Com- fyUI - ComfyUI.https : / / docs . comfy . org / interface/maskeditor, 2026. Accessed: 2026-03-

  40. [40]

    Tag Group: Eyes Tags|Danbooru.https: / / danbooru

    Danbooru. Tag Group: Eyes Tags|Danbooru.https: / / danbooru . donmai . us / wiki _ pages / tag _ group:eyes_tags, 2026. Accessed: 2026-03-14. 3

  41. [41]

    Tag Group: Face Tags|Danbooru.https: / / danbooru

    Danbooru. Tag Group: Face Tags|Danbooru.https: / / danbooru . donmai . us / wiki _ pages / tag _ group:face_tags, 2026. Accessed: 2026-03-14. 3

  42. [42]

    Emoticons and online message interpretation.Social Sci- ence Computer Review, 26(3):379–388, 2008

    Daantje Derks, Arjan ER Bos, and Jasper V on Grumbkow. Emoticons and online message interpretation.Social Sci- ence Computer Review, 26(3):379–388, 2008. 4

  43. [43]

    Aether weaver: Multimodal affective nar- rative co-generation with dynamic scene graphs.arXiv preprint arXiv:2507.21893, 2025

    Saeed Ghorbani. Aether weaver: Multimodal affective nar- rative co-generation with dynamic scene graphs.arXiv preprint arXiv:2507.21893, 2025. 2

  44. [44]

    Baby schema in infant faces induces cuteness perception and motivation for caretaking in adults.Ethology, 115(3): 257–263, 2009

    Melanie L Glocker, Daniel D Langleben, Kosha Ruparel, James W Loughead, Ruben C Gur, and Norbert Sachser. Baby schema in infant faces induces cuteness perception and motivation for caretaking in adults.Ethology, 115(3): 257–263, 2009. 2

  45. [45]

    Gemini 3 Flash: frontier intelligence built for speed.https://blog.google/products- and-platforms/products/gemini/gemini-3- flash/, 2025

    Google. Gemini 3 Flash: frontier intelligence built for speed.https://blog.google/products- and-platforms/products/gemini/gemini-3- flash/, 2025. Accessed: 2026-03-14. 4

  46. [46]

    Nano Banana 2: Combining Pro capabilities with lightning-fast speed.https://blog.google/ innovation - and - ai / technology / ai / nano - banana-2/, 2026

    Google. Nano Banana 2: Combining Pro capabilities with lightning-fast speed.https://blog.google/ innovation - and - ai / technology / ai / nano - banana-2/, 2026. Accessed: 2026-03-14. 2, 5

  47. [47]

    SynthID - Google DeepMind.https: / / deepmind

    Google DeepMind. SynthID - Google DeepMind.https: / / deepmind . google / models / synthid/, 2026. Accessed: 2026-03-14. 2, 5

  48. [48]

    Synthid-image: Image watermarking at internet scale.arXiv preprint arXiv:2510.09263, 2025

    Sven Gowal, Rudy Bunel, Florian Stimberg, David Stutz, Guillermo Ortiz-Jimenez, Christina Kouridi, Mel Vecerik, Jamie Hayes, Sylvestre-Alvise Rebuffi, Paul Bernard, et al. Synthid-image: Image watermarking at internet scale. arXiv preprint arXiv:2510.09263, 2025. 2, 5

  49. [49]

    ImageEditor - Gradio Docs.https://www

    Gradio. ImageEditor - Gradio Docs.https://www. gradio.app/docs/gradio/imageeditor, 2026. Accessed: 2026-03-14. 4

  50. [50]

    Dress is a fundamental com- ponent of person perception.Personality and Social Psy- chology Review, 27(4):414–433, 2023

    Neil Hester and Eric Hehman. Dress is a fundamental com- ponent of person perception.Personality and Social Psy- chology Review, 27(4):414–433, 2023. 8

  51. [51]

    Flaw or artifact? rethinking prompt sensitivity in evaluating LLMs

    Andong Hua, Kenan Tang, Chenhe Gu, Jindong Gu, Eric Wong, and Yao Qin. Flaw or artifact? rethinking prompt sensitivity in evaluating LLMs. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 19889–19899, Suzhou, China, 2025. As- sociation for Computational Linguistics. 4

  52. [52]

    Promptnavi: Text-to-image generation through interactive prompt visual exploration

    Bofei Huang and Haoran Xie. Promptnavi: Text-to-image generation through interactive prompt visual exploration. Computers & Graphics, page 104417, 2025. 2

  53. [53]

    T2i-compbench: A comprehensive bench- mark for open-world compositional text-to-image genera- tion.Advances in Neural Information Processing Systems, 36:78723–78747, 2023

    Kaiyi Huang, Kaiyue Sun, Enze Xie, Zhenguo Li, and Xihui Liu. T2i-compbench: A comprehensive bench- mark for open-world compositional text-to-image genera- tion.Advances in Neural Information Processing Systems, 36:78723–78747, 2023. 7

  54. [54]

    Adaptivesliders: User-aligned semantic slider-based editing of text-to-image model output

    Rahul Jain, Amit Goel, Koichiro Niinuma, and Aakar Gupta. Adaptivesliders: User-aligned semantic slider-based editing of text-to-image model output. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–27, 2025. 7 10

  55. [55]

    Emojidiff: Advanced facial expression control with high identity preservation in portrait genera- tion

    Liangwei Jiang, Ruida Li, Zhifeng Zhang, Shuo Fang, and Chenguang Ma. Emojidiff: Advanced facial expression control with high identity preservation in portrait genera- tion. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 328–338, 2026. 2

  56. [56]

    Compre- hensive database for facial expression analysis

    Takeo Kanade, Jeffrey F Cohn, and Yingli Tian. Compre- hensive database for facial expression analysis. InProceed- ings fourth IEEE international conference on automatic face and gesture recognition (cat. No. PR00580), pages 46–

  57. [57]

    Is clip ideal? no

    Raphi Kang, Yue Song, Georgia Gkioxari, and Pietro Per- ona. Is clip ideal? no. can we fix it? yes! InProceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 22436–22446, 2025. 7

  58. [58]

    Python Scripting - Krita Manual 5.3.0 documen- tation.https : / / docs

    Krita. Python Scripting - Krita Manual 5.3.0 documen- tation.https : / / docs . krita . org / en / user _ manual / python _ scripting . html, 2026. Ac- cessed: 2026-03-14. 5

  59. [59]

    Principles of traditional animation applied to 3d computer animation

    John Lasseter. Principles of traditional animation applied to 3d computer animation. InSeminal graphics: pioneering efforts that shaped the field, pages 263–272. 1998. 2

  60. [60]

    Face poser: Interactive modeling of 3d fa- cial expressions using facial priors.ACM Transactions on Graphics (TOG), 29(1):1–17, 2009

    Manfred Lau, Jinxiang Chai, Ying-Qing Xu, and Heung- Yeung Shum. Face poser: Interactive modeling of 3d fa- cial expressions using facial priors.ACM Transactions on Graphics (TOG), 29(1):1–17, 2009. 2

  61. [61]

    Okbench: De- mocratizing llm evaluation with fully automated, on- demand, open knowledge benchmarking.arXiv preprint arXiv:2511.08598, 2025

    Yanhong Li, Tianyang Xu, Kenan Tang, Karen Livescu, David McAllester, and Jiawei Zhou. Okbench: De- mocratizing llm evaluation with fully automated, on- demand, open knowledge benchmarking.arXiv preprint arXiv:2511.08598, 2025. 2

  62. [62]

    Sdxl- lightning: Progressive adversarial diffusion distillation

    Shanchuan Lin, Anran Wang, and Xiao Yang. Sdxl- lightning: Progressive adversarial diffusion distillation. arXiv preprint arXiv:2402.13929, 2024. 8

  63. [63]

    Freedrag: Feature dragging for reliable point-based image editing

    Pengyang Ling, Lin Chen, Pan Zhang, Huaian Chen, Yi Jin, and Jinjin Zheng. Freedrag: Feature dragging for reliable point-based image editing. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6860–6870, 2024. 7

  64. [64]

    LittleJelly. Yanami Anna [4 outfits]|Illustrious|Make Heroine Ga Oosugiru! - Illu v1.0|Illustrious LoRA|Civ- itai.https://civitai.com/models/1166558/ yanami - anna - 4 - outfits - or - illustrious - or-make-heroine-ga-oosugiru, 2025. Accessed: 2026-03-14. 8

  65. [65]

    Drag your noise: Interactive point-based editing via diffusion semantic propagation

    Haofeng Liu, Chenshu Xu, Yifei Yang, Lihua Zeng, and Shengfeng He. Drag your noise: Interactive point-based editing via diffusion semantic propagation. InProceedings of the IEEE/CVF conference on computer vision and pat- tern recognition, pages 6743–6752, 2024. 7

  66. [66]

    Design guidelines for prompt engineering text-to-image generative models

    Vivian Liu and Lydia B Chilton. Design guidelines for prompt engineering text-to-image generative models. In Proceedings of the 2022 CHI conference on human factors in computing systems, pages 1–23, 2022. 2

  67. [67]

    Magicquill: An intelligent interactive image editing system

    Zichen Liu, Yue Yu, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Wen Wang, Zhiheng Liu, Qifeng Chen, and Yujun Shen. Magicquill: An intelligent interactive image editing system. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 13072–13082, 2025. 4

  68. [68]

    Magicquillv2: Precise and interac- tive image editing with layered visual cues.arXiv preprint arXiv:2512.03046, 2025

    Zichen Liu, Yue Yu, Hao Ouyang, Qiuyu Wang, Shuailei Ma, Ka Leong Cheng, Wen Wang, Qingyan Bai, Yuxuan Zhang, Yanhong Zeng, et al. Magicquillv2: Precise and interactive image editing with layered visual cues.arXiv preprint arXiv:2512.03046, 2025. 4

  69. [69]

    lllyasviel. diffusers xl canny mid.safetensors - lllyasviel/sd control collection at main.https : //huggingface.co/lllyasviel/sd_control_ collection / blob / main / diffusers _ xl _ canny_mid.safetensors, 2023. Accessed: 2026- 03-14. 5

  70. [70]

    GitHub - lllyasviel/stable-diffusion-webui-forge

    lllyasviel. GitHub - lllyasviel/stable-diffusion-webui-forge. https : / / github . com / lllyasviel / stable - diffusion- webui- forge, 2026. Accessed: 2026- 03-14. 4

  71. [71]

    Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

    Simian Luo, Yiqin Tan, Longbo Huang, Jian Li, and Hang Zhao. Latent consistency models: Synthesizing high- resolution images with few-step inference.arXiv preprint arXiv:2310.04378, 2023. 8

  72. [72]

    Make Heroine ga Oosugiru!

    Makeine Support Committee. CHARACTER|TV Anime “Make Heroine ga Oosugiru!” Official Website.https: //makeine- anime.com/character/, 2026. Ac- cessed: 2026-03-14. 8

  73. [73]

    A Survey of Context Engineering for Large Language Models

    Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Bao- long Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, et al. A survey of context engineering for large language models.arXiv preprint arXiv:2507.13334,

  74. [74]

    GitHub - mirabarukaso/character select stand alone app: Char- acter Select Stand Alone App with AI prompt and ComfyUI/WebUI API support for wai-il model

    mirabarukaso. GitHub - mirabarukaso/character select stand alone app: Char- acter Select Stand Alone App with AI prompt and ComfyUI/WebUI API support for wai-il model. https : / / github . com / mirabarukaso / character _ select _ stand _ alone _ app, 2026. Accessed: 2026-03-14. 2

  75. [75]

    Dynamic prompt optimizing for text- to-image generation

    Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen, and Qing Yang. Dynamic prompt optimizing for text- to-image generation. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 26627–26636, 2024. 4

  76. [76]

    Dragondiffusion: Enabling drag-style manip- ulation on diffusion models

    Chong Mou, Xintao Wang, Jiechong Song, Ying Shan, and Jian Zhang. Dragondiffusion: Enabling drag-style manip- ulation on diffusion models. InThe Twelfth International Conference on Learning Representations, 2024. 7

  77. [77]

    The new ChatGPT Images is here.https:// openai.com/index/new-chatgpt-images-is- here/, 2025

    OpenAI. The new ChatGPT Images is here.https:// openai.com/index/new-chatgpt-images-is- here/, 2025. Accessed: 2026-03-14. 5

  78. [78]

    Emoticon style: Interpreting differences in emoticons across cultures

    Jaram Park, Vladimir Barash, Clay Fink, and Meeyoung Cha. Emoticon style: Interpreting differences in emoticons across cultures. InProceedings of the international AAAI conference on web and social media, pages 466–475, 2013. 4

  79. [79]

    Illustrious: an open advanced illustration model.arXiv preprint arXiv:2409.19946, 2024

    Sang Hyun Park, Jun Young Koh, Junha Lee, Joy Song, Dongha Kim, Hoyeon Moon, Hyunju Lee, and Min Song. Illustrious: an open advanced illustration model.arXiv preprint arXiv:2409.19946, 2024. 3 11

  80. [80]

    pixiv Encyclopedia.https://dic.pixiv

    pixiv. pixiv Encyclopedia.https://dic.pixiv. net/en/, 2026. Accessed: 2026-03-14. 3, 4

Showing first 80 references.