arxiv: 2604.21036 · v1 · submitted 2026-04-22 · 💻 cs.AI

Recognition: unknown

Who Defines Fairness? Target-Based Prompting for Demographic Representation in Generative Models

James Davis, Marzia Binta Nizam

Authors on Pith no claims yet

Pith reviewed 2026-05-09 23:43 UTC · model grok-4.3

classification 💻 cs.AI

keywords text-to-image modelsbias mitigationprompt engineeringdemographic fairnessskin tone representationinference-time interventionuser-defined fairnessgenerative AI

0 comments

The pith

Prompt engineering lets users set their own demographic targets for fair representation in AI image generation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an inference-time method that lets users pick a fairness target, such as equal skin-tone representation or a distribution drawn from external data, and then builds multiple prompt variants in those exact proportions. These variants steer text-to-image models like Stable Diffusion toward outputs whose skin tones match the chosen target, without any model retraining or dataset changes. Experiments with 36 prompts across 30 occupations and 6 other contexts show that skin-tone results move in the direction the user specified, and align more closely when the target is expressed directly in skin-tone terms rather than abstract categories. The approach treats fairness as something the user defines, not a single fixed property of the model.

Core claim

Instead of assuming one universal definition of fairness, the method lets users choose among multiple specifications, from uniform distributions to LLM-generated ones that cite sources and give confidence levels; these specifications determine the proportions in which demographic-specific prompt variants are generated, producing skin-tone distributions in the model outputs that shift consistently with the declared target and show lower deviation when the target is stated directly in skin-tone space.

What carries the argument

Target-based prompting: the construction of demographic-specific prompt variants in user-specified proportions that guide the generative model's output distribution toward the chosen fairness target.

If this is right

Fairness specifications become explicit and choosable by the user instead of being fixed by the model developer.
Bias mitigation works at inference time with no changes to model weights or training data.
Evaluation measures how well outputs match the specific target chosen rather than assuming any single distribution is fair.
The same prompting structure applies across both occupational prompts and non-occupational contexts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar prompt-proportion techniques could be tested on other visual attributes such as age or gender presentation.
The method might be combined with retrieval of real-world demographic statistics to make targets more evidence-based.
Different base models could be compared to see how strongly their internal biases resist or amplify the prompt interventions.
User interfaces could expose the LLM source and confidence estimates so people can judge the reliability of a chosen target.

Load-bearing premise

The generative model will respond to the demographic proportions in the prompt variants by producing matching skin-tone distributions in its outputs rather than overriding them with its own biases.

What would settle it

Generating thousands of images with a prompt set that specifies 100 percent of one skin tone and finding that the actual skin-tone distribution remains close to the model's default bias instead of shifting toward the target.

Figures

Figures reproduced from arXiv: 2604.21036 by James Davis, Marzia Binta Nizam.

**Figure 1.** Figure 1: Visual pipeline of our demographically-aware image generation system. Given an input prompt (e.g., “A full-color headshot of a [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Skin tone distribution shifts for three occupational prompts before and after target-conditioned generation using SD Realistic [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Skin tone distribution by occupational status group before and after target-conditioned generation using SD Realistic Vision [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Per-occupation Fitzpatrick skin tone distributions across high-, moderate-, and low-prestige occupational groups before [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Skin tone distribution by occupational status group (High, Mid, Low) and method, aggregated into three Fitzpatrick bins: Light [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Qualitative comparison across other methods for the high-status prompt [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: Skin tone distributions for three non-occupational prompt groups: (a) emotional/abstract, (b) socioeconomic, and (c) criminal [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

**Figure 8.** Figure 8: Visual comparison of model-generated and fallback-targeted image outputs for the non-occupational prompt [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

**Figure 9.** Figure 9: Skin tone distribution (Light I–II, Medium III–IV, Dark V–VI) for high-status occupations across four declared target settings: [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗

**Figure 10.** Figure 10: Visual comparison of model-generated and fallback-targeted image outputs for the non-occupational prompt “A full-color [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗

**Figure 11.** Figure 11: Visual comparison of model-generated and fallback-targeted image outputs for the non-occupational prompt “A full-color [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗

**Figure 12.** Figure 12: Target-conditioned prompting for gender presentation in occupational prompts. Baseline outputs (top) for “librarian” and [PITH_FULL_IMAGE:figures/full_fig_p023_12.png] view at source ↗

**Figure 13.** Figure 13: Target-conditioned prompting for gender presentation in a non-occupational prompt. Baseline outputs for “a friendly person” [PITH_FULL_IMAGE:figures/full_fig_p023_13.png] view at source ↗

**Figure 14.** Figure 14: Qualitative examples for SDXL Turbo across representative high-, moderate-, and low-status occupational prompts. Top rows: [PITH_FULL_IMAGE:figures/full_fig_p025_14.png] view at source ↗

**Figure 15.** Figure 15: Qualitative comparison of occupational image generations before and after prompt adjustment for DALL-E 2. The top rows [PITH_FULL_IMAGE:figures/full_fig_p026_15.png] view at source ↗

**Figure 16.** Figure 16: Monk Skin Tone distributions (MST 1–10) across high-, moderate-, and low-status occupational groups before (solid) and [PITH_FULL_IMAGE:figures/full_fig_p027_16.png] view at source ↗

**Figure 17.** Figure 17: Fitzpatrick skin tone distributions (Types I–VI) across high-, moderate-, and low-status occupational groups before (solid) and [PITH_FULL_IMAGE:figures/full_fig_p027_17.png] view at source ↗

**Figure 18.** Figure 18: Monk Skin Tone distributions (MST 1–10) across high-, moderate-, and low-status occupational groups before (solid) and [PITH_FULL_IMAGE:figures/full_fig_p027_18.png] view at source ↗

**Figure 19.** Figure 19: Fitzpatrick skin tone distributions (Types I–VI) across high-, moderate-, and low-status occupational groups before (solid) [PITH_FULL_IMAGE:figures/full_fig_p028_19.png] view at source ↗

**Figure 20.** Figure 20: Monk Skin Tone distributions (MST 1–10) across high-, moderate-, and low-status occupational groups before (solid) and [PITH_FULL_IMAGE:figures/full_fig_p028_20.png] view at source ↗

**Figure 21.** Figure 21: Fitzpatrick skin tone distributions (Types I–VI) across high-, moderate-, and low-status occupational groups before (solid) [PITH_FULL_IMAGE:figures/full_fig_p028_21.png] view at source ↗

**Figure 22.** Figure 22: Monk Skin Tone distributions (MST 1–10) across high-, moderate-, and low-status occupational groups before (solid) and [PITH_FULL_IMAGE:figures/full_fig_p028_22.png] view at source ↗

**Figure 23.** Figure 23: Qualitative comparison across all methods for occupational prompts using SD Realistic Vision v5.1: [PITH_FULL_IMAGE:figures/full_fig_p030_23.png] view at source ↗

read the original abstract

Text-to-image(T2I) models like Stable Diffusion and DALL-E have made generative AI widely accessible, yet recent studies reveal that these systems often replicate societal biases, particularly in how they depict demographic groups across professions. Prompts such as 'doctor' or 'CEO' frequently yield lighter-skinned outputs, while lower-status roles like 'janitor' show more diversity, reinforcing stereotypes. Existing mitigation methods typically require retraining or curated datasets, making them inaccessible to most users. We propose a lightweight, inference-time framework that mitigates representational bias through prompt-level intervention without modifying the underlying model. Instead of assuming a single definition of fairness, our approach allows users to select among multiple fairness specifications-ranging from simple choices such as a uniform distribution to more complex definitions informed by a large language model(LLM) that cites sources and provides confidence estimates. These distributions guide the construction of demographic specific prompt variants in the corresponding proportions, and we evaluate alignment by auditing adherence to the declared target and measuring the resulting skin tone distribution rather than assuming uniformity as 'fairness'. Across 36 prompts spanning 30 occupations and 6 non-occupational contexts, our method shifts observed skin-tone outcomes in directions consistent with the declared target, and reduces deviation from targets when the target is defined directly in skin-tone space(fallback). This work demonstrates how fairness interventions can be made transparent, controllable, and usable at inference time, directly empowering users of generative AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives users a way to pick their own demographic targets and steer T2I outputs with prompt mixtures, but the experiments are too light on details to show the method actually delivers the targets.

read the letter

The core idea is straightforward: let the user declare a target distribution over skin tones or demographics, pull in an LLM for more complex targets with sources, then build prompt variants in matching proportions and check the generated images against that declared target. This keeps everything at inference time and avoids retraining or picking one universal definition of fair. That part is practical and sidesteps some of the usual fairness debates by making the choice explicit and auditable. The abstract shows directional shifts and lower deviation from targets across 36 prompts, which is at least consistent with the claim. The evaluation approach of measuring against the user's stated goal rather than assuming uniformity is cleaner than many prior prompting papers. The soft spots sit in the results section. The abstract gives no sample counts per prompt, no variance numbers, no statistical tests, and no description of how skin tones were actually measured or labeled. Without those, it's hard to know whether the observed shifts are reliable or just noise from prompt sensitivity. The stress-test point about diffusion models resisting simple reweighting also lands, because these models carry strong demographic priors from training data and often override prompt tweaks. The paper does not address cases where the intervention fails to move the distribution. This is for practitioners and prompt engineers who need quick, controllable fixes for deployed generators rather than theoretical advances. A reader already working on bias mitigation in Stable Diffusion or similar would pick up the multi-target and LLM-assisted angle. It deserves a serious referee because the framing is usable and the evaluation idea is sound, even if the current evidence is preliminary. Send it for review with a request for fuller experimental reporting and failure-case analysis.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes an inference-time, prompt-based framework for controlling demographic (skin-tone) representation in text-to-image models. Users select among multiple fairness targets (uniform distribution or LLM-generated distributions citing sources), which dictate the proportions of demographic-specific prompt variants (e.g., 'a [skin-tone descriptor] [occupation]'). These variants are sampled to generate outputs, which are then audited for alignment with the declared target. Across 36 prompts (30 occupations + 6 non-occupational contexts), the authors claim that observed skin-tone distributions shift directionally consistent with the target and show reduced deviation from the target when the target is specified directly in skin-tone space.

Significance. If the empirical results are substantiated, the work offers a practical, transparent alternative to retraining-based debiasing methods by making fairness definitions user-controllable and auditable rather than assuming uniformity. It highlights prompt engineering as a lightweight intervention and provides external grounding via target alignment rather than self-referential metrics. The multi-specification approach (including LLM-informed targets with citations) is a notable strength for usability.

major comments (2)

[Abstract] Abstract: the central empirical claim of directional shifts and reduced deviation from targets is presented without any details on skin-tone measurement/auditing methods, per-prompt sample sizes, variance across runs, statistical tests against targets, or baselines, making it impossible to assess whether the data support the alignment results.
[Evaluation] The evaluation section (implied by the 36-prompt results): the assumption that constructing prompt variants in exact target proportions will produce matching output distributions is load-bearing, yet the manuscript provides no controls for prompt sensitivity, no cases of model override, and no quantitative comparison to the skeptic concern of non-linear diffusion-model responses to demographic descriptors.

minor comments (1)

[Abstract] The parenthetical '(fallback)' in the abstract is undefined and should be clarified or removed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We appreciate the referee's feedback highlighting areas where the manuscript can be strengthened for clarity and rigor. We have made revisions to address both major comments as detailed below.

read point-by-point responses

Referee: [Abstract] Abstract: the central empirical claim of directional shifts and reduced deviation from targets is presented without any details on skin-tone measurement/auditing methods, per-prompt sample sizes, variance across runs, statistical tests against targets, or baselines, making it impossible to assess whether the data support the alignment results.

Authors: We agree with the referee that the abstract lacks sufficient detail on the empirical methods to allow full assessment of the claims. We have revised the abstract to include information on the skin-tone measurement and auditing approach (using automated classification of generated images), per-prompt sample sizes, variance across runs, statistical tests employed, and comparison to baselines. These details are now summarized in the abstract while referring readers to the full Evaluation section for complete methodology. revision: yes
Referee: [Evaluation] The evaluation section (implied by the 36-prompt results): the assumption that constructing prompt variants in exact target proportions will produce matching output distributions is load-bearing, yet the manuscript provides no controls for prompt sensitivity, no cases of model override, and no quantitative comparison to the skeptic concern of non-linear diffusion-model responses to demographic descriptors.

Authors: We agree that the evaluation relies on the assumption that prompt proportions translate to output distributions, and that additional controls would strengthen the work. The current manuscript does not include explicit sensitivity analyses, override cases, or direct comparisons to non-linear models. We have therefore revised the Evaluation section to add: controls for prompt sensitivity through testing of alternative demographic descriptors; documented cases where the model partially overrides the target proportions; and a quantitative discussion comparing observed deviations to what would be expected under linear vs. non-linear response assumptions, using the data from the 36 prompts. This addresses the skeptic concern while maintaining the paper's focus on practical usability. revision: yes

Circularity Check

0 steps flagged

No significant circularity; evaluation is externally grounded

full rationale

The paper proposes constructing prompt variants from user- or LLM-specified target distributions and then audits generated outputs by directly measuring skin-tone distributions against those declared targets. This evaluation step is independent of the input targets and does not reduce to a self-referential fit, redefinition, or self-citation chain. No equations, parameter fitting, or load-bearing self-citations appear in the abstract or described method. The central result (directional shifts and reduced deviation) is presented as an empirical observation rather than a derivation that collapses to its own premises by construction. The approach remains self-contained against external image auditing.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that prompt-level demographic specifications can steer generative model outputs toward user targets without retraining; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Prompt modifications specifying demographics can control skin-tone distributions in generated images
This is the core premise enabling the inference-time intervention without model changes.

pith-pipeline@v0.9.0 · 5559 in / 1245 out tokens · 26152 ms · 2026-05-09T23:43:07.458289+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 24 canonical work pages · 5 internal anchors

[1]

Hritik Bansal, Da Yin, Masoud Monajatipoor, and Kai-Wei Chang. 2022. How well can text-to-image generative models understand ethical natural language interventions?. InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 1358–1370

2022
[2]

2017.Fairness and machine learning

Solon Barocas, Moritz Hardt, and Arvind Narayanan. 2017.Fairness and machine learning. fairmlbook. org

2017
[3]

Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, and Aylin Caliskan. 2023. Easily accessible text-to-image generation amplifies demographic stereotypes at large scale.arXiv preprint arXiv:2211.03759 (2023)

work page arXiv 2023
[4]

Alain Chardon, Isabelle Cretois, and Colette Hourseau. 1991. Skin colour typology and suntanning pathways.International journal of cosmetic science13, 4 (1991), 191–208. 16 Marzia Binta Nizam and James Davis

1991
[5]

Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, et al. 2023. Pixart Fast training of diffusion transformer for photorealistic text-to-image synthesis.arXiv preprint arXiv:2310.00426(2023)

work page internal anchor Pith review arXiv 2023
[6]

Tianwei Chen, Yusuke Hirota, Mayu Otani, Noa García, and Yuta Nakashima. 2024. Would deep generative models amplify bias in future models? arXiv preprint arXiv:2404.08242(2024)

work page arXiv 2024
[7]

Aditya Chinchure, Pushkar Shukla, Gaurav Bhatt, Kiri Salij, Kartik Hosanagar, Leonid Sigal, and Matthew Turk. 2023. TIBET: Identifying and evaluating biases in text-to-image generative models.arXiv preprint arXiv:2312.01261(2023)

work page arXiv 2023
[8]

Jaemin Cho, Abhaysinh Zala, and Mohit Bansal. 2022. DALL-EVAL: Probing the reasoning skills and social biases of text-to-image generation models.arXiv preprint arXiv:2202.04053(2022)

work page arXiv 2022
[9]

Jiankang Deng, Jia Guo, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. Retinaface: Single-shot multi-level face localisation in the wild. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5203–5212

2020
[10]

Moreno D’Inca, Elia Peruzzo, Massimiliano Mancini, Dejia Xu, Vidit Goel, Xingqian Xu, Zhangyang Wang, Humphrey Shi, and Nicu Sebe. 2024. OpenBias: Open-set bias detection in text-to-image generative models.arXiv preprint arXiv:2404.07990(2024)

work page arXiv 2024
[11]

Thomas B Fitzpatrick. 1988. The validity and practicality of sun-reactive skin types I through VI.Archives of dermatology124, 6 (1988), 869–871

1988
[12]

Fraser, Svetlana Kiritchenko, and Isar Nejadgholi

Kathleen C. Fraser, Svetlana Kiritchenko, and Isar Nejadgholi. 2023. Diversity is Not a One-Way Street: Pilot Study on Ethical Interventions for Racial Bias in Text-to-Image Systems. InProceedings of the 14th International Conference on Computational Creativity (ICCC). 288–292

2023
[13]

Felix Friedrich, Manuel Brack, Lukas Struppek, Dominik Hintersdorf, Patrick Schramowski, Sasha Luccioni, and Kristian Kersting. 2023. Fair diffusion: Instructing text-to-image generation models on fairness.arXiv preprint arXiv:2302.10893(2023)

work page arXiv 2023
[14]

Sourojit Ghosh and Aylin Caliskan. 2023. ’Person’== light-skinned, western man, and sexualization of women of color: Stereotypes in stable diffusion.arXiv preprint arXiv:2310.19981(2023)

work page arXiv 2023
[15]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets.Advances in neural information processing systems27 (2014)

2014
[16]

Google. [n. d.]. Skin Tone by Google. https://skintone.google/. Accessed: 2026-03-23

2026
[17]

Ruifei He, Chuhui Xue, Haoru Tan, Wenqing Zhang, Yingchen Yu, Song Bai, and Xiaojuan Qi. 2024. Debiasing text-to-image diffusion models. In Proceedings of the 1st ACM Multimedia Workshop on Multi-modal Misinformation Governance in the Era of Foundation Models. 29–36

2024
[18]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems33 (2020), 6840–6851

2020
[19]

Lin-Chun Huang, Ching Chieh Tsao, Fang-Yi Su, and Jung-Hsien Chiang. 2025. Debiasing Diffusion Model: Enhancing Fairness through Latent Representation Learning in Stable Diffusion Model.arXiv preprint arXiv:2503.12536(2025)

work page arXiv 2025
[20]

Kimmo Kärkkäinen and Jungseock Joo. 2021. FairFace: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation. Proceedings of the IEEE/CVF winter conference on applications of computer vision(2021), 1548–1558

2021
[21]

Tahsin Alamgir Kheya, Mohamed Reda Bouadjenek, and Sunil Aryal. 2024. The Pursuit of Fairness in Artificial Intelligence Models: A Survey.arXiv preprint arXiv:2403.17333(2024). doi:10.48550/arXiv.2403.17333

work page doi:10.48550/arxiv.2403.17333 2024
[22]

Eunji Kim, Siwon Kim, Minjun Park, Rahim Entezari, and Sungroh Yoon. 2025. Rethinking Training for De-biasing Text-to-Image Generation: Unlocking the Potential of Stable Diffusion. InProceedings of the Computer Vision and Pattern Recognition Conference. 13361–13370

2025
[23]

Sasha Luccioni, Christopher Akiki, Margaret Mitchell, and Yacine Jernite. 2023. Stable bias: Evaluating societal representations in diffusion models. arXiv preprint arXiv:2303.11408(2023)

work page arXiv 2023
[24]

Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A survey on bias and fairness in machine learning. ACM computing surveys54, 6 (2021), 1–35

2021
[25]

Michele Merler, Nalini Ratha, Rogerio S Feris, and John R Smith. 2019. Diversity in faces.arXiv preprint arXiv:1901.10436(2019)

work page arXiv 2019
[26]

Ranjita Naik and Besmira Nushi. 2023. Social biases through the text-to-image generation lens.arXiv preprint arXiv:2304.06034(2023)

work page arXiv 2023
[27]

William Peebles and Saining Xie. 2023. Scalable diffusion models with transformers.arXiv preprint arXiv:2212.09748(2023)

work page internal anchor Pith review arXiv 2023
[28]

Shah Prerak. 2024. Addressing Bias in Text-to-Image Generation: A Review of Mitigation Methods. In2024 Third International Conference on Smart Technologies and Systems for Next Generation Computing (ICSTSN). doi:10.1109/ICSTSN61422.2024.10671230

work page doi:10.1109/icstsn61422.2024.10671230 2024
[29]

Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125(2022)

work page internal anchor Pith review arXiv 2022
[30]

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-shot text-to-image generation.arXiv preprint arXiv:2102.12092(2021)

work page internal anchor Pith review arXiv 2021
[31]

Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. 2016. Generative adversarial text to image synthesis.arXiv preprint arXiv:1605.05396(2016)

work page arXiv 2016
[32]

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684–10695

2022
[33]

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, et al. 2022. Photorealistic text-to-image diffusion models with deep language understanding.Advances in Neural Information Processing Systems35 (2022), 36479–36494

2022
[34]

Peter Saunders. 2004. What is Fair About a ’Fair Go’?Policy20, 1 (2004), 3–10. Who Defines Fairness? Target-Based Prompting for Demographic Representation in Generative Models 17

2004
[35]

Parkes, and Yang Liu

Nripsuta Ani Saxena, Karen Huang, Evan DeFilippis, Goran Radanovic, David C. Parkes, and Yang Liu. 2019. How Do Fairness Definitions Fare? Examining Public Attitudes Towards Algorithmic Definitions of Fairness. InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES ’19). doi:10.1145/3306618.3314248

work page doi:10.1145/3306618.3314248 2019
[36]

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. 2020. Score-based generative modeling through stochastic differential equations.arXiv preprint arXiv:2011.13456(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2020
[37]

Donald J Treiman. 2019. Occupational prestige in comparative perspective. InSocial Stratification, Class, Race, and Gender in Sociological Perspective, Second Edition. Routledge, 260–263

2019
[38]

Sahil Verma and Julia Rubin. 2018. Fairness Definitions Explained. InProceedings of the International Workshop on Software Fairness (FairWare ’18). doi:10.1145/3194770.3194776

work page doi:10.1145/3194770.3194776 2018
[39]

Yixin Wan, Arjun Subramonian, Anaelia Ovalle, Zongyu Lin, Ashima Suvarna, Christina Chance, Hritik Bansal, Rebecca Pattichis, and Kai-Wei Chang. 2024. Survey of Bias In Text-to-Image Generation: Definition, Evaluation, and Mitigation.arXiv preprint arXiv:2404.01030(2024)

work page arXiv 2024
[40]

Yankun Wu, Yuta Nakashima, and Noa García. 2023. Stable diffusion exposed: Gender bias from prompt to image.arXiv preprint arXiv:2308.03399 (2023)

work page arXiv 2023
[41]

A full-color headshot of a [OCCUPATION]

Cheng Zhang, Xuanbai Chen, Siqi Chai, Chen Henry Wu, Dmitry Lagun, Thabo Beeler, and Fernando De la Torre. 2023. ITI-GEN: Inclusive text-to-image generation.arXiv preprint arXiv:2309.05569(2023). A Appendix: Prompt List A.1 Occupational Prompts We used 30 occupational prompts grouped into three status categories: High, Moderate, and Low. Each prompt follo...

work page arXiv 2023