pith. machine review for the scientific record. sign in

arxiv: 2601.09896 · v4 · submitted 2026-01-14 · 💻 cs.HC · cs.AI· cs.CV

Recognition: 2 theorem links

· Lean Theorem

The Algorithmic Gaze of Image Quality Assessment: An Audit and Trace Ethnography of the LAION-Aesthetics Predictor

Authors on Pith no claims yet

Pith reviewed 2026-05-16 13:59 UTC · model grok-4.3

classification 💻 cs.HC cs.AIcs.CV
keywords LAION-Aesthetics Predictoraesthetic evaluationdataset curationbias in AIimage generationrepresentational harmwestern art gazegenerative models
0
0 comments X

The pith

The LAION-Aesthetics Predictor filters training images to favor women and western realistic art while excluding men, LGBTQ+ people, and non-western styles.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper audits the LAION-Aesthetics Predictor, a model that scores images for aesthetic quality and is used to curate massive datasets for training visual AI generators such as Stable Diffusion. The audit shows the predictor keeps a disproportionate share of images whose captions mention women and removes those mentioning men or LGBTQ+ identities. It also assigns the highest scores to realistic landscapes, cityscapes, and portraits by western and Japanese artists. These patterns trace directly to the model's training on aesthetic ratings supplied mostly by English-speaking photographers and western AI enthusiasts. The result is that the curation process for current image-generation systems embeds a narrow set of cultural preferences that echo historical art biases.

Core claim

Audits across the LAION-Aesthetics Dataset and two art collections reveal that the LAION-Aesthetics Predictor disproportionately retains images captioned with references to women while filtering out those mentioning men or LGBTQ+ people, and assigns peak scores to realistic western and Japanese landscapes, cityscapes, and portraits. Digital ethnography of the model's public development materials shows that the aesthetic scores used to train it came primarily from English-speaking photographers and western AI enthusiasts, producing scoring behavior that reinforces the imperial and male gazes documented in western art history.

What carries the argument

The LAION-Aesthetics Predictor (LAP), the aesthetic scoring model whose filtering thresholds and rating outputs are measured against large image collections and traced to its training sources.

If this is right

  • Datasets curated with LAP will over-represent images aligned with western realistic aesthetics and female subjects.
  • AI image generators trained on these datasets will inherit systematic under-representation of men, LGBTQ+ identities, and non-western art styles.
  • Quality evaluation of generated images using LAP will penalize outputs that deviate from realistic western conventions.
  • Continued reliance on single prescriptive aesthetic measures will embed representational harms into future visual AI systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Other aesthetic predictors built on similar English-dominant rating pools are likely to exhibit comparable demographic and stylistic skews.
  • Replacing the training ratings with scores from broader cultural groups offers a direct test of whether filtering patterns can be equalized.
  • Public release of the raw rating sources behind LAP would allow independent verification of how demographic composition shapes final model behavior.

Load-bearing premise

The observed patterns of disproportionate filtering and scoring directly originate from biases in LAP's training data and development process rather than from confounding factors in the underlying image collections or captioning practices.

What would settle it

Retraining LAP on aesthetic scores collected from a culturally and linguistically diverse global group of raters and then re-running the same dataset audits to find balanced retention rates across gender mentions and art styles would falsify the link between training sources and the observed biases.

Figures

Figures reproduced from arXiv: 2601.09896 by Haiyi Zhu, Jordan Taylor, Maarten Sap, Sarah E. Fox, William Agnew.

Figure 1
Figure 1. Figure 1: Top 25 domains of images in the LAION-Aesthetics [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Pointwise Mutual Information (PMI) between regex [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
read the original abstract

Visual generative AI models are trained using a one-size-fits-all measure of aesthetic appeal. However, what is deemed "aesthetic" is inextricably linked to personal taste and cultural values, raising the question of whose taste is represented in visual generative AI models. In this work, we study an aesthetic evaluation model--LAION-Aesthetics Predictor (LAP)--that is widely used to curate datasets to train visual generative image models, like Stable Diffusion, and evaluate the quality of AI-generated images. To understand what LAP measures, we audited the model across three datasets. First, we examined the impact of aesthetic filtering on the LAION-Aesthetics Dataset (approximately 1.2B images), which was curated from LAION-5B using LAP. We find that the LAP disproportionally filters in images with captions mentioning women, while filtering out images with captions mentioning men or LGBTQ+ people. Then, we used LAP to score approximately 330k images across two art datasets, finding the model rates realistic images of landscapes, cityscapes, and portraits from western and Japanese artists most highly. In doing so, the algorithmic gaze of this aesthetic evaluation model reinforces the imperial and male gazes found within western art history. In order to understand where these biases may have originated, we performed a digital ethnography of public materials related to the creation of LAP. We find that the development of LAP reflects the biases we found in our audits, such as the aesthetic scores used to train LAP primarily coming from English-speaking photographers and western AI-enthusiasts. In response, we discuss how aesthetic evaluation can perpetuate representational harms and call on AI developers to shift away from prescriptive measures of "aesthetics" toward more pluralistic evaluation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper audits the LAION-Aesthetics Predictor (LAP), a model used to filter and score images for training generative AI systems such as Stable Diffusion. Across the LAION-Aesthetics Dataset (~1.2B images derived from LAION-5B), it reports that LAP disproportionately retains images with captions mentioning women while filtering out those mentioning men or LGBTQ+ people. Scoring ~330k images from two art datasets shows highest ratings for realistic western and Japanese landscapes, cityscapes, and portraits. A digital ethnography of LAP development materials attributes these patterns to training data from English-speaking photographers and western AI enthusiasts. The paper concludes that LAP's algorithmic gaze reinforces imperial and male gazes from western art history and advocates shifting to pluralistic evaluation methods.

Significance. If the core empirical patterns are confirmed with appropriate controls, the work provides a timely audit of a widely deployed aesthetic scoring model that shapes large-scale vision datasets. It combines quantitative filtering analysis with qualitative trace ethnography, offering concrete examples of how prescriptive aesthetics can embed historical representational biases. The call for pluralistic alternatives is directly relevant to ongoing dataset curation practices in the field. The absence of base-distribution controls and protocol details currently limits the strength of claims linking observed patterns specifically to LAP rather than upstream collection artifacts.

major comments (3)
  1. [Audit of LAION-Aesthetics Dataset] LAION-Aesthetics Dataset audit: The reported disproportionate filtering of men/LGBTQ+ captions versus retention of women captions lacks a control comparison to the unfiltered LAION-5B caption distributions or regression adjustment for confounders such as caption length, source, or image availability. This comparison is required to isolate LAP's contribution from pre-existing imbalances in the source collection.
  2. [Scoring of art datasets] Art dataset scoring: The analysis of ~330k images reports high scores for western landscapes/portraits but provides no sample sizes per category, exact scoring thresholds, statistical tests, or variance measures. These details are necessary to evaluate whether the preference for western/Japanese realistic images is robust or sensitive to sampling.
  3. [Digital ethnography of LAP creation] Ethnography section: The digital ethnography links LAP biases to English-speaking photographers and western enthusiasts but does not specify the protocol for material selection, sampling frame, or coding procedure. Without this, the strength of the causal attribution from development process to observed filtering patterns cannot be assessed.
minor comments (1)
  1. [Scoring of art datasets] Clarify the exact definition and source of the two art datasets used for scoring; the abstract mentions them but does not name or describe their composition.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed review. The comments highlight important areas for strengthening the empirical rigor of our audit. We address each major comment below and will revise the manuscript to incorporate additional controls, statistical details, and methodological clarifications.

read point-by-point responses
  1. Referee: [Audit of LAION-Aesthetics Dataset] LAION-Aesthetics Dataset audit: The reported disproportionate filtering of men/LGBTQ+ captions versus retention of women captions lacks a control comparison to the unfiltered LAION-5B caption distributions or regression adjustment for confounders such as caption length, source, or image availability. This comparison is required to isolate LAP's contribution from pre-existing imbalances in the source collection.

    Authors: We agree that explicit base-rate comparisons and confounder adjustments would strengthen the isolation of LAP's effects. In the revised manuscript, we will add a new analysis comparing caption distributions (proportions mentioning women, men, and LGBTQ+ terms) in a representative sample drawn from LAION-5B against the filtered LAION-Aesthetics Dataset. We will also include regression models adjusting for caption length and source metadata where available. Complete adjustment for image availability is constrained by the scale and limited metadata of LAION-5B, so this will be presented as a partial revision with explicit discussion of remaining limitations. revision: partial

  2. Referee: [Scoring of art datasets] Art dataset scoring: The analysis of ~330k images reports high scores for western landscapes/portraits but provides no sample sizes per category, exact scoring thresholds, statistical tests, or variance measures. These details are necessary to evaluate whether the preference for western/Japanese realistic images is robust or sensitive to sampling.

    Authors: We will add all requested details to the revised results section and supplementary materials. This includes exact sample sizes per category (e.g., number of western landscape images, Japanese portraits, etc.), the precise scoring thresholds used to define 'high scores' (LAP produces continuous scores on a 0-10 scale; we will report means and top-decile cutoffs), results of appropriate statistical tests (e.g., Kruskal-Wallis with post-hoc comparisons across style/region groups), and variance measures (standard deviations, interquartile ranges, and bootstrap confidence intervals). These additions will demonstrate that the observed preferences are robust rather than sampling artifacts. revision: yes

  3. Referee: [Digital ethnography of LAP creation] Ethnography section: The digital ethnography links LAP biases to English-speaking photographers and western enthusiasts but does not specify the protocol for material selection, sampling frame, or coding procedure. Without this, the strength of the causal attribution from development process to observed filtering patterns cannot be assessed.

    Authors: We will expand the methods subsection on the digital ethnography to fully specify the protocol. Materials were drawn from an exhaustive review of all publicly documented sources associated with LAP (LAION GitHub repositories, official blog posts, release notes, and developer communications dated through 2023). The sampling frame was defined as every available reference to training data sources, aesthetic scoring criteria, and contributor backgrounds. Coding followed a structured thematic analysis: open coding for recurring aesthetic descriptors and data origins, followed by axial coding to connect themes to representational biases, with documentation of the codebook and examples. This expanded description will clarify the evidentiary basis for linking development practices to the observed patterns. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical audit and ethnography are self-contained

full rationale

The paper presents no mathematical derivations, fitted equations, or self-referential definitions. Central claims rest on raw empirical counts from external datasets (LAION-5B, art collections) and qualitative analysis of public LAP development materials. These observations do not reduce to the paper's own inputs by construction, and no load-bearing self-citations, uniqueness theorems, or ansatzes are invoked. The analysis is proportionate to the provided text and reader's assessment of score 1.0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on interpretive assumptions about how caption mentions map to image content and how public development materials reveal the origins of aesthetic preferences; no free parameters or new entities are introduced.

axioms (1)
  • domain assumption Mentions of demographic terms in image captions reliably indicate the presence of corresponding subjects for the purpose of measuring filtering bias
    This assumption underpins the claim of disproportionate filtering in the LAION-Aesthetics Dataset audit.

pith-pipeline@v0.9.0 · 5633 in / 1299 out tokens · 45213 ms · 2026-05-16T13:59:39.223342+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. "I Just Don't Want My Work Being Fed Into The AI Blender": Queer Artists on Refusing and Resisting Generative AI

    cs.HC 2026-04 unverdicted novelty 6.0

    Queer artists largely refuse and resist generative AI, seeing it as anti-relational and disruptive to the community-oriented, identity-forming nature of their art practices, with only limited acceptance for surreal im...

Reference graph

Works this paper leans on

117 extracted references · 117 canonical work pages · cited by 1 Pith paper · 6 internal anchors

  1. [1]

    Sara M Abdulla. 2025. Are deepfakes “Ab” Normal enough? On using norm shifts in gender and sexuality safety measures.IEEE Security & Privacy23, 4 (2025), 15–20

  2. [2]

    Mai AlKhamissi, Yunze Xiao, Badr AlKhamissi, and Mona Diab. 2025. Hire Your Anthropologist! Rethinking Culture Benchmarks Through an Anthropological Lens.arXiv preprint arXiv:2510.05931(2025)

  3. [3]

    Stefan Baack. 2024. A critical analysis of the largest source for generative ai training data: Common crawl. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 2199–2208

  4. [4]

    Stefan Baack. 2024. Training Data for the Price of a Sandwich: Common Crawl’s Impact on Generative AI.(Feb. 2024).Mozilla Insights(2024)

  5. [5]

    Chelsea Barabas, Colin Doyle, JB Rubinovitz, and Karthik Dinakar. 2020. Study- ing up: reorienting the study of algorithmic fairness around issues of power. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 167–176

  6. [6]

    Andrew M Bean, Ryan Othniel Kearns, Angelika Romanou, Franziska Sofia Hafner, Harry Mayne, Jan Batzner, Negar Foroutan, Chris Schmitz, Karolina Korgul, Hunar Batra, et al. 2025. Measuring what Matters: Construct Validity in Large Language Model Benchmarks.arXiv preprint arXiv:2511.04703(2025)

  7. [7]

    2008.Ways of seeing

    John Berger. 2008.Ways of seeing. Penguin uK

  8. [8]

    Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, and Aylin Caliskan. 2023. Easily accessible text-to-image generation amplifies demo- graphic stereotypes at large scale. InProceedings of the 2023 ACM conference on fairness, accountability, and transparency. 1493–1504

  9. [9]

    Abeba Birhane, Sanghyun Han, Vishnu Boddeti, Sasha Luccioni, et al. 2023. Into the laion’s den: Investigating hate in multimodal datasets.Advances in Neural Information Processing Systems36 (2023), 21268–21284

  10. [10]

    1984.Distinction: A Social Critique of the Judgement of Taste

    Pierre Bourdieu. 1984.Distinction: A Social Critique of the Judgement of Taste. Harvard University Press

  11. [11]

    Wayne Brekhus. 1998. A sociology of the unmarked: Redirecting our focus. Sociological theory16, 1 (1998), 34–51

  12. [12]

    Ana Carmo. 2025. Ai and anonymity fuel surge in digital violence against women | UN news. https://news.un.org/en/story/2025/11/1166411

  13. [13]

    Thibault Castells, Hyoung-Kyu Song, Bo-Kyeong Kim, and Shinkook Choi. 2024. Ld-pruner: Efficient pruning of latent diffusion models using task-agnostic insights. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 821–830

  14. [14]

    2019.Brotopia: Breaking up the boys’ club of Silicon Valley

    Emily Chang. 2019.Brotopia: Breaking up the boys’ club of Silicon Valley. Penguin. The Algorithmic Gaze of Image Quality Assessment FAccT ’26, June 25–28, 2026, Montréal, Canada

  15. [15]

    Minsuk Chang, Stefania Druga, Alexander J Fiannaca, Pedro Vergani, Chinmay Kulkarni, Carrie J Cai, and Michael Terry. 2023. The prompt artists. InProceedings of the 15th Conference on Creativity and Cognition. 75–87

  16. [16]

    Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, and Zhenguo Li. 2024. Pixart-𝜎: Weak- to-strong training of diffusion transformer for 4k text-to-image generation. In European Conference on Computer Vision. Springer, 74–91

  17. [17]

    Kenneth Church and Patrick Hanks. 1990. Word association norms, mutual information, and lexicography.Computational linguistics16, 1 (1990), 22–29

  18. [18]

    Alex Cranz. 2025. Meta Cheated on AI Benchmarks and It’s a Glimpse Into a New Golden Age. https://gizmodo.com/meta-cheated-on-ai-benchmarks-and- its-a-glimpse-into-a-new-golden-age-2000586433

  19. [19]

    Ana-Maria Cretu, Klim Kireev, Amro Abdalla, Wisdom Obinna, Raphael Meier, Sarah Adel Bargal, Elissa M Redmiles, and Carmela Troncoso. 2025. Evaluating Concept Filtering Defenses against Child Sexual Abuse Material Generation by Text-to-Image Models.arXiv preprint arXiv:2512.05707(2025)

  20. [20]

    Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z Wang. 2006. Studying aesthetics in photographic images using a computational approach. InEuropean conference on computer vision. Springer, 288–301

  21. [21]

    Wenxin Ding, Cathy Y Li, Shawn Shan, Ben Y Zhao, and Haitao Zheng. 2024. Understanding implosion in text-to-image generative models. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. 1211–1225

  22. [22]

    Jesse Dodge, Maarten Sap, Ana Marasović, William Agnew, Gabriel Ilharco, Dirk Groeneveld, Margaret Mitchell, and Matt Gardner. 2021. Documenting large webtext corpora: A case study on the colossal clean crawled corpus.arXiv preprint arXiv:2104.08758(2021)

  23. [23]

    Paul Dourish. 2014. Reading and interpreting ethnography. InWays of Knowing in HCI. Springer, 1–23

  24. [24]

    Zezhong Fan, Xiaohan Li, Kaushiki Nag, Chenhao Fang, Topojoy Biswas, Jian- peng Xu, and Kannan Achan. 2024. Prompt optimizer of text-to-image diffusion models for abstract concept understanding. InCompanion Proceedings of the ACM Web Conference 2024. 1530–1537

  25. [25]

    Ekaterina Filikhina and Lennart Elsaß. 2025. Robert Kneschke v. Laion: Judgment of 10 december 2025 (ref.: 5 U 104/24). https: //www.technologyslegaledge.com/2025/12/robert-kneschke-v-laion- judgment-of-10-december-2025-ref-5-u-104-24/

  26. [26]

    Bi Qi Fong and John See. 2024. Branddiffusion: Multimodal personalized market- ing visual content generation. InProceedings of the 2nd International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice. 72–77

  27. [27]

    2001.Studying those who study us: An anthropologist in the world of artificial intelligence

    Diana Forsythe. 2001.Studying those who study us: An anthropologist in the world of artificial intelligence. Stanford University Press

  28. [28]

    Roger Fry. 1920. An essay in aesthetics. InVision and Design. Chatto & Windus, 11–26

  29. [29]

    Being Eroded, Piece by Piece

    Kexue Fu, Ruishan Wu, Yuying Tang, Yixin Chen, Bowen Liu, and RAY LC. 2024. " Being Eroded, Piece by Piece": Enhancing Engagement and Storytelling in Cultural Heritage Dissemination by Exhibiting GenAI Co-Creation Artifacts. InProceedings of the 2024 ACM Designing Interactive Systems Conference. 2833– 2850

  30. [30]

    Xing Fu and Navid Razmjooy. 2025. Preserving and enhancing cultural heritage through art design using feature pyramid network optimized by modified builder optimization algorithm.Scientific Reports15, 1 (2025), 42603

  31. [31]

    Xiang Gao and Jiaying Liu. 2024. Fbsdiff: Plug-and-play frequency band substitu- tion of diffusion features for highly controllable text-driven image translation. In Proceedings of the 32nd ACM International Conference on Multimedia. 4101–4109

  32. [32]

    1973.The interpretation of cultures

    Clifford Geertz. 1973.The interpretation of cultures. Vol. 5019. Basic books

  33. [33]

    R Stuart Geiger and David Ribes. 2011. Trace ethnography: Following coordina- tion through documentary practices. In2011 44th Hawaii international conference on system sciences. IEEE, 1–10

  34. [34]

    Cassidy Gibson, Daniel Olszewski, Natalie Grace Brigham, Anna Crowder, Kevin RB Butler, Patrick Traynor, Elissa M Redmiles, and Tadayoshi Kohno

  35. [35]

    In34th USENIX Security Symposium (USENIX Security 25)

    Analyzing the {AI} nudification application ecosystem. In34th USENIX Security Symposium (USENIX Security 25). 1–20

  36. [36]

    Samuel Goree, Jackson Domingo, and David Crandall. 2025. Human-Centered Evaluation of Aesthetic Quality Assessment Models Using a Smartphone Camera Application. InProceedings of the 2025 ACM Conference on Fairness, Accountabil- ity, and Transparency. 3265–3275

  37. [37]

    Jiayi Guo, Xingqian Xu, Yifan Pu, Zanlin Ni, Chaofei Wang, Manushree Vasu, Shiji Song, Gao Huang, and Humphrey Shi. 2024. Smooth diffusion: Craft- ing smooth latent spaces in diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7548–7558

  38. [38]

    Sireesh Gururaja, Amanda Bertsch, Clara Na, David Widder, and Emma Strubell

  39. [39]

    InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

    To build our future, we must know our past: Contextualizing paradigm shifts in natural language processing. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 13310–13325

  40. [40]

    2018.Mapping modernisms: Art, indigene- ity, colonialism

    Elizabeth Harney and Ruth B Phillips. 2018.Mapping modernisms: Art, indigene- ity, colonialism. Duke University Press

  41. [41]

    Huiguo He, Tianfu Wang, Huan Yang, Jianlong Fu, Nicholas Jing Yuan, Jian Yin, Hongyang Chao, and Qi Zhang. 2023. Learning profitable NFT image diffusions via multiple visual-policy guided reinforcement learning. InProceedings of the 31st ACM International Conference on Multimedia. 6831–6840

  42. [42]

    Alex Hern. 2024. Institute bans use of Playboy Test Image in engineering jour- nals. https://www.theguardian.com/technology/2024/mar/31/tech-publisher- bans-playboy-centrefold-test-image-from-its-journals

  43. [43]

    Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium.Advances in neural information processing systems 30 (2017)

  44. [44]

    Anna Lauren Hoffmann. 2021. Terms of inclusion: Data, discourse, violence. New Media & Society23, 12 (2021), 3539–3556

  45. [45]

    Rachel Hong, William Agnew, Tadayoshi Kohno, and Jamie Morgenstern. 2024. Who’s in and who’s out? A case study of multimodal CLIP-filtering in DataComp. InProceedings of the 4th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization. 1–17

  46. [46]

    Yipo Huang, Quan Yuan, Xiangfei Sheng, Zhichao Yang, Haoning Wu, Pengfei Chen, Yuzhe Yang, Leida Li, and Weisi Lin. 2024. Aesbench: An expert benchmark for multimodal large language models on image aesthetics perception.arXiv preprint arXiv:2401.08276(2024)

  47. [47]

    Sami P Itävuori. 2025. The Computational Approach To Aesthetics: Value Alignment And The Political Economy Of Attention In Museums And Text-To- Image Generator Stable Diffusion.A Peer-Reviewed Journal About: Everything Is A Matter Of Distance14, 1 (2025), 82–100

  48. [48]

    Harry H Jiang, Lauren Brown, Jessica Cheng, Mehtab Khan, Abhishek Gupta, Deja Workman, Alex Hanna, Johnathan Flowers, and Timnit Gebru. 2023. AI Art and its Impact on Artists. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 363–374

  49. [49]

    Nagai Kaf¯u, Kyoko Selden, and Alisa Freedman. 2012. Ukiyo-e landscapes and edo scenic places (1914).Review of Japanese Culture and Society24 (2012), 210–232

  50. [50]

    2012.Looking for the other: Feminism, film and the imperial gaze

    E Ann Kaplan. 2012.Looking for the other: Feminism, film and the imperial gaze. Routledge

  51. [51]

    Anna Kawakami, Amanda Coston, Hoda Heidari, Kenneth Holstein, and Haiyi Zhu. 2024. Studying Up Public Sector AI: How Networks of Power Relations Shape Agency Decisions Around AI Design and Use.Proceedings of the ACM on Human-Computer Interaction8, CSCW2 (2024), 1–24

  52. [52]

    Aditya Khosla, Akhil S Raju, Antonio Torralba, and Aude Oliva. 2015. Under- standing and predicting image memorability at a large scale. InProceedings of the IEEE international conference on computer vision. 2390–2398

  53. [53]

    Kun Li, Lai Man Po, Hongzheng Yang, Xuyuan Xu, Kangcheng Liu, and Yuzhi Zhao. 2025. AesBiasBench: Evaluating Bias and Alignment in Multimodal Language Models for Personalized Image Aesthetic Assessment. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 7618–7631

  54. [54]

    Zejian Li, Chenye Meng, Yize Li, Ling Yang, Shengyuan Zhang, Jiarui Ma, Jiayi Li, Guang Yang, Changyuan Yang, Zhiyuan Yang, et al. 2024. Laion-sg: An enhanced large-scale dataset for training complex image-text models with structural annotations.arXiv preprint arXiv:2412.08580(2024)

  55. [55]

    Zhimin Li, Jianwei Zhang, Qin Lin, Jiangfeng Xiong, Yanxin Long, Xinchi Deng, Yingfang Zhang, Xingchao Liu, Minbin Huang, Zedong Xiao, et al. 2024. Hunyuan-dit: A powerful multi-resolution diffusion transformer with fine- grained chinese understanding.arXiv preprint arXiv:2405.08748(2024)

  56. [56]

    Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Z Wang. 2014. Rapid: Rating pictorial aesthetics using deep learning. InProceedings of the 22nd ACM international conference on Multimedia. 457–466

  57. [57]

    Simian Luo, Yiqin Tan, Longbo Huang, Jian Li, and Hang Zhao. 2023. Latent con- sistency models: Synthesizing high-resolution images with few-step inference. arXiv preprint arXiv:2310.04378(2023)

  58. [58]

    George E Marcus. 1995. Ethnography in/of the world system: The emergence of multi-sited ethnography.Annual review of anthropology24, 1 (1995), 95–117

  59. [59]

    Vidushi Marda and Shivangi Narayan. 2020. Data in New Delhi’s predictive policing system. InProceedings of the 2020 conference on fairness, accountability, and transparency. 317–324

  60. [60]

    Liv McMahon and Laura Cress. 2026. X could face UK ban over deepfakes, minister says. https://www.bbc.com/news/articles/c99kn52nx9do

  61. [61]

    Brian Merchant. 2025. What’s really going on with AI and jobs? https://www. bloodinthemachine.com/p/what-the-hell-is-going-on-with-ai

  62. [62]

    Milagros Miceli, Tianling Yang, Laurens Naudts, Martin Schuessler, Diana Ser- banescu, and Alex Hanna. 2021. Documenting computer vision datasets: An invitation to reflexive data practices. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 161–172

  63. [63]

    Nusrat Jahan Mim, Dipannita Nandi, Sadaf Sumyia Khan, Arundhuti Dey, and Syed Ishtiaque Ahmed. 2024. In-Between Visuals and Visible: The Impacts of Text-to-Image Generative AI Tools on Digital Image-making Practices in the Global South. InProceedings of the CHI Conference on Human Factors in Computing Systems. 1–18. FAccT ’26, June 25–28, 2026, Montréal, ...

  64. [64]

    Jaron Mink, Miranda Wei, Collins W Munyendo, Kurt Hugenberg, Tadayoshi Kohno, Elissa M Redmiles, and Gang Wang. 2024. It’s Trying Too Hard To Look Real: Deepfake Moderation Mistakes and Identity-Based Bias. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–20

  65. [65]

    Youssef Mohamed, Mohamed Abdelfattah, Shyma Alhuwaider, Feifan Li, Xian- gliang Zhang, Kenneth Church, and Mohamed Elhoseiny. 2022. Artelingo: A million emotion annotations of wikiart with emphasis on diversity over lan- guage and culture. InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 8770–8785

  66. [66]

    Patt Morrison. 2025. ‘Degenerate’ or ‘woke, ’ Paris museum exhibit shows what happens to art in the crosshairs of politics. https://www.latimes.com/ entertainment-arts/story/2025-05-07/degenerate-or-woke-paris-museum- exhibit-evokes-what-happens-to-art-in-the-crosshairs-of-politics

  67. [67]

    Laura Mulvey. 1975. Visual Pleasure and Narrative Cinema.Screen16, 3 (Oct. 1975), 6–18. doi:10.1093/screen/16.3.6

  68. [68]

    2021.Proxies: The cultural work of standing in

    Dylan Mulvin. 2021.Proxies: The cultural work of standing in. MIT Press

  69. [69]

    David C Munson. 1996. A note on Lena.IEEE Transactions on Image Processing 5, 1 (1996), 3–3

  70. [70]

    Naila Murray, Luca Marchesotti, and Florent Perronnin. 2012. AVA: A large-scale database for aesthetic visual analysis. In2012 IEEE conference on computer vision and pattern recognition. IEEE, 2408–2415

  71. [71]

    Marissa Newman and Aggi Cantrill. 2023. The Future of AI Relies on a High School Teacher’s Free Database. https://www.bloomberg.com/news/ features/2023-04-24/a-high-school-teacher-s-free-image-database-powers- ai-unicorns

  72. [72]

    Will Orr and Edward B Kang. 2024. AI as a sport: On the competitive epistemolo- gies of benchmarking. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 1875–1884

  73. [73]

    Samir Passi and Solon Barocas. 2019. Problem formulation and fairness. In Proceedings of the conference on fairness, accountability, and transparency. 39–48

  74. [74]

    David Porter. 1999. Chinoiserie and the Aesthetics of Illegitimacy.Studies in Eighteenth-Century Culture28, 1 (1999), 27–54

  75. [75]

    2006.Mimesis

    Matthew Potolsky. 2006.Mimesis. Routledge

  76. [76]

    2022.Simulacra Aesthetic Captions

    John David Pressman, Katherine Crowson, and Simulacra Captions Contributors. 2022.Simulacra Aesthetic Captions. Technical Report Version 1.0. Stability AI. https://github.com/JD-P/simulacra-aesthetic-captions

  77. [77]

    2001.Primitive art in civilized places

    Sally Price. 2001.Primitive art in civilized places. University of Chicago Press

  78. [78]

    Mahima Pushkarna, Andrew Zaldivar, and Oddur Kjartansson. 2022. Data cards: Purposeful and transparent dataset documentation for responsible ai. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 1776–1826

  79. [79]

    Rida Qadri, Piotr Mirowski, and Remi Denton. 2025. AI and Non-Western Art Worlds: Reimagining Critical AI Futures through Artistic Inquiry and Situ- ated Dialogue. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–17

  80. [80]

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand- hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al

Showing first 80 references.