pith. sign in

arxiv: 2505.22073 · v2 · submitted 2025-05-28 · 💻 cs.CY

A Closer Look at the Existing Risks of Generative AI: Mapping the Who, What, and How of Real-World Incidents

Pith reviewed 2026-05-19 14:07 UTC · model grok-4.3

classification 💻 cs.CY
keywords generative AIAI risksharmsreal-world incidentsfailure modesstakeholder impactsuse-related issuesrisk taxonomy
0
0 comments X p. Extension

The pith

Analysis of 499 incidents shows generative AI harms mostly stem from use issues yet affect people beyond the direct users and differ from traditional AI risks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines hundreds of documented cases of harm from generative AI systems to build a dedicated taxonomy of failures and map them to the damages they produce. It finds that problems frequently trace to how people apply or interpret the outputs rather than core design flaws, but the resulting harms commonly reach organizations, communities, or the broader public instead of stopping at the individual user. This matters because it points to the need for responses that go beyond fixing the models themselves, such as better education for users and rules that require transparency about downstream effects. The work distinguishes generative AI from earlier AI systems by showing different patterns in what fails and who pays the price.

Core claim

Through a systematic review of 499 publicly reported incidents, the authors show that most generative AI harms arise from use-related issues such as inappropriate applications or misinterpretations of outputs, yet these incidents disproportionately impact stakeholders other than the end users of the system. The study further establishes that the overall landscape of harms and failure modes for generative AI is distinct from that of traditional AI, requiring separate consideration in risk assessment and mitigation efforts.

What carries the argument

A taxonomy of generative AI failures and harms constructed by coding real-world incidents, which classifies failure modes and links them to affected stakeholders and harm types.

If this is right

  • Risk mitigation should emphasize non-technical measures such as public disclosures and user education.
  • Developers and deployers need to account for harms that extend beyond immediate users when designing safeguards.
  • Policymakers should adopt regulatory approaches tailored to the distinct failure patterns of generative AI.
  • Users and organizations require guidance on responsible appropriation of generative outputs to limit wider harms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Ongoing tracking of incidents could reveal whether the current patterns shift as more users gain experience with the technology.
  • The emphasis on broader stakeholder impacts suggests liability frameworks may need to extend responsibility beyond the direct operator.
  • Interface designs that guide safer use could reduce the frequency of use-related failures without changing the underlying models.

Load-bearing premise

The collection of 499 publicly reported incidents is representative enough to reveal the true prevalence and patterns of generative AI harms and failure modes.

What would settle it

Discovery of a large set of unreported or under-reported incidents showing that design and development failures outnumber use-related ones and produce the same stakeholder impacts.

Figures

Figures reproduced from arXiv: 2505.22073 by Hoda Heidari, Hong Shen, Jason Hong, Lorrie Cranor, Megan Li, Ningjing Tang, Wendy Bickersteth.

Figure 1
Figure 1. Figure 1: Overview of Taxonomy of Generative AI Harms. Darker colors indicate higher prevalence in our dataset. Harms that were [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of harmed stakeholders, with the most prevalent category of harm for each stakeholder represented by the left [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of Taxonomy of Sociotechnical Failure Modes. Darker colors indicate higher overall prevalence in our dataset. [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Bars are divided by the type of failure mode that precipitated harm to the respective stakeholder. Use-related failure modes [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: We indicate the two most prevalent sociotechnical failure modes precipitating each category of harm with the colored sections [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
read the original abstract

Due to its general-purpose nature, Generative AI is applied in an ever-growing set of domains and tasks, leading to an expanding set of risks of harm impacting people, communities, society, and the environment. These risks may arise due to failures during the design and development of the technology, as well as during its release, deployment, or downstream usages and appropriations of its outputs. In this paper, building on prior taxonomies of AI risks, harms, and failures, we construct a taxonomy specifically for Generative AI failures and map them to the harms they precipitate. Through a systematic analysis of 499 publicly reported incidents, we describe what harms are reported, how they arose, and who they impact. We report the prevalence of each type of harm, underlying failure mode, and harmed stakeholder, as well as their common co-occurrences. We find that most reported incidents are caused by use-related issues but bring harm to parties beyond the end user(s) of the Generative AI system at fault, and that the landscape of Generative AI harms is distinct from that of traditional AI. Our work offers actionable insights to policymakers, developers, and Generative AI users. In particular, we call for the prioritization of non-technical risk and harm mitigation strategies, including public disclosures and education and careful regulatory stances.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript constructs a taxonomy of Generative AI failures and harms by extending prior AI risk frameworks, then maps these to a dataset of 499 publicly reported real-world incidents. It reports prevalence and co-occurrence statistics for harms, failure modes (emphasizing use-related issues), and affected stakeholders, concluding that most incidents arise from use-related problems yet primarily harm non-end-users and that the GenAI harm landscape differs from traditional AI. The work offers recommendations prioritizing non-technical mitigation strategies such as disclosures, education, and regulatory approaches.

Significance. If the incident sample supports the reported distributions, the study supplies empirical grounding for GenAI-specific risk patterns, particularly the downstream use phase and third-party harms, which could usefully inform policy and developer priorities. The systematic coding of 499 cases and explicit co-occurrence analysis are strengths that extend existing taxonomies with concrete mappings. Significance is reduced by the observational nature of public reports, but the work still provides a useful descriptive baseline for the reported incident landscape.

major comments (3)
  1. [§3 (Data Collection and Coding)] §3 (Data Collection and Coding): The account of how the 499 incidents were assembled provides insufficient detail on search strategy, queried sources or databases, exact inclusion/exclusion criteria, time window, and inter-coder reliability metrics. Because prevalence, co-occurrence, and the claimed distinction from traditional AI all rest on the properties of this sample, the absence of these elements leaves the central descriptive claims vulnerable to selection bias.
  2. [§5 (Comparative Analysis)] §5 (Comparative Analysis): The assertion that the GenAI harm landscape is distinct from traditional AI is presented without a matched quantitative comparison (e.g., a parallel coding of traditional AI incidents or reference to an established benchmark dataset). If the distinction rests only on qualitative differences observed in the current public-report sample, it cannot yet support the stronger claim of a fundamentally different landscape.
  3. [Discussion section] Discussion section: Recommendations for policymakers and the prioritization of non-technical strategies extrapolate from the observed distributions without explicitly bounding the claims to publicly reported incidents. Public reporting filters (newsworthiness, legal exposure) systematically under-sample low-severity, internal, or slowly manifesting harms; this limitation is load-bearing for the policy implications.
minor comments (3)
  1. [Abstract] Abstract: Add one sentence summarizing the search and coding process at a high level so readers can immediately gauge the scope of the 499-incident sample.
  2. [Tables] Tables reporting prevalence and co-occurrences: Include per-category sample sizes or percentages of the total 499 to allow readers to assess the stability of the reported proportions.
  3. [Taxonomy figure or table] Taxonomy figure or table: Ensure that the mapping between failure modes and harmed stakeholders is presented with explicit definitions or examples for each category to improve reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed comments, which help clarify the scope and limitations of our analysis. We address each major point below, indicating revisions where appropriate to improve transparency and precision without overstating our findings.

read point-by-point responses
  1. Referee: [§3 (Data Collection and Coding)]: The account of how the 499 incidents were assembled provides insufficient detail on search strategy, queried sources or databases, exact inclusion/exclusion criteria, time window, and inter-coder reliability metrics. Because prevalence, co-occurrence, and the claimed distinction from traditional AI all rest on the properties of this sample, the absence of these elements leaves the central descriptive claims vulnerable to selection bias.

    Authors: We agree that greater methodological transparency is needed. In the revised manuscript, we will expand §3 to provide a full description of the search strategy (including keywords and Boolean operators used), the specific sources and databases queried (e.g., news archives, incident repositories, and public reports), precise inclusion/exclusion criteria, the exact time window of incidents collected, and inter-coder reliability statistics (such as percentage agreement and Cohen’s kappa). These additions will directly address concerns about selection bias and allow readers to better evaluate the reported distributions and co-occurrences. revision: yes

  2. Referee: [§5 (Comparative Analysis)]: The assertion that the GenAI harm landscape is distinct from traditional AI is presented without a matched quantitative comparison (e.g., a parallel coding of traditional AI incidents or reference to an established benchmark dataset). If the distinction rests only on qualitative differences observed in the current public-report sample, it cannot yet support the stronger claim of a fundamentally different landscape.

    Authors: The distinction is drawn from observed differences in failure modes (particularly the predominance of use-related issues) and stakeholder impacts relative to patterns documented in prior traditional AI risk literature. However, we acknowledge the value of a more direct quantitative anchor. In revision, we will either incorporate a concise comparison against an existing benchmark dataset of traditional AI incidents where feasible, or revise the language in §5 and the abstract to frame the finding as a notable difference within the public-report sample rather than asserting a fundamentally different landscape. This will preserve the empirical observations while avoiding overstatement. revision: partial

  3. Referee: [Discussion section]: Recommendations for policymakers and the prioritization of non-technical strategies extrapolate from the observed distributions without explicitly bounding the claims to publicly reported incidents. Public reporting filters (newsworthiness, legal exposure) systematically under-sample low-severity, internal, or slowly manifesting harms; this limitation is load-bearing for the policy implications.

    Authors: We accept this critique. The revised Discussion will explicitly bound all prevalence claims, co-occurrence patterns, and resulting recommendations to the set of publicly reported incidents. We will add language acknowledging that public reporting filters may under-represent low-severity, internal, or slowly emerging harms, and we will qualify the policy implications accordingly—stressing that they pertain to the observed incident landscape while noting the need for complementary research on unreported cases. This will ensure the recommendations are appropriately scoped. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical observational analysis

full rationale

The paper performs a systematic empirical mapping of 499 publicly reported incidents against a taxonomy built from prior external work on AI risks. All reported findings consist of direct counts, prevalences, co-occurrence statistics, and descriptive comparisons drawn from the collected incident data. No equations, model derivations, parameter fits, or first-principles predictions appear; therefore no step reduces by construction to the paper's own inputs or self-citations. The central claims rest on the external incident corpus rather than any internal tautology, satisfying the criteria for a self-contained, non-circular analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Empirical study whose central claims rest on assumptions about data representativeness rather than new mathematical constructs or entities.

axioms (1)
  • domain assumption Publicly reported incidents form a representative sample of generative AI harms and failure modes
    Prevalence, co-occurrence, and distinction claims are derived directly from analysis of these 499 incidents.

pith-pipeline@v0.9.0 · 5795 in / 1277 out tokens · 83793 ms · 2026-05-19T14:07:32.808116+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. What People See (and Miss) About Generative AI Risks: Perceptions of Failures, Risks, and Who Should Address Them

    cs.HC 2026-04 unverdicted novelty 4.0

    A validated survey instrument grounded in real GenAI incidents reveals public perceptions of failure modes, risks, and stakeholder responsibilities, showing potential for guiding AI literacy efforts.

Reference graph

Works this paper leans on

78 extracted references · 78 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    Gavin Abercrombie, Djalel Benbouzid, Paolo Giudici, Delaram Golpayegani, Julio Hernandez, Pierre Noro, Harshvardhan Pandit, Eva Paraschou, Charlie Pownall, Prajapati, et al. 2024. AIAAIC - AIAAIC Repository. https://www.aiaaic.org/aiaaic-repository Manuscript submitted to ACM A Closer Look at the Existing Risks of Generative AI 17

  2. [2]

    Gavin Abercrombie, Djalel Benbouzid, Paolo Giudici, Delaram Golpayegani, Julio Hernandez, Pierre Noro, Harshvardhan Pandit, Eva Paraschou, Charlie Pownall, Jyoti Prajapati, et al. 2024. A collaborative, human-centred taxonomy of ai, algorithmic, and automation harms.arXiv preprint arXiv:2407.01294(2024)

  3. [3]

    Noor Al-Siabi. 2023. Someone Deepfaked Joe Rogan to Sell a Male Enhancement Product. https://futurism.com/the-byte/joe-rogan-deepfake-ad

  4. [4]

    Julia Barnett. 2023. The ethical implications of generative audio models: A systematic literature review. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 146–161

  5. [5]

    Charlotte Bird, Eddie Ungless, and Atoosa Kasirzadeh. 2023. Typology of risks of generative text-to-image models. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 396–410

  6. [6]

    Shannon Bond. 2023. Fake viral images of an explosion at the Pentagon were probably created by AI. https://www.npr.org/2023/05/22/1177590231/ fake-viral-images-of-an-explosion-at-the-pentagon-were-probably-created-by-ai

  7. [7]

    Shannon Bond. 2023. Fake viral images of an explosion at the Pentagon were probably created by AI. (2023). https://www.npr.org/2023/05/22/ 1177590231/fake-viral-images-of-an-explosion-at-the-pentagon-were-probably-created-by-ai

  8. [8]

    Nick Bostrom. 2001. Existential Risks: Analyzing Human Extinction Scenarios and Related Hazards.Journal of Evolution and Technology(2001)

  9. [9]

    Blake Brittain. 2023. Google hit with class-action lawsuit over AI data scraping. https://www.reuters.com/legal/litigation/google-hit-with-class- action-lawsuit-over-ai-data-scraping-2023-07-11/

  10. [10]

    Alexandra Bruell. 2023. New York Times Sues Microsoft and OpenAI, Alleging Copyright Infringement. https://www.wsj.com/tech/ai/new-york- times-sues-microsoft-and-openai-alleging-copyright-infringement-fd85e1c4

  11. [11]

    Carl Carlson. 2012. Effective FMEAs: Achieving Safe, Reliable, and Economical Products and Processes Using Failure Mode and Effects Analysis. Effective FMEAs: Achieving Safe, Reliable, and Economical Products and Processes Using Failure Mode and Effects Analysis(04 2012). https://doi.org/10. 1002/9781118312575

  12. [12]

    Osmond Chia. 2024. ‘Safety labels’ that clearly indicate AI risks and testing on the cards in Singapore.The Straits Times(2024). https: //www.straitstimes.com/tech/safety-labels-to-clearly-state-ai-risks-and-testing-in-discussion-josephine-teo

  13. [13]

    Linda Codega. 2022. Tor Tried to Hide AI Art on a Book Cover, and It Is a Mess. https://gizmodo.com/tor-book-ai-art-cover-christopher-paolini- fractalverse-1849904058

  14. [14]

    Samantha Cole. 2019. A Site Faking Jordan Peterson’s Voice Shuts Down After Peterson Decries Deepfakes. https://www.vice.com/en/article/not- jordan-peterson-voice-generator-shut-down-deepfakes/

  15. [15]

    Andrew Critch and Stuart Russell. 2023. TASRA: a taxonomy and analysis of societal-scale risks from AI.arXiv preprint arXiv:2306.06924(2023)

  16. [16]

    2025.S.4569 - 118th Congress (2023-2024): TAKE IT DOWN Act | Congress.gov | Library of Congress

    Ted Cruz and Amy Klobuchar. 2025.S.4569 - 118th Congress (2023-2024): TAKE IT DOWN Act | Congress.gov | Library of Congress. https: //www.congress.gov/bill/118th-congress/senate-bill/4569

  17. [17]

    Matt Davies. 2024. Michael Mosley ’deepfake’ warning as late doctor’s likeness used in sham footage. https://www.dailyrecord.co.uk/news/uk- world-news/michael-mosley-deepfake-warning-late-33275696

  18. [18]

    Wes Davis. 2024. OpenAI transcribed over a million hours of YouTube videos to train GPT-4. https://www.theverge.com/2024/4/6/24122915/openai- youtube-transcripts-gpt-4-training-data-google

  19. [19]

    Michelle L Ding and Harini Suresh. 2025. The Malicious Technical Ecosystem: Exposing Limitations in Technical Governance of AI-Generated Non-Consensual Intimate Images of Adults.arXiv preprint arXiv:2504.17663(2025)

  20. [20]

    Alison Durkee. 2023. Ex-Trump ‘Fixer’ Michael Cohen Admits He Accidentally Used Google Bard To Put Fake Cases Into Legal Fil- ing. https://www.forbes.com/sites/alisondurkee/2023/12/29/ex-trump-fixer-michael-cohen-admits-he-accidentally-used-google-bard-to-put-fake- cases-into-legal-filing/?sh=3ecef43d4c57

  21. [21]

    2024.High-level summary of the AI Act | EU Artificial Intelligence Act

    The EU. 2024.High-level summary of the AI Act | EU Artificial Intelligence Act. https://artificialintelligenceact.eu/high-level-summary/

  22. [22]

    David Fischer. 2021. Teen charged with extorting official with explicit photos. https://apnews.com/article/arrests-florida- 73f7dba8089422a079e7e1126decc419

  23. [23]

    Lorenzo Franceschi-Bicchierai. 2020. Listen to This Deepfake Audio Impersonating a CEO in Brazen Fraud Attempt. https://www.vice.com/en/ article/deepfake-audio-impersonating-ceo-fraud-attempt/

  24. [24]

    Matthew Gault. 2022. An AI-Generated Artwork Won First Place at a State Fair Fine Arts Competition, and Artists Are Pissed. https://www.vice. com/en/article/an-ai-generated-artwork-won-first-place-at-a-state-fair-fine-arts-competition-and-artists-are-pissed/

  25. [25]

    Cassidy Gibson, Daniel Olszewski, Natalie Grace Brigham, Anna Crowder, Kevin RB Butler, Patrick Traynor, Elissa M Redmiles, and Tadayoshi Kohno. 2024. Analyzing the AI Nudification Application Ecosystem.arXiv preprint arXiv:2411.09751(2024)

  26. [26]

    David Gilbert. 2023. High Schoolers Made a Racist Deepfake of a Principal Threatening Black Students. https://www.vice.com/en/article/school- principal-deepfake-racist-video/

  27. [27]

    Dan Goodin. 2023. FBI warns of increasing use of AI-generated deepfakes in sextortion schemes. https://arstechnica.com/information-technology/ 2023/06/fbi-warns-of-increasing-use-of-ai-generated-deepfakes-in-sextortion-schemes/

  28. [28]

    Dan Hendrycks, Mantas Mazeika, and Thomas Woodside. 2023. An Overview of Catastrophic AI Risks. https://doi.org/10.48550/arXiv.2306.12001 arXiv:2306.12001 [cs]

  29. [29]

    Mia Hoffman and Heather Frase. 2023. Adding Structure to AI Harm. https://cset.georgetown.edu/publication/adding-structure-to-ai-harm/

  30. [30]

    Jake Horton. 2020. US 2020 Election: Does Joe Biden support defunding the police? https://www.bbc.com/news/election-us-2020-53997196 Manuscript submitted to ACM 18 Li et al

  31. [31]

    Wiebke Hutiri, Orestis Papakyriakopoulos, and Alice Xiang. 2024. Not my voice! a taxonomy of ethical and safety harms of speech generators. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 359–376

  32. [32]

    John Hyde. 2023. LiP presents false citations to court after asking ChatGPT. https://www.lawgazette.co.uk/news/lip-presents-false-citations-to- court-after-asking-chatgpt/5116143.article

  33. [33]

    2024.On the Societal Impact of Open Foundation Models

    Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, et al. 2024.On the Societal Impact of Open Foundation Models. https://crfm.stanford.edu/open-fms/

  34. [34]

    2025.We Looked at 78 Election Deepfakes

    Sayash Kapoor and Arvind Narayanan. 2025.We Looked at 78 Election Deepfakes. Political Misinformation is not an AI Problem.https://www. aisnakeoil.com/p/we-looked-at-78-election-deepfakes

  35. [35]

    Ezra Karger, Josh Rosenberg, Zachary Jacobs, Molly Hickman, Rose Hadshar, Kayla Gamin, and PE Tetlock. 2023. Forecasting Existential Risks: Evidence from a Long-Run Forecasting Tournament.Forecasting Research Institute(2023)

  36. [36]

    Atoosa Kasirzadeh. 2025. Two Types of AI Existential Risk: Decisive and Accumulative. https://doi.org/10.48550/arXiv.2401.07836 arXiv:2401.07836 [cs]

  37. [37]

    Leonie Koessler and Jonas Schuett. 2023. Risk assessment at AGI companies: A review of popular risk assessment techniques from other safety-critical industries.arXiv preprint arXiv:2307.08823(2023)

  38. [38]

    Hao-Ping Lee, Yu-Ju Yang, Thomas Serban Von Davier, Jodi Forlizzi, and Sauvik Das. 2024. Deepfakes, phrenology, surveillance, and more! a taxonomy of ai privacy risks. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–19

  39. [39]

    2012.Engineering a Safer World: Systems Thinking Applied to Safety

    Nancy Leveson. 2012.Engineering a Safer World: Systems Thinking Applied to Safety. The MIT Press. https://doi.org/10.7551/mitpress/8179.001.0001

  40. [40]

    James Liddell. 2024. Trump posts AI-generated image of Harris speaking at DNC with communist flags. https://www.the-independent.com/news/ world/americas/us-politics/trump-ai-communism-harris-dnc-b2598303.html

  41. [41]

    Zack Linly. 2024. GOP Pollster’s Viral AI Fake Photos Of Republican ‘Black Voters’ Spotlight Election Misinformation Fears. https://newsone.com/ 4978752/artificial-intelligence-black-voters-photos/

  42. [42]

    Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo, Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, and Hang Li

  43. [43]

    Trustworthy llms: a survey and guideline for evaluating large language models’ alignment.arXiv preprint arXiv:2308.05374(2023)

  44. [44]

    Clifford Lo. 2023. Hong Kong police arrest 6 in crackdown on fraud syndicate using AI deepfake technology to apply for loans. https://www.scmp.com/news/hong-kong/law-and-crime/article/3232273/hong-kong-police-arrest-6-crackdown-fraud-syndicate-using-ai- deepfake-technology-apply-loans

  45. [45]

    2024.Open Foundation Models: Implications of Contemporary Artificial Intelligence

    Jason Ly. 2024.Open Foundation Models: Implications of Contemporary Artificial Intelligence. https://cset.georgetown.edu/article/open-foundation- models-implications-of-contemporary-artificial-intelligence/

  46. [46]

    Carl Macrae. 2022. Learning from the failure of autonomous and intelligent systems: Accidents, safety, and sociotechnical sources of risk.Risk analysis42, 9 (2022), 1999–2025

  47. [47]

    2024.Samsung-Backed AI Image Generator Produces Nonconsensual Porn

    Emanuel Maiberg. 2024.Samsung-Backed AI Image Generator Produces Nonconsensual Porn. https://www.404media.co/samsung-backed-ai-image- generator-produces-nonconsensual-porn/

  48. [48]

    2021.Preventing repeated real world AI failures by cataloging incidents: The AI incident database

    Sean McGregor. 2021.Preventing repeated real world AI failures by cataloging incidents: The AI incident database

  49. [49]

    Robert McMillan. 2023. AI Junk Is Starting to Pollute the Internet. https://www.wsj.com/articles/chatgpt-already-floods-some-corners-of-the- internet-with-spam-its-just-the-beginning-9c86ea25

  50. [50]

    Dan Milmo. 2023. AI-created child sexual abuse images ‘threaten to overwhelm internet’. (2023). https://www.theguardian.com/technology/2023/ oct/25/ai-created-child-sexual-abuse-images-threaten-overwhelm-internet

  51. [51]

    NIST. 2023. Artificial intelligence risk management framework (AI RMF 1.0).National Institute of Standards and Technology(2023), 100–1

  52. [52]

    NIST. 2024. Reducing risk posed by synthetic content an overview of technical approaches to digital content transparency. , error: 100–4 pages. https://doi.org/10.6028/NIST.AI.100-4

  53. [53]

    OECD. 2023. Stocktaking for the development of an AI incident definition.OECD Artificial Intelligence Papers4 (2023). https://doi.org/https: //doi.org/10.1787/c323ac71-en

  54. [54]

    2025.AIM: The OECD AI Incidents Monitor, an evidence base for trustworthy AI

    OECD. 2025.AIM: The OECD AI Incidents Monitor, an evidence base for trustworthy AI. https://oecd.ai/en/incidents

  55. [55]

    systematic theft

    Aimee Picchi. 2023. George R.R. Martin, John Grisham and other major authors sue OpenAI, alleging "systematic theft". https://www.cbsnews.com/ news/openai-lawsuit-george-rr-martin-john-grisham-copyright-infringement/

  56. [56]

    Nikiforos Pittaras and Sean McGregor. 2023. A Taxonomic System for Failure Cause Analysis of Open Source AI Incidents. InProceedings of the Workshop on Artificial Intelligence Safety 2023

  57. [57]

    Jason Proctor. 2024. Air Canada found liable for chatbot’s bad advice on plane tickets. https://www.cbc.ca/news/canada/british-columbia/air- canada-chatbot-lawsuit-1.7116416

  58. [58]

    Jason Proctor. 2024. B.C. lawyer reprimanded for citing fake cases invented by ChatGPT. https://www.cbc.ca/news/canada/british-columbia/lawyer- chatgpt-fake-precedent-1.7126393

  59. [59]

    Inioluwa Deborah Raji, I Elizabeth Kumar, Aaron Horowitz, and Andrew Selbst. 2022. The fallacy of AI functionality. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 959–972

  60. [60]

    Reuters. 2023. Kremlin: fake Putin address broadcast on Russian radio stations after ’hack’. https://www.reuters.com/world/europe/kremlin-fake- putin-address-broadcast-russian-radio-stations-after-hack-2023-06-05/ Manuscript submitted to ACM A Closer Look at the Existing Risks of Generative AI 19

  61. [61]

    Shalaleh Rismani, Renee Shelby, Andrew Smart, Edgar Jatho, Joshua Kroll, AJung Moon, and Negar Rostamzadeh. 2023. From plane crashes to algorithmic harm: applicability of safety engineering frameworks for responsible ML. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–18

  62. [62]

    April Rubin. 2023. Teens exploited by fake nudes illustrate threat of unregulated AI. https://www.axios.com/2023/11/03/ai-deepfake-nude-images- new-jersey-high-school

  63. [63]

    Zeve Sanderson, Solomon Messing, and Joshua A. Tucker. 2024.Misunderstood mechanics: How AI, TikTok, and the liar’s dividend might affect the 2024 elections. https://www.brookings.edu/articles/misunderstood-mechanics-how-ai-tiktok-and-the-liars-dividend-might-affect-the-2024-elections/

  64. [64]

    Renee Shelby, Shalaleh Rismani, Kathryn Henne, AJung Moon, Negar Rostamzadeh, Paul Nicholas, N’Mah Yilla-Akbari, Jess Gallegos, Andrew Smart, Emilio Garcia, et al. 2023. Sociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 723–741

  65. [65]

    Peter Slattery, Alexander K Saeri, Emily AC Grundy, Jess Graham, Michael Noetel, Risto Uuk, James Dao, Soroush Pour, Stephen Casper, and Neil Thompson. 2024. The ai risk repository: A comprehensive meta-review, database, and taxonomy of risks from artificial intelligence.arXiv preprint arXiv:2408.12622(2024)

  66. [66]

    Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan Baker, Su Lin Blodgett, Canyu Chen, Hal Daumé III, Jesse Dodge, Isabella Duan, et al. 2023. Evaluating the social impact of generative ai systems in systems and society.arXiv preprint arXiv:2306.05949(2023)

  67. [67]

    Marlow Stern. 2023. A College Girl Found Deepfake Porn of Herself Online. Who Did It Shocked Her. https://www.yahoo.com/entertainment/college- girl-found-deepfake-porn-140000282.html?guccounter=1

  68. [68]

    Catherine Stupp. 2019. Fraudsters Used AI to Mimic CEO’s Voice in Unusual Cybercrime Case. https://www.wsj.com/articles/fraudsters-use-ai-to- mimic-ceos-voice-in-unusual-cybercrime-case-11567157402

  69. [69]

    Ningjing Tang, Jiayin Zhi, Tzu-Sheng Kuo, Calla Kainaroi, Jeremy J Northup, Kenneth Holstein, Haiyi Zhu, Hoda Heidari, and Hong Shen. 2024. AI Failure Cards: Understanding and Supporting Grassroots Efforts to Mitigate AI Failures in Homeless Services. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 713–732

  70. [70]

    Victor Tangermann. 2024. AI CEO Proud of Chatbot for Convincing Woman to Euthanize Her Dog. https://futurism.com/ai-ceo-chatbot-convince- woman-euthanize-dog

  71. [71]

    Joshua Thruston. 2024. AI images of Donald Trump with black voters spread before election. https://www.thetimes.com/world/article/ai-images- of-donald-trump-with-black-voters-spread-before-election-p3fhfc8wl?region=global

  72. [72]

    Violet Turri and Rachel Dzombak. 2023. Why we need to know more: Exploring the state of AI incident documentation practices. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 576–583

  73. [73]

    Julia De Miguel Velázquez, Sanja Šćepanović, Andrés Gvirtz, and Daniele Quercia. 2024. Decoding Real-World Artificial Intelligence Incidents. Computer57, 11 (2024), 71–81

  74. [74]

    Bertie Vidgen, Adarsh Agrawal, Ahmed M Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Max Bartolo, et al. 2024. Introducing v0. 5 of the ai safety benchmark from mlcommons.arXiv preprint arXiv:2404.12241(2024)

  75. [75]

    Emmanuelle Walkowiak and Jason Potts. 2024. Generative AI, Work and Risks in Cultural and Creative Industries.SSRN Electronic Journal(2024). https://doi.org/10.2139/ssrn.4830265

  76. [76]

    Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor Griffin, Ben Bariach, et al. 2023. Sociotechnical safety evaluation of generative ai systems.arXiv preprint arXiv:2310.11986(2023)

  77. [77]

    Aaron Wininger. 2024. China’s Beijing Internet Court Recognizes Personality Rights in Generative AI Case. https://natlawreview.com/article/chinas- beijing-internet-court-recognizes-personality-rights-generative-ai-case

  78. [78]

    Harm Codebook

    Yi Zeng, Kevin Klyman, Andy Zhou, Yu Yang, Minzhou Pan, Ruoxi Jia, Dawn Song, Percy Liang, and Bo Li. 2024. Ai risk categorization decoded (air 2024): From government regulations to corporate policies.arXiv preprint arXiv:2406.17864(2024). Manuscript submitted to ACM 20 Li et al. 7 Appendices 7.1 Overview of Existing AI Risk and Harm Taxonomies We include...