pith. machine review for the scientific record. sign in

arxiv: 2604.07838 · v1 · submitted 2026-04-09 · 💻 cs.CY

Recognition: no theorem link

We Need Strong Preconditions For Using Simulations In Policy

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:13 UTC · model grok-4.3

classification 💻 cs.CY
keywords LLM simulationspolicy interventionsethical AIsocietal modelingdual-use risksaccountabilityparticipatory consentmarginalized groups
0
0 comments X

The pith

Societal-scale LLM agent simulations for policy must follow three preconditions to be used ethically.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper contends that LLM agent simulations offer potential for policymakers to test interventions but carry risks of dual-use and poor validation. It proposes three specific preconditions to set boundaries: avoid treating simulations of marginalized populations as neutral, ensure populations participate in simulations, and establish accountability. Combined with required development and deployment reports, these are meant to promote responsible use and build trust among decision-makers and the public.

Core claim

The authors argue that to responsibly develop and use societal-scale LLM agent simulations, developers and policymakers must adhere to three preconditions: do not treat simulations of marginalized populations as neutral technical outputs, do not simulate populations without their participation, and do not simulate without accountability. They posit that these guardrails, along with simulation development and deployment reports, will address challenges of dual-use potential and output validation, thereby building trust and ensuring public benefit.

What carries the argument

The three preconditions serving as guardrails for simulation developers and decision-makers to ensure ethical boundaries in using LLM agent simulations for policy.

If this is right

  • Simulations of marginalized groups must be approached with awareness that they are not neutral, requiring explicit consideration of biases and impacts.
  • Populations should be involved in the simulation process to provide consent and input.
  • Clear accountability structures must be in place for both developers and users of the simulations.
  • Development and deployment reports will provide transparency to support trust building.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Adopting these preconditions could necessitate new participatory frameworks for AI in governance that extend to other modeling techniques.
  • Effective implementation might require regulatory backing to ensure the preconditions are not just voluntary guidelines.
  • These rules could influence how simulations are designed, potentially improving their accuracy through real participation but also complicating rapid deployment.

Load-bearing premise

The assumption that combining the three preconditions with development reports will sufficiently mitigate dual-use and validation problems and build trust, despite lacking evidence of their effectiveness or enforceability.

What would settle it

A real-world example where a simulation followed all three preconditions yet produced unvalidated outputs leading to harmful policy outcomes or eroded public trust would indicate the preconditions are insufficient.

read the original abstract

Simulations, and more recently LLM agent simulations, have been adopted as useful tools for policymakers to explore interventions, rehearse potential scenarios, and forecast outcomes. While LLM simulations have enormous potential, two critical challenges remain understudied: the dual-use potential of accurate models of individual or population-level human behavior and the difficulty of validating simulation outputs. In light of these limitations, we must define boundaries for both simulation developers and decision-makers to ensure responsible development and ethical use. We propose and discuss three preconditions for societal-scale LLM agent simulations: 1) do not treat simulations of marginalized populations as neutral technical outputs, 2) do not simulate populations without their participation, and 3) do not simulate without accountability. We believe that these guardrails, combined with our call for simulation development and deployment reports, will help build trust among policymakers while promoting responsible development and use of societal-scale LLM agent simulations for the public benefit.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript identifies dual-use risks and validation challenges in societal-scale LLM agent simulations for policy and proposes three preconditions for responsible use: (1) do not treat simulations of marginalized populations as neutral technical outputs, (2) do not simulate populations without their participation, and (3) do not simulate without accountability. It further advocates for mandatory simulation development and deployment reports to build trust among policymakers and promote ethical practices.

Significance. The paper usefully surfaces ethical tensions in an emerging application area and offers concrete guardrails that could, if implemented, reduce harms and increase legitimacy of policy simulations. Its value lies in framing the problem and naming specific boundaries rather than in any new empirical findings or validated mechanisms.

major comments (3)
  1. [Abstract] Abstract and the section introducing the three preconditions: the central claim that these preconditions 'combined with our call for simulation development and deployment reports, will help build trust' and address dual-use/validation issues is asserted without any operationalization, logical derivation, or reference to prior evidence showing similar guardrails have reduced misuse in AI or simulation contexts.
  2. [Preconditions section] Discussion of precondition 2: the requirement of 'participation' is load-bearing for the proposal yet provides no analysis of how consent or involvement would scale to simulations of millions of agents or how it would constrain model accuracy or behavioral fidelity, leaving the feasibility of the guardrail unaddressed.
  3. [Conclusion] The sufficiency argument for the overall framework: no mechanism is supplied showing why accountability or non-neutrality framing would limit the dual-use potential of accurate individual- or population-level behavioral models, making the prescriptive recommendation rest on an unsupported assumption.
minor comments (2)
  1. [Title] The title is strongly normative; a more descriptive phrasing would better signal the manuscript's focus on preconditions and reports.
  2. [Throughout] Terms such as 'societal-scale' and 'marginalized populations' are used repeatedly but never defined operationally, which reduces precision in the policy recommendations.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the scope and limitations of our position paper. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract and the section introducing the three preconditions: the central claim that these preconditions 'combined with our call for simulation development and deployment reports, will help build trust' and address dual-use/validation issues is asserted without any operationalization, logical derivation, or reference to prior evidence showing similar guardrails have reduced misuse in AI or simulation contexts.

    Authors: The manuscript is a position paper that proposes these preconditions based on ethical reasoning and the identification of risks, rather than presenting them as empirically tested solutions. We do not claim that they have been shown to reduce misuse in prior contexts, as this is an emerging application. In the revised version, we will update the abstract and relevant sections to emphasize that these are proposed boundaries derived from first principles and analogies to established practices in AI ethics and policy simulation, and we will include a new subsection discussing the need for future validation studies. This is a partial revision since the core argument remains unchanged. revision: partial

  2. Referee: [Preconditions section] Discussion of precondition 2: the requirement of 'participation' is load-bearing for the proposal yet provides no analysis of how consent or involvement would scale to simulations of millions of agents or how it would constrain model accuracy or behavioral fidelity, leaving the feasibility of the guardrail unaddressed.

    Authors: We agree that the feasibility and scalability of population participation require further elaboration. The original manuscript prioritizes articulating the ethical requirement over detailed implementation strategies. For the revision, we will add analysis to the preconditions section, including discussion of scalable approaches such as community representatives, differential privacy techniques for anonymized participation, and acknowledgment of potential impacts on simulation fidelity. We will also note that full participation may not always be feasible and suggest it as an ideal to strive toward. revision: yes

  3. Referee: [Conclusion] The sufficiency argument for the overall framework: no mechanism is supplied showing why accountability or non-neutrality framing would limit the dual-use potential of accurate individual- or population-level behavioral models, making the prescriptive recommendation rest on an unsupported assumption.

    Authors: This comment correctly identifies that the paper does not detail a specific mechanism by which these preconditions would constrain dual-use. As the work is conceptual and aims to set normative boundaries, it assumes that accountability structures and non-neutral framing can reduce risks through increased scrutiny and ethical awareness. We will revise the conclusion to explicitly frame this as a reasoned proposal rather than a proven sufficiency, and add a call for research into enforcement and effectiveness. We cannot provide an empirical mechanism at this stage, but we will clarify the argumentative basis. revision: partial

Circularity Check

0 steps flagged

No circularity: normative policy recommendations with no derivations or self-referential reductions

full rationale

The paper advances three ethical preconditions for LLM agent simulations as forward-looking policy proposals grounded in identified challenges (dual-use potential and validation difficulties). It contains no equations, no fitted parameters, no predictions derived from data subsets, and no self-citations that serve as load-bearing premises. The central claim—that the preconditions plus development reports will build trust and promote responsible use—is presented as a belief rather than a derived result, with no reduction to prior inputs by construction. This is a standard non-circular position paper whose argument rests on normative reasoning rather than any technical or empirical chain that could collapse into self-definition or fitted inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on the domain assumption that dual-use risks and validation difficulties are the primary barriers requiring these specific guardrails, without independent evidence or falsifiable tests of the preconditions.

axioms (1)
  • domain assumption LLM agent simulations have dual-use potential and are difficult to validate, necessitating ethical boundaries for developers and decision-makers.
    Directly stated in the abstract as the two critical challenges that motivate the preconditions.

pith-pipeline@v0.9.0 · 5454 in / 1205 out tokens · 73149 ms · 2026-05-10T18:13:28.105986+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

60 extracted references · 23 canonical work pages · 1 internal anchor

  1. [1]

    Beyond Intent: Establishing Discriminatory Purpose in Algorithmic Risk Assessment.Harvard Law Review134, 5 (2021), pp

    2021. Beyond Intent: Establishing Discriminatory Purpose in Algorithmic Risk Assessment.Harvard Law Review134, 5 (2021), pp. 1760–1781. https://www. jstor.org/stable/27028679

  2. [2]

    California Council on Science and Technology

    California Council on Science and Technology 2025.Legislative Staff Academy on Artificial Intelligence. California Council on Science and Technology. https: //ccst.us/artificial-intelligence/academy/ Accessed 2026-04-08

  3. [3]

    Center for Technological Responsibility, Reimagination and Redesign, Brown University

    Center for Technological Responsibility, Reimagination and Redesign, Brown University 2026.CNTR and Watson Tech & Policy Summer School. Center for Technological Responsibility, Reimagination and Redesign, Brown University. https://cntr.brown.edu/summer-school Accessed 2026-04-08

  4. [4]

    Stanford Institute for Human-Centered Artificial Intelli- gence

    Stanford Institute for Human-Centered Artificial Intelligence 2026.Congres- sional Boot Camp on AI. Stanford Institute for Human-Centered Artificial Intelli- gence. https://hai.stanford.edu/policy/policymaker-education/congressional- boot-camp Accessed 2026-04-08

  5. [5]

    Mike Ananny and Kate Crawford. 2018. Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability.new media & society20, 3 (2018), 973–989

  6. [6]

    Jacy Reese Anthis, Ryan Liu, Sean M Richardson, Austin C Kozlowski, Bernard Koch, James Evans, Erik Brynjolfsson, and Michael Bernstein. 2025. Llm social simulations are a promising research method.arXiv preprint arXiv:2504.02234 (2025)

  7. [7]

    2019.Race After Technology: Abolitionist Tools for the New Jim Code

    Ruha Benjamin. 2019.Race After Technology: Abolitionist Tools for the New Jim Code. Polity

  8. [8]

    Cristina Besio, Cornelia Fedtke, Michael Grothe-Hammer, Athanasios Karafillidis, and Andrea Pronzini. 2025. Algorithmic responsibility without accountability: Understanding data-intensive algorithms and decisions in organisations.Systems Research and Behavioral Science42, 3 (2025), 739–755

  9. [9]

    Marcel Binz, Elif Akata, Matthias Bethge, Franziska Brändle, Fred Callaway, Julian Coda-Forno, Peter Dayan, Can Demircan, Maria K Eckstein, Noémi Éltető, et al

  10. [10]

    A foundation model to predict and capture human cognition.Nature644, 8078 (2025), 1002–1009

  11. [11]

    Abeba Birhane, William Isaac, Vinodkumar Prabhakaran, Mark Diaz, Madeleine Clare Elish, Iason Gabriel, and Shakir Mohamed. 2022. Power to the People? Opportunities and Challenges for Participatory AI. InProceedings of the 2nd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization(Arlington, VA, USA)(EAAMO ’22). Association for Com...

  12. [12]

    Rishi Bommasani, Sanjeev Arora, Jennifer Chayes, Yejin Choi, Mariano-Florentino Cuéllar, Li Fei-Fei, Daniel E Ho, Dan Jurafsky, Sanmi Koyejo, Hima Lakkaraju, et al. 2025. Advancing science-and evidence-based AI policy.Science389, 6759 (2025), 459–461

  13. [13]

    Rishi Bommasani, Scott R Singer, Ruth E Appel, Sarah Cen, A Feder Cooper, Lindsey A Gailmard, Ian Klaus, Meredith M Lee, Inioluwa Deborah Raji, Anka Reuel, et al. 2025. The California Report on Frontier AI Policy.arXiv preprint arXiv:2506.17303(2025)

  14. [14]

    2025.Nuclear Energy Support Near Record High in U.S

    Megan Brenan. 2025.Nuclear Energy Support Near Record High in U.S. Gallup. https://news.gallup.com/poll/659180/nuclear-energy-support-near- record-high.aspx Accessed 2026-04-08

  15. [15]

    Miles Brundage, Shahar Avin, Jack Clark, Helen Toner, Peter Eckersley, Ben Garfinkel, Allan Dafoe, Paul Scharre, Thomas Zeitzoff, Bobby Filar, Hyrum An- derson, Heather Roff, Gregory C. Allen, Jacob Steinhardt, Carrick Flynn, Seán Ó hÉigeartaigh, SJ Beard, Haydn Belfield, Sebastian Farquhar, Clare Lyle, Rebecca Crootof, Owain Evans, Michael Page, Joanna B...

  16. [16]

    Aditya Challapally, Chris Pease, Ramesh Raskar, and Pradyumna Chari. 2025. The GenAI divide: State of AI in business 2025.MIT Nanda(2025)

  17. [17]

    Man-pui Sally Chan. 2025. Enhancing Trust in Science: Current Challenges and Recommendations for Policymakers, the Scientific Community, Media, and Public.Social and Personality Psychology Compass19, 11 (2025), e70104

  18. [18]

    Myra Cheng, Esin Durmus, and Dan Jurafsky. 2023. Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models. InPro- ceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistic...

  19. [19]

    Ayush Chopra, Shashank Kumar, Nurullah Giray Kuru, Ramesh Raskar, and Arnau Quera-Bofarull. 2025. On the Limits of Agency in Agent-based Models. InProceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems(Detroit, MI, USA)(AAMAS ’25). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 500–509

  20. [20]

    Eric Corbett, Remi Denton, and Sheena Erete. 2023. Power and Public Partic- ipation in AI. InProceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization(Boston, MA, USA)(EAAMO ’23). Association for Computing Machinery, New York, NY, USA, Article 8, 13 pages. doi:10.1145/3617694.3623228

  21. [21]

    Add Diverse Stakeholders and Stir

    Fernando Delgado, Stephen Yang, Michael Madaio, and Qian Yang. 2021. Stakeholder Participation in AI: Beyond "Add Diverse Stakeholders and Stir". arXiv:2111.01122 [cs.AI] https://arxiv.org/abs/2111.01122

  22. [22]

    2018.Automating inequality: How high-tech tools profile, police, and punish the poor

    Virginia Eubanks. 2018.Automating inequality: How high-tech tools profile, police, and punish the poor. St. Martin’s Press

  23. [23]

    Richard F Fenno Jr. 1977. US House members in their constituencies: An explo- ration.American Political Science Review71, 3 (1977), 883–917

  24. [24]

    Sam Fluit, Laura Cortés-García, and Tilmann von Soest. 2024. Social marginal- ization: A scoping review of 50 years of research.Humanities and Social Sciences Communications11, 1 (2024), 1665

  25. [25]

    Alexander C Furnas, Timothy M LaPira, and Dashun Wang. 2025. Partisan disparities in the use of science in policy.Science388, 6745 (2025), 362–367

  26. [26]

    Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2021. Datasheets for datasets. Commun. ACM64, 12 (Nov. 2021), 86–92. doi:10.1145/3458723

  27. [27]

    Ben Green. 2021. Data Science as Political Action: Grounding Data Science in a Politics of Justice.Journal of Social Computing2, 3 (Sept. 2021), 249–265. doi:10.23919/jsc.2021.0029

  28. [28]

    Elisa D Harris, Robert Rosner, James M Acton, and Herbert Lin. 2016. Governance of dual-use technologies: Theory and practice. American Academy of Arts and Sciences

  29. [29]

    Luke Hewitt, Ashwini Ashokkumar, Isaias Ghezae, and Robb Willer. 2024. Predict- ing results of social science experiments using large language models.Preprint (2024)

  30. [30]

    Sarah Holland, Ahmed Hosny, Sarah Newman, Joshua Joseph, and Kasia Chmielin- ski. 2020. The dataset nutrition label.Data protection and privacy12, 12 (2020), 1

  31. [31]

    Jakko Kemper and Daan Kolkman. 2019. Transparent to whom? No algorithmic accountability without a critical audience.Information, Communication & Society 22, 14 (2019), 2081–2096

  32. [32]

    Himabindu Lakkaraju, Jon Kleinberg, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2017. The Selective Labels Problem: Evaluating Algorithmic Predic- tions in the Presence of Unobservables. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(Halifax, NS, Canada)(KDD ’17). Association for Computing Ma...

  33. [33]

    Maik Larooij and Petter Törnberg. 2025. Validation is the central challenge for generative social simulation: a critical review of LLMs in agent-based modeling. Artificial Intelligence Review59, 1 (2025), 15

  34. [34]

    Yuxuan Li, Sauvik Das, and Hirokazu Shirado. 2025. What Makes LLM Agent Simulations Useful for Policy? Insights From an Iterative Design Engagement in Emergency Preparedness.arXiv preprint arXiv:2509.21868(2025)

  35. [35]

    Marlene Lutz, Indira Sen, Georg Ahnert, Elisa Rogers, and Markus Strohmaier

  36. [36]

    The prompt makes the person(a): A systematic evaluation of sociodemographic persona prompting for large language models, 2025

    The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemo- graphic Persona Prompting for Large Language Models. arXiv:2507.16076 [cs.CL] PoliSim@CHI 2026, April 16, 2026, Barcelona, Spain Steven Luo, Saanvi Arora, and Carlos Guirado https://arxiv.org/abs/2507.16076

  37. [37]

    Sandra G. Mayson. 2019. Bias In, Bias Out.Yale Law Journal128 (20 June 2019). https://yalelawjournal.org/article/bias-in-bias-out

  38. [38]

    Matto Mildenberger and Alexander Sahn. 2025. The effect of policy traceability on legislative incentives.Legislative Studies Quarterly50, 4 (2025), e70036

  39. [39]

    Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. InProceedings of the Conference on Fairness, Accountability, and Transparency(Atlanta, GA, USA)(FAT* ’19). Association for Computing Machinery, New York, NY, USA, 220–2...

  40. [40]

    R C Mitchell. 1980. Public opinion and nuclear power before and after Three Mile Island.Resources; (United States)64 (01 1980). https://www.osti.gov/biblio/ 5354975

  41. [41]

    A Human-Centered Approach to

    Deirdre K. Mulligan and Helen Nissenbaum. 2020. The Concept of Handoff as a Model for Ethical Analysis and Design. InThe Oxford Handbook of Ethics of AI. Oxford University Press. doi:10.1093/oxfordhb/9780190067397.013.15

  42. [42]

    Arvind Narayanan and Sayash Kapoor. 2025. AI as Normal Technology.Knight First Amendment Institute25-09 (15 April 2025). https://knightcolumbia.org/ content/ai-as-normal-technology

  43. [43]

    Lynnette Hui Xian Ng and Kathleen M Carley. 2025. Are LLM-Powered Social Media Bots Realistic?. InInternational Conference on Social Computing, Behavioral- Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation. Springer, 14–23

  44. [44]

    2012.The Growing Concern over Dual-Use

    Ali Nouri. 2012.The Growing Concern over Dual-Use. American Association for the Advancement of Science. https://www.aaas.org/taxonomy/term/9/growing- concern-over-dual-use AAAS blog post

  45. [45]

    Supreme Court of the United States. 1971. Griggs v. Duke Power Co., 401 U.S. 424 (1971). Decided March 8, 1971

  46. [46]

    2021.Minding the gap: The disconnect between government bureaucracies and cultures of innovation in scaling

    Brad Olsen. 2021.Minding the gap: The disconnect between government bureaucracies and cultures of innovation in scaling. Brookings Institu- tion. https://www.brookings.edu/articles/minding-the-gap-the-disconnect- between-government-bureaucracies-and-cultures-of-innovation-in-scaling/

  47. [47]

    Bernstein

    Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2023. Generative Agents: Interactive Simulacra of Human Behavior. InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology(San Francisco, CA, USA)(UIST ’23). Association for Computing Machinery, New York, NY, USA, ...

  48. [48]

    Joon Sung Park, Carolyn Q Zou, Aaron Shaw, Benjamin Mako Hill, Carrie Cai, Meredith Ringel Morris, Robb Willer, Percy Liang, and Michael S Bernstein. 2024. Generative agent simulations of 1,000 people.arXiv preprint arXiv:2411.10109 (2024)

  49. [49]

    Elizabeth Kumar, Aaron Horowitz, and Andrew Selbst

    Inioluwa Deborah Raji, I. Elizabeth Kumar, Aaron Horowitz, and Andrew Selbst

  50. [50]

    InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency(Seoul, Republic of Korea)(FAccT ’22)

    The Fallacy of AI Functionality. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency(Seoul, Republic of Korea)(FAccT ’22). Association for Computing Machinery, New York, NY, USA, 959–972. doi:10. 1145/3531146.3533158

  51. [51]

    Mona Sloane, Emanuel Moss, Olaitan Awomolo, and Laura Forlano. 2022. Partic- ipation Is not a Design Fix for Machine Learning. InProceedings of the 2nd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (Arlington, VA, USA)(EAAMO ’22). Association for Computing Machinery, New York, NY, USA, Article 1, 6 pages. doi:10.1145/3551...

  52. [52]

    2026.Poll: Majority of voters say risks of AI outweigh the benefits

    Allan Smith and Shira Ovide. 2026.Poll: Majority of voters say risks of AI outweigh the benefits. NBC News. https://www.nbcnews.com/politics/politics-news/poll- majority-voters-say-risks-ai-outweigh-benefits-rcna262196 Accessed 2026-04- 08

  53. [53]

    Bobra, Jennifer Tridgell, K

    Genevieve Smith, Steven Luo, Hiral Jignesh Patel, Monica G. Bobra, Jennifer Tridgell, K. Jarrod Millman, Shachee Doshi, Katie Steen-James, Chinasa T. Okolo, Derek Slater, Nikko Stevens, Cathryn Carson, Ricardo Miron Torres, Natalia Luka, Judy Brewer, Woohyeuk Lee, Meredith M. Lee, Maximilian Gahntz, Isadora Cruxen, Cailean Osborne, Nicholas Garcia, David ...

  54. [54]

    Paul Smits and Giulia Listorti. 2023. Using Models for Policymaking: The Ques- tions You Should Ask When Presented with the Use of Simulation Models in Policymaking.Publications Office of the European Union(2023). doi:10.2760/545843

  55. [55]

    2026.Aaru, the Billion-Dollar AI Startup Founded by Teenagers

    Suzanne Vranica. 2026.Aaru, the Billion-Dollar AI Startup Founded by Teenagers. The Wall Street Journal. https://www.wsj.com/business/ai-startup-aaru-young- founders-35da7f87 Accessed 2026-04-08

  56. [56]

    Angelina Wang, Jamie Morgenstern, and John P Dickerson. 2025. Large language models that replace human participants can harmfully misportray and flatten identity groups.Nature Machine Intelligence7, 3 (2025), 400–411

  57. [57]

    Qian Wang, Jiaying Wu, Zichen Jiang, Zhenheng Tang, Bingqiao Luo, Nuo Chen, Wei Chen, and Bingsheng He. 2025. LLM-based Human Simulations Have Not Yet Been Reliable. arXiv:2501.08579 [cs.CL] https://arxiv.org/abs/2501.08579

  58. [58]

    Rebecca Wexler. 2018. Life, Liberty, and Trade Secrets: Intellectual Property in the Criminal Justice System.Stanford Law Review70, 5 (2018), 1343–1429. http://www.jstor.org/stable/45097249

  59. [59]

    2021.What Sen

    Alana Wise. 2021.What Sen. Blumenthal’s ’finsta’ flub says about Congress’ grasp of Big Tech. NPR. https://www.npr.org/2021/10/04/1043150167/sen-blumenthals- finsta-flub-renews-questions-about-congress-grasp-of-big-tech Accessed 2026- 04-08

  60. [60]

    Rebecca Yu, Valerie Chen, Ameet Talwalkar, and Hoda Heidari. 2025. Why Do Decision Makers (Not) Use AI? A Cross-Domain Analysis of Factors Impacting AI Adoption. arXiv:2508.00723 [cs.HC] https://arxiv.org/abs/2508.00723