AI as a Tool for Simulation-Based Experiments in Literary Studies

Matthew Wilkens

arxiv: 2606.02293 · v1 · pith:5ITS7GJFnew · submitted 2026-06-01 · 💻 cs.CL

AI as a Tool for Simulation-Based Experiments in Literary Studies

Matthew Wilkens This is my paper

Pith reviewed 2026-06-28 14:52 UTC · model grok-4.3

classification 💻 cs.CL

keywords generative AIliterary studiessimulationtext generationcultural productionin-distribution outputsmultiagent systems

0 comments

The pith

Generative AI enables controlled simulations of literary production by generating texts that reflect specified cultural constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that generative AI opens new experimental possibilities in literary studies by allowing large-scale, low-cost simulations of how cultural systems produce literature. It notes that while current models do not yet reliably create book-length texts matching arbitrary stylistic or cultural specifications, separate bodies of work already exist on AI as stand-ins for human populations, the narrative qualities of AI text, the coherence of multiagent AI interactions, and techniques for steering AI knowledge and output. The authors review these components and report experiments that generate literary texts and compare them to human novels, marking the first demonstration of limited in-distribution outputs in this setting. If the integration succeeds, scholars could run counterfactual tests on how different constraints shape literary history.

Core claim

Existing research on AI proxies for humans, narrative properties of generated text, stability of multiagent simulations, and methods to alter AI knowledge can be combined to support AI-based modeling of cultural systems of literary production. Experiments on literary text generation provide the first demonstration of limited in-distribution outputs by AI models when compared to high-status human-authored novels.

What carries the argument

Use of generative AI systems as proxies for human populations within multiagent, multiturn simulations to model cultural constraints on literary production.

If this is right

Literary scholars gain the ability to run controlled, grounded, large-scale experiments on questions of cultural production at low cost.
Comparisons between AI-generated and human-authored novels become feasible for testing narrative and stylistic properties.
Full counterfactual literary-historical simulations become possible once the component techniques are combined.
Technical methods for predictably altering AI behavior can be applied to study how stylistic features arise under different constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Such simulations could be extended to test specific historical hypotheses about how external events influence literary output.
Validation against existing digital humanities datasets of real texts might strengthen claims about simulation fidelity.
If stability improves, the approach could scale to longer narratives and more complex multiagent cultural dynamics.

Load-bearing premise

Separate lines of research on AI as human proxies, properties of generated text, stability of multiagent simulations, and techniques for altering AI knowledge can be integrated to yield reliable simulations that reflect arbitrarily specified cultural constraints or stylistic features.

What would settle it

An experiment in which multiagent AI simulations lose coherence before producing texts or in which generated outputs cannot be steered to match specified stylistic or cultural features even after applying known knowledge-alteration methods.

Figures

Figures reproduced from arXiv: 2606.02293 by Matthew Wilkens.

**Figure 2.** Figure 2: Comparison of human-authored texts in selected genres to AI generations under complex and basic [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Fightin’ Words analysis of the most distinctive words in human- and AI-authored PW fiction. [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

read the original abstract

Generative artificial intelligence (AI) systems open new possibilities for experimentation in literary studies via controlled, grounded, large-scale, low-cost simulations of cultural production. Current systems have not yet been shown to produce high-quality, book-length narrative texts that reliably reflect arbitrarily specified cultural constraints or stylistic features. But there exists substantial relevant research on each of the components required for literary-historical simulation. These include the use and validation of AI systems as proxies for differentiable human populations; the narrative and stylistic properties of AI-generated texts; the stability and coherence of multiagent, multiturn AI simulations of human actors; and technical methods through which to alter in predictable ways the knowledge and behavior of generative systems. Together, these areas could provide a starting point for more ambitious AI-based modeling of cultural systems of literary production. We describe the possibilities and challenges of simulation-based experiments in literary studies, summarize the current state of the art in relevant fields, and explain key technical aspects of the work. To provide an example directly relevant to literary scholars, we present the results of experiments on literary text generation, including comparisons to high-status, human-authored novels. Our results include the first demonstration of (limited) in-distribution outputs by AI models in this domain. We conclude with a description of future work on full counterfactual literary-historical simulations using AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a proposal for AI simulations in literary studies that asserts an experimental result on text generation without any methods, metrics, or data to support the claim.

read the letter

The main thing here is that the paper wants to use generative AI for controlled simulations of literary production at scale. It pulls together existing work on AI as proxies for human readers and writers, the narrative qualities of model outputs, stability in multi-agent setups, and ways to steer model knowledge. The goal is to run counterfactual experiments on how culture shapes texts.

It does a clear job summarizing those component areas and explaining why they matter for literary scholars who want something more experimental than traditional close reading. The framing around low-cost, large-scale tests is straightforward and points to real technical pieces that already exist.

The soft spot is the experimental section. The abstract says the authors ran text generation tests, compared outputs to high-status human novels, and achieved the first limited in-distribution results. But it supplies no definition of in-distribution, no models or prompts used, no datasets, no quantitative measures, and no baselines. Without those, the claim cannot be checked for novelty or correctness. The actual full counterfactual simulations are left for future work.

This is aimed at digital humanities researchers who are open to computational methods and want a roadmap for combining AI tools with literary questions. Someone already deep in generative modeling will not find new technical ground. A reader looking for concrete evidence on the simulation side will come away wanting more.

It deserves a serious referee to see whether the full manuscript supplies the missing experimental details or whether it functions mainly as a position paper. I would send it to review on the chance the experiments are properly documented there.

Referee Report

1 major / 1 minor

Summary. The paper claims that generative AI systems enable new simulation-based experiments in literary studies via controlled, grounded, large-scale, low-cost modeling of cultural production. It summarizes relevant prior work on AI as proxies for human populations, narrative/stylistic properties of generated text, stability of multiagent simulations, and techniques for predictably altering AI knowledge/behavior. The authors state that they present experimental results on literary text generation (with comparisons to high-status human-authored novels) constituting the first demonstration of limited in-distribution outputs by AI models in this domain, and they outline future work on full counterfactual literary-historical simulations.

Significance. If the claimed experimental results hold under proper documentation and validation, the work would be significant as a methodological bridge between AI and literary studies, offering a framework for testing hypotheses about cultural constraints and stylistic features at scales infeasible with traditional methods. The synthesis of component research areas provides a useful starting point for interdisciplinary modeling even if full simulations remain prospective.

major comments (1)

[Abstract] Abstract: the central empirical claim of 'the first demonstration of (limited) in-distribution outputs by AI models in this domain' (with comparisons to human-authored novels) supplies no operational definition of 'in-distribution,' experimental protocol, datasets, metrics, baselines, or quantitative outcomes. This renders the novelty assertion unevaluable and is load-bearing for the paper's stated contribution.

minor comments (1)

[Abstract] Abstract: the phrasing that the summarized research areas 'could provide a starting point' for ambitious modeling is appropriately hedged but leaves unclear which specific integrations have been implemented versus proposed for future work.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review and positive assessment of the paper's potential significance. We address the single major comment below and agree that revisions to the abstract are warranted to make the central empirical claim fully evaluable.

read point-by-point responses

Referee: [Abstract] Abstract: the central empirical claim of 'the first demonstration of (limited) in-distribution outputs by AI models in this domain' (with comparisons to human-authored novels) supplies no operational definition of 'in-distribution,' experimental protocol, datasets, metrics, baselines, or quantitative outcomes. This renders the novelty assertion unevaluable and is load-bearing for the paper's stated contribution.

Authors: We agree that the abstract does not supply the requested operational details and that this weakens the evaluability of the novelty claim. In the revised manuscript we will expand the abstract to include: (1) an operational definition of 'in-distribution' as outputs whose feature distributions (stylistic n-gram frequencies, sentence-length statistics, and embedding-space proximity) fall inside the 95% confidence intervals derived from the human novel corpus; (2) a concise statement of the experimental protocol (prompt construction, temperature settings, and post-generation filtering); (3) the specific datasets (public-domain 19th-century novels used for both training constraints and comparison); (4) the metrics and baselines employed (perplexity against a fine-tuned GPT-2 reference, cosine similarity on sentence-BERT embeddings, and human-novel inter-text distances); and (5) the key quantitative outcomes (e.g., mean similarity scores and statistical tests). These elements already appear in the methods and results sections; the revision will ensure the abstract summarizes them at the required level of specificity without altering the paper's core contribution. revision: yes

Circularity Check

0 steps flagged

No circularity; proposal paper with no derivations or fitted predictions

full rationale

The paper is a high-level proposal summarizing existing research components for AI-based literary simulations and deferring full experiments to future work. It contains no equations, parameters, derivations, or quantitative models. The claim of 'first demonstration of (limited) in-distribution outputs' is presented as an empirical result from experiments but is not supported by any self-referential definitions, fitted inputs renamed as predictions, or load-bearing self-citations that reduce the argument to its own inputs. No steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The proposal rests on the domain assumption that AI systems can function as proxies for human populations and that separate research strands can be combined for reliable cultural simulations; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption AI systems can serve as proxies for differentiable human populations
Invoked in the abstract as one of the key research components required for literary-historical simulation.

pith-pipeline@v0.9.1-grok · 5755 in / 1352 out tokens · 34418 ms · 2026-06-28T14:52:12.988809+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

111 extracted references · 74 canonical work pages · 9 internal anchors

[1]

and Suchow, Jordan W

Almaatouq, Abdullah and Griffiths, Thomas L. and Suchow, Jordan W. and Whiting, Mark E. and Evans, James and Watts, Duncan J. , date =. Beyond Playing 20 Questions with Nature:. 2024 , journal =. doi:10.1017/S0140525X22002874 , url =

work page doi:10.1017/s0140525x22002874 2024
[2]

and Kozlowski, Austin C

Anthis, Jacy Reese and Liu, Ryan and Richardson, Sean M. and Kozlowski, Austin C. and Koch, Bernard and Evans, James and Brynjolfsson, Erik and Bernstein, Michael , date =. 2025 , eprint =. doi:10.48550/arXiv.2504.02234 , url =

work page doi:10.48550/arxiv.2504.02234 2025
[3]

and Busby, Ethan C

Argyle, Lisa P. and Busby, Ethan C. and Fulda, Nancy and Gubler, Joshua R. and Rytting, Christopher and Wingate, David , date =. Out of. 2023 , journal =. doi:10.1017/pan.2023.2 , url =

work page doi:10.1017/pan.2023.2 2023
[4]

2025 , journal =

Contextualizing Ancient Texts with Generative Neural Networks , author =. 2025 , journal =. doi:10.1038/s41586-025-09292-5 , url =

work page doi:10.1038/s41586-025-09292-5 2025
[5]

Historical

Atari, Mohammad and Henrich, Joseph , date =. Historical. 2023 , journal =. doi:10.1177/09637214221149737 , url =

work page doi:10.1177/09637214221149737 2023
[6]

Proceedings of the 2025

Baek, Jinheon and Jauhar, Sujay Kumar and Cucerzan, Silviu and Hwang, Sung Ju , editor =. Proceedings of the 2025. 2025 , pages =. doi:10.18653/v1/2025.naacl-long.342 , url =

work page doi:10.18653/v1/2025.naacl-long.342 2025
[7]

and Jia, Hengrui and Travers, Adelin and Zhang, Baiwu and Lie, David and Papernot, Nicolas , date =

Bourtoule, Lucas and Chandrasekaran, Varun and Choquette-Choo, Christopher A. and Jia, Hengrui and Travers, Adelin and Zhang, Baiwu and Lie, David and Papernot, Nicolas , date =. Machine. 2020 , eprint =. doi:10.48550/arXiv.1912.03817 , url =

work page doi:10.48550/arxiv.1912.03817 2020
[8]

Chakrabarty, Tuhin and Laban, Philippe and Agarwal, Divyansh and Muresan, Smaranda and Wu, Chien-Sheng , date =. Art or. Proceedings of the 2024. 2024 , series =. doi:10.1145/3613904.3642731 , url =

work page doi:10.1145/3613904.3642731 2024
[9]

2024 , eprint =

Chen, Ruizhe and Zhang, Xiaotian and Luo, Meng and Chai, Wenhao and Liu, Zuozhu , date =. 2024 , eprint =. doi:10.48550/arXiv.2410.04070 , url =

work page doi:10.48550/arxiv.2410.04070 2024
[10]

Surveying the

Chen, Yuqi and Li, Sixuan and Li, Ying and Atari, Mohammad , date =. Surveying the. Proceedings of the 2024. 2024 , eprint =. doi:10.18653/v1/2024.emnlp-main.151 , url =

work page doi:10.18653/v1/2024.emnlp-main.151 2024
[11]

Chen, Jiaao and Yang, Diyi , date =. Unlearn. 2023 , eprint =. doi:10.48550/arXiv.2310.20150 , url =

work page doi:10.48550/arxiv.2310.20150 2023
[12]

2024 , eprint =

Chiu, Yu Ying and Jiang, Liwei and Lin, Bill Yuchen and Park, Chan Young and Li, Shuyue Stella and Ravi, Sahithya and Bhatia, Mehar and Antoniak, Maria and Tsvetkov, Yulia and Shwartz, Vered and Choi, Yejin , date =. 2024 , eprint =

2024
[13]

and Su, Zhe and Kao, Hsien-Te and Nguyen, Daniel and Lynch, Spencer and Sap, Maarten and Volkova, Svitlana , date =

Cohen, Myke C. and Su, Zhe and Kao, Hsien-Te and Nguyen, Daniel and Lynch, Spencer and Sap, Maarten and Volkova, Svitlana , date =. Exploring. 2025 , eprint =. doi:10.48550/arXiv.2506.15928 , url =

work page doi:10.48550/arxiv.2506.15928 2025
[14]

Dillion, Danica and Tandon, Niket and Gu, Yuling and Gray, Kurt , date =. Can. 2023 , journal =. doi:10.1016/j.tics.2023.04.008 , url =. 37173156 , eprinttype =

work page doi:10.1016/j.tics.2023.04.008 2023
[15]

TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Eldan, Ronen and Li, Yuanzhi , date =. 2023 , eprint =. doi:10.48550/arXiv.2305.07759 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2305.07759 2023
[16]

Eldan, Ronen and Russinovich, Mark , date =. Who's. 2023 , eprint =. doi:10.48550/arXiv.2310.02238 , url =

work page doi:10.48550/arxiv.2310.02238 2023
[17]

2024 , eprint =

Applying Sparse Autoencoders to Unlearn Knowledge in Language Models , author =. 2024 , eprint =. doi:10.48550/arXiv.2410.19278 , url =

work page doi:10.48550/arxiv.2410.19278 2024
[18]

and Manzoor, Emaad and Pryzant, Reid and Sridhar, Dhanya and Wood-Doughty, Zach and Eisenstein, Jacob and Grimmer, Justin and Reichart, Roi and Roberts, Margaret E

Feder, Amir and Keith, Katherine A. and Manzoor, Emaad and Pryzant, Reid and Sridhar, Dhanya and Wood-Doughty, Zach and Eisenstein, Jacob and Grimmer, Justin and Reichart, Roi and Roberts, Margaret E. and Stewart, Brandon M. and Veitch, Victor and Yang, Diyi , date =. Causal. 2022 , journal =. doi:10.1162/tacl_a_00511 , url =

work page doi:10.1162/tacl_a_00511 2022
[19]

Pretraining

Fittschen, Elisabeth and Li, Sabrina and Lippincott, Tom and Choshen, Leshem and Messner, Craig , date =. Pretraining. 2025 , eprint =. doi:10.48550/arXiv.2504.05523 , url =

work page doi:10.48550/arxiv.2504.05523 2025
[20]

Fleisig, Eve and Blodgett, Su Lin and Klein, Dan and Talat, Zeerak , editor =. The. Proceedings of the 2024. 2024 , pages =. doi:10.18653/v1/2024.naacl-long.126 , url =

work page doi:10.18653/v1/2024.naacl-long.126 2024
[21]

Massively

Fung, Yi and Zhao, Ruining and Doo, Jae and Sun, Chenkai and Ji, Heng , date =. Massively. 2024 , eprint =

2024
[22]

Gandikota, Rohit and Materzyńska, Joanna and Fiotto-Kaufman, Jaden and Bau, David , date =. Erasing. 2023. 2023 , pages =. doi:10.1109/ICCV51070.2023.00230 , url =

work page doi:10.1109/iccv51070.2023.00230 2023
[23]

Gao, Chen and Lan, Xiaochong and Lu, Zhihong and Mao, Jinzhu and Piao, Jinghua and Wang, Huandong and Jin, Depeng and Li, Yong , date =. S3:. 2023 , eprint =. doi:10.48550/arXiv.2307.14984 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2307.14984 2023
[24]

and Maini, Pratyush and Raghunathan, Aditi , date =

Ghosal, Gaurav R. and Maini, Pratyush and Raghunathan, Aditi , date =. Memorization. 2025 , eprint =. doi:10.48550/arXiv.2507.09937 , url =

work page doi:10.48550/arxiv.2507.09937 2025
[25]

and Christakis, Nicholas A

Grossmann, Igor and Feinberg, Matthew and Parker, Dawn C. and Christakis, Nicholas A. and Tetlock, Philip E. and Cunningham, William A. , date =. 2023 , journal =. doi:10.1126/science.adi1778 , langid =. 37319216 , eprinttype =

work page doi:10.1126/science.adi1778 2023
[26]

Learning to

Gurung, Alexander and Lapata, Mirella , date =. Learning to. 2025 , eprint =. doi:10.48550/arXiv.2503.22828 , url =

work page doi:10.48550/arxiv.2503.22828 2025
[27]

, date =

Hewitt, John and Chen, Sarah and Xie, Lanruo Lora and Adams, Edward and Liang, Percy and Manning, Christopher D. , date =. Model. 2024 , eprint =. doi:10.48550/arXiv.2402.06155 , url =

work page doi:10.48550/arxiv.2402.06155 2024
[28]

Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?

Horton, John J and Filippas, Apostolos and Manning, Benjamin S. Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?. 2023

2023
[29]

2025 , eprint =

Hou, Zhaoyi Joey and Zhang, Bowei Alvin and Lu, Yining and Baghel, Bhiman Kumar and Brei, Anneliese and Lu, Ximing and Jiang, Meng and Brahman, Faeze and Chaturvedi, Snigdha and Chang, Haw-Shiuan and Khashabi, Daniel and Li, Xiang Lorraine , date =. 2025 , eprint =. doi:10.48550/arXiv.2510.20091 , url =

work page doi:10.48550/arxiv.2510.20091 2025
[30]

Ilharco, Gabriel and Ribeiro, Marco Tulio and Wortsman, Mitchell and Gururangan, Suchin and Schmidt, Ludwig and Hajishirzi, Hannaneh and Farhadi, Ali , date =. Editing. 2023 , eprint =. doi:10.48550/arXiv.2212.04089 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2212.04089 2023
[31]

Ji, Ke and Xu, Jiahao and Liang, Tian and Liu, Qiuzhi and He, Zhiwei and Chen, Xingyu and Liu, Xiaoyuan and Wang, Zhijie and Chen, Junying and Wang, Benyou and Tu, Zhaopeng and Mi, Haitao and Yu, Dong , date =. The. 2025 , eprint =. doi:10.13140/RG.2.2.33772.07043 , url =

work page doi:10.13140/rg.2.2.33772.07043 2025
[32]

2023 , eprint =

Jinxin, Shi and Jiabao, Zhao and Yilei, Wang and Xingjiao, Wu and Jiawen, Li and Liang, He , date =. 2023 , eprint =. doi:10.48550/arXiv.2308.12503 , url =

work page doi:10.48550/arxiv.2308.12503 2023
[33]

2025 , eprint =

Joshi, Brihi and Venkatapathy, Sriram and Bansal, Mohit and Peng, Nanyun and Chang, Haw-Shiuan , date =. 2025 , eprint =. doi:10.48550/arXiv.2503.17136 , url =

work page doi:10.48550/arxiv.2503.17136 2025
[34]

Generating the

Karell, Daniel and Shu, Matthew and Davidson, Thomas and Okura, Keitaro , date =. Generating the. 2025 , journal =

2025
[35]

Text and

Keith, Katherine and Jensen, David and O’Connor, Brendan , date =. Text and. Proceedings of the 58th. 2020 , pages =. doi:10.18653/v1/2020.acl-main.474 , url =

work page doi:10.18653/v1/2020.acl-main.474 2020
[36]

2024 , eprint=

In Silico Sociology: Forecasting COVID-19 Polarization with Large Language Models , author=. 2024 , eprint=

2024
[37]

2025 , url =

Societal and Technological Progress as Sewing an Ever-Growing, Ever-Changing, Patchy, and Polychrome Quilt , author =. 2025 , url =

2025
[38]

2023 , eprint =

Lin, Jiaju and Zhao, Haoran and Zhang, Aochi and Wu, Yiting and Ping, Huqiuyue and Chen, Qin , date =. 2023 , eprint =. doi:10.48550/arXiv.2308.04026 , url =

work page doi:10.48550/arxiv.2308.04026 2023
[39]

Measuring the

Lin, Jinkun and Zhang, Anqi and Lecuyer, Mathias and Li, Jinyang and Panda, Aurojit and Sen, Siddhartha , date =. Measuring the. 2022 , eprint =

2022
[40]

Quantifying the

Li, Chao and Su, Xing and Han, Haoying and Xue, Cong and Zheng, Chunmo and Fan, Chao , date =. Quantifying the. 2023 , eprint =. doi:10.48550/arXiv.2308.03313 , url =

work page doi:10.48550/arxiv.2308.03313 2023
[41]

Chain of

Liu, Hao and Sferrazza, Carmelo and Abbeel, Pieter , date =. Chain of. 2023 , eprint =. doi:10.48550/arXiv.2302.02676 , url =

work page doi:10.48550/arxiv.2302.02676 2023
[42]

and Shwartz, Vered , date =

Liu, Zhuozhuo Joy and Samir, Farhan and Bhatia, Mehar and Nelson, Laura K. and Shwartz, Vered , date =. Is. 2025 , eprint =. doi:10.48550/arXiv.2505.18322 , url =

work page doi:10.48550/arxiv.2505.18322 2025
[43]

Liu, Ken Ziyu , date =. Machine. 2024 , url =

2024
[44]

Liu, Alisa and Han, Xiaochuang and Wang, Yizhong and Tsvetkov, Yulia and Choi, Yejin and Smith, Noah A , date =. Tuning. 2024 , location =

2024
[45]

TOFU: A Task of Fictitious Unlearning for LLMs

Maini, Pratyush and Feng, Zhili and Schwarzschild, Avi and Lipton, Zachary C. and Kolter, J. Zico , date =. 2024 , eprint =. doi:10.48550/arXiv.2401.06121 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2401.06121 2024
[46]

Locating and Editing Factual Associations in

Meng, Kevin and Bau, David and Andonian, Alex and Belinkov, Yonatan , booktitle =. Locating and Editing Factual Associations in
[47]

Meng, Kevin and Sharma, Arnab Sen and Andonian, Alex and Belinkov, Yonatan and Bau, David , date =. Mass-. 2023 , eprint =. doi:10.48550/arXiv.2210.07229 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2210.07229 2023
[48]

and Rush, Alexander M

Morris, John X. and Rush, Alexander M. , date =. Contextual. 2024 , year = 2024, eprint =. doi:10.48550/arXiv.2410.02525 , url =

work page doi:10.48550/arxiv.2410.02525 2024
[49]

and Hajishirzi, Hannaneh and Koh, Pang Wei and Dodge, Jesse and Dasigi, Pradeep , date =

Morrison, Jacob and Smith, Noah A. and Hajishirzi, Hannaneh and Koh, Pang Wei and Dodge, Jesse and Dasigi, Pradeep , date =. Merge to. 2024 , eprint =

2024
[50]

Muscato, Benedetta and Mala, Chandana Sree and Marchiori Manerba, Marta and Gezici, Gizem and Giannotti, Fosca , editor =. An. Proceedings of the 3rd. 2024 , pages =

2024
[51]

Designing

Navarro, Alejandro Leonardo García and Koneva, Nataliia and Sánchez-Macián, Alfonso and Hernández, José Alberto and Goyanes, Manuel , date =. Designing. 2024 , eprint =. doi:10.48550/arXiv.2411.07038 , url =

work page doi:10.48550/arxiv.2411.07038 2024
[52]

Descent-to-

Neel, Seth and Roth, Aaron and Sharifi-Malvajerdi, Saeed , date =. Descent-to-. Proceedings of the 32nd. 2021 , pages =

2021
[53]

2025 , eprint =

Empirically Evaluating Commonsense Intelligence in Large Language Models with Large-Scale Human Judgments , author =. 2025 , eprint =. doi:10.48550/arXiv.2505.10309 , url =

work page doi:10.48550/arxiv.2505.10309 2025
[54]

Learning

Papadimitriou, Isabel and Jurafsky, Dan , date =. Learning. 2020 , eprint =. doi:10.48550/arXiv.2004.14601 , url =

work page doi:10.48550/arxiv.2004.14601 2020
[55]

O’Brien, Carrie J

Park, Joon Sung and O'Brien, Joseph and Cai, Carrie Jun and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S. , date =. Generative. Proceedings of the 36th. 2023 , pages =. doi:10.1145/3586183.3606763 , url =

work page doi:10.1145/3586183.3606763 2023
[56]

Survey of

Pawar, Siddhesh and Park, Junyeong and Jin, Jiho and Arora, Arnav and Myung, Junho and Yadav, Srishti and Haznitrama, Faiz Ghifari and Song, Inhwa and Oh, Alice and Augenstein, Isabelle , date =. Survey of. 2024 , langid =

2024
[57]

Pawelczyk, Martin and Neel, Seth and Lakkaraju, Himabindu , date =. In-. 2024 , eprint =. doi:10.48550/arXiv.2310.07579 , url =

work page doi:10.48550/arxiv.2310.07579 2024
[58]

and Agarwal, Divyansh and Huang, Kung-Hsiang and Tan, Sarah and Peng, Nanyun and Wu, Chien-Sheng , date =

Qiu, Haoyi and Fabbri, Alexander R. and Agarwal, Divyansh and Huang, Kung-Hsiang and Tan, Sarah and Peng, Nanyun and Wu, Chien-Sheng , date =. Evaluating. 2024 , eprint =

2024
[59]

The Social

Roland, Edwin and So, Richard and Long, Hoyt , date =. The Social. 2025 , journal =. doi:10.1007/s00146-025-02790-0 , url =

work page doi:10.1007/s00146-025-02790-0 2025
[60]

Sorensen, Taylor and Moore, Jared and Fisher, Jillian and Gordon, Mitchell and Mireshghallah, Niloofar and Rytting, Christopher Michael and Ye, Andre and Jiang, Liwei and Lu, Ximing and Dziri, Nouha and Althoff, Tim and Choi, Yejin , date =. A. 2024 , eprint =

2024
[61]

Templeton, Adley , date =. Scaling. 2024 , institution =

2024
[62]

Guardrail

Thaker, Pratiksha and Maurya, Yash and Hu, Shengyuan and Wu, Zhiwei Steven and Smith, Virginia , date =. Guardrail. 2024 , eprint =. doi:10.48550/arXiv.2403.03329 , url =

work page doi:10.48550/arxiv.2403.03329 2024
[63]

and Wilkens, Matthew , date =

Underwood, Ted and Nelson, Laura K. and Wilkens, Matthew , date =. Can. 2025 , eprint =. doi:10.48550/arXiv.2505.00030 , url =

work page doi:10.48550/arxiv.2505.00030 2025
[64]

and Narayanan, Arvind , date =

Veselovsky, Veniamin and Argin, Berke and Stroebl, Benedikt and Wendler, Chris and West, Robert and Evans, James and Griffiths, Thomas L. and Narayanan, Arvind , date =. Localized. 2025 , eprint =. doi:10.48550/arXiv.2504.10191 , url =

work page doi:10.48550/arxiv.2504.10191 2025
[65]

Feder and Wang, Angelina and Atalla, Chad and Barocas, Solon and Blodgett, Su Lin and Chouldechova, Alexandra and Corvi, Emily and Dow, P

Wallach, Hanna and Desai, Meera and Cooper, A. Feder and Wang, Angelina and Atalla, Chad and Barocas, Solon and Blodgett, Su Lin and Chouldechova, Alexandra and Corvi, Emily and Dow, P. Alex and Garcia-Gathright, Jean and Olteanu, Alexandra and Pangakis, Nicholas and Reed, Stefanie and Sheng, Emily and Vann, Dan and Vaughan, Jennifer Wortman and Vogel, Ma...

work page doi:10.48550/arxiv.2502.00561 2025
[66]

Walsh, Melanie and Preus, Anna and Gronski, Elizabeth , date =. Does. 2024 , eprint =. doi:10.48550/arXiv.2410.15299 , url =

work page doi:10.48550/arxiv.2410.15299 2024
[67]

2024 , journal =

A Survey on Large Language Model Based Autonomous Agents , author =. 2024 , journal =. doi:10.1007/s11704-024-40231-1 , url =

work page doi:10.1007/s11704-024-40231-1 2024
[68]

and Zhang, Chiyuan and Zettlemoyer, Luke and Li, Kai and Henderson, Peter , date =

Wei, Boyi and Shi, Weijia and Huang, Yangsibo and Smith, Noah A. and Zhang, Chiyuan and Zettlemoyer, Luke and Li, Kai and Henderson, Peter , date =. Evaluating. 2024 , eprint =. doi:10.48550/arXiv.2406.18664 , url =

work page doi:10.48550/arxiv.2406.18664 2024
[69]

Xi, Zhiheng and Chen, Wenxiang and Guo, Xin and He, Wei and Ding, Yiwen and Hong, Boyang and Zhang, Ming and Wang, Junzhe and Jin, Senjie and Zhou, Enyu and Zheng, Rui and Fan, Xiaoran and Wang, Xiao and Xiong, Limao and Zhou, Yuhao and Wang, Weiran and Jiang, Changhao and Zou, Yicheng and Liu, Xiangyang and Yin, Zhangyue and Dou, Shihan and Weng, Rongxia...

2023
[70]

2024 , origdate =

Xi, Zhiheng , date =. 2024 , origdate =

2024
[71]

Echoes in

Xu, Weijia and Jojic, Nebojsa and Rao, Sudha and Brockett, Chris and Dolan, Bill , date =. Echoes in. 2025 , journal =. doi:10.1073/pnas.2504966122 , url =

work page doi:10.1073/pnas.2504966122 2025
[72]

Yang, Jiashu and Wang, Ningning and Zhao, Yian and Feng, Chaoran and Du, Junjia and Pang, Hao and Fang, Zhirui and Cheng, Xuxin , date =. Kongzi:. 2025 , eprint =. doi:10.48550/arXiv.2504.09488 , url =

work page doi:10.48550/arxiv.2504.09488 2025
[73]

Composing

Zhang, Jinghan and Chen, Shiqi and Liu, Junteng and He, Junxian , date =. Composing. 2023 , eprint =. doi:10.48550/arXiv.2306.14870 , url =

work page doi:10.48550/arxiv.2306.14870 2023
[74]

2025 , eprint =

Zhang, Xuanming and Chen, Yuxuan and Yeh, Min-Hsuan and Li, Yixuan , date =. 2025 , eprint =. doi:10.48550/arXiv.2505.18943 , url =

work page doi:10.48550/arxiv.2505.18943 2025
[75]

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

Zhang, Ruiqi and Lin, Licong and Bai, Yu and Mei, Song , date =. Negative. 2024 , eprint =. doi:10.48550/arXiv.2404.05868 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2404.05868 2024
[76]

2025 , eprint =

Zhang, Yiming and Diddee, Harshita and Holm, Susan and Liu, Hanchen and Liu, Xinyue and Samuel, Vinay and Wang, Barry and Ippolito, Daphne , date =. 2025 , eprint =. doi:10.48550/arXiv.2504.05228 , url =

work page doi:10.48550/arxiv.2504.05228 2025
[77]

Deciphering the

Zhao, Yang and Du, Li and Ding, Xiao and Xiong, Kai and Sun, Zhouhao and Jun, Shi and Liu, Ting and Qin, Bing , editor =. Deciphering the. Findings of the. 2024 , pages =. doi:10.18653/v1/2024.findings-acl.559 , url =

work page doi:10.18653/v1/2024.findings-acl.559 2024
[78]

and Potts, Christopher and Chen, Danqi , date =

Zhong, Zexuan and Wu, Zhengxuan and Manning, Christopher D. and Potts, Christopher and Chen, Danqi , date =. 2024 , eprint =. doi:10.48550/arXiv.2305.14795 , url =

work page doi:10.48550/arxiv.2305.14795 2024
[79]

Zhou, Jiaxu and Huang, Jen-tse and Zhou, Xuhui and Lam, Man Ho and Wang, Xintao and Zhu, Hao and Wang, Wenxuan and Sap, Maarten , date =. The. 2025 , eprint =. doi:10.48550/arXiv.2509.18052 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2509.18052 2025
[80]

2026 , langid =

Zhou, Xuhui and Liu, Jiarui and Yerukola, Akhila and Kim, Hyunwoo and Sap, Maarten , date =. 2026 , langid =

2026

Showing first 80 references.

[1] [1]

and Suchow, Jordan W

Almaatouq, Abdullah and Griffiths, Thomas L. and Suchow, Jordan W. and Whiting, Mark E. and Evans, James and Watts, Duncan J. , date =. Beyond Playing 20 Questions with Nature:. 2024 , journal =. doi:10.1017/S0140525X22002874 , url =

work page doi:10.1017/s0140525x22002874 2024

[2] [2]

and Kozlowski, Austin C

Anthis, Jacy Reese and Liu, Ryan and Richardson, Sean M. and Kozlowski, Austin C. and Koch, Bernard and Evans, James and Brynjolfsson, Erik and Bernstein, Michael , date =. 2025 , eprint =. doi:10.48550/arXiv.2504.02234 , url =

work page doi:10.48550/arxiv.2504.02234 2025

[3] [3]

and Busby, Ethan C

Argyle, Lisa P. and Busby, Ethan C. and Fulda, Nancy and Gubler, Joshua R. and Rytting, Christopher and Wingate, David , date =. Out of. 2023 , journal =. doi:10.1017/pan.2023.2 , url =

work page doi:10.1017/pan.2023.2 2023

[4] [4]

2025 , journal =

Contextualizing Ancient Texts with Generative Neural Networks , author =. 2025 , journal =. doi:10.1038/s41586-025-09292-5 , url =

work page doi:10.1038/s41586-025-09292-5 2025

[5] [5]

Historical

Atari, Mohammad and Henrich, Joseph , date =. Historical. 2023 , journal =. doi:10.1177/09637214221149737 , url =

work page doi:10.1177/09637214221149737 2023

[6] [6]

Proceedings of the 2025

Baek, Jinheon and Jauhar, Sujay Kumar and Cucerzan, Silviu and Hwang, Sung Ju , editor =. Proceedings of the 2025. 2025 , pages =. doi:10.18653/v1/2025.naacl-long.342 , url =

work page doi:10.18653/v1/2025.naacl-long.342 2025

[7] [7]

and Jia, Hengrui and Travers, Adelin and Zhang, Baiwu and Lie, David and Papernot, Nicolas , date =

Bourtoule, Lucas and Chandrasekaran, Varun and Choquette-Choo, Christopher A. and Jia, Hengrui and Travers, Adelin and Zhang, Baiwu and Lie, David and Papernot, Nicolas , date =. Machine. 2020 , eprint =. doi:10.48550/arXiv.1912.03817 , url =

work page doi:10.48550/arxiv.1912.03817 2020

[8] [8]

Chakrabarty, Tuhin and Laban, Philippe and Agarwal, Divyansh and Muresan, Smaranda and Wu, Chien-Sheng , date =. Art or. Proceedings of the 2024. 2024 , series =. doi:10.1145/3613904.3642731 , url =

work page doi:10.1145/3613904.3642731 2024

[9] [9]

2024 , eprint =

Chen, Ruizhe and Zhang, Xiaotian and Luo, Meng and Chai, Wenhao and Liu, Zuozhu , date =. 2024 , eprint =. doi:10.48550/arXiv.2410.04070 , url =

work page doi:10.48550/arxiv.2410.04070 2024

[10] [10]

Surveying the

Chen, Yuqi and Li, Sixuan and Li, Ying and Atari, Mohammad , date =. Surveying the. Proceedings of the 2024. 2024 , eprint =. doi:10.18653/v1/2024.emnlp-main.151 , url =

work page doi:10.18653/v1/2024.emnlp-main.151 2024

[11] [11]

Chen, Jiaao and Yang, Diyi , date =. Unlearn. 2023 , eprint =. doi:10.48550/arXiv.2310.20150 , url =

work page doi:10.48550/arxiv.2310.20150 2023

[12] [12]

2024 , eprint =

Chiu, Yu Ying and Jiang, Liwei and Lin, Bill Yuchen and Park, Chan Young and Li, Shuyue Stella and Ravi, Sahithya and Bhatia, Mehar and Antoniak, Maria and Tsvetkov, Yulia and Shwartz, Vered and Choi, Yejin , date =. 2024 , eprint =

2024

[13] [13]

and Su, Zhe and Kao, Hsien-Te and Nguyen, Daniel and Lynch, Spencer and Sap, Maarten and Volkova, Svitlana , date =

Cohen, Myke C. and Su, Zhe and Kao, Hsien-Te and Nguyen, Daniel and Lynch, Spencer and Sap, Maarten and Volkova, Svitlana , date =. Exploring. 2025 , eprint =. doi:10.48550/arXiv.2506.15928 , url =

work page doi:10.48550/arxiv.2506.15928 2025

[14] [14]

Dillion, Danica and Tandon, Niket and Gu, Yuling and Gray, Kurt , date =. Can. 2023 , journal =. doi:10.1016/j.tics.2023.04.008 , url =. 37173156 , eprinttype =

work page doi:10.1016/j.tics.2023.04.008 2023

[15] [15]

TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Eldan, Ronen and Li, Yuanzhi , date =. 2023 , eprint =. doi:10.48550/arXiv.2305.07759 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2305.07759 2023

[16] [16]

Eldan, Ronen and Russinovich, Mark , date =. Who's. 2023 , eprint =. doi:10.48550/arXiv.2310.02238 , url =

work page doi:10.48550/arxiv.2310.02238 2023

[17] [17]

2024 , eprint =

Applying Sparse Autoencoders to Unlearn Knowledge in Language Models , author =. 2024 , eprint =. doi:10.48550/arXiv.2410.19278 , url =

work page doi:10.48550/arxiv.2410.19278 2024

[18] [18]

and Manzoor, Emaad and Pryzant, Reid and Sridhar, Dhanya and Wood-Doughty, Zach and Eisenstein, Jacob and Grimmer, Justin and Reichart, Roi and Roberts, Margaret E

Feder, Amir and Keith, Katherine A. and Manzoor, Emaad and Pryzant, Reid and Sridhar, Dhanya and Wood-Doughty, Zach and Eisenstein, Jacob and Grimmer, Justin and Reichart, Roi and Roberts, Margaret E. and Stewart, Brandon M. and Veitch, Victor and Yang, Diyi , date =. Causal. 2022 , journal =. doi:10.1162/tacl_a_00511 , url =

work page doi:10.1162/tacl_a_00511 2022

[19] [19]

Pretraining

Fittschen, Elisabeth and Li, Sabrina and Lippincott, Tom and Choshen, Leshem and Messner, Craig , date =. Pretraining. 2025 , eprint =. doi:10.48550/arXiv.2504.05523 , url =

work page doi:10.48550/arxiv.2504.05523 2025

[20] [20]

Fleisig, Eve and Blodgett, Su Lin and Klein, Dan and Talat, Zeerak , editor =. The. Proceedings of the 2024. 2024 , pages =. doi:10.18653/v1/2024.naacl-long.126 , url =

work page doi:10.18653/v1/2024.naacl-long.126 2024

[21] [21]

Massively

Fung, Yi and Zhao, Ruining and Doo, Jae and Sun, Chenkai and Ji, Heng , date =. Massively. 2024 , eprint =

2024

[22] [22]

Gandikota, Rohit and Materzyńska, Joanna and Fiotto-Kaufman, Jaden and Bau, David , date =. Erasing. 2023. 2023 , pages =. doi:10.1109/ICCV51070.2023.00230 , url =

work page doi:10.1109/iccv51070.2023.00230 2023

[23] [23]

Gao, Chen and Lan, Xiaochong and Lu, Zhihong and Mao, Jinzhu and Piao, Jinghua and Wang, Huandong and Jin, Depeng and Li, Yong , date =. S3:. 2023 , eprint =. doi:10.48550/arXiv.2307.14984 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2307.14984 2023

[24] [24]

and Maini, Pratyush and Raghunathan, Aditi , date =

Ghosal, Gaurav R. and Maini, Pratyush and Raghunathan, Aditi , date =. Memorization. 2025 , eprint =. doi:10.48550/arXiv.2507.09937 , url =

work page doi:10.48550/arxiv.2507.09937 2025

[25] [25]

and Christakis, Nicholas A

Grossmann, Igor and Feinberg, Matthew and Parker, Dawn C. and Christakis, Nicholas A. and Tetlock, Philip E. and Cunningham, William A. , date =. 2023 , journal =. doi:10.1126/science.adi1778 , langid =. 37319216 , eprinttype =

work page doi:10.1126/science.adi1778 2023

[26] [26]

Learning to

Gurung, Alexander and Lapata, Mirella , date =. Learning to. 2025 , eprint =. doi:10.48550/arXiv.2503.22828 , url =

work page doi:10.48550/arxiv.2503.22828 2025

[27] [27]

, date =

Hewitt, John and Chen, Sarah and Xie, Lanruo Lora and Adams, Edward and Liang, Percy and Manning, Christopher D. , date =. Model. 2024 , eprint =. doi:10.48550/arXiv.2402.06155 , url =

work page doi:10.48550/arxiv.2402.06155 2024

[28] [28]

Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?

Horton, John J and Filippas, Apostolos and Manning, Benjamin S. Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus?. 2023

2023

[29] [29]

2025 , eprint =

Hou, Zhaoyi Joey and Zhang, Bowei Alvin and Lu, Yining and Baghel, Bhiman Kumar and Brei, Anneliese and Lu, Ximing and Jiang, Meng and Brahman, Faeze and Chaturvedi, Snigdha and Chang, Haw-Shiuan and Khashabi, Daniel and Li, Xiang Lorraine , date =. 2025 , eprint =. doi:10.48550/arXiv.2510.20091 , url =

work page doi:10.48550/arxiv.2510.20091 2025

[30] [30]

Ilharco, Gabriel and Ribeiro, Marco Tulio and Wortsman, Mitchell and Gururangan, Suchin and Schmidt, Ludwig and Hajishirzi, Hannaneh and Farhadi, Ali , date =. Editing. 2023 , eprint =. doi:10.48550/arXiv.2212.04089 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2212.04089 2023

[31] [31]

Ji, Ke and Xu, Jiahao and Liang, Tian and Liu, Qiuzhi and He, Zhiwei and Chen, Xingyu and Liu, Xiaoyuan and Wang, Zhijie and Chen, Junying and Wang, Benyou and Tu, Zhaopeng and Mi, Haitao and Yu, Dong , date =. The. 2025 , eprint =. doi:10.13140/RG.2.2.33772.07043 , url =

work page doi:10.13140/rg.2.2.33772.07043 2025

[32] [32]

2023 , eprint =

Jinxin, Shi and Jiabao, Zhao and Yilei, Wang and Xingjiao, Wu and Jiawen, Li and Liang, He , date =. 2023 , eprint =. doi:10.48550/arXiv.2308.12503 , url =

work page doi:10.48550/arxiv.2308.12503 2023

[33] [33]

2025 , eprint =

Joshi, Brihi and Venkatapathy, Sriram and Bansal, Mohit and Peng, Nanyun and Chang, Haw-Shiuan , date =. 2025 , eprint =. doi:10.48550/arXiv.2503.17136 , url =

work page doi:10.48550/arxiv.2503.17136 2025

[34] [34]

Generating the

Karell, Daniel and Shu, Matthew and Davidson, Thomas and Okura, Keitaro , date =. Generating the. 2025 , journal =

2025

[35] [35]

Text and

Keith, Katherine and Jensen, David and O’Connor, Brendan , date =. Text and. Proceedings of the 58th. 2020 , pages =. doi:10.18653/v1/2020.acl-main.474 , url =

work page doi:10.18653/v1/2020.acl-main.474 2020

[36] [36]

2024 , eprint=

In Silico Sociology: Forecasting COVID-19 Polarization with Large Language Models , author=. 2024 , eprint=

2024

[37] [37]

2025 , url =

Societal and Technological Progress as Sewing an Ever-Growing, Ever-Changing, Patchy, and Polychrome Quilt , author =. 2025 , url =

2025

[38] [38]

2023 , eprint =

Lin, Jiaju and Zhao, Haoran and Zhang, Aochi and Wu, Yiting and Ping, Huqiuyue and Chen, Qin , date =. 2023 , eprint =. doi:10.48550/arXiv.2308.04026 , url =

work page doi:10.48550/arxiv.2308.04026 2023

[39] [39]

Measuring the

Lin, Jinkun and Zhang, Anqi and Lecuyer, Mathias and Li, Jinyang and Panda, Aurojit and Sen, Siddhartha , date =. Measuring the. 2022 , eprint =

2022

[40] [40]

Quantifying the

Li, Chao and Su, Xing and Han, Haoying and Xue, Cong and Zheng, Chunmo and Fan, Chao , date =. Quantifying the. 2023 , eprint =. doi:10.48550/arXiv.2308.03313 , url =

work page doi:10.48550/arxiv.2308.03313 2023

[41] [41]

Chain of

Liu, Hao and Sferrazza, Carmelo and Abbeel, Pieter , date =. Chain of. 2023 , eprint =. doi:10.48550/arXiv.2302.02676 , url =

work page doi:10.48550/arxiv.2302.02676 2023

[42] [42]

and Shwartz, Vered , date =

Liu, Zhuozhuo Joy and Samir, Farhan and Bhatia, Mehar and Nelson, Laura K. and Shwartz, Vered , date =. Is. 2025 , eprint =. doi:10.48550/arXiv.2505.18322 , url =

work page doi:10.48550/arxiv.2505.18322 2025

[43] [43]

Liu, Ken Ziyu , date =. Machine. 2024 , url =

2024

[44] [44]

Liu, Alisa and Han, Xiaochuang and Wang, Yizhong and Tsvetkov, Yulia and Choi, Yejin and Smith, Noah A , date =. Tuning. 2024 , location =

2024

[45] [45]

TOFU: A Task of Fictitious Unlearning for LLMs

Maini, Pratyush and Feng, Zhili and Schwarzschild, Avi and Lipton, Zachary C. and Kolter, J. Zico , date =. 2024 , eprint =. doi:10.48550/arXiv.2401.06121 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2401.06121 2024

[46] [46]

Locating and Editing Factual Associations in

Meng, Kevin and Bau, David and Andonian, Alex and Belinkov, Yonatan , booktitle =. Locating and Editing Factual Associations in

[47] [47]

Meng, Kevin and Sharma, Arnab Sen and Andonian, Alex and Belinkov, Yonatan and Bau, David , date =. Mass-. 2023 , eprint =. doi:10.48550/arXiv.2210.07229 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2210.07229 2023

[48] [48]

and Rush, Alexander M

Morris, John X. and Rush, Alexander M. , date =. Contextual. 2024 , year = 2024, eprint =. doi:10.48550/arXiv.2410.02525 , url =

work page doi:10.48550/arxiv.2410.02525 2024

[49] [49]

and Hajishirzi, Hannaneh and Koh, Pang Wei and Dodge, Jesse and Dasigi, Pradeep , date =

Morrison, Jacob and Smith, Noah A. and Hajishirzi, Hannaneh and Koh, Pang Wei and Dodge, Jesse and Dasigi, Pradeep , date =. Merge to. 2024 , eprint =

2024

[50] [50]

Muscato, Benedetta and Mala, Chandana Sree and Marchiori Manerba, Marta and Gezici, Gizem and Giannotti, Fosca , editor =. An. Proceedings of the 3rd. 2024 , pages =

2024

[51] [51]

Designing

Navarro, Alejandro Leonardo García and Koneva, Nataliia and Sánchez-Macián, Alfonso and Hernández, José Alberto and Goyanes, Manuel , date =. Designing. 2024 , eprint =. doi:10.48550/arXiv.2411.07038 , url =

work page doi:10.48550/arxiv.2411.07038 2024

[52] [52]

Descent-to-

Neel, Seth and Roth, Aaron and Sharifi-Malvajerdi, Saeed , date =. Descent-to-. Proceedings of the 32nd. 2021 , pages =

2021

[53] [53]

2025 , eprint =

Empirically Evaluating Commonsense Intelligence in Large Language Models with Large-Scale Human Judgments , author =. 2025 , eprint =. doi:10.48550/arXiv.2505.10309 , url =

work page doi:10.48550/arxiv.2505.10309 2025

[54] [54]

Learning

Papadimitriou, Isabel and Jurafsky, Dan , date =. Learning. 2020 , eprint =. doi:10.48550/arXiv.2004.14601 , url =

work page doi:10.48550/arxiv.2004.14601 2020

[55] [55]

O’Brien, Carrie J

Park, Joon Sung and O'Brien, Joseph and Cai, Carrie Jun and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S. , date =. Generative. Proceedings of the 36th. 2023 , pages =. doi:10.1145/3586183.3606763 , url =

work page doi:10.1145/3586183.3606763 2023

[56] [56]

Survey of

Pawar, Siddhesh and Park, Junyeong and Jin, Jiho and Arora, Arnav and Myung, Junho and Yadav, Srishti and Haznitrama, Faiz Ghifari and Song, Inhwa and Oh, Alice and Augenstein, Isabelle , date =. Survey of. 2024 , langid =

2024

[57] [57]

Pawelczyk, Martin and Neel, Seth and Lakkaraju, Himabindu , date =. In-. 2024 , eprint =. doi:10.48550/arXiv.2310.07579 , url =

work page doi:10.48550/arxiv.2310.07579 2024

[58] [58]

and Agarwal, Divyansh and Huang, Kung-Hsiang and Tan, Sarah and Peng, Nanyun and Wu, Chien-Sheng , date =

Qiu, Haoyi and Fabbri, Alexander R. and Agarwal, Divyansh and Huang, Kung-Hsiang and Tan, Sarah and Peng, Nanyun and Wu, Chien-Sheng , date =. Evaluating. 2024 , eprint =

2024

[59] [59]

The Social

Roland, Edwin and So, Richard and Long, Hoyt , date =. The Social. 2025 , journal =. doi:10.1007/s00146-025-02790-0 , url =

work page doi:10.1007/s00146-025-02790-0 2025

[60] [60]

Sorensen, Taylor and Moore, Jared and Fisher, Jillian and Gordon, Mitchell and Mireshghallah, Niloofar and Rytting, Christopher Michael and Ye, Andre and Jiang, Liwei and Lu, Ximing and Dziri, Nouha and Althoff, Tim and Choi, Yejin , date =. A. 2024 , eprint =

2024

[61] [61]

Templeton, Adley , date =. Scaling. 2024 , institution =

2024

[62] [62]

Guardrail

Thaker, Pratiksha and Maurya, Yash and Hu, Shengyuan and Wu, Zhiwei Steven and Smith, Virginia , date =. Guardrail. 2024 , eprint =. doi:10.48550/arXiv.2403.03329 , url =

work page doi:10.48550/arxiv.2403.03329 2024

[63] [63]

and Wilkens, Matthew , date =

Underwood, Ted and Nelson, Laura K. and Wilkens, Matthew , date =. Can. 2025 , eprint =. doi:10.48550/arXiv.2505.00030 , url =

work page doi:10.48550/arxiv.2505.00030 2025

[64] [64]

and Narayanan, Arvind , date =

Veselovsky, Veniamin and Argin, Berke and Stroebl, Benedikt and Wendler, Chris and West, Robert and Evans, James and Griffiths, Thomas L. and Narayanan, Arvind , date =. Localized. 2025 , eprint =. doi:10.48550/arXiv.2504.10191 , url =

work page doi:10.48550/arxiv.2504.10191 2025

[65] [65]

Feder and Wang, Angelina and Atalla, Chad and Barocas, Solon and Blodgett, Su Lin and Chouldechova, Alexandra and Corvi, Emily and Dow, P

Wallach, Hanna and Desai, Meera and Cooper, A. Feder and Wang, Angelina and Atalla, Chad and Barocas, Solon and Blodgett, Su Lin and Chouldechova, Alexandra and Corvi, Emily and Dow, P. Alex and Garcia-Gathright, Jean and Olteanu, Alexandra and Pangakis, Nicholas and Reed, Stefanie and Sheng, Emily and Vann, Dan and Vaughan, Jennifer Wortman and Vogel, Ma...

work page doi:10.48550/arxiv.2502.00561 2025

[66] [66]

Walsh, Melanie and Preus, Anna and Gronski, Elizabeth , date =. Does. 2024 , eprint =. doi:10.48550/arXiv.2410.15299 , url =

work page doi:10.48550/arxiv.2410.15299 2024

[67] [67]

2024 , journal =

A Survey on Large Language Model Based Autonomous Agents , author =. 2024 , journal =. doi:10.1007/s11704-024-40231-1 , url =

work page doi:10.1007/s11704-024-40231-1 2024

[68] [68]

and Zhang, Chiyuan and Zettlemoyer, Luke and Li, Kai and Henderson, Peter , date =

Wei, Boyi and Shi, Weijia and Huang, Yangsibo and Smith, Noah A. and Zhang, Chiyuan and Zettlemoyer, Luke and Li, Kai and Henderson, Peter , date =. Evaluating. 2024 , eprint =. doi:10.48550/arXiv.2406.18664 , url =

work page doi:10.48550/arxiv.2406.18664 2024

[69] [69]

Xi, Zhiheng and Chen, Wenxiang and Guo, Xin and He, Wei and Ding, Yiwen and Hong, Boyang and Zhang, Ming and Wang, Junzhe and Jin, Senjie and Zhou, Enyu and Zheng, Rui and Fan, Xiaoran and Wang, Xiao and Xiong, Limao and Zhou, Yuhao and Wang, Weiran and Jiang, Changhao and Zou, Yicheng and Liu, Xiangyang and Yin, Zhangyue and Dou, Shihan and Weng, Rongxia...

2023

[70] [70]

2024 , origdate =

Xi, Zhiheng , date =. 2024 , origdate =

2024

[71] [71]

Echoes in

Xu, Weijia and Jojic, Nebojsa and Rao, Sudha and Brockett, Chris and Dolan, Bill , date =. Echoes in. 2025 , journal =. doi:10.1073/pnas.2504966122 , url =

work page doi:10.1073/pnas.2504966122 2025

[72] [72]

Yang, Jiashu and Wang, Ningning and Zhao, Yian and Feng, Chaoran and Du, Junjia and Pang, Hao and Fang, Zhirui and Cheng, Xuxin , date =. Kongzi:. 2025 , eprint =. doi:10.48550/arXiv.2504.09488 , url =

work page doi:10.48550/arxiv.2504.09488 2025

[73] [73]

Composing

Zhang, Jinghan and Chen, Shiqi and Liu, Junteng and He, Junxian , date =. Composing. 2023 , eprint =. doi:10.48550/arXiv.2306.14870 , url =

work page doi:10.48550/arxiv.2306.14870 2023

[74] [74]

2025 , eprint =

Zhang, Xuanming and Chen, Yuxuan and Yeh, Min-Hsuan and Li, Yixuan , date =. 2025 , eprint =. doi:10.48550/arXiv.2505.18943 , url =

work page doi:10.48550/arxiv.2505.18943 2025

[75] [75]

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

Zhang, Ruiqi and Lin, Licong and Bai, Yu and Mei, Song , date =. Negative. 2024 , eprint =. doi:10.48550/arXiv.2404.05868 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2404.05868 2024

[76] [76]

2025 , eprint =

Zhang, Yiming and Diddee, Harshita and Holm, Susan and Liu, Hanchen and Liu, Xinyue and Samuel, Vinay and Wang, Barry and Ippolito, Daphne , date =. 2025 , eprint =. doi:10.48550/arXiv.2504.05228 , url =

work page doi:10.48550/arxiv.2504.05228 2025

[77] [77]

Deciphering the

Zhao, Yang and Du, Li and Ding, Xiao and Xiong, Kai and Sun, Zhouhao and Jun, Shi and Liu, Ting and Qin, Bing , editor =. Deciphering the. Findings of the. 2024 , pages =. doi:10.18653/v1/2024.findings-acl.559 , url =

work page doi:10.18653/v1/2024.findings-acl.559 2024

[78] [78]

and Potts, Christopher and Chen, Danqi , date =

Zhong, Zexuan and Wu, Zhengxuan and Manning, Christopher D. and Potts, Christopher and Chen, Danqi , date =. 2024 , eprint =. doi:10.48550/arXiv.2305.14795 , url =

work page doi:10.48550/arxiv.2305.14795 2024

[79] [79]

Zhou, Jiaxu and Huang, Jen-tse and Zhou, Xuhui and Lam, Man Ho and Wang, Xintao and Zhu, Hao and Wang, Wenxuan and Sap, Maarten , date =. The. 2025 , eprint =. doi:10.48550/arXiv.2509.18052 , url =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2509.18052 2025

[80] [80]

2026 , langid =

Zhou, Xuhui and Liu, Jiarui and Yerukola, Akhila and Kim, Hyunwoo and Sap, Maarten , date =. 2026 , langid =

2026