arXiv preprint arXiv:2505.21371 , year=

Wang, Z · 2025 · arXiv 2505.21371

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Understanding the Mechanism of Altruism in Large Language Models

econ.GN · 2026-04-21 · unverdicted · novelty 6.0

A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.

Large language models converge on competitive rationality but diverge on cooperation across providers and generations

physics.soc-ph · 2026-04-01 · unverdicted · novelty 6.0

LLMs converge on competitive rationality and coordination but diverge 48-fold on cooperation, with provider identity and generational shifts as dominant factors across 38 games.

citing papers explorer

Showing 2 of 2 citing papers.

Understanding the Mechanism of Altruism in Large Language Models econ.GN · 2026-04-21 · unverdicted · none · ref 245
A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.
Large language models converge on competitive rationality but diverge on cooperation across providers and generations physics.soc-ph · 2026-04-01 · unverdicted · none · ref 54
LLMs converge on competitive rationality and coordination but diverge 48-fold on cooperation, with provider identity and generational shifts as dominant factors across 38 games.

arXiv preprint arXiv:2505.21371 , year=

fields

years

verdicts

representative citing papers

citing papers explorer