I want to break free! persuasion and anti-social behavior of LLMs in multi-agent settings with social hierarchy

Gian Maria Campedelli, Nicolò Penzo, Massimo Stefan, Roberto Dessì, Marco Guerini, Bruno Lepri, Jacopo Staiano · 2024 · arXiv 2410.07109

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

A tabular foundation model with LLM-as-Observer features predicts AI agent decisions in controlled games, outperforming baselines by 4 AUC points and 14% lower error at K=16 interactions.

Superminds Test: Actively Evaluating Collective Intelligence of Agent Society via Probing Agents

cs.AI · 2026-04-24 · unverdicted · novelty 6.0

Large-scale experiments on two million agents reveal that collective intelligence does not emerge from scale alone due to sparse and shallow interactions.

Human-LLM Dialogue Improves Diagnostic Accuracy in Emergency Care

cs.AI · 2026-05-08 · unverdicted · novelty 4.0

Interactive LLM dialogue raised residents' hard-case diagnostic correctness from 0.589 to 0.734 and produced medium effect sizes in a blinded study of seven physicians on 52 emergency cases.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling cs.LG · 2026-05-12 · unverdicted · none · ref 15
A tabular foundation model with LLM-as-Observer features predicts AI agent decisions in controlled games, outperforming baselines by 4 AUC points and 14% lower error at K=16 interactions.
Superminds Test: Actively Evaluating Collective Intelligence of Agent Society via Probing Agents cs.AI · 2026-04-24 · unverdicted · none · ref 8
Large-scale experiments on two million agents reveal that collective intelligence does not emerge from scale alone due to sparse and shallow interactions.
Human-LLM Dialogue Improves Diagnostic Accuracy in Emergency Care cs.AI · 2026-05-08 · unverdicted · none · ref 33
Interactive LLM dialogue raised residents' hard-case diagnostic correctness from 0.589 to 0.734 and produced medium effect sizes in a blinded study of seven physicians on 52 emergency cases.

I want to break free! persuasion and anti-social behavior of LLMs in multi-agent settings with social hierarchy

fields

years

verdicts

representative citing papers

citing papers explorer