Social dynamics in LLM collectives cause representative agents to make less accurate decisions as peer pressure increases through larger adversarial groups, more capable peers, longer arguments, and persuasive styles.
Debate only when necessary: Adaptive multiagent collaboration for efficient llm reasoning
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
HCP-MAD reduces token costs in multi-agent debates by using heterogeneous consensus verification, adaptive pair-agent stopping, and escalated collective voting based on task complexity signals.
Oracle per-example routing among decoding, voting, and debate yields +13-14 pp gains over the best fixed protocol, but vote-entropy thresholds and learned routers recover only 1-2 pp with non-significant results.
EMS reduces the average number of agents invoked for majority voting by 32% via reliability-aware prioritization and early stopping on six benchmarks.
citing papers explorer
-
Social Dynamics as Critical Vulnerabilities that Undermine Objective Decision-Making in LLM Collectives
Social dynamics in LLM collectives cause representative agents to make less accurate decisions as peer pressure increases through larger adversarial groups, more capable peers, longer arguments, and persuasive styles.
-
Heterogeneous Consensus-Progressive Reasoning for Efficient Multi-Agent Debate
HCP-MAD reduces token costs in multi-agent debates by using heterogeneous consensus verification, adaptive pair-agent stopping, and escalated collective voting based on task complexity signals.
-
Statistical Scouting Finds Debate-Safe but Not Debate-Useful Cases: A Matched-Ceiling Study of Open-Weight LLM Reasoning Protocols
Oracle per-example routing among decoding, voting, and debate yields +13-14 pp gains over the best fixed protocol, but vote-entropy thresholds and learned routers recover only 1-2 pp with non-significant results.
-
EMS: Multi-Agent Voting via Efficient Majority-then-Stopping
EMS reduces the average number of agents invoked for majority voting by 32% via reliability-aware prioritization and early stopping on six benchmarks.