pith. machine review for the scientific record. sign in

arxiv: 2604.25088 · v1 · submitted 2026-04-28 · 💻 cs.AI · cs.CL

Recognition: unknown

Cooperate to Compete: Strategic Coordination in Multi-Agent Conquest

Abigail O'Neill, Alan Zhu, Joseph E. Gonzalez, Mihran Miroyan, Narges Norouzi

Authors on Pith no claims yet

Pith reviewed 2026-05-07 16:34 UTC · model grok-4.3

classification 💻 cs.AI cs.CL
keywords multi-agent negotiationlanguage model agentsmixed-motive gamesstrategic coordinationprompt engineeringhuman-AI comparisoncooperation and competition
0
0 comments X

The pith

Adjusting language model agents to match human negotiation patterns raises their win rate from 22 percent to 33 percent in a new competitive game.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Cooperate to Compete, a multi-agent game in which players pursue secret asymmetric goals through private, non-binding negotiations while racing to win. It compares how humans and language-model agents behave in these mixed-motive settings and finds that humans favor simpler deals, prove less reliable as partners, and negotiate more aggressively. Prompt changes that steer the agents toward the observed human style produce a clear lift in performance against human opponents. A reader would care because many real deployments of AI agents will require exactly this blend of short-term cooperation and long-term competition.

Core claim

In the Cooperate to Compete environment, language-model agents produce more complex and more reliable negotiation behavior than humans, who accept deals without counter-offers only 56 percent of the time and favor lower-complexity agreements. Targeted prompting that aligns agent behavior with these human patterns improves the agents' win rate against human opponents from 22.2 percent to 32.7 percent across more than 1,100 games.

What carries the argument

The Cooperate to Compete (C2C) environment, in which players hold asymmetric secret objectives and conduct private non-binding negotiations that can form and dissolve as short-term interests align or diverge.

If this is right

  • Language-model agents can be steered toward more effective mixed-motive behavior without retraining, simply by changing their negotiation prompts.
  • C2C provides a scalable testbed for studying coordination failures and successes that occur when short-term alliances must serve long-term competitive goals.
  • The collected dataset of 16,000 private conversations supplies a concrete resource for measuring and improving agent reliability in negotiation.
  • Human-AI performance gaps in this setting are large enough that modest behavioral alignment yields double-digit win-rate gains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same prompting approach could be tested in other non-binding negotiation domains such as automated bargaining or multi-party policy simulation.
  • If human unreliability is the dominant factor, future agents might benefit from explicit modeling of partner reliability rather than pure imitation of human style.
  • The 10-point win-rate improvement suggests that real-world deployments involving repeated human-AI interactions could see compounding advantages from similar alignment techniques.

Load-bearing premise

The differences between human and language-model negotiation styles observed in the user study will remain stable enough to guide prompt changes that continue to work against new opponents and in new game variants.

What would settle it

Re-running the same prompting modifications against a fresh pool of human players or with a different base language model and measuring whether the win-rate gain disappears.

Figures

Figures reproduced from arXiv: 2604.25088 by Abigail O'Neill, Alan Zhu, Joseph E. Gonzalez, Mihran Miroyan, Narges Norouzi.

Figure 1
Figure 1. Figure 1: Overview of C2C. (a) A sample game state from Red’s perspective. The board shows each territory’s owner (indicated by color) and troop count (indicated by number); territories marked with “?” are obscured and not visible to Red. (b) Available actions and a negotiation channel between Red and Blue. (c) Player strength comparison of various LM-based agents and humans based on the Plackett-Luce model. Human p… view at source ↗
Figure 2
Figure 2. Figure 2: A qualitative example of an evolving relationship. Early in the game view at source ↗
Figure 3
Figure 3. Figure 3: Win Rate, Negotiation, and Deal-making metrics. 95% confidence intervals shown. view at source ↗
Figure 4
Figure 4. Figure 4: Reliability and Relationships metrics. Note that human deception is omitted as it view at source ↗
Figure 5
Figure 5. Figure 5: Win rates across all interventions. Re￾stricting negotiations and partnerships harms per￾formance, while principled strategies improve per￾formance. 95% confidence intervals shown. Unique Negotiation Targets Follow-Through Rate 0.0 0.5 1.0 1.5 Number Players 1.51 1.20 Reference Agents Single Partner 0 20 40 60 80 100 Percent (%) 70.6 77.0 view at source ↗
Figure 7
Figure 7. Figure 7: Relevant metrics for Aggressive Negotiation, Support Seeking, and Deceiving interventions. 95% confidence intervals shown. 8 view at source ↗
Figure 8
Figure 8. Figure 8: Overview of the human-AI game interface. view at source ↗
read the original abstract

Language Model (LM)-based agents remain largely untested in mixed-motive settings where agents must leverage short-term cooperation for long-term competitive goals (e.g., multi-party politics). We introduce Cooperate to Compete (C2C), a multi-agent environment where players can engage in private negotiations while competing to be the first to achieve their secret objective. Players have asymmetric objectives and negotiations are non-binding, allowing alliances to form and break as players' short-term interests align and diverge. We run AI only games and conduct a user study pitting human players against AI opponents. We identify significant differences between human and AI negotiation behaviors, finding that humans favor lower-complexity deals and are significantly less reliable partners compared to LM-based agents. We also find that humans are more aggressive negotiators, accepting deals without a counteroffer only 56.3% of the time compared to 67.6% for LM-based agents. Through targeted prompting inspired by these findings, we modify agents' negotiation behavior and improve win rates from 22.2% to 32.7%. We run over 1,100 games with over 16,000 private conversations totaling 15.2 million tokens and over 150,000 player actions. Our results establish C2C as a testbed for studying and building LM-based agents that can navigate the sophisticated coordination required for real-world deployments. The game, code, and dataset may be found at https://negotiationgame.io/c2c.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces the Cooperate to Compete (C2C) multi-agent environment in which LM-based agents engage in non-binding private negotiations while pursuing asymmetric secret objectives. Through AI-only games and a human-AI user study, it identifies behavioral differences (humans favor lower-complexity deals, are less reliable partners, and accept deals without counteroffer only 56.3% of the time versus 67.6% for LM agents). Targeted prompting inspired by these differences is then shown to raise LM agent win rates from 22.2% to 32.7% across >1,100 games, >16,000 conversations, and >150,000 actions. The work positions C2C as a testbed and releases the game, code, and dataset.

Significance. If the central performance claim is statistically supported, the paper supplies a scalable, mixed-motive testbed and concrete evidence that human-AI behavioral differences can be leveraged to improve LM negotiation strategies. The large experimental scale and public artifacts are clear strengths that would enable follow-on work on coordination in politics-like settings.

major comments (2)
  1. [Abstract and Results] Abstract and Results: the headline claim that targeted prompting improves win rates from 22.2% to 32.7% is reported without per-condition sample sizes, standard errors, bootstrap intervals, or any hypothesis test. With only the aggregate >1,100 games stated, it is impossible to determine whether the 10.5-point lift exceeds sampling variability or is attributable to the prompting rather than other factors.
  2. [User-study and prompting sections] User-study and prompting sections: the manuscript states that prompting is 'inspired by' the observed human-AI differences (e.g., acceptance rates, reliability) but provides no explicit mapping from those statistics to the prompt modifications, nor any ablation showing that the specific changes (rather than generic prompting) drive the gain.
minor comments (2)
  1. [Abstract] Abstract should include brief baseline details, opponent pool description, and mention of statistical testing to allow readers to evaluate the 10.5-point claim at a glance.
  2. [Methods] Notation for negotiation complexity and reliability metrics should be defined consistently when first introduced and carried through tables or figures.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We agree that the statistical details and the explicit link between user-study observations and prompting require strengthening. We address each point below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract and Results] Abstract and Results: the headline claim that targeted prompting improves win rates from 22.2% to 32.7% is reported without per-condition sample sizes, standard errors, bootstrap intervals, or any hypothesis test. With only the aggregate >1,100 games stated, it is impossible to determine whether the 10.5-point lift exceeds sampling variability or is attributable to the prompting rather than other factors.

    Authors: We agree that the current reporting is insufficient to evaluate the reliability of the improvement. In the revision we will report the exact number of games per condition, standard errors, bootstrap confidence intervals, and the result of an appropriate hypothesis test (two-proportion z-test) for the difference between 22.2% and 32.7%. The total exceeds 1,100 games with balanced allocation across conditions, which supplies adequate power for these analyses. revision: yes

  2. Referee: [User-study and prompting sections] User-study and prompting sections: the manuscript states that prompting is 'inspired by' the observed human-AI differences (e.g., acceptance rates, reliability) but provides no explicit mapping from those statistics to the prompt modifications, nor any ablation showing that the specific changes (rather than generic prompting) drive the gain.

    Authors: We acknowledge that the manuscript lacks an explicit mapping and an ablation. In the revision we will add a subsection that directly links each reported human-AI difference (e.g., 56.3% vs. 67.6% acceptance without counter-offer, lower human reliability) to the corresponding prompt modifications. We will also include an ablation comparing the targeted prompts against a generic negotiation-prompt baseline to demonstrate that the specific changes, rather than prompting in general, account for the observed gain. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results are independent of inputs

full rationale

The paper reports results from separate runs of AI-only games (>1,100 total) and a distinct human-vs-AI user study. The win-rate improvement (22.2% to 32.7%) is obtained by applying prompting changes in new game instances, not by any re-use of the same data, fitted parameters, or self-referential definitions. No equations, uniqueness theorems, or ansatzes appear in the provided text, and no self-citations are invoked as load-bearing justification for the behavioral differences or performance gains. The derivation chain consists of observation followed by independent validation runs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard LLM prompting assumptions and the premise that observed behavioral differences can be translated into effective prompt modifications; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Targeted prompting can reliably alter LM negotiation behavior in the direction suggested by human-AI comparisons
    Invoked when the authors apply findings from the user study to modify agents and report the win-rate increase.

pith-pipeline@v0.9.0 · 5578 in / 1187 out tokens · 62542 ms · 2026-05-07T16:34:10.453099+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 12 canonical work pages

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  2. [2]

    Cooperation, competition, and maliciousness: LLM -stakeholders interactive negotiation

    Sahar Abdelnabi, Amr Gomaa, Sarath Sivaprasad, Lea Sch \"o nherr, and Mario Fritz. Cooperation, competition, and maliciousness: LLM -stakeholders interactive negotiation. Advances in Neural Information Processing Systems, 37: 0 83548--83599, 2024

  3. [3]

    Playing repeated games with large language models

    Elif Akata, Lion Schulz, Julian Coda-Forno, Seong Joon Oh, Matthias Bethge, and Eric Schulz. Playing repeated games with large language models. Nature Human Behaviour, pp.\ 1--11, 2025

  4. [4]

    Human-level play in the game of Diplomacy by combining language models with strategic reasoning

    Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyuan Hu, et al. Human-level play in the game of Diplomacy by combining language models with strategic reasoning. Science, 378 0 (6624): 0 1067--1074, 2022

  5. [5]

    Using cognitive psychology to understand GPT -3

    Marcel Binz and Eric Schulz. Using cognitive psychology to understand GPT -3. Proceedings of the National Academy of Sciences, 120 0 (6): 0 e2218523120, 2023

  6. [6]

    Rank analysis of incomplete block designs: I

    Ralph Allan Bradley and Milton E Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39 0 (3/4): 0 324--345, 1952

  7. [7]

    Overcommunication in strategic information transmission games

    Hongbin Cai and Joseph Tao-Yi Wang. Overcommunication in strategic information transmission games. Games and Economic Behavior, 56 0 (1): 0 7--36, 2006

  8. [8]

    Diplomacy, 1957

    Allan Calhamer. Diplomacy, 1957

  9. [9]

    Bounded rationality for LLMs : Satisficing alignment at inference-time

    Mohamad Chehade, Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy, Dinesh Manocha, Hao Zhu, and Amrit Singh Bedi. Bounded rationality for LLMs : Satisficing alignment at inference-time. arXiv preprint arXiv:2505.23729, 2025

  10. [10]

    The collaboration gap

    Tim R Davidson, Adam Fourney, Saleema Amershi, Robert West, Eric Horvitz, and Ece Kamar. The collaboration gap. arXiv preprint arXiv:2511.02687, 2025

  11. [11]

    Cheap talk

    Joseph Farrell and Matthew Rabin. Cheap talk. Journal of Economic perspectives, 10 0 (3): 0 103--118, 1996

  12. [12]

    Universal mechanisms

    Fran c oise Forges. Universal mechanisms. Econometrica: Journal of the Econometric Society, pp.\ 1341--1364, 1990

  13. [13]

    Gemini 3.1 flash lite model card

    Gemini Team . Gemini 3.1 flash lite model card. https://storage.googleapis.com/deepmind-media/Model-Cards/Gemini-3-1-Flash-Lite-Model-Card.pdf, March 2026 a

  14. [14]

    Gemini 3.1 pro model card

    Gemini Team . Gemini 3.1 pro model card. https://storage.googleapis.com/deepmind-media/Model-Cards/Gemini-3-1-Pro-Model-Card.pdf, February 2026 b

  15. [15]

    Explicit cooperation shapes human-like multi-agent LLM negotiation

    Yanru Jiang and G \"u lsah Ak c ak r. Explicit cooperation shapes human-like multi-agent LLM negotiation. In Proceedings of the Annual Meeting of the Cognitive Science Society, volume 47, 2025

  16. [16]

    Risk: The game of global domination, 1957

    Albert Lamorisse. Risk: The game of global domination, 1957

  17. [17]

    FightLadder : A benchmark for competitive multi-agent reinforcement learning

    Wenzhe Li, Zihan Ding, Seth Karten, and Chi Jin. FightLadder : A benchmark for competitive multi-agent reinforcement learning. arXiv preprint arXiv:2406.02081, 2024

  18. [18]

    From text to tactic: Evaluating llms play- ing the game of avalon

    Jonathan Light, Min Cai, Sheng Shen, and Ziniu Hu. AvalonBench : Evaluating LLMs playing the game of Avalon . arXiv preprint arXiv:2310.05036, 2023

  19. [19]

    On the limited memory bfgs method for large scale optimization

    Dong C Liu and Jorge Nocedal. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45 0 (1): 0 503--528, 1989

  20. [20]

    Communication enhances LLMs ' stability in strategic thinking

    Nunzio Lore and Babak Heydari. Communication enhances LLMs ' stability in strategic thinking. arXiv preprint arXiv:2602.06081, 2026

  21. [21]

    Individual choice behavior, volume 4

    R Duncan Luce. Individual choice behavior, volume 4. Wiley New York, 1959

  22. [22]

    (ir) rationality and cognitive biases in large language models

    Olivia Macmillan-Scott and Mirco Musolesi. (ir) rationality and cognitive biases in large language models. Royal Society open science, 11 0 (6), 2024

  23. [23]

    Communication enables cooperation in LLM agents: A comparison with curriculum-based approaches

    Hachem Madmoun and Salem Lahlou. Communication enables cooperation in LLM agents: A comparison with curriculum-based approaches. arXiv preprint arXiv:2510.05748, 2025

  24. [24]

    Note on the sampling error of the difference between correlated proportions or percentages

    Quinn McNemar . Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12 0 (2): 0 153--157, 1947

  25. [25]

    LieCraft : A multi-agent framework for evaluating deceptive capabilities in language models

    Matthew Lyle Olson, Neale Ratzlaff, Musashi Hinck, Tri Nguyen, Vasudev Lal, Joseph Campbell, Simon Stepputtis, and Shao-Yen Tseng. LieCraft : A multi-agent framework for evaluating deceptive capabilities in language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 40, pp.\ 37802--37809, 2026

  26. [26]

    Introducing gpt-4.1 in the API

    OpenAI . Introducing gpt-4.1 in the API . https://openai.com/index/gpt-4-1/, April 2025 a

  27. [27]

    Introducing gpt-5.2

    OpenAI . Introducing gpt-5.2. https://openai.com/index/introducing-gpt-5-2/, December 2025 b

  28. [28]

    arXiv preprint arXiv:2410.07553 , year=

    Timothy Ossowski, Jixuan Chen, Danyal Maqbool, Zefan Cai, Tyler Bradshaw, and Junjie Hu. COMMA : A communicative multimodal multi-agent benchmark. arXiv preprint arXiv:2410.07553, 2024

  29. [29]

    The analysis of permutations

    Robin L Plackett. The analysis of permutations. Journal of the Royal Statistical Society Series C: Applied Statistics, 24 0 (2): 0 193--202, 1975

  30. [30]

    Evaluating large language models through communication games: An agent-based framework using Werewolf in unity

    Christian Poglitsch, Fabian Szak \'a cs, and Johanna Pirker. Evaluating large language models through communication games: An agent-based framework using Werewolf in unity. In Proceedings of the 20th International Conference on the Foundations of Digital Games, pp.\ 1--10, 2025

  31. [31]

    Micromotives and macrobehavior

    Thomas C Schelling. Micromotives and macrobehavior. WW Norton & Company, 2006

  32. [32]

    BLGAN : Bayesian learning and genetic algorithm for supporting negotiation with incomplete information

    Kwang Mong Sim, Yuanyuan Guo, and Benyun Shi. BLGAN : Bayesian learning and genetic algorithm for supporting negotiation with incomplete information. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39 0 (1): 0 198--211, 2008

  33. [33]

    Evaluating generalization capabilities of llm-based agents in mixed-motive scenarios using Concordia

    Chandler Smith, Marwa Abdulhai, Manfred Diaz, Marko Tesic, Rakshit S Trivedi, Alexander Sasha Vezhnevets, Lewis Hammond, Jesse Clifton, Minsuk Chang, Edgar A Du \'e \ n ez-Guzm \'a n, et al. Evaluating generalization capabilities of llm-based agents in mixed-motive scenarios using Concordia . arXiv preprint arXiv:2512.03318, 2025

  34. [34]

    Beyond survival: Evaluating llms in social deduction games with human- aligned strategies, 2025

    Zirui Song, Yuan Huang, Junchang Liu, Haozhe Luo, Chenxi Wang, Lang Gao, Zixiang Xu, Mingfei Han, Xiaojun Chang, and Xiuying Chen. Beyond survival: Evaluating LLMs in social deduction games with human-aligned strategies. arXiv preprint arXiv:2510.11389, 2025

  35. [35]

    arXiv preprint arXiv:2408.15971 , year=

    Wei Wang, Dan Zhang, Tao Feng, Boyan Wang, and Jie Tang. BattleAgentBench : A benchmark for evaluating cooperation and competition capabilities of language models in multi-agent systems. arXiv preprint arXiv:2408.15971, 2024

  36. [36]

    Individual comparisons by ranking methods

    Frank Wilcoxon. Individual comparisons by ranking methods. Biometrics bulletin, 1 0 (6): 0 80--83, 1945

  37. [37]

    More victories, less cooperation: Assessing Cicero ’s diplomacy play

    Wichayaporn Wongkamjan, Feng Gu, Yanze Wang, Ulf Hermjakob, Jonathan May, Brandon M Stewart, Jonathan Kummerfeld, Denis Peskoff, and Jordan Boyd-Graber. More victories, less cooperation: Assessing Cicero ’s diplomacy play. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 12423--12441, 2024

  38. [38]

    Grok 4.1 model card

    xAI . Grok 4.1 model card. https://data.x.ai/2025-11-17-grok-4-1-model-card.pdf, November 2025

  39. [39]

    Can large language model agents simulate human trust behavior? Advances in neural information processing systems, 37: 0 15674--15729, 2024

    Chengxing Xie, Canyu Chen, Feiran Jia, Ziyu Ye, Shiyang Lai, Kai Shu, Jindong Gu, Adel Bibi, Ziniu Hu, David Jurgens, et al. Can large language model agents simulate human trust behavior? Advances in neural information processing systems, 37: 0 15674--15729, 2024

  40. [40]

    Magic: Investigation of large language model powered multi-agent in cognition, adaptability, rationality and collaboration

    Lin Xu, Zhiyuan Hu, Daquan Zhou, Hongyu Ren, Zhen Dong, Kurt Keutzer, See Kiong Ng, and Jiashi Feng. Magic: Investigation of large language model powered multi-agent in cognition, adaptability, rationality and collaboration. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pp.\ 7315--7332, 2024

  41. [41]

    arXiv preprint arXiv:2310.11667 , year=

    Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, et al. SOTOPIA : Interactive evaluation for social intelligence in language agents. arXiv preprint arXiv:2310.11667, 2023

  42. [42]

    MultiAgentBench : Evaluating the collaboration and competition of LLM agents

    Kunlun Zhu, Hongyi Du, Zhaochen Hong, Xiaocheng Yang, Shuyi Guo, Daisy Zhe Wang, Zhenhailong Wang, Cheng Qian, Robert Tang, Heng Ji, et al. MultiAgentBench : Evaluating the collaboration and competition of LLM agents. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 8580--8622, 2025

  43. [43]

    @esa (Ref

    \@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

  44. [44]

    \@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

  45. [45]

    version": 4,

    @open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...