arxiv: 2605.07996 · v1 · submitted 2026-05-08 · 💻 cs.GT · cs.MA· econ.GN· q-fin.EC

Recognition: 2 theorem links

· Lean Theorem

Nash without Numbers: A Social Choice Approach to Mixed Equilibria in Context-Ordinal Games

Ian Gemp , Crystal Qian , Marc Lanctot , Kate Larson

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:34 UTC · model grok-4.3

classification 💻 cs.GT cs.MAecon.GNq-fin.EC

keywords Nash equilibriumordinal preferencessocial choice theorymixed strategy equilibriacontext-ordinal gamespreference aggregationlearning in gamesgame theory

0 comments

The pith

Nash equilibrium can be defined and computed using only ordinal action rankings aggregated via social choice rules.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper generalizes Nash equilibrium to games where players supply only ordinal rankings of their actions conditional on the joint actions of others, instead of numerical utilities. Social choice aggregation functions turn these rankings into a notion of best response, yielding a context-ordinal Nash equilibrium. Existence follows from mild conditions on the aggregation methods. The work adds regularization, approximation, and regret concepts, analyzes complexity in simple cases, and supplies learning rules that compute the equilibria directly from ordinal data.

Core claim

A mixed strategy profile is a context-ordinal Nash equilibrium when each player's mixed strategy is a best response under a social choice aggregation of their ordinal preference ranking over pure actions, given the opponents' mixed strategies. Such equilibria exist whenever the aggregation satisfies standard mild conditions drawn from social choice theory. The authors also define regularized and approximate versions together with regret measures and give learning dynamics that converge to these equilibria.

What carries the argument

Context-ordinal best response formed by applying a social choice aggregation function to a player's ordinal ranking of actions given opponents' joint mixed strategy.

If this is right

Equilibria can be identified from elicited rankings without any numerical utility elicitation.
Regularized and approximate equilibria remain well-defined and computable when rankings contain noise or omissions.
Learning rules based on the aggregated best-response condition can be run directly on ordinal feedback.
Complexity results for simple games bound the cost of computing exact or approximate equilibria under common aggregation methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework reduces the cost of running equilibrium-based experiments with human subjects by replacing numerical payoff entry with simple rankings.
It supplies a direct route for applying equilibrium analysis to multi-agent systems where only qualitative preference data are observed.
Extensions to incomplete or cyclic rankings would immediately enlarge the set of games to which the method applies.

Load-bearing premise

Each player can supply a consistent ordinal ranking of their own actions for any fixed joint action of the others, and the chosen aggregation rule satisfies conditions that guarantee an equilibrium exists.

What would settle it

A concrete finite game in which every player supplies transitive ordinal rankings over actions in every context yet no mixed strategy profile satisfies the aggregated best-response condition for the chosen aggregation rule.

Figures

Figures reproduced from arXiv: 2605.07996 by Crystal Qian, Ian Gemp, Kate Larson, Marc Lanctot.

**Figure 2.** Figure 2: Algorithm 1 applied to the example from Fig. 1. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: We evaluate our (SGF-based) FTRL approach (p [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Atari: We present two different CO-NEs computed using the FTRL-inspired approach: [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Exploitability (ϵ) landscapes for a 2-action (Swerve/Straight) Chicken game using various regularized social choice BR(p=0,q=0.1,µ=[1/2,1/2])s. Panel (a) displays traditional exploitability based on cardinal payoffs whereas (b), (c), and (d) measure earth mover’s distance (EMD). Maximal lotteries (ML) and Borda coincide in two candidate (action) settings. However, we discussed a generalized notion of exter… view at source ↗

**Figure 6.** Figure 6: Best Response Regularization in Chicken. [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 8.** Figure 8: We evaluate our (SGF-based) FTRL approach (p [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗

**Figure 9.** Figure 9: Election (Table 1) LLE Equilibrium [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗

**Figure 10.** Figure 10: Election (Table 3) LLE Equilibrium [PITH_FULL_IMAGE:figures/full_fig_p028_10.png] view at source ↗

read the original abstract

Nash equilibrium serves as a fundamental mathematical tool in economics and game theory. However, it classically assumes knowledge of player utilities, whereas economics generally regards preferences as more fundamental. To leverage equilibrium analysis in strategic scenarios, one must first elicit numerical utilities consistent with player preferences, a delicate and time-consuming process. In this work, we forgo precise utilities and generalize the Nash equilibrium to a setting where we only assume a player is capable of providing an ordinal ranking of their actions within the context of other players' joint actions. The key technical challenge is to rethink the definition of a best-response. While the classical definition identifies actions maximizing expected payoff, we naturally look towards social choice theory for how to aggregate preferences to identify the most preferred actions. We define this generalized notion of a context-ordinal Nash equilibrium, establish its existence under mild conditions on aggregation methods, introduce notions of regularization, approximation, and regret, explore complexity for simple settings, and develop learning rules for computing such equilibria. In doing so, we provide a generalization of Nash equilibrium and demonstrate its direct applicability to elicited preferences in human experiments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines a context-ordinal Nash equilibrium by aggregating ordinal rankings with social choice rules, but the mixed-strategy case looks under-specified.

read the letter

The main thing to know is that this paper replaces the usual expected-utility best response with a social-choice aggregation over ordinal rankings that players give for their actions in each pure context. They claim this yields a well-defined equilibrium concept with existence under mild conditions on the aggregator, plus some regularization, regret, and learning rules on top. That specific move is new relative to the ordinal game theory they cite. The motivation for skipping cardinal utilities is also handled cleanly, which matters for experiments where numbers are expensive to elicit. The added complexity results for simple cases and the learning dynamics give the work some applied flavor. The soft spot is the mixed-strategy extension. Rankings are supplied only for pure joint actions, yet a mixed Nash requires best responses when opponents randomize. The abstract does not describe an explicit rule for turning a lottery over contexts into a ranking or an aggregated choice, so it is not obvious that the best-response correspondence is non-empty or upper hemicontinuous for every mixed profile. If that step is missing or implicit, the fixed-point argument for existence does not automatically carry over. This is worth a close check in the full text rather than a fatal flaw, but it is the part that needs the most scrutiny. The paper is aimed at game theorists and multi-agent researchers who want equilibrium tools that start from elicited rankings instead of utilities. A reader interested in foundations or in preference-based AI will get concrete new definitions and some computational angles to think about. It deserves a serious referee because the core construction is coherent on its own terms and the claims are falsifiable once the mixed case is written out. I would send it to review with a request to clarify how aggregation works on mixed contexts.

Referee Report

1 major / 1 minor

Summary. The paper introduces context-ordinal Nash equilibria as a generalization of classical Nash equilibrium for games in which players supply only ordinal rankings of their pure actions conditional on other players' pure joint-action profiles. Best responses are redefined via social-choice aggregation of these rankings rather than expected-utility maximization; the manuscript claims existence of mixed-strategy equilibria under mild conditions on the aggregator, and develops accompanying notions of regularization, approximation, regret, complexity results for simple cases, and learning dynamics for computing the equilibria. The stated goal is to enable equilibrium analysis directly from elicited ordinal preferences without cardinal utility elicitation.

Significance. If the existence argument and mixed-strategy extension are made rigorous, the work would provide a parameter-free bridge between game theory and social choice that permits equilibrium analysis in domains where only ordinal data are available, such as human-subject experiments. The additional development of approximation, regret, and learning rules would further increase applicability.

major comments (1)

[Definition of context-ordinal Nash equilibrium and existence theorem] The central existence claim for mixed equilibria rests on a best-response correspondence defined via aggregation of ordinal rankings supplied for pure contexts. No extension rule is supplied for how these rankings induce preferences (or aggregated choices) when the context is itself a mixed strategy profile, i.e., a lottery over pure joint actions. Without such a rule (e.g., via stochastic dominance, expected rank, or an explicit lottery extension), the best-response set is either empty or undefined for non-degenerate mixed profiles, rendering the fixed-point argument inapplicable.

minor comments (1)

[Existence theorem] The abstract asserts existence 'under mild conditions on aggregation methods' but does not list the precise axioms; the main text should state them explicitly (e.g., neutrality, monotonicity, or continuity properties) at the point where the existence theorem is proved.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback. The positive assessment of the work's potential significance is appreciated, and we agree that the technical point on extending the best-response definition to mixed profiles requires clarification to make the existence argument fully rigorous. We address the major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Definition of context-ordinal Nash equilibrium and existence theorem] The central existence claim for mixed equilibria rests on a best-response correspondence defined via aggregation of ordinal rankings supplied for pure contexts. No extension rule is supplied for how these rankings induce preferences (or aggregated choices) when the context is itself a mixed strategy profile, i.e., a lottery over pure joint actions. Without such a rule (e.g., via stochastic dominance, expected rank, or an explicit lottery extension), the best-response set is either empty or undefined for non-degenerate mixed profiles, rendering the fixed-point argument inapplicable.

Authors: We agree that the manuscript as submitted does not explicitly supply a lottery extension rule for ordinal rankings when the context is a mixed strategy profile, which leaves the best-response correspondence formally undefined for non-pure profiles and weakens the fixed-point argument. In the revision we will add a new subsection that defines such an extension. Specifically, we will extend each player's ordinal ranking over pure actions (conditional on a pure joint-action context) to a mixed context by applying the aggregator to the vector of expected ranks induced by the lottery; this preserves the aggregator's mild conditions (e.g., continuity and monotonicity) and ensures the resulting best-response correspondence is non-empty, convex-valued, and upper hemicontinuous on the mixed-strategy simplex. We will then verify that Kakutani's fixed-point theorem applies directly, thereby establishing existence of context-ordinal Nash equilibria. This addition will be parameter-free and consistent with the social-choice aggregation approach already used for pure contexts. revision: yes

Circularity Check

0 steps flagged

No circularity: context-ordinal Nash equilibrium defined from external social choice axioms

full rationale

The paper introduces a new equilibrium concept by replacing expected-utility best responses with social-choice aggregation of ordinal rankings supplied over pure action profiles. Existence is asserted under 'mild conditions on aggregation methods,' which are treated as independent inputs drawn from social choice theory rather than derived or fitted within the paper. No equations reduce a claimed prediction or existence result to a parameter that was itself calibrated on the target quantity; no self-citation chain is invoked to justify the core fixed-point argument; and the construction does not rename or smuggle in a known result under new coordinates. The derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the existence of aggregation methods satisfying unspecified mild conditions and on the assumption that players can supply context-dependent ordinal rankings; no free parameters or invented entities are mentioned.

axioms (2)

domain assumption Aggregation methods satisfy mild conditions sufficient for existence of context-ordinal Nash equilibrium
Invoked in the abstract to establish existence of the generalized equilibrium.
domain assumption Players can provide ordinal rankings of actions given others' joint actions
Stated as the fundamental modeling assumption replacing numerical utilities.

pith-pipeline@v0.9.0 · 5507 in / 1293 out tokens · 34659 ms · 2026-05-11T02:34:07.687456+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages

[1]

A note on strictly competi- tive games

Ilan Adler, Constantinos Daskalakis, and Christos H Papadimitriou. A note on strictly competi- tive games. InInternational Workshop on Internet and Network Economics, pages 471–474. Springer, 2009

work page 2009
[2]

Differentiable convex optimization layers.Advances in Neural Information Processing Systems, 32, 2019

Akshay Agrawal, Brandon Amos, Shane Barratt, Stephen Boyd, Steven Diamond, and J Zico Kolter. Differentiable convex optimization layers.Advances in Neural Information Processing Systems, 32, 2019

work page 2019
[3]

Addison-Wesley, 1983

Alfred V Aho.Data Structures and Algorithms. Addison-Wesley, 1983

work page 1983
[4]

Utilitarian distortion under probabilistic voting.arXiv preprint arXiv:2602.11152, 2026

Hamidreza Alipour and Mohak Goyal. Utilitarian distortion under probabilistic voting.arXiv preprint arXiv:2602.11152, 2026

work page arXiv 2026
[5]

Springer Science & Business Media, 2006

Charalambos D Aliprantis and Kim C Border.Infinite Dimensional Analysis: A Hitchhiker’s Guide. Springer Science & Business Media, 2006. ISBN 978-3-540-29587-7. doi: 10.1007/ 3-540-29587-9_17

work page 2006
[6]

Equilibria in ordinal games: A framework based on possibility theory

Nahla Ben Amor, Hélène Fargier, and Régis Sabbadin. Equilibria in ordinal games: A framework based on possibility theory. InProceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI’17), page 105–111, 2017

work page 2017
[7]

Mastering the game of no-press diplomacy via human- regularized reinforcement learning and planning.arXiv preprint arXiv:2210.05492, 2022

Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, and Noam Brown. Mastering the game of no-press diplomacy via human- regularized reinforcement learning and planning.arXiv preprint arXiv:2210.05492, 2022

work page arXiv 2022
[8]

Mastering the game of no-press diplomacy via human- regularized reinforcement learning and planning

Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, and Noam Brown. Mastering the game of no-press diplomacy via human- regularized reinforcement learning and planning. InThe Eleventh International Conference on Learning Representations, 2023

work page 2023
[9]

Re-evaluating evaluation

David Balduzzi, Karl Tuyls, Julien Perolat, and Thore Graepel. Re-evaluating evaluation. Advances in Neural Information Processing Systems, 31, 2018

work page 2018
[10]

A theory of measuring, electing, and ranking.Proceedings of the National Academy of Sciences, 104(21):8720–8725, 2007

Michel Balinski and Rida Laraki. A theory of measuring, electing, and ranking.Proceedings of the National Academy of Sciences, 104(21):8720–8725, 2007. doi: 10.1073/pnas.0702634104. URLhttps://www.pnas.org/doi/abs/10.1073/pnas.0702634104

work page doi:10.1073/pnas.0702634104 2007
[11]

Mirror descent and nonlinear projected subgradient methods for convex optimization.Operations Research Letters, 31(3):167–175, 2003

Amir Beck and Marc Teboulle. Mirror descent and nonlinear projected subgradient methods for convex optimization.Operations Research Letters, 31(3):167–175, 2003

work page 2003
[12]

Consistency of vanishingly smooth fictitious play.Mathe- matics of Operations Research, 38(3):437–450, 2013

Michel Benaïm and Mathieu Faure. Consistency of vanishingly smooth fictitious play.Mathe- matics of Operations Research, 38(3):437–450, 2013

work page 2013
[13]

Mixed equilibria and dynamical systems arising from fictitious play in perturbed games.Games and Economic Behavior, 29(1-2):36–72, 1999

Michel Benaım and Morris W Hirsch. Mixed equilibria and dynamical systems arising from fictitious play in perturbed games.Games and Economic Behavior, 29(1-2):36–72, 1999

work page 1999
[14]

The graph structure of two-player games.Scientific Reports, 13(1):1833, 2023

Oliver Biggar and Iman Shames. The graph structure of two-player games.Scientific Reports, 13(1):1833, 2023

work page 2023
[15]

Gender and Willingness to Lead: Does the Gender Composition of Teams Matter?The Review of Economics and Statistics, 104(2): 259–275, 03 2022

Andreas Born, Eva Ranehill, and Anna Sandberg. Gender and Willingness to Lead: Does the Gender Composition of Teams Matter?The Review of Economics and Statistics, 104(2): 259–275, 03 2022. ISSN 0034-6535. doi: 10.1162/rest_a_00955. URL https://doi.org/ 10.1162/rest_a_00955. 10

work page doi:10.1162/rest_a_00955 2022
[16]

JAX: composable transformations of Python+NumPy programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax

work page 2018
[17]

Consistent probabilistic social choice

Florian Brandl, Felix Brandt, and Hans Georg Seedig. Consistent probabilistic social choice. Econometrica, 84(5):1839–1880, 2016

work page 2016
[18]

Cambridge University Press, 2016

Felix Brandt, Vincent Conitzer, Ulle Endriss, Jérôme Lang, and Ariel D Procaccia.Handbook of Computational Social Choice. Cambridge University Press, 2016

work page 2016
[19]

Iterative solution of games by fictitious play.Act

George W Brown. Iterative solution of games by fictitious play.Act. Anal. Prod Allocation, 13 (1):374, 1951

work page 1951
[20]

Flows and decomposi- tions of games: Harmonic and potential games.Mathematics of Operations Research, 36(3): 474–503, 2011

Ozan Candogan, Ishai Menache, Asuman Ozdaglar, and Pablo A Parrilo. Flows and decomposi- tions of games: Harmonic and potential games.Mathematics of Operations Research, 36(3): 474–503, 2011

work page 2011
[21]

Subset selection via implicit utilitarian voting.Journal of Artificial Intelligence Research, 58:123–152, 2017

Ioannis Caragiannis, Swaprava Nath, Ariel D Procaccia, and Nisarg Shah. Subset selection via implicit utilitarian voting.Journal of Artificial Intelligence Research, 58:123–152, 2017

work page 2017
[22]

Approximation guarantees for fictitious play

Vincent Conitzer. Approximation guarantees for fictitious play. In2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 636–643. IEEE, 2009

work page 2009
[23]

The complexity of computing robust mediated equilibria in ordinal games

Vincent Conitzer. The complexity of computing robust mediated equilibria in ordinal games. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 9607–9615, 2024

work page 2024
[24]

Ordinal games and generalized Nash and Stackelberg solutions

JB Cruz and Marwan A Simaan. Ordinal games and generalized Nash and Stackelberg solutions. Journal of Optimization Theory and Applications, 107:205–222, 2000

work page 2000
[25]

Charting the

Constantinos Daskalakis, Ian Gemp, Yanchen Jiang, Renato Paes Leme, Christos Papadimitriou, and Georgios Piliouras. Charting the shapes of stories with game theory.arXiv preprint arXiv:2412.05747, 2024

work page arXiv 2024
[26]

Approximating nash equilibria using small-support strategies

Tomas Feder, Hamid Nazerzadeh, and Amin Saberi. Approximating nash equilibria using small-support strategies. InProceedings of the 8th ACM Conference on Electronic Commerce, pages 352–354, 2007

work page 2007
[27]

Probabilistic social choice based on simple voting comparisons.The Review of Economic Studies, 51(4):683–692, 1984

Peter C Fishburn. Probabilistic social choice based on simple voting comparisons.The Review of Economic Studies, 51(4):683–692, 1984

work page 1984
[28]

Gerald B Folland.Real Analysis: Modern Techniques and Their Applications, 2nd ed.John Wiley & Sons, 1999

work page 1999
[29]

Consistency and cautious fictitious play.Journal of Economic Dynamics and Control, 19(5-7):1065–1089, 1995

Drew Fudenberg and David K Levine. Consistency and cautious fictitious play.Journal of Economic Dynamics and Control, 19(5-7):1065–1089, 1995

work page 1995
[30]

MIT press, 1991

Drew Fudenberg and Jean Tirole.Game Theory. MIT press, 1991

work page 1991
[31]

D3c: Reducing the price of anarchy in multi-agent learning

Ian Gemp, Kevin R McKee, Richard Everett, Edgar Duéñez-Guzmán, Yoram Bachrach, David Balduzzi, and Andrea Tacchetti. D3c: Reducing the price of anarchy in multi-agent learning. InProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, pages 498–506, 2022

work page 2022
[32]

Sample-based approximation of nash in large many-player games via gradient descent

Ian Gemp, Rahul Savani, Marc Lanctot, Yoram Bachrach, Thomas Anthony, Richard Everett, Andrea Tacchetti, Tom Eccles, and János Kramár. Sample-based approximation of nash in large many-player games via gradient descent. InProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, pages 507–515, 2022

work page 2022
[33]

Approximating nash equilibria in normal- form games via stochastic optimization

Ian Gemp, Luke Marris, and Georgios Piliouras. Approximating nash equilibria in normal- form games via stochastic optimization. InThe Twelfth International Conference on Learning Representations, 2024. 11

work page 2024
[34]

Con- vex markov games: A new frontier for multi-agent reinforcement learning

Ian Gemp, Andreas Alexander Haupt, Luke Marris, Siqi Liu, and Georgios Piliouras. Con- vex markov games: A new frontier for multi-agent reinforcement learning. InForty-second International Conference on Machine Learning, 2025

work page 2025
[35]

No-regret learning in convex games

Geoffrey J Gordon, Amy Greenwald, and Casey Marks. No-regret learning in convex games. In Proceedings of the 25th International Conference on Machine learning, pages 360–367, 2008

work page 2008
[36]

A general theory of equilibrium selection in games

John C Harsanyi, Reinhard Selten, et al. A general theory of equilibrium selection in games. MIT Press Books, 1, 1988

work page 1988
[37]

Rainbow: Combining improve- ments in deep reinforcement learning

Matteo Hessel, Joseph Modayil, Hado Van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, and David Silver. Rainbow: Combining improve- ments in deep reinforcement learning. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018

work page 2018
[38]

On the global convergence of stochastic fictitious play.Econometrica, 70(6):2265–2294, 2002

Josef Hofbauer and William H Sandholm. On the global convergence of stochastic fictitious play.Econometrica, 70(6):2265–2294, 2002

work page 2002
[39]

A generalization of brouwer’s fixed point theorem.Duke Mathematical Journal, 8(3):457, 1941

Shizuo Kakutani. A generalization of brouwer’s fixed point theorem.Duke Mathematical Journal, 8(3):457, 1941

work page 1941
[40]

John G. Kemeny. Mathematics without numbers.Daedalus, 88(4):577–591, 1959. ISSN 00115266. URLhttp://www.jstor.org/stable/20026529

work page arXiv 1959
[41]

Evaluating agents using social choice theory

Marc Lanctot, Kate Larson, Yoram Bachrach, Luke Marris, Zun Li, Avishkar Bhoopchand, Thomas Anthony, Brian Tanner, and Anna Koop. Evaluating agents using social choice theory. arXiv preprint arXiv:2312.03121, 2023

work page arXiv 2023
[42]

No-regret learning in harmonic games: Extrapolation in the face of conflicting interests.Advances in Neural Information Processing Systems, 37:123637–123674, 2024

Davide Legacci, Panayotis Mertikopoulos, Christos Papadimitriou, Georgios Piliouras, and Bary Pradelski. No-regret learning in harmonic games: Extrapolation in the face of conflicting interests.Advances in Neural Information Processing Systems, 37:123637–123674, 2024

work page 2024
[43]

Online learning and equilibrium computation with ranking feedback.arXiv preprint arXiv:2603.19221, 2026

Mingyang Liu, Yongshan Chen, Zhiyuan Fan, Gabriele Farina, Asuman Ozdaglar, and Kaiqing Zhang. Online learning and equilibrium computation with ranking feedback.arXiv preprint arXiv:2603.19221, 2026

work page arXiv 2026
[44]

Re- evaluating open-ended evaluation of large language models

Siqi Liu, Ian Gemp, Luke Marris, Georgios Piliouras, Nicolas Heess, and Marc Lanctot. Re- evaluating open-ended evaluation of large language models. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[45]

Deviation ratings: A general, clone-invariant rating method.arXiv preprint arXiv:2502.11645, 2025

Luke Marris, Siqi Liu, Ian Gemp, Georgios Piliouras, and Marc Lanctot. Deviation ratings: A general, clone-invariant rating method.arXiv preprint arXiv:2502.11645, 2025

work page arXiv 2025
[46]

Jackpot! alignment as a maximal lottery

Roberto-Rafael Maura-Rivero, Marc Lanctot, Francesco Visin, and Kate Larson. Jackpot! alignment as a maximal lottery.arXiv preprint arXiv:2501.19266, 2025

work page arXiv 2025
[47]

Quantal response equilibria for normal form games.Games and Economic Behavior, 10(1):6–38, 1995

Richard D McKelvey and Thomas R Palfrey. Quantal response equilibria for normal form games.Games and Economic Behavior, 10(1):6–38, 1995

work page 1995
[48]

A survey of algorithms and analysis for adaptive online learning.Journal of Machine Learning Research, 18(90):1–50, 2017

H Brendan McMahan. A survey of algorithms and analysis for adaptive online learning.Journal of Machine Learning Research, 18(90):1–50, 2017

work page 2017
[49]

A theory of voting equilibria.American Political science review, 87(1):102–114, 1993

Roger B Myerson and Robert J Weber. A theory of voting equilibria.American Political science review, 87(1):102–114, 1993

work page 1993
[50]

Equilibrium points in n-person games.Proceedings of the national academy of sciences, 36(1):48–49, 1950

John F Nash Jr. Equilibrium points in n-person games.Proceedings of the national academy of sciences, 36(1):48–49, 1950

work page 1950
[51]

Quadratic programming with one negative eigenvalue is np-hard.Journal of Global optimization, 1(1):15–22, 1991

Panos M Pardalos and Stephen A Vavasis. Quadratic programming with one negative eigenvalue is np-hard.Journal of Global optimization, 1(1):15–22, 1991

work page 1991
[52]

From poincaré recurrence to convergence in imperfect information games: Finding equilibrium via regularization

Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, et al. From poincaré recurrence to convergence in imperfect information games: Finding equilibrium via regularization. InInternational Conference on Machine Learning, pages 8525–8535. PMLR, 2021. 12

work page 2021
[53]

The distortion of cardinal preferences in voting

Ariel D Procaccia and Jeffrey S Rosenschein. The distortion of cardinal preferences in voting. InInternational Workshop on Cooperative Information Agents, pages 317–331. Springer, 2006

work page 2006
[54]

To mask or to mirror: Human-AI alignment in collective reasoning

Crystal Qian, Aaron Parisi, Clémentine Bouleau, Vivian Tsai, Maël Lebreton, and Lucas Dixon. To mask or to mirror: Human-AI alignment in collective reasoning. 2025. URL https://arxiv.org/abs/2510.01924

work page arXiv 2025
[55]

An iterative method of solving a game.Annals of Mathematics, 54(2):296–301, 1951

Julia Robinson. An iterative method of solving a game.Annals of Mathematics, 54(2):296–301, 1951

work page 1951
[56]

Springer Science & Business Media, 1995

Donald G Saari.Basic Geometry of Voting, volume 12. Springer Science & Business Media, 1995

work page 1995
[57]

Online learning and online convex optimization.Foundations and Trends® in Machine Learning, 4(2):107–194, 2012

Shai Shalev-Shwartz et al. Online learning and online convex optimization.Foundations and Trends® in Machine Learning, 4(2):107–194, 2012

work page 2012
[58]

A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games

Samuel Sokota, Ryan D’Orazio, J Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, and Christian Kroer. A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games. InThe Eleventh International Conference on Learning Representations, 2023

work page 2023
[59]

Follow the perturbed leader: Optimism and fast parallel algorithms for smooth minimax games.Advances in Neural Information Processing Systems, 33:22316–22326, 2020

Arun Suggala and Praneeth Netrapalli. Follow the perturbed leader: Optimism and fast parallel algorithms for smooth minimax games.Advances in Neural Information Processing Systems, 33:22316–22326, 2020. 13 A Context Ordinal Equilibria A.1 Doubly Probabilistic Social Choice Functions Many voting rules satisfy or can be adapted to satisfy Definition 2 of a ...

work page 2020
[60]

trembling hand

depending on what information is needed to compute them. C1 uses only pairwise majority relationships (e.g., Copeland), C2 uses weighted pairwise majority relationships (e.g., ranked pairs, Borda), and then C3 is other rules (e.g., Dodgson). The lottery representation enables at least C1 and C2rules. Axioms’ Effect on dpSCFsSocial choice theory is axiomat...

work page
[61]

Indicating interest- You will first be asked to indicate how much you want to become the group leader on a scale from 0 to 10

work page
[62]

You cannot vote for yourself

Ranking your teammates- You will rank your three teammates, with your preferred leader at position 1, the second most preferred leader at position 2, and the third most preferred leader at position 3. You cannot vote for yourself. We will use your answers to these two questions to select the leader: • The two group members who express the most interest in...

work page