pith. machine review for the scientific record. sign in

cs.GT

Computer Science and Game Theory

Covers all theoretical and applied aspects at the intersection of computer science and game theory, including work in mechanism design, learning in games (which may overlap with Learning), foundations of agent modeling in games (which may overlap with Multiagent systems), coordination, specification and formal methods for non-cooperative computational environments. The area also deals with applications of game theory to areas such as electronic commerce.

0
cs.GT 2026-05-13 Recognition

New check verifies clustering fairness in near-linear time

Check, Please: Verifiably Fair Clustering

DC-mPJR+ restricts fairness tests to coalitions near unselected centers and approximates full proportional representation within an additive

Figure from the paper full image
abstract click to expand
Popular centroid-based clustering methods are typically optimized for global objectives and may fail to adequately represent large groups of datapoints. To address this concern, recent work puts forward clustering analogs of social choice proportionality concepts, such as Proportionally Representative Fairness (also known as mPJR). For proportionality guarantees to be useful in practice, they must be (a) achievable and (b) efficiently auditable, so that one can check whether standard approaches, such as $k$-means, which are not guaranteed to provide proportional representation in general, nevertheless output proportional solutions on specific inputs. In this work, we study the computational complexity of verifying proportional representation in clustering. We first show that verifying mPJR is coNP-hard. Inspired by PJR+ -- a strengthening of PJR that is polynomial-time verifiable in the committee voting setting -- we introduce mPJR+ as its metric analog. However, verifying mPJR+ relies on repeated submodular minimization, rendering it impractical at scale. Hence, we introduce Default Coalitions mPJR+ (DC-mPJR+): a new proportionality concept that offers representation guarantees to a restricted set of coalitions around unselected centers, and as a result, admits an $O(mn \log n + mnk)$ verification algorithm. DC-mPJR+ is satisfied by SEAR and remains a meaningful proxy for global fairness: any solution satisfying $\gamma$-DC-mPJR+ also satisfies $(\gamma + 2)$-mPJR+. Together, our results identify a practical and theoretically grounded path for auditing proportional representation in clustering.
0
0
cs.GT 2026-05-13 2 theorems

Optimal welfare strategies under different discounts use finite counting memory

Social Welfare under Heterogeneous Time Preferences

In multi-principal MDPs, utilitarian maximization avoids positional strategies yet allows polynomial-time finite-memory synthesis.

Figure from the paper full image
abstract click to expand
In several socioeconomic-critical decision-making settings, such as fair resource allocation, climate policy, or AI alignment, multiple principals interact within a common arena. While it is well established that these principals may have differing preferences, decision-making under heterogeneous time preferences remains relatively unexplored. In particular, principals may weigh future outcomes differently and may derive distinct utilities from the same decisions. Motivated by such scenarios, we introduce the notion of heterogeneous time preferences in MDPs, where multiple principals possess distinct reward functions and apply different discount factors to future rewards. To compute meaningful decisions in such settings, an AI agent must rely on a notion of optimality that accounts for the preferences of all principals. We adopt a utilitarian notion of social welfare, defined as the sum of utilities accrued to all principals, and study the synthesis of agent strategies that maximise this welfare. Under heterogeneous time preferences, we show that optimal strategies are no longer positional, even when all principals receive identical rewards. Nevertheless, optimal strategies remain structurally simple: they can be realized as pure finite-memory counting strategies, require only polynomial memory in the system size, and can be synthesized in polynomial time. On the other hand, we show that deciding threshold questions for optimal positional strategies is NP-hard, exposing a poor trade-off: insisting on positional simplicity neither makes synthesis tractable nor preserves social welfare.
0
0
cs.GT 2026-05-13 1 theorem

Sure-almost-sure window mean-payoff in MDPs is in P for fixed windows

Sure-almost-sure and Sure-limit-sure Window Mean Payoff in Markov Decision Processes

The combined sure and almost-sure conditions match the complexity of each taken separately, with explicit memory bounds for the resulting st

abstract click to expand
Given rationals $\alpha$ and $\beta$, the sure-almost-sure problem for a quantitative objective $\varphi$ in a Markov decision process (MDP) asks if one can simultaneously ensure that all outcomes of the MDP have $\varphi$-value at least $\alpha$ (i.e. sure $\alpha$ satisfaction) and with probability $1$ the outcome has $\varphi$-value at least $\beta$ (i.e. almost-sure $\beta$ satisfaction). The sure-limit-sure problem asks if for all $\varepsilon > 0$ one can simultaneously ensure that all outcomes have $\varphi$-value at least $\alpha$ and with probability at least $1 - \varepsilon$ the outcome has $\varphi$-value at least $\beta$. Moreover, if simultaneous satisfaction of objectives is possible, then one would also like to construct a strategy (for sure-almost-sure) or a family of strategies (for sure-limit-sure) that achieves this. In this paper, we solve the sure-almost-sure and sure-limit-sure problems for window mean-payoff objectives. The window mean-payoff objective strengthens the standard mean-payoff objective by requiring that the average payoff of a finite window that slides over an infinite run be greater than a given threshold. We study two variants of window mean payoff: in the fixed variant, the window length $\ell$ is given, while in the bounded variant, the length is not given but is required to be bounded throughout the run. We show that the sure-almost-sure problem and the sure-limit-sure problem are both in P for the fixed variant (if $\ell$ is given in unary) and are both in NP $\cap$ coNP for the bounded variant, matching the computational complexity of sure satisfaction and almost-sure satisfaction when considered separately for these objectives. We also give bounds for the memory requirement of winning strategies for all considered problems.
0
0
cs.GT 2026-05-13 Recognition

CVaR breaks standard persuasion reduction yet yields polynomial LP

Bayesian Persuasion with a Risk-Conscious Receiver

Finite-state models let each CVaR value be written as max-affine in the posterior, so active-facet refinement produces an exact polynomial L

Figure from the paper full image
abstract click to expand
We study Bayesian persuasion when the receiver evaluates actions by reward-side Conditional Value-at-Risk (CVaR) rather than expected utility. CVaR preferences break the standard action-based direct-recommendation reduction: merging signals that recommend the same action can change the receiver's tail-risk ranking and destroy incentive compatibility. We show that this failure does not imply intractability in the explicit finite-state model. Each CVaR action value is max-affine in the posterior, and refining recommendations by the active affine piece yields an active-facet revelation principle and an exact polynomial-size linear program. We further identify a representation boundary: listed polyhedral risks remain tractable by the same LP, whereas succinctly represented facet families make exact persuasion NP-hard. Finally, we give a finite-precision approximation scheme for risk preferences determined by finitely many stable posterior statistics.
0
0
cs.GT 2026-05-13 Recognition

Mechanism gives first truthful approx for size-limited ad auctions

Position Auctions with a Capacity Constraint

Density ordering plus targeted swaps respects global space while keeping the auction incentive-compatible for single-parameter bidders.

Figure from the paper full image
abstract click to expand
Sponsored search auctions are commonly modeled as an assignment of a fixed set of slots (positions) to a set of advertisers, with welfare maximization being reducible to a standard matching problem. Motivated by modern ad formats, we study a richer variant of the classical position auctions model, in which ads have heterogeneous sizes and the platform must jointly select and assign a subset of ads to positions subject to a global space constraint. We formulate this as a matching problem with a capacity constraint, and propose an algorithmic technique that goes beyond simple greedy methods while achieving constant factor approximation guarantees. Our allocation rule augments density-based ordering with capacity-aware local improvements, which allow for re-allocations that improve welfare, while respecting the capacity constraint. Applied in the context of position auctions, we analyze this mechanism under the assumption of single-parameter agents and position-dependent click-through-rates (CTRs). We show that a minor modification to our approach yields a universally truthful randomized mechanism with a constant factor approximation guarantee. To the best of our knowledge, this is the first truthful constant-approximation mechanism for this variant of capacity-constrained matching.
0
0
cs.GT 2026-05-13 Recognition

Nash product rule limits manipulation gain to factor of 2

Approximate Strategyproofness in Approval-based Budget Division

The bound is tight under fairness and efficiency axioms and extends to concave utilities in approval-based budget division.

Figure from the paper full image
abstract click to expand
In approval-based budget division, the task is to allocate a divisible resource to the candidates based on the voters' approval preferences over the candidates. For this setting, Brandl et al. [2021] have shown that no distribution rule can be strategyproof, efficient, and fair at the same time. In this paper, we aim to circumvent this impossibility theorem by focusing on approximate strategyproofness. To this end, we analyze the incentive ratio of distribution rules, which quantifies the maximum multiplicative utility gain of a voter by manipulating. While it turns out that several classical rules have a large incentive ratio, we prove that the Nash product rule ($\mathsf{NASH}$) has an incentive ratio of $2$, thereby demonstrating that we can bypass the impossibility of Brandl et al. by relaxing strategyproofness. Moreover, we show that an incentive ratio of $2$ is optimal subject to some of the fairness and efficiency properties of $\mathsf{NASH}$, and that the positive result for the Nash product rule even holds when voters may report arbitrary concave utility functions. Finally, we complement our results with an experimental analysis.
0
0
cs.GT 2026-05-13 2 theorems

More AI can lower productivity via skill and reliability feedback

Human-AI Productivity Paradoxes: Modeling the Interplay of Skill, Effort, and AI Assistance

When skill evolves with use or AI reliability depends on assistance, ramping up AI leads to output drops and skill divides.

Figure from the paper full image
abstract click to expand
Generative Artificial Intelligence (AI) tools are rapidly adopted in the workplace and in education, yet the empirical evidence on AI's impact remains mixed. We propose a model of human-AI interaction to better understand and analyze several mechanisms by which AI affects productivity. In our setup, human agents with varying skill levels exert utility-maximizing effort to produce certain task outcomes with AI assistance. We find that incorporating either endogeneity in skill development or in AI unreliability can induce a productivity paradox: increased levels of AI assistance may degrade productivity, leading to potentially significant shortfalls. Moreover, we examine the long-term distributional effect of AI on skill, and demonstrate that skill polarization can emerge in steady state when accounting for heterogeneity in AI literacy -- the agent's capability to identify and adapt to inaccurate AI outputs. Our results elucidate several mechanisms that may explain the emergence of human-AI productivity paradoxes and skill polarization, and identify simple measures that characterize when they arise.
0
0
cs.GT 2026-05-12 Recognition

Mean-field approximation turns network interventions into finite linear programs

Optimal Interventions on the Linear Threshold Model in Large-Scale Networks

The reformulation lets planners find low-cost threshold changes that hit a target adoption fraction using only statistical network knowledge

Figure from the paper full image
abstract click to expand
We study an optimal intervention problem on the linear threshold model (LTM) in which a social planner aims to design minimal-cost interventions that modify the agents' thresholds, under the constraint that at least a predefined fraction of agents reaches a given state after a finite number of iterations. While this problem is known to be NP-hard and its exact solution requires full knowledge of the network structure, we focus on approximate solutions for large-scale networks and assume that the planner has only statistical knowledge of the network. In particular, we build on a local mean-field approximation of the LTM that is known to hold true on large-scale random networks, and reformulate the optimal intervention problem as a linear program with an infinite set of constraints. We then show how to approximate the solutions of the latter problem by standard linear programs with finitely many constraints. Finally, our approach is validated through numerical experiments on real-world networks and compared both with optimal seeding and state-of-the-art algorithms for the least-cost influence.
0
0
cs.GT 2026-05-12 Recognition

Strategic questions cut majority bias in AI outputs

When to Ask a Question: Understanding Communication Strategies in Generative AI Tools

A model determines how much extra information to request from users to represent diverse preferences without wasting their time.

Figure from the paper full image
abstract click to expand
Generative AI models differ from traditional machine learning tools in that they allow users to provide as much or as little information as they choose in their inputs. This flexibility often leads users to omit certain details, relying on the models to infer and fill in under-specified information based on distributional knowledge of user preferences. Such inferences may privilege majority viewpoints and disadvantage users with atypical preferences, raising concerns about fairness. Unlike more traditional recommender systems, LLMs can explicitly solicit more information from users through natural language. However, while directly eliciting user preferences could increase personalization and mitigate inequality, excessive querying places a burden on users who value efficiency. We develop a stylized model of user-LLM interaction and develop an objective that captures tradeoff between user burden and preference representation. Building on the observation that individual preferences are often correlated, we analyze how AI systems should balance inference and elicitation, characterizing the optimal amount of information to solicit before content generation. Ultimately, we show that information elicitation can mitigate the systematic biases of preference inference, enabling the design of generative tools that better incorporate diverse user perspectives while maintaining efficiency. We complement this theoretical analysis with an empirical evaluation illustrating the model's predictions and exploring their practical implications.
0
0
cs.GT 2026-05-12 2 theorems

Repeated voting incurs sublinear welfare cost for fairness

The Price of Proportional Representation in Temporal Voting

The loss grows with voters and rounds but vanishes for basic justified representation over long horizons.

abstract click to expand
We study proportional representation in the temporal voting model, where collective decisions are made repeatedly over time over a fixed horizon. Prior work has extensively investigated how proportional representation axioms from multiwinner voting (e.g., justified representation (JR) and its variants) can be adapted, satisfied, and verified in this setting. However, much less is understood about their interaction with social welfare. In this work, we quantify the efficiency cost of enforcing proportionality. We formalize the welfare-proportionality tension via the worst-case ratio between the maximum achievable utilitarian welfare and the maximum welfare attainable subject to a proportionality axiom. We show that imposing proportional representation in the temporal setting can incur a growing, yet sublinear, welfare loss as the number of voters or rounds increases. We further identify a clean separation among axioms: for JR, the welfare loss diminishes as the time horizon grows and vanishes asymptotically, whereas for stronger axioms this conflict persists even with many rounds. Moreover, we prove that welfare maximization under each axiom is NP-complete and APX-hard, even under static preferences and bounded-degree approvals, and provide fixed-parameter algorithms under several natural structural parameters.
0
0
cs.GT 2026-05-12 Recognition

10 CFR steps create abstractions beating equity and rank methods

Effective, Efficient, and General Information Abstraction for Imperfect-Information Extensive-Form Games

WEVA extracts expected values from brief full-game runs then clusters them, cutting exploitability over 80% with no pre-training needed.

Figure from the paper full image
abstract click to expand
Information abstraction reduces the computational cost of solving imperfect-information games by clustering information sets into a smaller number of \emph{buckets}. Existing methods either rely on domain-specific features such as rank or equity, which are inapplicable to games with non-standard payoff structures, or require expensive offline neural-network training on billions of samples. We propose \textbf{Warm-up Expected Value-based Abstraction (WEVA)}, a simple yet effective alternative: run a small number of Counterfactual Regret Minimization (CFR) iterations on the full game as a \emph{warm-up} phase, extract per-hand expected value features at every decision node, form a depth-weighted multi-node feature vector, and apply $k$-means++ clustering to obtain the abstraction mapping. WEVA requires no domain knowledge, no pre-training, and incurs only a small overhead on top of the abstract-game solve. Experiments on three structurally diverse games, with different bucket numbers and CFR variants, show that WEVA consistently outperforms equity-based and rank-based abstractions, reducing exploitability by up to over $80\%$. Surprisingly, as few as $W{=}10$ warm-up iterations already produce abstractions that outperform existing information abstraction methods in most settings. These results establish WEVA as an \emph{effective, efficient, and general} approach to information abstraction in imperfect-information extensive-form games.
0
0
cs.GT 2026-05-12 Recognition

Fisher equilibria approximation harder than 1/11 factor

Constant Inapproximability for Fisher Markets

PPAD-completeness rules out any polynomial-time scheme for constant approximations better than 1/11 in SPLC markets.

Figure from the paper full image
abstract click to expand
We study the problem of computing approximate market equilibria in Fisher markets with separable piecewise-linear concave (SPLC) utility functions. In this setting, the problem was only known to be PPAD-complete for inverse-polynomial approximations. We strengthen this result by showing PPAD-hardness for constant approximations. This means that the problem does not admit a polynomial time approximation scheme (PTAS) unless PPAD$=$P. In fact, we prove that computing any approximation better than $1/11$ is PPAD-complete. As a direct byproduct of our main result, we get the same inapproximability bound for Arrow-Debreu exchange markets with SPLC utility functions.
0
0
cs.GT 2026-05-12 2 theorems

Algorithm gets sqrt(T) regret for online allocation under mixed constraints

Online Resource Allocation With General Constraints

It competes with changing benchmarks in both predictable and worst-case arrivals while obeying budgets exactly.

Figure from the paper full image
abstract click to expand
Online resource allocation (ORA) is a fundamental framework for sequential decision-making problems under budget constraints, with applications ranging from online advertising to revenue management. In this work, we study a broader setting that includes both budget constraints and general constraints, extending the classical budget-only model. This extension is essential for modeling critical economic requirements, such as Return-on-Investment (ROI) constraints. We develop an algorithm that achieves best-of-both-world guarantees within this generalized framework. In particular, against a dynamic benchmark, our algorithm achieves $\widetilde{\mathcal O}(\sqrt{T})$ regret in the \emph{stochastic} regime and $\alpha$-regret of order $\widetilde{\mathcal O}(\sqrt{T})$ in the \emph{adversarial} regime, where $\alpha$ depends on the feasibility margin of the corresponding offline problem. At the same time, our algorithm guarantees strict satisfaction of the budget constraints and $\widetilde{\mathcal O}(\sqrt{T})$ cumulative violation for the general ones. From a technical perspective, introducing general constraints alongside budgets precludes the use of standard budget-focus methods. While budget methods rely on a zero-consumption ``safe'' action to ensure feasibility, general constraints are much less ``aligned'' towards feasibility. We overcome these difficulties with a new analysis that exploits \emph{weak adaptivity} to get boundedness of the Lagrangian multipliers and best-of-both-world guarantees.
0
0
cs.GT 2026-05-12 Recognition

Algorithm regret scales with corruption in bilateral trade

Regret Minimization in Bilateral Trade With Perturbed Markets

In perturbed markets the bound is T to the 3/4 plus C log T against the best distribution while matching the adversarial worst-case bound.

abstract click to expand
We address the problem of maximizing Gain from Trade (GFT) in repeated buyer-seller exchanges subject to global budget balance constraints. While this problem is well-understood in purely adversarial and stochastic settings, these environments exhibit a sharp dichotomy: adversarial environments allow for no-regret learning against the best fixed-price mechanism, whereas stochastic environments allow for no-regret learning against the best distribution over prices that is budget balanced in expectation. This gap is significant, as policies balanced in expectation can increase the GFT by a multiplicative factor of two. In this work, we bridge these extremes by studying perturbed markets, where an underlying stochastic distribution is subject to an adversarial corruption $C$. We design an algorithm that adaptively scales with the level of corruption, achieving an $\tilde{\mathcal{O}}(T^{3/4}) + \mathcal{O}(C\log(T))$ regret bound against the best budget-balanced distribution over prices. Simultaneously, our algorithm maintains the worst-case $\tilde{\mathcal{O}}(T^{3/4})$ regret bound relative to a per-round budget-balanced baseline, ensuring optimality even in fully adversarial environments.
0
0
cs.GT 2026-05-12 2 theorems

(k+1)/(k+2)-EFkX allocations exist for any agents when k>2

Approximate Envy-Free Allocations up to any k Goods

A generalized algorithm computes them in polynomial time and gives 3/4-EF2X for any number of agents.

Figure from the paper full image
abstract click to expand
We study the problem of finding approximate envy-free allocations up to any $k$ goods ($\alpha$-EFkX), when agents have additive values over goods in a bundle. As our main result, we show that for any $k>2$, $\frac{k+1}{k+2}$-EFkX allocations exist for any number of agents, and can be computed in polynomial time, via an appropriate generalization of the 3PA algorithm of [Amanatidis et al., 2024]. An immediate corollary of this result is that $3/4$-EF2X allocations exist for any number of agents; in contrast, $2/3$-EFX allocations are only known to exist for up to 7 agents. We improve this latter result by devising an algorithm that achieves $2/3$-EFX for 8 agents. We also consider EFkX graph orientations; we prove that such orientations do not always exist, and that deciding their existence is NP-complete, thereby generalizing the corresponding result of [Christodoulou et., 2023] for $k=1$.
0
0
cs.GT 2026-05-12 Recognition

Vote-Left triples Faithful win rate in The Traitors

The Vote-Left Equilibrium: A Deterministic Coordination Strategy for the Faithful in The Traitors

Cyclic voting with punishment creates equilibrium in all televised states against Traitor collusion.

Figure from the paper full image
abstract click to expand
The Traitors is a social deduction game in which an informed minority of Traitors face an uninformed majority of Faithful, and the recurring question facing the Faithful is how to vote. Random voting is known to be optimal for the uninformed majority under simultaneous-signal protocols [Braverman, Etesami and Mossel, 2008], but when votes are cast individually, random votes are indistinguishable from strategic ones and the Faithful remain exposed to coordinated Traitor collusion. We introduce the Vote-Left protocol, a deterministic rule under which every player votes for the next surviving player in a fixed cyclic ordering. Under full compliance every surviving player receives exactly one vote, so the banishment distribution coincides with random voting; since prescribed votes are deterministic functions of public information, any deviation is immediately identifiable. Combined with a simple punishment rule, Vote-Left constitutes a Perfect Bayesian Equilibrium for every state with $n_t > 2m_t + 2$, a region that contains every televised configuration. We characterise the Traitors' best response in the late-game phase ($n_t \leq 2m_t + 2$): deviate via collusion once the Faithful no longer have enough votes to guarantee punishment. Across the configurations played on television, Vote-Left raises the Faithful's winning probability by a factor of approximately three over random voting under collusion.
0
0
cs.GT 2026-05-12 2 theorems

Model-free RL learns near-SNE policies in Karma economies

Towards Model-Free Learning in Dynamic Population Games: An Application to Karma Economies

DQN error bounds improve with replay buffer size and population size, enabling decentralized fair allocation.

Figure from the paper full image
abstract click to expand
Dynamic Population Games (DPGs) provide a tractable framework for modeling strategic interactions in large populations of self-interested agents, and have been successfully applied to the design of Karma economies, a class of fair non-monetary resource allocation mechanisms. Despite their appealing theoretical properties, existing computational tools for DPGs assume full knowledge of the game model and operate in a centralized fashion, limiting their applicability in realistic settings where agents have access only to their own private experience. This paper takes a step towards addressing this gap by studying model-free equilibrium learning in Karma DPGs. First, we analyze the setting in which a novel agent joins a Karma DPG already at its Stationary Nash Equilibrium (SNE) and learns a policy via Deep Q-Networks (DQN) without knowledge of the game model. Leveraging recent convergence results for DQN, we establish a suboptimality bound consisting of a DQN approximation error of order $O(1/\sqrt{N_s})$ and a mean field perturbation error of order $O(1/N)$, where $N_s$ is the replay buffer size and $N$ is the population size. Second, we consider the challenging problem of learning the SNE from scratch. We show empirically that combining deep RL with fictitious play and smoothed policy iteration allows agents to converge, in a model-free fashion, to a configuration close to the centrally computed SNE. Together, these contributions support the vision of Karma economies as practical tools for fair resource allocation.
0
0
cs.GT 2026-05-12 Recognition

Resource allocation equilibria are identity or alternating flat functions

A Resource Allocation Game and its Equilibrium Strategies

Two-player cases reduce to three explicit profile types; large games are solved by a finite construction algorithm that may enter a chattery

Figure from the paper full image
abstract click to expand
In this paper we propose a Bayesian game to allocate resources. In this game, there are $c$ units of resources to be allocated to $n$ players. Agent $i$ has a demand of $V_i$ units of resources and takes action $X_i$ according to a strategy function $s_i$, \ie $X_i=s_i(V_i)$. Payoffs are setup such that player $i$ is contented with no more than $V_i$ units of resources. We assume that resources are granted to the players on a smallest-request-first and all-or-nothing basis. For this game with two players, we analyze the equilibrium strategy functions mathematically within the family of alternating identity-and-flat (AIF) functions. We show that Nash equilibrium profiles consist of two identity functions, two AIF functions with a common switch point, or two AIF functions with one and three switch points, respectively. For an $n$-player game with a large $n$ and a large $c_n$ of order $O(n)$, we present a mean-field first order approximation and a second-order Gaussian approximation for its equilibrium strategy function. The first-order analysis obtains an equilibrium AIF function with one switch point. In Gaussian analysis of large games, we propose a construction algorithm. This construction algorithm begins in searching within the family of AIF functions. If a gradient conflict condition occurs, the game enters a chattering regime, in which players play a continuous, strictly increasing strategy function that is not an identity nor a flat function. Conceptually one can view the chattering regime as if players alternate between a slope-one strategy and a flat strategy infinitely fast in order to sustain a high payoff. We prove that the construction algorithm always obtains a Nash equilibrium and terminates in a finite number of steps. We present several numerical examples for the two player game as well as the Gaussian model.
0
0
cs.GT 2026-05-12 2 theorems

Risk-sensitive preference games preserve monotonicity via translation-invariant riskโ€ฆ

Structure from Strategic Interaction & Uncertainty: Risk Sensitive Games for Robust Preference Learning

Translation-invariant risk measures keep zero-sum structure intact so self-play converges and handles tails in preference data.

Figure from the paper full image
abstract click to expand
A growing line of work reframes preference-based fine-tuning of large language models game-theoretically: Nash Learning from Human Feedback (NLHF) recasts the problem as a zero-sum game over policies. However, optimization is over expected pairwise payoffs, thereby conflating policies with similar win rates but different tail behavior. As such, these methods are agnostic to where in the data distribution they succeed or fail: strong average performance can mask systematic failure across prompts, annotators, or safety-critical strata. We introduce risk-sensitive preference games, in which players optimize convex risk measures of their preference loss, exploiting structure in preference uncertainty. While risk-sensitivity generally breaks the zero-sum structure, we show that translation invariance of many risk metrics ensures that we retain monotonicity, yielding fast convergence of sample-efficient self-play methods. Furthermore, we establish algorithmic stability and offline sample complexity bounds that scale with risk, requiring simultaneous control of structural bias from nonlinear risk transformations, statistical bias in risk estimation, and concentration tailored to the risk-sensitive setting. To address statistical bias, we introduce a hierarchical game formulation and a two-timescale extragradient algorithm with bias correction that converges to the Stackelberg equilibrium and is especially effective in low-sample regimes. Empirically, risk-adjusted policies are robust across data strata, stable across risk choices, and match or exceed risk-neutral performance thereby achieving robustness without a performance tax.
0
0
cs.GT 2026-05-12 2 theorems

Maximal EF1 always exists for two monotone agents on any graph

Fair Allocation under Conflict Constraints

Existence fails for three agents even with identical monotone valuations and is NP-hard to decide; EF[1,1] holds on paths.

Figure from the paper full image
abstract click to expand
We study the fair allocation of indivisible items subject to conflict constraints. In this framework, the items are represented as the vertices of a graph, with edges corresponding to conflicts between pairs of items. Each agent is assigned an independent set of items from the graph. Our goal is to achieve a fair and efficient allocation of these items. Fairness pertains to satisfying envy-freeness up to one item (EF1), while efficiency is defined by maximality, meaning that no unallocated item can be feasibly assigned to any agent. First, we explore the case of two agents. For monotone valuations, we show that a maximal EF1 allocation always exists on any graph. Our existence proof relies on a color-switching technique, which locally modifies a maximal allocation while preserving feasibility and restoring EF1. We further show that such allocations can be computed in pseudopolynomial time in general, and in polynomial time for additive valuations on arbitrary graphs, as well as for monotone valuations on interval and bipartite graphs. By contrast, once monotonicity is dropped, maximal EF1 allocations need not exist even for identical additive valuations, and deciding existence becomes NP-hard. Next, we consider the case with a general number of agents. Again, we arrive at a negative result: An EF1 and maximal allocation fails to exist even for three agents under identical monotone valuations, and determining the existence of such an allocation is NP-hard. On the positive side, we show that under identical non-monotone additive valuations on a path graph, an EF[1,1] and maximal allocation always exists. This result involves a novel application of the "cycle plus triangles" theorem.
0
0
cs.GT 2026-05-11 2 theorems

Prediction markets adapt liquidity using online learning

Adaptive Liquidity in Prediction Markets via Online Learning

Mixing cost-function markets with learnable weights preserves no-arbitrage and bounded loss while responding to trading conditions.

Figure from the paper full image
abstract click to expand
Prediction markets rely on liquidity to convert trades into informative prices, yet existing mechanisms fix liquidity ex ante. This restriction enforces a static trade-off between price responsiveness and worst-case loss despite inherently nonstationary trading conditions. We propose a fundamentally different approach that treats liquidity selection itself as an online learning problem. Our mechanism mixes a family of cost-function markets via learnable weights, yielding a single adaptive market that preserves no-arbitrage, bounded worst-case loss, expressiveness, and positive upside. We introduce a hybrid structural risk signal, a per-round objective that quantifies the trade-off between price impact and inventory risk, and show that standard online learning algorithms achieve switching-regret guarantees relative to the best sequence of liquidity regimes in hindsight. Simulations demonstrate that the mechanism adaptively shifts liquidity across regimes in response to both order flow and inventory dynamics. Our results establish a principled framework for adaptive liquidity, connecting prediction market design with online learning.
0
0
cs.GT 2026-05-11 Recognition

Greedy cuts queries for ensemble selection with 1-1/e guarantee

Efficient Ensemble Selection from Binary and Pairwise Feedback

Binary feedback yields instance-dependent savings; pairwise uses monotone relaxation and auditing to recover theta-winning guarantees.

Figure from the paper full image
abstract click to expand
Organizations increasingly deploy multiple AI systems across task domains, but selecting a small, high-performing ensemble can require costly model calls, benchmark runs, and human evaluation. We study this selection problem as a distributional variant of multiwinner voting: tasks are drawn from an unknown domain distribution, each task induces feedback over candidate experts, and a committee's value on a task is determined by its best-performing member. We analyze both binary feedback, for tasks with correct/incorrect outcomes, and pairwise feedback, for tasks where candidate outputs are compared by preference. In the binary setting, the induced objective is coverage. We give exhaustive-elicitation baselines and matching worst-case query lower bounds, and we design a failure-conditioned greedy algorithm that preserves the standard $(1-1/e)$ guarantee while obtaining instance-dependent query savings. In the pairwise setting, we study $\theta$-winning committees. We show that full-information optimization admits a PTAS but no EPTAS under Gap-ETH, and that the objective is monotone but not submodular. This motivates a weighted ordinal coverage relaxation, which is submodular and supports a failure-conditioned greedy oracle under pairwise feedback. We then convert this oracle back into $\theta$-type guarantees through finite-family auditing or a minimax wrapper. We also provide small-scale LLM experiments illustrating the predicted query savings and the role of complementarity in committee selection.
0
0
cs.GT 2026-05-11 2 theorems

Polynomial-time method finds exact SPPEs for constant goods

Pacing Equilibria in Second-Price Auctions with Few Goods

Pacing multipliers are mapped to bid orderings that stay fixed inside polynomially many geometric cells, each solved by a linear program.

abstract click to expand
In this paper, we investigate the computation of second-price pacing equilibria (SPPEs), a foundational model in online advertising auctions. We present a polynomial-time algorithm for computing exact SPPEs in instances with a constant number of goods. Our core technique maps buyers' pacing multipliers to the highest bids on each good, effectively partitioning the parameter space into a set of distinct geometric cells. By enumerating these cells, we fix the relative ordering of the bids and reduce the problem of equilibrium computation to a linear feasibility program. Finally, we demonstrate that this tractability extends to large-scale markets with an arbitrary number of goods, provided the goods can be aggregated into a constant number of valuation types.
0
0
cs.GT 2026-05-11 2 theorems

Allocations beat 1/4 MMS bound for large agent counts

On MMS, APS and XOS

A modified algorithm shows more than one quarter of the maximin share is achievable once the number of agents exceeds some threshold.

Figure from the paper full image
abstract click to expand
We consider allocations of a set of $m$ indivisible goods to $n$ agents of equal entitlements that have valuations from the class XOS. A previous sequence of works showed allocations that obtain an $\alpha$-approximation for the maximin share (MMS), for values of $\alpha$ that gradually approach $\frac{1}{4}$ from below (the currently known ratio is $\frac{4}{17}$). In this work we attempt to obtain ratios better than $\frac{1}{4}$, and manage to do so for sufficiently large $n$. Our methodology is to first investigate the gap between the anyprice share (APS) and the MMS when all agents have the same XOS valuations, for which we design an allocation algorithm and prove that each agent receives at least $\alpha > \frac{11}{40}$ times the APS. Then, we derive inspiration from this algorithm, and modify it so that it applies also when agents have different XOS valuations. Using this modified version, we show that for some sufficiently large $n_0$, there is an $\alpha$-MMS allocation (in fact, an $\alpha$-APS allocation) for every $n \geq n_0$.
0
0
cs.GT 2026-05-11 Recognition

Prosocial agents recover welfare losses no mechanism can fix

Mechanism Design Is Not Enough: Prosocial Agents for Cooperative AI

When contracts miss future contingencies, a gap remains unless agents value others' welfare too.

Figure from the paper full image
abstract click to expand
Ensuring that AI agents behave safely and beneficially when interacting with other parties has emerged as one of the central challenges of modern AI safety. While mechanism design, as the theory of designing rules to align individual and collective objectives, can incentivize cooperative behavior, it is still an open question whether it alone is sufficient to maximize LLM agents' social welfare. This work proves that the answer is negative: drawing from incomplete contract theory, we formally show that when contracts cannot distinguish all relevant future contingencies, there is a strictly positive welfare loss that no realistic mechanism can eliminate. We show that prosocial agents, who weigh others' welfare alongside their own, can close this gap and achieve outcomes that are socially superior and individually beneficial. Experimentally, we show that in multi-agent resource-allocation environments and canonical social dilemmas where agents are powered by large language models, prosociality is beneficial. The implication for AI safety is clear: to enable cooperative interactions at scale, designing adequate mechanisms is not sufficient; agents must be built to be intrinsically prosocial.
0
0
cs.GT 2026-05-11 2 theorems

Nash equilibria defined from ordinal rankings alone

Nash without Numbers: A Social Choice Approach to Mixed Equilibria in Context-Ordinal Games

Social choice aggregation turns conditional action rankings into mixed-strategy best responses that exist under mild conditions.

Figure from the paper full image
abstract click to expand
Nash equilibrium serves as a fundamental mathematical tool in economics and game theory. However, it classically assumes knowledge of player utilities, whereas economics generally regards preferences as more fundamental. To leverage equilibrium analysis in strategic scenarios, one must first elicit numerical utilities consistent with player preferences, a delicate and time-consuming process. In this work, we forgo precise utilities and generalize the Nash equilibrium to a setting where we only assume a player is capable of providing an ordinal ranking of their actions within the context of other players' joint actions. The key technical challenge is to rethink the definition of a best-response. While the classical definition identifies actions maximizing expected payoff, we naturally look towards social choice theory for how to aggregate preferences to identify the most preferred actions. We define this generalized notion of a context-ordinal Nash equilibrium, establish its existence under mild conditions on aggregation methods, introduce notions of regularization, approximation, and regret, explore complexity for simple settings, and develop learning rules for computing such equilibria. In doing so, we provide a generalization of Nash equilibrium and demonstrate its direct applicability to elicited preferences in human experiments.
0
0
cs.GT 2026-05-11 Recognition

Zero-determinant strategies match SSE defense performance

Zero-determinant Strategy for Moving Target Defense: Existence, Performance, and Computation

Repeated-game tactics achieve the strong Stackelberg upper bound for moving target defense while cutting computational cost for multiple, re

Figure from the paper full image
abstract click to expand
Moving Target Defense (MTD) is commonly formulated as a repeated security game to mitigate persistent threats. Although the strong Stackelberg equilibrium (SSE) characterizes the defender's optimal strategy in the leader-follower framework, computing the SSE often incurs high computational complexity, which significantly limits its practical deployment in MTD problems with multiple targets. This paper proposes adopting a zero-determinant (ZD) strategy for constructing an MTD strategy that achieves both high defensive performance and substantially low computational complexity. We first derive a necessary and sufficient condition for the existence of ZD strategies and investigate the performance of ZD strategies, which shows their upper-bound performance matches that of the SSE strategy. We then formulate two programs to find the optimal ZD strategy parameters under different conditions. Moreover, we design an algorithm to compute the proposed ZD strategies, along with the computational complexity analysis in comparison with the traditional SSE computation. Finally, we conduct experiments on two practical applications to verify our results.
0
0
cs.GT 2026-05-11 2 theorems

Strategic responses make uniform privacy budgets suboptimal

Differentially Private Auditing Under Strategic Response

Naive uniform or proportional allocations leave more welfare-weighted undetected harm when developers reallocate mitigation across harm dims

abstract click to expand
Regulatory audits of AI systems increasingly rely on differential privacy (DP) to protect training data and model internals. We study audit design when the audited developer can strategically respond to the privacy-constrained audit interface. We formalize privacy-constrained auditing as a bilevel Stackelberg game, in which an auditor commits to a query policy and DP budget allocation across harm dimensions, and a strategic developer reallocates mitigation efforts in response. We introduce the welfare-weighted under-detection gap $B_w$, the welfare-weighted true residual harm the audit fails to detect at the developer's strategic best response, and prove that naive DP auditing (uniform or harm-proportional allocation) induces a strictly larger $B_w$ than any non-strategic mitigation baseline whenever effective detectability is heterogeneous, the welfare weights are not comonotone with detectability, and the developer's optimum is interior. We characterize the optimal auditor allocation as a four-factor balance of welfare weight, audit miss-probability, detectability elasticity, and mitigation-cost curvature, and provide a single-level reformulation of the bilevel problem via the developer's KKT system. We propose Strategic Private Audit Design (SPAD), a projected-gradient algorithm with hypergradients computed through the developer's best response.
0
0
cs.GT 2026-05-11 2 theorems

Non-affine approvals force miscalibration in all proper scoring rules

The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting

A step-function threshold restores truthful reporting and first-best screening in oversight problems.

abstract click to expand
Eliciting truthful reports from autonomous agents is a core problem in scalable AI oversight: a principal scores the agent's report using a strictly proper scoring rule, but the agent also benefits from the report through a non-accuracy channel (approval for autonomous action, allocation share, downstream control). The same structure appears in classical mechanism-design settings such as marketplace operation. Our main result is an endogeneity: the principal's optimal oversight necessarily uses a non-affine approval function to screen types, yet any non-affine approval makes truthful reporting suboptimal under the combined objective whenever deviation is undetectable. The principal cannot avoid the perturbation that undermines calibration. This impossibility holds for all strictly proper scoring rules, with a closed-form perturbation formula. A constructive escape exists: a step-function approval threshold achieves first-best screening for every strictly proper scoring rule, because the agent's binary inflate-or-not choice creates a type-space threshold regardless of the generator's curvature. Under the Brier score specifically, the type-independent inflation cost yields a welfare equivalence between second-best and first-best; we prove this equivalence is unique to Brier (the welfare gap under smooth $C^1$ oversight is bounded below by $\Omega(\text{Var}(1/G'') (\gamma/\beta)^2)$ for every non-Brier rule). Two instances develop the framework: AI agent oversight (the lead motivating setting) and marketplace operation (a parallel mechanism-design domain). The message for AI alignment is direct: smooth scoring-based oversight cannot elicit truthful reports from a strategic agent; sharp thresholds are the calibration-preserving design.
0
0
cs.GT 2026-05-11 Recognition

Quotient semivalues block false-name attacks on data attribution

Quotient Semivalues for False-Name-Resistant Data Attribution

Identity-level Shapley fairness and manipulation resistance cannot coexist, but clustering into evidence groups cuts duplicate-attack gains.

Figure from the paper full image
abstract click to expand
Data valuation methods allocate payments and audit training data's contribution to machine-learning pipelines; however, they often assume passive contributors. In reality, contributors can split datasets across pseudonymous identities, duplicate high-value examples, create near-duplicates, or launder synthetic variants to inflate their share. We formalize this as false-name manipulation in ML data attribution. Our main construction is the quotient semivalue mechanism: compute Shapley-, Banzhaf-, or Beta-style values over evidence-backed attribution clusters instead of raw identities, using a canonical-representative operator to absorb within-cluster duplication. We prove an impossibility: on a fixed monotone data-value game, exact Shapley-fair attribution over reported identities is incompatible with unrestricted false-name-proofness, even on binary-valued instances, and characterize the split-gain of a general semivalue on a unanimity counter-example. The mechanism is exactly false-name-proof under two structural conditions: false-name-neutral within-cluster allocation and quotient-stable manipulations. Under imperfect provenance, when these conditions hold approximately, manipulation gain and fairness loss are bounded by three measurable quantities: escaped-cluster mass, value-estimation error, and clustering distance. We instantiate the mechanisms in DataMarket-Gym, a benchmark for attribution under strategic provider attacks. On synthetic classification tasks, quotient semivalues with example-level evidence reduce manipulation gain on duplicate and near-duplicate Sybil attacks from $1.74$ under baseline Shapley to $0.96$, near the honest level. The cosine-threshold and (false-merge, false-split) rate sweeps trace the corresponding fairness--Sybil frontier.
0
0
cs.GT 2026-05-11 Recognition

Cost reports plus assignments stop wasted LLM data subsidies

Incentivizing User Data Contributions for LLM Improvement under Withdrawal Rights

Platforms pay for user contributions only when the total data will actually improve the model.

Figure from the paper full image
abstract click to expand
The continued improvement of large language models (LLMs) increasingly depends on eliciting high-quality, user-generated data, yet such data are costly to provide and often withheld due to privacy and effort concerns. This creates a fundamental design challenge: how to incentivize data contribution when model improvements require coordinated, threshold-level inputs, while contributions remain privately costly and partially reversible. We develop and theoretically analyze incentive mechanisms for user data contribution that explicitly account for threshold effects and reversibility, focusing on how subsidies and withdrawal rights can be jointly designed to overcome coordination failure. As a natural benchmark, we first consider subsidy-based incentives, under which users respond to posted payments with privately optimal floor contributions. These decentralized responses may fall below the improvement threshold, resulting in subsidy expenditure without model improvements. We then analyze mechanisms with withdrawal rights, in which users report costs, the provider centrally assigns contribution burdens, and users may withdraw before training. We prove that combining cost reporting with personalized assignment can eliminate inefficient provision by ensuring that data are collected only when improvement is sustainable, converting infeasible instances into a null outcome rather than subsidy leakage. Finally, we compare two withdrawal protocols. The simultaneous protocol can achieve lower total cost, while the small-first sequential protocol better incentivizes participation, encouraging greater data provision and thereby increasing the probability of crossing the improvement threshold.
0
0
cs.GT 2026-05-11 2 theorems

Fee sharing equilibria beat random allocation on DAG throughput

Game-Theoretic Analysis of Transaction Selection in DAG-Based Distributed Ledgers

Simulations find collaborative sharing produces higher transaction rates and rewards than random fees, especially when fees vary widely.

Figure from the paper full image
abstract click to expand
Transaction selection in parallel or DAG-based distributed ledger technologies (DLTs) is a crucial challenge that directly impacts throughput, fairness, and validator incentives. In these systems, validators independently choose transactions to include in their blocks, often relying on naive heuristics like uniform or proportional selection. This can lead to inefficient outcomes when validators prioritize their own rewards without considering collective impacts. We analyze two fee allocation mechanisms used in practice: Random Fee Allocation (RFA), where transaction fees are randomly assigned to one validator, and Collaborative Fee Sharing (CFS), where fees are distributed equally among all validators. Using a single-shot game-theoretic framework, we derive symmetric Nash equilibria (NE) for selecting transactions for both mechanisms and propose an optimization-based method to compute these equilibria. Numerical simulations demonstrate that the NE of CFS consistently achieves higher throughput and rewards compared to the NE of RFA, particularly under skewed fee distributions. Additionally, we compare these equilibrium strategies to naive benchmarks (uniform and proportional selection), showing that the proportional strategy outperforms the NE of RSA in many situations. These findings may provide actionable insights into the design of transaction selection and incentive mechanisms, enabling more robust and high-performance DAG-based DLTs.
0
0
cs.GT 2026-05-08 Recognition

Valuation design cuts Tullock optimization to two variables

Incentive Design in Competitive Resource Allocation: Exploiting Valuation Asymmetry in Tullock Contests

A coordinator sets reported values for any number of subordinates and keeps the design problem fixed in size while competing against an unop

Figure from the paper full image
abstract click to expand
In competitive resource allocation, a central coordinator may seek to gain an advantage not by directly controlling subordinate agents, but by strategically manipulating the information they receive. We study this problem within the framework of multi-player Tullock contests, where the coordinator influences subordinate players by designing their reported valuations of the contested prize, a mechanism that preserves the Tullock structure of the subordinates' objectives and thereby enables tractable equilibrium analysis. We first characterize the Nash equilibrium of the general multi-player Tullock contest, establishing how valuations and per-unit costs jointly determine equilibrium bids and payoffs. We then derive the optimal reported valuations for a coordinator managing two subordinates against a single opponent, and show that the structure of the optimal solution extends to contests with an arbitrary number of subordinates, reducing the coordinator's optimization to a two-variable problem regardless of system size.
0
0
cs.GT 2026-05-08 2 theorems

Least core cuts LLM calls for creator credit assignment

In-Context Credit Assignment via the Core

Stable value distribution among content creators requires far fewer model queries when using constraint seeding and separation.

Figure from the paper full image
abstract click to expand
We propose incentive-aligned mechanisms for in-context credit assignment: the task of assigning credit for AI-generated content (e.g. code, news articles, short-form videos) among creators whose intellectual property appears in the context window. Our approach is based on the least core solution concept from cooperative game theory, which distributes value in a way that is as stable as possible by ensuring that no subset of creators is significantly under-compensated relative to the value they could generate on their own. We develop algorithms for approximating the least core, which leverage novel routines for constraint seeding and constraint separation. On a web retrieval credit assignment task, we find that our approaches are capable of approximating the least core using orders of magnitude fewer LLM calls compared to alternative methods.
0
0
cs.GT 2026-05-08 Recognition

Auctions embed ads in LLMs without distorting content

Mechanism Design for Quality-Preserving LLM Advertising

RAG-based endogenous reserves and screened VCG payments raise revenue while keeping responses close to clean outputs.

Figure from the paper full image
abstract click to expand
Embedding advertisements into large language model (LLM) outputs introduces a fundamental tension: revenue optimization can distort content and degrade user experience. Existing approaches largely ignore this trade-off, often forcing irrelevant ads into responses. We propose a quality-preserving auction framework that explicitly integrates content fidelity into the mechanism design. Built on retrieval-augmented generation (RAG), our approach treats organic content as a reference and derives an endogenous reserve price that screens out ads with non-positive marginal social welfare contributions. We develop a KL-regularized single-allocation mechanism with Myerson payments and a screened VCG multi-allocation mechanism, both satisfying dominant-strategy incentive compatibility and individual rationality. Experiments across diverse scenarios demonstrate that our mechanisms outperform existing baselines in metrics such as revenue per ad and semantic similarity to no-ad responses. Our results establish a new paradigm for LLM advertising that enables monetization without compromising output quality.
0
0
cs.GT 2026-05-08 2 theorems

Adaptive scalarization choice steers vector games toward preferred equilibria

Online Scalarization in Vector-Valued Games

A slow outer learner picks from candidate linear combinations so a fast inner bandit converges closer to the outcome favored by a hiddentrue

Figure from the paper full image
abstract click to expand
We study repeated multi-player vector-valued games in which a player observes a payoff vector each round and evaluates outcomes through linear scalarizations of those vectors. Different from most prior works, the choice of scalarization is treated as an online decision variable rather than a fixed modeling decision. We propose a bi-level learning framework in which an outer learner chooses a scalarization from a finite candidate class on a slow timescale, while a faster inner bandit no-regret learner selects actions using the scalar feedback induced by the chosen scalarization. Performance of this approach is defined with respect to a certain true weight vector, and the deployed scalarizations act as control signals that shape the induced payoff trajectory. We provide implementable algorithms based on bandit online mirror descent with stabilized importance weighting, and we derive finite-time performance guarantees in the form of sublinear regret bounds. Experiments on a vector-valued extension of a canonical game show that convergence to the preferred equilibrium rises from roughly $50\%$ under non-adaptive scalarization to about $80\%$ under our proposed method.
0
0
cs.GT 2026-05-08

Online scalarization choice yields sublinear regret in vector-valued games

Online Scalarization in Vector-Valued Games

An outer slow learner picks scalarizations from a candidate set while a fast inner bandit learner acts on the scalar payoffs, raising rates,

Figure from the paper full image
abstract click to expand
We study repeated multi-player vector-valued games in which a player observes a payoff vector each round and evaluates outcomes through linear scalarizations of those vectors. Different from most prior works, the choice of scalarization is treated as an online decision variable rather than a fixed modeling decision. We propose a bi-level learning framework in which an outer learner chooses a scalarization from a finite candidate class on a slow timescale, while a faster inner bandit no-regret learner selects actions using the scalar feedback induced by the chosen scalarization. Performance of this approach is defined with respect to a certain true weight vector, and the deployed scalarizations act as control signals that shape the induced payoff trajectory. We provide implementable algorithms based on bandit online mirror descent with stabilized importance weighting, and we derive finite-time performance guarantees in the form of sublinear regret bounds. Experiments on a vector-valued extension of a canonical game show that convergence to the preferred equilibrium rises from roughly $50\%$ under non-adaptive scalarization to about $80\%$ under our proposed method.
0
0
cs.GT 2026-05-08

Folk theorem for LLMs sustains any rational outcome as ฮต-equilibrium

Sustaining Cooperation in Populations Guided by AI: A Folk Theorem for LLMs

Hidden advisor identities and indirect observation still allow all feasible payoffs in repeated agent interactions guided by shared models.

abstract click to expand
Large language models (LLMs) are increasingly used to provide instructions to many agents who interact with one another. Such shared reliance couples agents who appear to act independently: they may in fact be guided by a common model. This coupling can change the prospects for cooperation among agents with misaligned incentives. We study settings in which multiple LLMs each advise a population of clients who participate in instances of an underlying game, creating strategic interaction at the level of the LLMs themselves. This induces a meta-game among the LLMs, mediated through clients. We first analyze the one-shot setting, where shared instructions can change equilibrium behavior only when an LLM may influence more than one role in the same interaction; in such cases, cooperation may emerge, and the effect of client share can be beneficial, harmful, or non-monotone, depending on the base game. Our main result concerns the repeated setting. We prove a folk theorem for LLMs: despite indirect observation and the clients' inability to identify which LLM advised their opponents, all feasible and individually rational outcomes can be sustained as $\varepsilon$-equilibria. The result does not follow from the standard folk theorem and requires new proof techniques. Together, these results show that shared LLM guidance can sustain cooperation among populations of agents even when the underlying incentives are misaligned.
0
0
cs.GT 2026-05-08

Sequential trials with subsidies raise social utility over 35%

Optimizing Social Utility in Sequential Experiments

A belief-process model lets regulators compute the subsidy level that encourages valuable but uncertain products without full upfront costs.

Figure from the paper full image
abstract click to expand
Regulatory approval of products in high-stakes domains such as drug development requires statistical evidence of safety and efficacy through large-scale randomized controlled trials. However, the high financial cost of these trials may deter developers who lack absolute certainty in their product's efficacy, ultimately stifling the development of `moonshot' products that could offer high social utility. To address this inefficiency, in this paper, we introduce a statistical protocol for experimentation where the product developer (the agent) conducts a randomized controlled trial sequentially and the regulator (the principal) partially subsidizes its cost. By modeling the protocol using a belief Markov decision process, we show that the agent's optimal strategy can be found efficiently using dynamic programming. Further, we show that the social utility is a piecewise linear and convex function over the subsidy level the principal selects, and thus the socially optimal subsidy can also be found efficiently using divide-and-conquer. Simulation experiments using publicly available data on antibiotic development and approval demonstrate that our statistical protocol can be used to increase social utility by more than $35$$\%$ relative to standard, non-sequential protocols.
0
0
cs.GT 2026-05-08

EFX fails even for symmetric submodular valuations with three agents

Counterexamples to EFX for Submodular and Subadditive Valuations

Eight-good instances yield no allocation that meets the EFX criterion when agents differ only by how goods are labeled.

abstract click to expand
The existence of EFX allocations is a fundamental question in fair division. In this paper, we construct a three-agent, eight-good instance with monotone subadditive valuations such that no allocation satisfies $\alpha$-EFX for any $\alpha > \frac{1}{\sqrt[6]{2}} \approx 0.89$. We also provide a closely related three-agent, eight-good instance with submodular (in fact weighted coverage) valuations for which no EFX allocation exists. A key feature of our construction is its symmetry: the agents' valuations are identical up to a relabeling of the goods. Thus, EFX can fail even when agents differ only in how the goods are labeled. This symmetry makes the counterexamples compact and human-verifiable, yielding simple combinatorial obstructions to the existence of EFX.
0
0
cs.GT 2026-05-08

Agents learn Nash equilibria independently with local observations

Independent Learning of Nash Equilibria in Partially Observable Markov Potential Games with Decoupled Dynamics

In potential games with independent state transitions, finite-history policies yield quasi-polynomial convergence without communication.

abstract click to expand
We study Nash equilibrium learning in partially observable Markov games (POMGs), a multi-agent reinforcement learning framework in which agents cannot fully observe the underlying state. Prior work in this setting relies on centralization or information sharing, and suffers from sample and computational complexity that scales exponentially in the number of players. We focus on a subclass of POMGs with independent state transitions, where agents remain coupled through their rewards, and assume that the underlying fully observed Markov game is a Markov potential game. For this class, we present an independent learning algorithm in which players, observing only their own actions and observations and without communication, jointly converge to an approximate Nash equilibrium. Due to partial observability, optimal policies may in general depend on the full action-observation history. Under a filter stability assumption, we show that policies based on finite history windows provide sufficient approximation guarantees. This enables us to approximate the POMG by a surrogate Markov game that is near-potential, leading to quasi-polynomial sample and computational complexity for independent Nash equilibrium learning in the underlying POMG.
0
0
cs.GT 2026-05-08

Core committees exist for approval elections with up to five voters

Core Existence in Approval-Based Committee Elections with up to Five Voter Types

A rounding technique shows stable proportional outcomes always exist and are efficiently computable for small voter counts.

abstract click to expand
In an approval-based committee election, the task is to select a committee of up to $k$ candidates from a set of $m$ candidates based on the preferences of $n$ voters, each of whom approves a subset of the candidates. A central open question is whether there always exists a committee in the core, a stability notion capturing proportional representation. We prove core non-emptiness for all approval-based committee elections with at most five voters. The proof is based on affine monoid methods and shows that, for $n\le5$, every fractional committee admits a deterministic rounding to an integral committee that preserves each voter's utility up to floors. We extend our argument to the weighted voter setting, which implies core existence for instances with up to five distinct approval sets. In all these cases, a core committee can be computed in polynomial time. We show that our technique cannot be extended as-is: our rounding method breaks down for $n=6$, and for $n=3$ when applied to more general models with additive valuations or non-unit candidate costs.
0
0
cs.GT 2026-05-08 Recognition

Method yields undominated ex ante stable school lotteries

A Simple Method for School Choice Lotteries

ETE reassignment on constrained efficient matchings produces fair random assignments no other ex ante stable lottery can beat.

abstract click to expand
This note proposes a simple polynomial-time method for constructing an ex ante stable school-choice lottery satisfying equal treatment of equals. The method applies the ETE reassignment to a constrained efficient stable matching and yields a lottery that is not ordinally dominated by any other ex ante stable lottery.
0
0
cs.GT 2026-05-08

Blockchain procurement loss scales logarithmically with fault costs

Adversarial procurement in blockchains

Optimal mechanisms lose only log times adversarial fraction in liveness cost, suggesting single-worker plus committee designs.

Figure from the paper full image
abstract click to expand
An emerging blockchain protocol design pattern leverages the asymmetry between the computational effort in performing versus verifying tasks. For example, cryptographic validity proofs (e.g., SNARKS) require the prover to expend significant effort demonstrating the correctness of their claim, while the verifiers benefit from extremely easy validation. The operationalization of this paradigm requires efficiently soliciting the performance of expensive tasks in pseudonymous, adversarial environments. We formalize this as a mechanism design question. The protocol balances the economic cost of a liveness fault, where the work is not completed, with the payments required to incentivize specific behavior from candidate suppliers. We show that the loss of the optimal protocol scales logarithmically in the cost of a liveness fault, scaled up by the adversarial fraction of the network. Further, we find that the optimal equilibria have an intuitive structure, allowing us to provide concrete advice to practitioners. Specifically, in many regimes, the optimum designates a single, random node as the primary worker and a committee as a fallback, which is reminiscent of leader-based consensus mechanisms. We also characterize the asymptotic regimes where having negative payments (i.e., slashing in blockchain parlance) is especially helpful.
0
0
cs.GT 2026-05-07 Recognition

Platforms balance profit and welfare using pricing

Pricing, Matching, and Bundling: an Equilibrium Analysis of Online Platforms

Thesis shows the three design levers reinforce one another in equilibrium models of marketplaces and services.

Figure from the paper full image
abstract click to expand
Modern online platforms such as marketplaces, ride-hailing services, and food-delivery systems serve a dual role: they are both markets where participants interact and transact, and operators that design and govern how these markets function. These platforms connect multiple sides, for example buyers, sellers, and couriers, facilitating access that would otherwise be difficult to achieve. By setting the rules of the market, platforms determine who participates, how interactions take place, and how value is created and distributed. In response to these rules, participants may behave strategically, deciding whether to join the platform and which transactions to pursue. This thesis studies how platform design affects market outcomes through three key levers: pricing that determines participants' gains when operating on a platform; matching that governs which interactions are feasible among participants; and bundling that shapes the structure of supply when the platform itself acts as a market participant. Across these levers, the goal in this thesis is to understand how platforms can be designed to balance platform profitability with overall market welfare. The first part of this thesis studies pricing, including both the commission fees that participants pay to a platform and the prices associated with each transaction. The second part of this thesis studies matching. By shaping recommendation systems and consumer search, platforms influence which transactions take place. The third part of this thesis analyzes bundling. As a marketplace operator, a platform may be able to source products from sellers and offer them as bundled packages to buyers. Collectively, this thesis shows how pricing, matching, and bundling serve as complementary design levers through which platforms can shape market outcomes.
0
0
cs.GT 2026-05-07

Recognizing graph supports for preferences is NP-hard with few edges

When Graph Traversal Meets Structured Preferences: Unified Framework and Complexity Results

For all six standard traversal methods, deciding if voter rankings arise from a small or low-degree graph is computationally hard; tree case

Figure from the paper full image
abstract click to expand
Preference restrictions have played a significant role in computational social choice. This paper studies a framework that connects preference restrictions with classical graph search paradigms. We model candidates as vertices of a graph and interpret the preference ordering of each voter as the outcome of traversing the graph according to a graph search. We focus on six fundamental paradigms: breadth-first search (BFS), depth-first search (DFS), breadth-first search (LexBFS), lexicographic depth-first (LexDFS), maximum cardinality search (MCS), and maximal neighborhood search (MNS). Within this framework, we study the problem of determining whether a given preference profile admits a graph support subject to structural restrictions, that is, whether there exists a graph such that each preference ordering can be generated by traversing the graph under the chosen paradigm. For all considered paradigms, we show that this problem is NP-hard when the graph support is required to have at most $k$ edges, where $k$ is a given integer. We further extend these hardness results to the case where the graph support is required to have maximum degree $k$. For DFS, we prove that recognizing whether a preference profile admits a tree support can be solved in polynomial time. Moreover, existing results imply polynomial-time solvability of the problem for all remaining graph traversals, except BFS and LexBFS, for which the complexity remains open.
0
0
cs.GT 2026-05-07

PJR+ is the minimal proportionality notion under four mild axioms

An Axiomatic Analysis of Proportionality Notions in Approval-Based Multiwinner Voting

Any notion obeying monotonicity, independence of losers, robustness to satisfied voters, and lower quota must refine PJR+.

Figure from the paper full image
abstract click to expand
Even though proportional representation is a fundamental goal in multiwinner voting and a plethora of proportionality notions has been introduced, the normative justifications for choosing one notion over another remain poorly understood. We address this by introducing the axiomatic study of proportionality notions in the approval-based multiwinner voting setting. That is, we define axioms (or desirable properties) that ``good'' proportionality notions should possess. Using these axioms, we then provide axiomatic characterizations of two prominent recently introduced notions: PJR+ and EJR+ [Brill and Peters 2023]. Our characterization proceeds in two parts. Firstly, we provide a characterization of refinements of PJR+ and EJR+. That is, we define axioms such that any notion satisfying these axioms must imply PJR+ (or EJR+, respectively). In particular, the fundamental axiom distinguishing PJR+ and EJR+ from their predecessors PJR and EJR is the classical axiom of monotonicity. Secondly, we introduce our framework of witness-based proportionality notions, that is, proportionality notions that certify ``misrepresentation'' via a witness set of misrepresented voters. In this class, we provide characterizations of PJR+ and EJR+ as the strongest (assuming certain axioms). Thus, by putting both directions together we obtain exact characterizations of both notions. Among our results, it may be worth highlighting that any notion satisfying mild conditions (monotonicity, independence of losers, robustness to fully satisfied voters, and lower quota) refines PJR+. In this sense, PJR+ turns out to be the canonical minimal requirement that one may impose on proportionality.
0
0
cs.GT 2026-05-06

Light storage limits turn content-provider competition into a potential game

Decentralized Edge Caching under Budget and Storage Constraints: A Game-Theoretic Approach

Decentralized equilibria are guaranteed and reachable; binding constraints still converge in simulations but widen profit gaps and boost the

Figure from the paper full image
abstract click to expand
The rapid growth of mobile social networks (MSNs) has significantly increased the demand for low-latency and reliable content delivery, motivating the deployment of edge caching systems. In practice, multiple content providers (CPs) compete for the limited storage resources of edge devices (EDs), while facing heterogeneous budgets and operational costs. This paper investigates a decentralized multi-CP edge caching framework that jointly accounts for CP budget constraints, ED storage limitations, and strategic interactions among all entities. We formulate the interaction between CPs and EDs as a hierarchical game, combining a Stackelberg model for CP-ED interactions with a non-cooperative game among competing CPs. Under light storage constraints, we show that CP competition constitutes an exact potential game, ensuring the existence of a pure-strategy Nash equilibrium and enabling decentralized convergence. When storage constraints are binding, the resulting game loses this structure; nevertheless, extensive simulations demonstrate stable and efficient convergence in practice. Through a comprehensive numerical evaluation, we show that convergence behavior is primarily driven by CP competition rather than the scale of edge infrastructure. We further reveal that storage scarcity fundamentally alters economic outcomes, amplifying inequality among CPs while increasing the relative bargaining power of EDs. The proposed framework provides a scalable and economically grounded solution for decentralized resource allocation in multi-provider edge caching systems.
0
0
cs.GT 2026-05-06

Honest reports beat any misreport under power-p scores for d<=4

Honest Reporting in Scored Oversight: True-KL0 Property via the Prekopa Principle

Exact gain formula G=-R U together with log-concavity of the loss integral proves unconditional incentive compatibility for every M>1.

abstract click to expand
We prove the True-KL$_0$ property for a parametric family of heterogeneous scoring rules arising in scored elicitation mechanisms (AI oversight, forecasting competitions, expert surveys). A $d$-dimensional agent with private type $M>1$ reports to a principal who evaluates via a power-$p$ pseudospherical scoring rule, $p \in (d,d+1)$; $M$ captures the agent's information quality relative to a reference. An exact formula $G(M,M') = -R(M,p,d) U(M|M)$ shows DSIC unconditionally: honest reporting maximises expected score for every $M>1$, without distributional assumptions. True-KL$_0$, the property $R(M,p,d)<1$ for all $M>1$, $d \in \{2,3,4\}$, $p \in (d,d+1)$, gives an explicit gain-magnitude bound: the best misreport is always worse than the honest score itself. Two structural tools drive the proof: (i) a substitution $y=(x+1)/(x-1)$ rewrites the loss integral $I_L$ as $\int_1^M F(y)(M^2-y^2)^{d/2} dy$ with $M$-independent weight $F(y)>0$, isolating all $M$-dependence in a single convex factor; (ii) Prekopa's theorem on log-concavity preservation establishes that $I_L$ is log-concave in $M$, the key step in the unimodality proof for $R$. For $d=2$ the log-concavity proof is fully algebraic. For $d \in \{3,4\}$ the Prekopa argument (analytic, covering $M \le M_{cut}(d,p) \le 20$) combines with a certified high-precision numerical step on the residual region $M \in [M_{cut}, 20]$, closed by a large-$M$ asymptotic for $M>20$. We also characterise the dimensional boundary: True-KL$_0$ holds unconditionally for all $p \in (d,d+1)$ when $d \le 4$, but fails above a critical threshold $p_{crit}(d) \in (d,d+1)$ for $d \ge 5$; for $d=5$ we locate $p_{crit}(5) \in (5.5718, 5.5750)$ via high-precision mpmath evaluation (half-width 0.0016, not interval-certified).
0
0
cs.GT 2026-05-06

LLMs in security dilemmas reproduce multipolar conflict and unraveling

Multi-Agent Strategic Games with LLMs

Experiments show more players raise conflict risk, finite rounds trigger immediate defection, and messages enable reciprocity.

abstract click to expand
This paper asks whether large language models (LLMs) can be used to study the strategic foundations of conflict and cooperation. I introduce LLMs as experimental subjects in a repeated security dilemma and evaluate whether they reproduce canonical mechanisms from international relations theory. The baseline game is extended along three theoretically central dimensions: multipolarity, finite time horizons, and the availability of communication. Across multiple models, the results exhibit systematic and consistent patterns: multipolarity increases the likelihood of conflict, finite horizons induce universal unraveling consistent with backward-induction logic, and communication reduces conflict by enabling signaling and reciprocity. Beyond observed behavior, the design provides access to agents' private reasoning and public messages, allowing choices to be linked to underlying strategic logics such as preemption, cooperation under uncertainty, and trust-building. The contribution is primarily methodological. LLM-based experiments offer a scalable, transparent, and replicable approach to probing theoretical mechanisms.
0
0
cs.GT 2026-05-06

Delegation conserves credit capacity in pseudonymous lending

Unsecured Lending via Delegated Underwriting

Sponsors reallocate existing borrowing power to new users, defaults stay local to sponsor paths, and earned-credit caps deter repay-then-

Figure from the paper full image
abstract click to expand
We develop a mechanism for unsecured lending among pseudonymous users that does not rely on collateral, legal identity, or centralized underwriting. New borrowers enter only through sponsors who delegate part of their own credit capacity, so onboarding a new account reallocates existing borrowing power rather than minting new capacity. Default losses flow back along the sponsor path, while repayment creates earned credit that expands future borrowing capacity. We prove that delegation conserves aggregate credit capacity, that revocation and default remain local to a unique sponsor path, and that a simple cap on earned-credit growth makes repay-then-default weakly unprofitable.
0
0
cs.GT 2026-05-06

Any graph admits a balanced k-partition that is approx envy-free and core-stable

Some Improved Results on Fair and Balanced Graph Partitions

The partition keeps individual envy within O(max{โˆšฮ”, kยฒ} ln n) and coalitional deviation within (k + o(k)), with efficient algorithms when a

abstract click to expand
We consider the problem of partitioning an undirected graph (representing a social network) over $n$ nodes and max degree $\Delta$ into $k$ equally sized parts. Each node in the graph, representing an agent, derives utility proportional to the number of their neighbors in their assigned part. Our goal is to find a balanced partitioning that is fair. The two notions of fairness we consider are the core and envy-freeness. A partition is envy-free if no node gains utility from moving to a different part, and a partition is in the core if no set of $n/k$ nodes can deviate to form a new part with all nodes gaining in utility. We show that there exists a balanced partition which is both $O(\max\{\sqrt{\Delta}, k^2\} \ln n)$-approximately envy-free and in the $(k + o(k))$-approximate core. Taken separately, these two guarantees are comparable to (and in some cases, better than) the best known envy-freeness and core guarantees for this problem. Moreover, we show that these desirable partitions can be computed efficiently if we slightly relax the balancedness constraint. In addition, when $k = 2$, we show that a $(1.618 + o(1))$-core exists, and a $(2 + \varepsilon)$-core can be computed in polynomial time. The last two results make progress on two open questions from Li et al. [AAAI, 2023].
0
0
cs.GT 2026-05-05

Neural menus deliver strategy-proof matching under quotas

MenuNet: A Strategy-Proof Mechanism for Matching Markets

MenuNet outputs probabilistic options for each agent and assigns via sequential choice, cutting envy below RSD and waste below DA.

Figure from the paper full image
abstract click to expand
Strategy-proofness is a fundamental desideratum in mechanism design, ensuring truthful reporting and robust participation. Stability is another central requirement in matching markets, widely adopted in applications such as school choice and labor market clearing. In practice, however, these markets are invariably governed by complex distributional constraints, ranging from diversity quotas and regional balance to global capacity slacks, under which stable matchings often fail to exist. This raises a fundamental question: how to distribute unavoidable instability across agents while preserving strategy-proofness? To address this, we propose \texttt{MenuNet}, a strategy-proof mechanism design framework based on a neural representation of menus. Rather than directly constructing assignments, \texttt{MenuNet} learns to generate personalized probabilistic menus, from which assignments are realized via a structured sequential choice rule that guarantees strategy-proofness by construction. By decomposing stability into fairness (no envy) and non-wastefulness, our approach models these properties as vector-valued quantities and optimizes their distribution through differentiable objectives, providing a principled trade-off between competing axioms. Empirically, \texttt{MenuNet} navigates this trade-off effectively: it consistently outperforms Random Serial Dictatorship (RSD) in terms of envy and Deferred Acceptance (DA) in terms of waste, while maintaining scalability and computational efficiency. These results suggest that learning-based menu mechanisms provide a flexible and scalable paradigm for mechanism design in highly constrained, real-world environments.
0
0
cs.GT 2026-05-05

Informed player solves Nash strategies at 10 Hz in 8D LQ games

Fast Strategy Solving for the Informed Player in Two-Player Zero-Sum Linear-Quadratic Differential Games with One-Sided Information

Bi-level optimization over signaling actions and game-tree LQR controls yields real-time equilibria under payoff uncertainty and random dist

Figure from the paper full image
abstract click to expand
We study finite-horizon two-player zero-sum differential games with one-sided payoff information ($G$), where the informed player (P1) knows the game payoff, while P2 only has a public belief over a finite set of possible payoffs. In this case, P1's Nash equilibrium (NE) behavioral strategy may control the release of the type information or even resort to manipulate P2's belief. Previous studies revealed an atomic structure of the NE of $G$ with general nonlinear dynamics and payoffs, leading to tractable NE approximation. Implementing such approximation schemes for real-time sub-game solving, however, has not been achieved, yet is desired for applications where sim-to-real gaps exist and robust control is required. This paper improves the computational efficiency of sub-game solving for P1 during $G$ with linear dynamics and quadratic losses. Specifically, we show that P1's NE computation can be formulated as a bi-level optimization problem where the outer level optimizes the "signaling" strategy, i.e., when and how to reveal information through control, and the inner level is a game-tree LQR that solves for the optimal closed-loop control. This bi-level problem is solved via an adjoint-enabled backpropagation scheme: A "backward" LQR pass is followed by a "forward" gradient descent pass for improving the signaling. We apply the proposed algorithm to approximate NEs for variants of a homing problem with a 8D state space, 2D action spaces, and a discrete time horizon of $K=10$. The algorithm achieves $\approx$10Hz sub-game solving, enabling robust game-theoretic planning under information asymmetry and random disturbances.
0
0
cs.GT 2026-05-04

Belief game converts imperfect-recall equilibria into perfect-recall solvable form

Efficient representations for team and imperfect-recall equilibrium computation

The equivalent game is represented as a team belief DAG that supports efficient regret minimization and state-of-the-art benchmark results.

Figure from the paper full image
abstract click to expand
Equilibrium finding in two-player zero-sum games with perfect recall is a well-studied topic that has led to many breakthroughs in computational game theory. This paper aims to generalize such techniques to (timeable) two-player zero-sum games with imperfect recall, or equivalently to two-team zero-sum games. In this setting, the problem of computing a mixed-strategy Nash equilibrium (or, equivalently, a team maxmin equilibrium with correlation) is known to be NP-hard. We connect the imperfect-recall setting with its perfect-recall counterpart through a novel construction we call the belief game. This is a perfect-recall game equivalent to a given (timeable) two-player zero-sum game with imperfect recall. The belief game may be exponentially larger than the original game but can be solved using any standard method. We then show that the strategy spaces of the two players in the belief game can be directly represented as a DAG, leading to a possibly exponential speedup. We call this the team belief DAG (TB-DAG). The TB-DAG simultaneously enjoys essentially optimal parameterized complexity bounds and the advantages of efficient regret minimization techniques. Along the way, we show $\Delta_2^P$-completeness and $\Sigma_2^P$-completeness of finding Nash equilibria in both mixed and behavioral strategies for the class of games we consider. Experimentally, we show that the TB-DAG, when paired with existing learning techniques, yields state-of-the-art performance on a wide variety of benchmark team games.
0
0
cs.GT 2026-05-04

Marginal ad value model yields optimal regret in second-price auctions

The (Marginal) Value of a Search Ad: An Online Causal Framework for Repeated Second-price Auctions

Defining value as the outcome gain from winning versus losing enables algorithms that strictly improve on first-price learning.

Figure from the paper full image
abstract click to expand
Existing auto-bidding algorithms in digital advertising often treat the value of an ad opportunity as the revenue obtained when an ad is shown and/or clicked, and bid accordingly. This can lead to wasteful spending because the true value is the marginal gain from paid exposure: even without winning a sponsored slot, an advertiser may still earn revenue via an organic search result (e.g., on Google or Amazon). Motivated by recent work, we model ad value as a treatment effect--the outcome difference between winning and losing the auction--and study online learning for bidding in second-price (Vickrey) auctions under this causal perspective. We develop algorithms that attain rate-optimal regret under several feedback models. A key ingredient exploits the information revealed by the second-price payment rule, which strictly improves regret relative to analogous learning problems in first-price auctions.
1 0
0
cs.GT 2026-05-04

Beliefs guarantee non-empty cores in symmetric partition games

Partition function form games with probabilistic beliefs

Specific probability distributions over outsider partitions make the core non-empty even when beliefs are inconsistent with actual play.

Figure from the paper full image
abstract click to expand
We revisit games in partition function form, i.e. cooperative games where the payoff of a coalition depends on the partition of the entire set of players. We assume that each coalition computes its worth having probabilistic beliefs over the coalitional behavior of the outsiders, i.e., it assigns various probability distributions over the set of partitions that the outsiders can form. These beliefs are not necessarily consistent with respect to the actual choices of the outsiders. We apply this framework to symmetric partition function form games characterized by either positive or negative externalities and we derive conditions on coalitional beliefs that guarantee the non-emptiness of the core of the induced games.
0
0
cs.GT 2026-05-04

Low-stake attackers degrade pools to profit from token drops

Your Loss is My Gain: Low Stake Attacks on Liquid Staking Pools

Data shows pool performance predicts LST returns, enabling short positions that succeed over 50 percent of the time in simulations.

Figure from the paper full image
abstract click to expand
Permissionless Proof-of-Stake (PoS) economic security is predicated on the high cost of violating consensus safety or liveness. We show that liquid staking introduces additional risks that are not captured by standard PoS economic security arguments. Through an empirical study of Ethereum data, we find that the operational performance of liquid staking pools is positively associated with subsequent normalized liquid staking token (LST) returns. Motivated by this, we present a cross-layer attack: a low-stake adversary can manipulate the consensus protocol to degrade a target pool's performance and take application-layer positions that profit if the market reprices the corresponding \gls{LST} in-line with the historically observed association. To make the consensus layer manipulation concrete, we develop a deep reinforcement learning (DRL) framework to automatically discover attack strategies. Our evaluation shows that the learned strategies can recover near-optimal theoretical attacks and uncover new manipulation behaviors that significantly degrade target pool performance. We further characterize feasible application-layer monetization channels and analyze leveraged shorting in detail using Monte Carlo simulations, showing that such attacks can be profitable with over one-half probability for LSTs of major staking pools. Our findings reveal a previously overlooked attack surface in PoS systems with liquid staking and expose a gap between consensus and economic security.
0
0
cs.GT 2026-05-04

Homogeneous groups bound game interactions to minority size

Induced Representations in Cooperative Games with Homogeneous Groups of Players

Induced representations filter the kernel and recover the unique Shapley value for two groups under standard axioms.

abstract click to expand
Oftentimes, the Shapley value becomes infeasible for games with many players. However, establishing symmetry allows for polynomial-time computation. To examine this reduction, we identify the spectrum of homogeneous group games by using an induced representation from a Young subgroup. We then prove that such games are supported solely by irreducible representations, via the Littlewood-Richardson rule, where the depth of interactions is strictly bounded by the size of the minority group. Therefore, the algebraic structure of the game filters out the complexities of the general kernel $W$. We then show that this filtration constrains any symmetric linear value to a specific subspace. This recovers the Shapley value uniquely for $m=2$ under standard axioms. Finally, we explore applications to the UN Security Council and complementary goods markets to illustrate the practical power of this approach.
0
0
cs.GT 2026-05-04

First budget-feasible mechanism approximates submodular welfare

Budget-Feasible Mechanisms for Submodular Welfare Maximization in Procurement Auctions

BFM-SWM is truthful and individually rational while guaranteeing a constant-factor approximation for welfare maximization under hard budget

Figure from the paper full image
abstract click to expand
Budget-feasible procurement auctions play a pivotal role in various AI-driven marketplaces, such as data acquisition and crowdsourcing, where a buyer with a limited budget seeks to procure services from strategic sellers with private costs. While numerous budget-feasible mechanisms have been proposed for the classic objective of maximizing the buyer's valuation, the more challenging and economically significant objective of social welfare maximization has only recently been studied, and existing approaches still sacrifice budget feasibility, thereby limiting their practical applicability. In this paper, we bridge this gap by proposing BFM-SWM, the first budget-feasible mechanism with provable approximation guarantees for submodular welfare maximization in procurement auctions. Our mechanism satisfies standard economic properties, including truthfulness, individual rationality, and non-negative auctioneer surplus. As a by-product, we develop BFM-VM, a variant tailored for valuation maximization, which achieves a deterministic approximation ratio of $1/(12+4\sqrt{3})$ for general submodular functions, substantially improving upon the best-known deterministic ratio of $1/64$ established by [Balkanski et al., SODA 2022], while reducing the running time from $\mathcal{O}(n^2\log n)$ to $\mathcal{O}(n\log n)$. Extensive experiments demonstrate the efficiency and effectiveness of our mechanisms.
0
0
cs.GT 2026-05-01

Offline algorithm hits ร•(1/n) rate for Nash in ฮฑ-potential games

Fast Rates in ฮฑ-Potential Games via Regularized Mirror Descent

Reference-anchored coverage and KL-regularized mirror descent deliver the first fast-rate offline guarantee for this structured game class.

abstract click to expand
An $\alpha$-potential game is a multi-player non-cooperative interaction in which a global potential function approximates individual player rewards up to a structural bias $\alpha$. While identifying a Nash Equilibrium (NE) in generic general-sum games is known to be computationally intractable, the potential game structure enables tractable NE identification. In this paper, we study the offline learning of NE in $\alpha$-potential games using KL regularization. To analyze this process, we propose a novel Reference-Anchored offline data coverage framework--a verifiable condition that anchors data requirements to a known reference policy rather than an unknown optimum. Building on this, we propose Offline Potential Mirror Descent (OPMD), a decentralized algorithm that achieves an accelerated $\widetilde{\mathcal{O}}(1/n)$ statistical rate, surpassing the standard $\widetilde{\mathcal{O}}(1/\sqrt{n})$ rate typical of offline multi-agent learning. This work characterizes the first fast-rate offline learning approach for $\alpha$-potential games.
0
0
cs.GT 2026-05-01

Bounded relaxations reconcile fairness and non-wastefulness

Compatible k-Relaxations of Fairness and Non-Wastefulness Under Hereditary Constraints

For any fixed k, ER-k and NW-k are jointly achievable under hereditary constraints by two polynomial-time algorithms.

Figure from the paper full image
abstract click to expand
We study two-sided matching markets under hereditary constraints, which extend beyond simple capacity limits and arise in applications such as diversity requirements and refugee resettlement. In these settings, fairness and non-wastefulness are often incompatible, and existing approaches typically address this tension by prioritizing one property at the expense of the other. We take a different approach by relaxing both properties simultaneously in a controlled and symmetric manner. We introduce two notions indexed by an integer $k$: envy-received up to $k$ peers (ER-$k$) and non-wastefulness up to $k$ objections (NW-$k$). Our main theoretical result shows that ER-$k$ and NW-$k$ are always compatible under hereditary constraints for any fixed $k$. We provide two equivalent polynomial-time algorithms to compute such matchings: a $k$-admissible cutoff algorithm and a $k$-admissible college-proposing deferred acceptance mechanism. Finally, experimental results demonstrate that even small relaxations achieve a favorable balance between fairness and non-wastefulness.
0
0
cs.GT 2026-05-01

Minimizing average coalitional gains yields optimal-time equilibria

Computing Equilibrium beyond Unilateral Deviation

Always-existing alternatives to strong Nash admit algorithms matching a new complexity lower bound and compute maximum welfare under any fix

Figure from the paper full image
abstract click to expand
Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating unilaterally. They offer no guarantees against profitable coordinated deviations by coalitions. Although the literature proposes solution concepts that provide stability against multilateral deviations (\emph{e.g.}, strong Nash and coalition-proof equilibrium), these generally fail to exist. In this paper, we study an alternative solution concept that minimizes coalitional deviation incentives, rather than requiring them to vanish, and is therefore guaranteed to exist. Specifically, we focus on minimizing the average gain of a deviating coalition, and extend the framework to weighted-average and maximum-within-coalition gains. In contrast, the minimum-gain analogue is shown to be computationally intractable. For the average-gain and maximum-gain objectives, we prove a lower bound on the complexity of computing such an equilibrium and present an algorithm that matches this bound. Finally, we use our framework to solve the \emph{Exploitability Welfare Frontier} (EWF), the maximum attainable social welfare subject to a given exploitability (the maximum gain over all unilateral deviations).
0
0
cs.GT 2026-05-01

Proportional updates let hierarchies self-evaluate from binary outcomes

Implicit Evaluation Under Minimal Information: Price Formation in Hierarchical Component Selection

Selectors read the sign of their own weight changes as quality signals for children, needing no explicit reports across levels.

Figure from the paper full image
abstract click to expand
We study hierarchical component selection under severe information constraints. Component quality is not directly observable, each selector observes only the outcome of the chosen pathway, and no explicit evaluation channel crosses module boundaries. We analyse a proportional-redistribution mechanism in which each selector maintains a weight vector over its children and updates that vector from observed outcomes. The sign of a parent's weight change can be read locally as an implicit binary evaluation signal by the selected child, yielding a decentralised evaluation mechanism with no explicit reporting channel. We give a full formal treatment. Proportional redistribution preserves market integrity algebraically. The sign of the weight change propagates without loss through the active path. The single-selector dynamics admit a unique interior equilibrium; for $N{=}2$ the equilibrium is exact and closed-form, while for general $N$ an equi-ratio condition yields an explicit affine equilibrium. Hierarchical composition is informationally clean, with each node's active-round dynamics identical to a standalone instance observed on a thinned clock. All structural results, the equilibrium formula, and the composition theorem are fully proved. Illustrative cases on synthetic hierarchies with up to 32,768 leaves and on three natural-hierarchy datasets confirm the mechanism's operation under constructed and applied conditions.
0
0
cs.GT 2026-05-01

Authors voluntarily enter rejection lottery to improve reviews

Can We Volunteer Out of the Peer Review Crisis?

Game analysis shows a Nash equilibrium arises when scientists value the quality of the literature they read, shrinking reviewer workload for

Figure from the paper full image
abstract click to expand
The volume of scientific manuscripts is growing faster than the capacity to evaluate them, yet the institutions that govern peer review have remained largely unchanged. The result is a widening mismatch: reviewer scarcity, noisier assessments, and declining confidence in editorial decisions. Every scientist wants better reviews, but review quality depends on the total burden, which no single author can shift. To isolate this tension, we provide a game-theoretic thought experiment: a voluntary lottery in which authors accept a chance of random pre-review rejection, reducing reviewer burden and improving the quality of surviving evaluations. We show that a Nash equilibrium emerges in which authors voluntarily enter the lottery. Scientists who care about the literature they read, not just the papers they publish, will opt in, raising the quality of published science for all.
0
0
cs.GT 2026-05-01

Stable matchings optimize arbitrary institutional goals efficiently

Maximally Diverse Stable Matchings: Optimizing Arbitrary Institutional Objectives

Mapping set functions to edge weights lets algorithms minimize total or maximum objective values while preserving stability.

abstract click to expand
Stable matching theory is the foundation of centralized clearinghouses worldwide, from school choice programs to medical residency allocations. However, incorporating complex distributional goals-such as multi-dimensional diversity quotas or sibling co-assignment guarantees-often compromises stability or renders the problem computationally intractable. The existing literature typically addresses this tension by weakening stability to accommodate distributional constraints. In contrast, the reverse question remains largely unexplored: if we restrict attention to stable matchings, to what extent can such distributional objectives be achieved? In this paper, we resolve this tension by introducing a general, polynomial-time algorithmic framework to optimize arbitrary institutional (or even two-sided) objectives within the set of stable matchings. We prove that for any polynomial-time computable set functions $g_i$ evaluating the assigned students at institutions $i \in I$, a stable matching minimizing either the utilitarian objective $\sum_{i\in I} g_i$ or the egalitarian objective $\max_{i\in I} g_i$ can be found efficiently. Our approach leverages the structural properties of stable matchings, mapping arbitrary set functions to linear edge weights. We apply this theorem to efficiently solve major open practical problems: finding stable matchings that minimally violate overlapping diversity quotas (under both total and maximum violations) and maximizing the number of sibling families assigned to the same institution. Even when the distributional objective is prioritized, our algorithm helps to quantify the ``price of stability'', i.e., the gap between the maximally diverse matching and the maximally diverse stable matching.
0
0
cs.GT 2026-05-01

Approximate Fisher equilibria need PCP-for-PPAD conjecture for hardness

Fisher Markets with Approximately Optimal Bundles and the Need for a PCP Theorem for PPAD

The first natural problem whose constant-factor intractability is equivalent to proving the conjecture itself.

Figure from the paper full image
abstract click to expand
We study the problem of computing a competitive equilibrium with approximately optimal bundles in Fisher markets with separable piecewise-linear concave (SPLC) utility functions, meaning that every buyer receives a $(1-\delta)$-optimal bundle, instead of a perfectly optimal one. We establish the first intractability result for the problem by showing that it is PPAD-hard for some constant $\delta > 0$, assuming the PCP-for-PPAD conjecture. This hardness result holds even if all buyers have identical budgets (competitive equilibrium with equal incomes), linear capped utilities, and even if we also allow $\varepsilon$-approximate clearing instead of perfect clearing, for any constant $\varepsilon < 1/9$. Importantly, we show that the PCP-for-PPAD conjecture is in fact required to show hardness for constant $\delta$: showing PPAD-hardness for finding such approximate market equilibria in a broad class of markets encompassing those generated by our hardness result would prove the conjecture. This is the first natural problem where the conjecture is provably required to establish hardness for it.
0
0
cs.GT 2026-04-30

Randomized mechanisms guarantee 1/log n MMS ex-post

Truthful-in-Expectation Mechanisms for MMS Approximation

TIE mechanisms for indivisible goods deliver near-optimal fairness using only ordinal rankings and run in polynomial time.

Figure from the paper full image
abstract click to expand
We study fair allocation of indivisible goods among strategic agents with additive valuations. Motivated by impossibility results for deterministic truthful mechanisms, we focus on randomized mechanisms that are \emph{Truthful-in-Expectation (TIE)}. From a fairness perspective, we seek to guarantee every agent a large fraction of their \emph{Maximin Share (MMS)} ex-post. Among other results, Bu~and~Tao~[FOCS 2025] presented a TIE mechanism that guarantees $\frac{1}{n}$-MMS ex-post. First, we present an ordinal TIE mechanism that guarantees $\frac{1}{H_n + 2}$-MMS ex-post, where $H_n$ is the $n$-th harmonic number ($H_n \simeq \ln n$). This is nearly best possible for ordinal mechanisms, as even non-truthful ordinal allocation algorithms cannot obtain an approximation better than $\frac{1}{H_n}$. We then show that with just a small amount of additional cardinal information, the ex-post guarantee can be improved to $\Omega(\frac{1}{\log\log n})$-MMS, at the cost of relaxing the incentive requirement to $(1-\varepsilon(n))$-TIE for negligible $\varepsilon(n)$. Finally, for two agents, we present a TIE mechanism that is $\frac{2}{3}$-MMS ex-post. All our mechanisms are ex-ante proportional (thus also providing ``Best-of-Both-Worlds'' results) and run in polynomial time. Moreover, all our results extend to the truncated proportional share (TPS), which is at least as large as the MMS. Our two-agent $\frac{2}{3}$-TPS result is best possible for the TPS.
0
0
cs.GT 2026-04-30

LLMs compute Nash equilibria internally but suppress them

What Suppresses Nash Equilibrium Play in Large Language Models? Mechanistic Evidence and Causal Control

Mechanistic analysis shows early encoding of opponent moves and late prosocial override that causal edits can reverse.

Figure from the paper full image
abstract click to expand
LLM agents are known to deviate from Nash equilibria in strategic interactions, but nobody has looked inside the model to understand why, or asked whether the deviation can be reversed. We do both. Working with four open-source models (Llama-3 and Qwen2.5, 8B to 72B parameters) playing four canonical two-player games, we establish the behavioral picture through self-play and cross-play experiments, then open up the 32-layer Llama-3-8B model and examine what actually happens during a strategic decision. The mechanistic findings are clear. Opponent history is encoded with near-perfect fidelity at the first layer (96% probe accuracy) and consumed progressively, while Nash action encoding is weak throughout, never exceeding 56%. There is no dedicated Nash module. Instead, the model privately favors the Nash action through most of its forward pass, but a prosocial override rooted in pretraining on human text concentrated in the final layers reverses this, reaching 84% probability of cooperation at layer 30. Injecting a learned Nash direction into the residual stream shifts behavior bidirectionally and causally, confirmed through concept clamping. The behavioral experiments surface six scale- and architecture-dependent findings, the most notable being that chain-of-thought reasoning worsens Nash play in small models but achieves near-perfect Nash play above 70B parameters. The cross-play experiments reveal three phenomena invisible in self-play: a small model can unravel any partner's cooperation by defecting early; two large models reinforce each other's cooperative instincts indefinitely; and who moves first determines which Nash equilibrium the system reaches. LLMs do not lack Nash-playing competence. They compute it, then suppress it.
0
0
cs.GT 2026-04-30

Category labels alone bound welfare gaps and cut misreporting

MISES: Minimal Information Sufficiency for Effective Service

Demand-derived categories achieve tight coordination bounds while aggregate metrics outperform full per-agent data under homogeneity.

Figure from the paper full image
abstract click to expand
Category-based coordination mechanisms allocate resources by mapping a declared service category to a fixed resource profile, without observing individual demand types. We establish three results for this class of mechanisms. First, the relative welfare gap Delta satisfies a tight two-sided bound in terms of the aggregate within-category allocation variance epsilon: (alpha/2W*)epsilon <= Delta <= (beta/2W*)epsilon. Second, the expected misreporting gain is bounded by the same epsilon without assumptions on agent strategy; demand-derived categories minimise both welfare loss and misreporting incentive simultaneously. Third, aggregate outcome metrics strictly dominate per-agent metrics for service-level detection under a homogeneity condition, for all parameter values, with a finite-sample power gap of O(1/m). At any fixed K, the demand-derived category label is the sufficient statistic for coordination: collecting per-agent data beyond the category label adds noise to the detection problem without reducing the welfare gap. However, welfare and detection impose structurally opposed demands on K: welfare improves with finer categories, detection worsens. The designer faces a feasibility band [Kmin, Kmax] and must choose K within it as a value judgement. We claim that any protocol achieving welfare gap Delta <= epsilon* and missed-detection rate <= beta* requires at least Hlb(epsilon*, beta*) bits of category entropy. We illustrate the mechanism on a synthetic population of 50,000 demand vectors and five weeks of production performance-management data from four anonymised operator networks (28,249 cells).
0
0
cs.GT 2026-04-30

Category labels suffice to bound welfare gaps tightly in coordination

MISES: Minimal Information Sufficiency for Effective Service

Demand-derived categories minimize welfare loss and misreporting while aggregate metrics beat per-agent data for detection, creating a K-fee

Figure from the paper full image
abstract click to expand
Category-based coordination mechanisms allocate resources by mapping a declared service category to a fixed resource profile, without observing individual demand types. We establish three results for this class of mechanisms. First, the relative welfare gap Delta satisfies a tight two-sided bound in terms of the aggregate within-category allocation variance epsilon: (alpha/2W*)epsilon <= Delta <= (beta/2W*)epsilon. Second, the expected misreporting gain is bounded by the same epsilon without assumptions on agent strategy; demand-derived categories minimise both welfare loss and misreporting incentive simultaneously. Third, aggregate outcome metrics strictly dominate per-agent metrics for service-level detection under a homogeneity condition, for all parameter values, with a finite-sample power gap of O(1/m). At any fixed K, the demand-derived category label is the sufficient statistic for coordination: collecting per-agent data beyond the category label adds noise to the detection problem without reducing the welfare gap. However, welfare and detection impose structurally opposed demands on K: welfare improves with finer categories, detection worsens. The designer faces a feasibility band [Kmin, Kmax] and must choose K within it as a value judgement. We claim that any protocol achieving welfare gap Delta <= epsilon* and missed-detection rate <= beta* requires at least Hlb(epsilon*, beta*) bits of category entropy. We illustrate the mechanism on a synthetic population of 50,000 demand vectors and five weeks of production performance-management data from four anonymised operator networks (28,249 cells).
0
0
cs.GT 2026-04-29

Credit beyond full collateral works in decentralized micropayments

Credit Limits beyond Full Collateralization in Decentralized Micropayments: Incentive Conditions

Repeated buyer-merchant games with verifiable settlements use continuation value to prevent default and reduce locked capital.

abstract click to expand
In decentralized non-custodial micropayments, the central challenge is not whether payments can be executed directly, but under what conditions such systems can offer credit limits without requiring full collateral backing. Existing approaches typically tie available credit to posted collateral, causing liquidity requirements to scale with transaction volume and settlement exposure and limiting the practical usefulness of credit-based micropayments. This paper characterizes the incentive conditions under which credit-based non-custodial micropayments can operate beyond full collateralization while remaining incentive compatible. We model repeated buyer--merchant interactions under public monitoring and identify the roles of bounded exposure, verifiable settlement outcomes, and continuation value in deterring strategic default under non-custodial execution. The resulting characterization clarifies the trade-off between capital efficiency and the enforcement conditions required to sustain under-collateralized credit expansion without custodial trust. As an illustrative application-layer instantiation, an Arbitrum Nitro prototype provides execution-level evidence that the settlement, commitment, and incentive-enforcement paths of a credit-limit-based design can be realized with low on-chain overhead.
0
0
cs.GT 2026-04-29

Delay-averse agents stabilize equilibria in time-dependent job scheduling

Job-Scheduling Games with Time-Dependent Processing Times

Pure Nash equilibria exist and are easy to find when agents dislike delay, but non-delay-averse jobs make existence NP-complete to decide on

Figure from the paper full image
abstract click to expand
Job-scheduling games have traditionally assumed fixed processing times. However, in many realistic environments, ranging from cyber-security response to high-frequency trading, a task's duration depends on its starting time. We study job-scheduling games with time-dependent processing times, where job lengths are linear functions of their start times, exhibiting either positive deterioration (increasing length) or negative deterioration (decreasing length). We analyze these games under various coordination mechanisms and priority policies. By introducing the concept of delay-averse agents, we provide a unifying framework to characterize equilibrium existence. For delay-averse jobs, we show that stability is maintained and pure Nash equilibria (NE) can be computed efficiently. In contrast, for non-delay-averse jobs, we demonstrate that a NE may not exist, and prove that deciding its existence is NP-complete, even on identical machines - a fundamental departure from classical coordination mechanisms. Regarding equilibrium inefficiency, we show that the Price of Anarchy (PoA) can be significantly higher than in environments with fixed processing times. To mitigate this, we propose and analyze three coordination mechanisms: SBPT (Shortest Basic Processing Time), which reduces the PoA in games with positive deterioration to a constant, and SDR (Smallest Deterioration Rate) and LBDR (Largest Basic-Deterioration Ratio) for negative deterioration, which achieve tight constant PoA bounds of $2$ and $\max\{\frac{e}{e-1}, 2-\frac{1}{m}\}$, respectively. Our results bridge the gap between centralized time-dependent scheduling and decentralized game-theoretic analysis.
0
0
cs.GT 2026-04-29

Game model computes optimal audits against coordinated agents

Optimally Auditing Adversarial Agents

Efficient algorithms find policies that minimize principal losses when agents choose worst-case reports in adaptive or budgeted settings.

Figure from the paper full image
abstract click to expand
Fraud can pose a challenge in many resource allocation domains, including social service delivery and credit provision. For example, agents may misreport private information in order to gain benefits or access to credit. To mitigate this, a principal can design strategic audits to verify claims and penalize misreporting. In this paper, we introduce a general model of audit policy design as a principal-agent game with multiple agents, where the principal commits to an audit policy, and agents collectively choose an equilibrium that minimizes the principal's utility. We examine both adaptive and non-adaptive settings, depending on whether the principal's policy can be responsive to the distribution of agent reports. Our work provides efficient algorithms for computing optimal audit policies in both settings and extends these results to a setting with limited audit budgets.
0
0
cs.GT 2026-04-28

Linear program finds equilibrium of deceptive resource game

Asymmetric-Information Resource Allocation Games: An LP Approach to Purposeful Deception

Defender's optimal mix of real and decoy allocations emerges directly from solving one LP for the Perfect Bayesian Nash Equilibrium.

Figure from the paper full image
abstract click to expand
In this work, we introduce the Deceptive Resource Allocation Game (DRAG), which studies purposeful deception within a Bayesian game framework. In DRAG, a Defender allocates resources across the true asset and several decoys to influence an Attacker's beliefs and actions, with the goal of diverting the Attacker away from the true asset. We seek to characterize purposeful deception, whereby the Defender deceives only when doing so improves its performance. To this end, we solve for the Perfect Bayesian Nash Equilibrium (PBNE) of the corresponding game. We show that, despite the coupled belief-policy interdependence, the problem admits an efficient, non-iterative linear programming formulation. Numerical results demonstrate that the resulting policies naturally balance effective allocation and belief manipulation, giving rise to purposeful and emergent deceptive behaviors.
0
0
cs.GT 2026-04-28

No-swap-regret often loses to no-regret due to N-times slower learning

Hierarchies of No-regret Algorithms

The stronger guarantee produces lower player utility in many games because it updates strategies less frequently than standard no-regret.

abstract click to expand
Our paper studies the setting of players using no-regret algorithms in various two-player games. We address whether having stronger regret guarantees or playing against an opponent with weaker regret guarantees yields higher utilities for the player in question. We consider a hierarchy of algorithms from weakest to strongest: uniform random play, no-regret, and no-swap-regret. We find, counterintuitively, that in many games, no-swap-regret is a worse choice for players (and gives better utility for their opponents). We find the root cause of this phenomenon to be a difference in effective learning rate between the two algorithms, where the no-swap-regret algorithms learn $N$ times slower than no-regret algorithms. To address this, we attempt to equalize learning rates, leading to closer utility between no-regret and no-swap-regret players. Finally, we show that for certain random games with $7$ actions per player, no-swap-regret algorithms can perform noticeably better than no-regret algorithms in a manner that cannot be explained away by unfairly adjusted learning rates.
0
0
cs.GT 2026-04-28

Strongly polynomial algorithm computes Arctic Auction equilibria

A Strongly Polynomial Algorithm for Arctic Auctions

The method adapts Orlin's scaling procedure to quasi-linear utilities, allowing fast repeated runs for parameter tuning in asset exchanges.

abstract click to expand
Our main contribution is a strongly polynomial algorithm for computing an equilibrium for the Arctic Auction, which is the quasi-linear extension of the linear Fisher market model. We build directly on Orlin's strongly polynomial algorithm for the linear Fisher market (Orlin, 2010). The first combinatorial polynomial algorithm for the linear Fisher market was based on the primal-dual paradigm (Devanur et al., 2008). This was followed by Orlin's scaling-based algorithms. The Arctic Auction (Klemperer 2018) was developed for the Government of Iceland to allow individuals to exchange blocked offshore assets. It is a variant of the product-mix auction (Klemperer 2008, 2010, 2018) that was designed for, and used by, the Bank of England, to allocate liquidity efficiently across banks pledging heterogeneous collateral of varying quality. Our work was motivated by the fact that banks often need to run Arctic Auctions under many different settings of the parameters in order to home in on the right one, making it essential to find a time-efficient algorithm for Arctic Auction.
0
0
cs.GT 2026-04-28

Linear contracts beat any other under distributional ambiguity

Distributional Robustness of Linear Contracts

When only expected signals per effort are known, a linear contract matches or exceeds every alternative in the worst case over consistent

Figure from the paper full image
abstract click to expand
Linear contracts are ubiquitous in practice, yet optimal contract theory often prescribes complex, nonlinear structures. We provide a distributional robustness justification for linear contracts. We study a principal-agent problem where the agent exerts costly effort across multiple tasks, generating a stochastic signal upon which the principal conditions payment. The principal faces distributional ambiguity: she knows the expected signal for each effort level, but not the full distribution. She seeks a contract maximizing her worst-case payoff over all distributions consistent with this partial knowledge. Our main result shows that linear contracts are optimal for such a principal. For any contract, there exists a linear contract achieving weakly higher worst-case payoff. The proof introduces the concavification approach built around the notion of self-inducing actions; these are actions where an affine contract simultaneously induces the action as optimal and supports the concave envelope of payments from above. We show that self-inducing actions always exist as maximizers of the gap between the concave envelope and agent's cost function. We extend these results to multi-party settings. In common agency with multiple principals, we show that affine contracts improve all principals' worst-case payoffs. In team production with multiple agents, we establish a complementary necessity result: if any agent's contract is non-affine, the unique ex-post robust equilibrium is zero effort. Finally, we show that homogeneous utility and cost functions yield tractable characterizations, enabling closed-form approximation ratios and a sharp boundary between computational tractability results.
0
0
cs.GT 2026-04-28

Correlated equilibria verification is P-complete in reachability games

Verification of Correlated Equilibria in Concurrent Reachability Games

The broader concept demands polynomial time while its subgame-perfect refinement fits in log-squared space, except when inputs arrive viaBay

abstract click to expand
As part of an effort to apply the rigorous guarantees of formal verification to multi-agent systems, the field of equilibrium analysis, also called rational verification, studies equilibria in multiplayer games to reason about system-level properties such as safety and scalability. While most prior work focuses on deterministic settings, recent probabilistic extensions enable the use of richer equilibrium concepts. In this paper, we study one such equilibrium concept -- correlated equilibria -- and introduce a natural refinement -- subgame-perfect correlated equilibria -- in the context of the verification problem. We characterize the computational complexity of verifying such equilibria and show a somewhat surprising separation (under standard complexity-theoretic assumptions): despite being more general, correlated equilibria yield a strictly harder P-complete verification problem than the subgame-perfect correlated equilibria verification problem, which can be solved in log-squared-space. We further analyze the setting where inputs are given succinctly via Bayesian networks, as the study of succinct representations is an important direction to connect static complexity-theoretic analysis to real-world program representations, and show that this complexity gap disappears under such representations.
0
0
cs.GT 2026-04-28

Independent signals enable near-full surplus extraction in second-price auctions

Private Private Information in Second-Price Auction

A seller designs ex-ante independent bidder signals with correlated valuation priors to reach revenue close to total welfare with a strict

Figure from the paper full image
abstract click to expand
Classic results show that even an arbitrarily small correlation across bidders' information can enable full surplus extraction in auctions and related mechanism design settings. Motivated by this fragility, we study the information independence in a second-price auction when the seller commits to a private private information structure, meaning bidders' signals are independent ex ante, while bidders share a symmetric and arbitrarily correlated prior distribution over their valuations. We first show that the seller optimal efficient outcome with full surplus extraction can always be implemented by a private private information structure that admits a Bayes Nash equilibrium. However, this equilibrium may not be stable. We then further construct a private private information structure that achieves revenue arbitrarily close to maximum welfare while admitting a strict equilibrium. At the same time, we establish an impossibility result: under private private information, in general, bidder surplus cannot achieve maximal welfare exactly, and we characterize necessary and sufficient conditions on the prior distribution under which bidder surplus can be made arbitrarily close to maximal welfare. We then explore which other efficient outcomes are achievable under private private information. Finally, moving beyond private private information, we provide a complete characterization of the achievable pairs (bidder surplus, seller revenue) under general information structures.
0

browse all of cs.GT โ†’ full archive ยท search ยท sub-categories