pith. machine review for the scientific record. sign in

arxiv: 2604.15373 · v1 · submitted 2026-04-15 · 💻 cs.MA · cs.AI· cs.IT· cs.LG· math.IT

Recognition: unknown

InfoChess: A Game of Adversarial Inference and a Laboratory for Quantifiable Information Control

Authors on Pith no claims yet

Pith reviewed 2026-05-10 11:59 UTC · model grok-4.3

classification 💻 cs.MA cs.AIcs.ITcs.LGmath.IT
keywords InfoChessadversarial gameinformation inferencepartial observabilityreinforcement learningbelief entropymulti-agent systemsuncertainty quantification
0
0 comments X

The pith

InfoChess is a game where the only way to score is by accurately guessing the opponent's hidden king position through controlled visibility.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes InfoChess, a board game designed so that players must compete on the basis of information alone. Pieces move solely to change what areas of the board each player can see, and the winner is determined by how well each maintains a probabilistic estimate of where the opponent's king is located at every step. A reinforcement learning agent trained on the game beats a series of heuristic opponents that use progressively more detailed models of the other player's behavior. The authors also introduce measures such as belief entropy and predictive log score to separate different sources of uncertainty in the players' knowledge. This setup creates a clean laboratory for examining how agents gather and act on information when they cannot see the full state of the world.

Core claim

We propose InfoChess, a symmetric adversarial game that elevates competitive information acquisition to the primary objective. There is no piece capture, removing material incentives that would otherwise confound the role of information. Instead, pieces are used to alter visibility. Players are scored on their probabilistic inference of the opponent's king location over the duration of the game. To explore the space of strategies for playing InfoChess, we introduce a hierarchy of heuristic agents defined by increasing levels of opponent modeling, and train a reinforcement learning agent that outperforms these baselines. Leveraging the discrete structure of the game, we analyze gameplay via a

What carries the argument

The action-induced observation channel, which determines how each move changes the visible board and thereby updates each player's belief distribution over the opponent's king location.

Load-bearing premise

The game mechanics of using pieces solely to alter visibility, combined with the inference-based scoring, successfully isolate information control as the dominant strategic element without introducing unaccounted confounding factors from board geometry or movement rules.

What would settle it

Running many games and checking whether agents that ignore king inference and instead optimize for other visible patterns still achieve high scores would show if the scoring truly isolates information control.

Figures

Figures reproduced from arXiv: 2604.15373 by Kieran A. Murphy.

Figure 1
Figure 1. Figure 1: The mechanics of InfoChess. (a) The starting configuration on an [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Pairwise score distributions. Each scatter plot dis [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Per-turn characterization. The curves show the mean of the quantity, averaged per turn per agent over 250 games per [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Non-king movement by agent. For each agent in [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
read the original abstract

We propose InfoChess, a symmetric adversarial game that elevates competitive information acquisition to the primary objective. There is no piece capture, removing material incentives that would otherwise confound the role of information. Instead, pieces are used to alter visibility. Players are scored on their probabilistic inference of the opponent's king location over the duration of the game. To explore the space of strategies for playing InfoChess, we introduce a hierarchy of heuristic agents defined by increasing levels of opponent modeling, and train a reinforcement learning agent that outperforms these baselines. Leveraging the discrete structure of the game, we analyze gameplay through natural information-theoretic characterizations that include belief entropy, oracle cross entropy, and predictive log score under the action-induced observation channel. These measures disentangle epistemic uncertainty, calibration mismatch, and uncertainty induced by adversarial movement. The design of InfoChess renders it a testbed for studying multi-agent inference under partial observability. We release code for the environment and agents, and a public interface to encourage further study.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The paper proposes InfoChess, a symmetric adversarial game that prioritizes competitive information acquisition by eliminating piece capture and using pieces exclusively to modify visibility on a discrete board. Players receive scores based on the accuracy of their probabilistic inferences about the opponent's king location throughout the game. A hierarchy of heuristic agents with progressively deeper opponent modeling is defined, an RL agent is trained and shown to outperform these baselines, and information-theoretic metrics (belief entropy, oracle cross-entropy, and predictive log score under the action-induced observation channel) are introduced to disentangle epistemic uncertainty, calibration mismatch, and uncertainty arising from adversarial movement. The game is presented as a testbed for multi-agent inference under partial observability, with code and a public interface released.

Significance. If the isolation of information control holds and the metrics prove robust, InfoChess supplies a clean, quantifiable laboratory for studying adversarial inference and information theory in partially observable multi-agent systems. The explicit release of the environment, agents, and interface is a strength that supports reproducibility and community follow-up work. The RL outperformance result, if statistically solid, illustrates that learned strategies can exploit the information-centric objective beyond hand-crafted heuristics.

major comments (1)
  1. [Section 2 (Game Design)] Section 2 (Game Design): The central claim that InfoChess 'elevates competitive information acquisition to the primary objective' with 'no piece capture, removing material incentives' and pieces used 'solely to alter visibility' requires that board geometry and movement rules introduce no independent positional or visibility biases. The manuscript provides no ablation, symmetry analysis, or control experiment (e.g., fixed vs. randomized starting configurations or comparison across board sizes) to demonstrate separability of geometric effects from inference strategy. This is load-bearing for both the reported RL outperformance and the assertion that the three metrics cleanly disentangle the listed uncertainty sources.
minor comments (3)
  1. [Abstract and Section 4] The abstract and Section 4 (Analysis) would benefit from explicit equations or pseudocode for the three information-theoretic measures to make the disentanglement claim immediately verifiable.
  2. [Section 3 (Agents and Training)] The description of the RL training procedure (hyperparameters, observation space, reward shaping) is referenced but lacks sufficient detail for exact reproduction even with the released code.
  3. [Figures in Section 4] Figure captions and axis labels in the gameplay analysis plots should explicitly state the number of episodes or trials averaged and any error bars used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and for highlighting the importance of validating that geometric factors do not confound the information-control objective in InfoChess. We address the single major comment below and commit to strengthening the manuscript accordingly.

read point-by-point responses
  1. Referee: Section 2 (Game Design): The central claim that InfoChess 'elevates competitive information acquisition to the primary objective' with 'no piece capture, removing material incentives' and pieces used 'solely to alter visibility' requires that board geometry and movement rules introduce no independent positional or visibility biases. The manuscript provides no ablation, symmetry analysis, or control experiment (e.g., fixed vs. randomized starting configurations or comparison across board sizes) to demonstrate separability of geometric effects from inference strategy. This is load-bearing for both the reported RL outperformance and the assertion that the three metrics cleanly disentangle the listed uncertainty sources.

    Authors: We agree that explicit verification of separability strengthens the central claims. InfoChess is constructed with complete symmetry: identical movement and visibility rules for both players, a uniform square grid, and randomized initial king placements that eliminate fixed positional advantages. Because every agent (heuristic and RL) experiences the identical geometry and observation model, performance differentials arise from differences in inference and opponent modeling rather than from board layout. The three metrics are defined relative to the action-induced observation channel, which directly encodes how each move alters visibility; belief entropy captures epistemic uncertainty in the current belief, oracle cross-entropy measures calibration to ground truth, and the predictive log score isolates the residual uncertainty attributable to adversarial movement. Nevertheless, the current manuscript does not contain the requested ablations. In the revision we will add: (i) a symmetry check demonstrating statistically equivalent results when player roles are exchanged, (ii) performance comparisons across board sizes (e.g., 6×6 and 8×8), and (iii) controlled experiments contrasting fixed versus randomized starting configurations. These additions will quantify any residual geometric contribution and confirm that the reported RL gains and metric interpretations remain valid. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper's central contribution is the definition of InfoChess as a new symmetric game with explicit mechanics (no capture, visibility alteration via pieces, scoring via probabilistic king inference). Heuristic agents are constructed by increasing opponent-modeling levels, an RL agent is trained to outperform them, and information-theoretic measures (belief entropy, oracle cross-entropy, predictive log score) are applied directly to the game's observation channel. None of these steps reduce by the paper's own equations or definitions to fitted parameters renamed as predictions, self-referential loops, or load-bearing self-citations. The analysis is a direct, non-circular application of standard information theory to the newly defined discrete state space, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The paper's central claim rests on defining a novel game and applying standard reinforcement learning plus information theory; no free parameters are introduced in the abstract, and the only invented element is the game itself.

axioms (2)
  • domain assumption Chess-like discrete board and movement rules provide a suitable structure for partial observability
    Invoked implicitly when describing visibility changes and king inference on the board.
  • domain assumption Reinforcement learning can successfully train agents to optimize inference scores in this environment
    Required for the claim that the RL agent outperforms the heuristic hierarchy.
invented entities (1)
  • InfoChess game no independent evidence
    purpose: Laboratory for studying adversarial information acquisition and inference under partial observability
    Newly defined construct whose properties are asserted to enable the described analysis.

pith-pipeline@v0.9.0 · 5474 in / 1457 out tokens · 51097 ms · 2026-05-10T11:59:44.507166+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 4 canonical work pages · 1 internal anchor

  1. [1]

    Bowen Baker, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell, Bob McGrew, and Igor Mordatch. 2020. Emergent tool use from multi-agent autocur- ricula. InInternational conference on learning representations

  2. [2]

    Nolan Bard, Jakob N Foerster, Sarath Chandar, Neil Burch, Marc Lanctot, H Francis Song, Emilio Parisotto, Vincent Dumoulin, Subhodeep Moitra, Edward Hughes, et al. 2020. The hanabi challenge: A new frontier for ai research.Artificial Intelligence280 (2020), 103216

  3. [3]

    Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, et al. 2019. Dota 2 with large scale deep reinforcement learning.arXiv preprint arXiv:1912.06680(2019)

  4. [4]

    Noam Brown and Tuomas Sandholm. 2018. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals.Science359, 6374 (2018), 418–424

  5. [5]

    Noam Brown and Tuomas Sandholm. 2019. Superhuman AI for multiplayer poker.Science365, 6456 (2019), 885–890

  6. [6]

    Chess.com. 2026. What is Fog Of War Chess? https://support.chess.com/en/ articles/8708650-what-is-fog-of-war-chess Accessed: 2026-02-24

  7. [7]

    1999.Elements of information theory

    Thomas M Cover and Joy A Thomas. 1999.Elements of information theory. John Wiley & Sons

  8. [8]

    Elizabeth Gilmour, Noah Plotkin, and Leslie N Smith. 2021. An approach to partial observability in games: Learning to both act and observe. In2021 IEEE Conference on Games (CoG). IEEE, 01–05

  9. [9]

    Tilmann Gneiting and Adrian E Raftery. 2007. Strictly proper scoring rules, prediction, and estimation.Journal of the American statistical Association102, 477 (2007), 359–378

  10. [10]

    Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, and Máté Lengyel. 2011. Bayesian active learning for classification and preference learning.arXiv preprint arXiv:1112.5745(2011)

  11. [11]

    Leslie Pack Kaelbling, Michael L Littman, and Anthony R Cassandra. 1998. Plan- ning and acting in partially observable stochastic domains.Artificial intelligence 101, 1-2 (1998), 99–134

  12. [12]

    Julien Perolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T Connor, Neil Burch, Thomas Anthony, et al

  13. [13]

    Mastering the game of Stratego with model-free multiagent reinforcement learning.Science378, 6623 (2022), 990–996

  14. [14]

    2000.Popular chess variants

    David Brine Pritchard. 2000.Popular chess variants. Batsford

  15. [15]

    Freddie Bickford Smith, Andreas Kirsch, Sebastian Farquhar, Yarin Gal, Adam Foster, and Tom Rainforth. 2023. Prediction-oriented bayesian active learning. In International conference on artificial intelligence and statistics. PMLR, 7331–7348

  16. [16]

    Oriol Vinyals, Igor Babuschkin, Wojciech M Czarnecki, Michaël Mathieu, An- drew Dudzik, Junyoung Chung, David H Choi, Richard Powell, Timo Ewalds, Petko Georgiev, et al. 2019. Grandmaster level in StarCraft II using multi-agent reinforcement learning.nature575, 7782 (2019), 350–354

  17. [17]

    Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning.Machine learning8, 3 (1992), 229–256

  18. [18]

    Brian Hu Zhang and Tuomas Sandholm. 2025. General search techniques with- out common knowledge for imperfect-information games, and application to superhuman Fog of War chess.arXiv preprint arXiv:2506.01242(2025)