arxiv: 2605.13426 · v2 · submitted 2026-05-13 · 💻 cs.LG · math.AG

Recognition: 3 theorem links

· Lean Theorem

Strategic PAC Learnability via Geometric Definability

Yuval Filmus , Shay Moran , Elizaveta Nesterova , Nir Rosenfeld , Alexander Shlimovich

Authors on Pith no claims yet

Pith reviewed 2026-05-15 05:18 UTC · model grok-4.3

classification 💻 cs.LG math.AG

keywords strategic classificationPAC learnabilitygeometric definabilityfirst-order formulasR_expVC dimensionsample complexitycost functions

0 comments

The pith

A geometric definability assumption ensures strategic classification remains PAC learnable with sample complexity governed by formula complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Strategic classification lets agents alter their features at a cost to change the classifier's output. Without added structure, this can turn a learnable problem into an unlearnable one: there exist hypothesis classes of VC dimension 1 on the real line whose induced strategic versions have infinite VC dimension even under interval-based costs. The paper restores guarantees by requiring that both the original hypothesis class and the neighborhoods created by the cost function are definable by first-order formulas over the reals with exponentiation. This condition includes many standard settings such as ell_p distances, Wasserstein distance, and common divergences. Under the assumption the induced class stays PAC learnable and its sample complexity is bounded by the complexity of the defining formulas.

Core claim

Both the hypothesis class and the cost-induced neighborhood relation can be defined by first-order formulas over R_exp. Under this assumption learnability is preserved: if the base class is PAC learnable then so is the induced strategic class, and the sample complexity is controlled by the definability parameters of the formulas.

What carries the argument

Geometric definability: the requirement that hypotheses and cost-induced neighborhoods are expressible by first-order formulas over the reals with exponentiation, which bounds the combinatorial complexity of the induced strategic class.

If this is right

The result applies to linear classifiers paired with ell_p costs and to Wasserstein or divergence-based costs.
Sample complexity scales with the quantifier depth and formula size in the definitions rather than with unrestricted strategic modifications.
Strategic behavior cannot inflate VC dimension to infinity when the definability condition holds.
Many information-theoretic and geometric cost structures fall inside the covered regime.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

System designers could deliberately select or approximate cost functions to satisfy the definability condition and thereby retain learnability.
The same definability lens might apply to other tame structures in real geometry beyond R_exp.
Efficient learning algorithms could exploit the explicit first-order definitions to compute optimal responses or sample bounds directly.

Load-bearing premise

Both the hypothesis class and the cost-induced neighborhood relation can be defined by first-order formulas over the reals with exponentiation.

What would settle it

An explicit hypothesis class and cost function, both first-order definable over R_exp, for which the induced strategic class has infinite VC dimension would falsify the preservation claim.

Figures

Figures reproduced from arXiv: 2605.13426 by Alexander Shlimovich, Elizaveta Nesterova, Nir Rosenfeld, Shay Moran, Yuval Filmus.

**Figure 1.** Figure 1: Each anchor pi has its own private cell Ui . For the set FS = {q S i : i ∈ [n]}, the point q S i is placed inside Npi exactly when i ∈ S, and outside Npi otherwise. Thus (1FS ) Ns ∩ {p1, . . . , pn} = {pi : i ∈ S}. 8 Strategic expansion breaks learnability The aim of this section is to prove Main Theorem 1. Theorem 1. For r > 0, let Nr be the neighborhood system given by x ′ ∈ Nx ⇐⇒ |x − x ′ | ≤ r. There e… view at source ↗

read the original abstract

Strategic classification studies learning settings in which individuals can modify their features, at a cost, in order to influence the classifier's decision. A central question is how the sample complexity of the induced (strategic) hypothesis class depends on the complexities of the underlying hypothesis class and the cost structure governing feasible manipulations. Prior work has shown that in several natural settings, such as linear classifiers with norm costs, the induced complexity can be controlled. We begin by showing that such guarantees fail in general - even in simple cases: there exist hypothesis classes of VC dimension $1$ on the real line such that, even under the simplest interval neighborhoods, the induced class has infinite VC dimension. Thus, strategic behavior can turn an easy learning problem into a non-learnable one. To overcome this, we introduce structure via a geometric definability assumption: both the hypothesis class and the cost-induced neighborhood relation can be defined by first-order formulas over $\mathbb{R}_{\mathtt{exp}}$. Intuitively, this means that hypotheses and costs can be described using arithmetic operations, exponentiation, logarithms, and comparisons. This captures a broad range of natural classes and cost functions, including $\ell_p$ distances, Wasserstein distance, and information-theoretic divergences. Under this assumption, we prove that learnability is preserved, with sample complexity controlled by the complexity of the defining formulas.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows strategic classification can destroy learnability even for VC-1 classes, but a first-order definability condition over R_exp preserves it through o-minimality.

read the letter

The core point is that strategic manipulation can turn a learnable problem into an unlearnable one, but the paper gives a clean structural fix that covers many standard cases. They start with a counterexample: a hypothesis class of VC dimension 1 on the real line whose induced strategic version has infinite VC dimension even under simple interval neighborhoods. That negative result is useful because it rules out any hope of a blanket guarantee. On the positive side, they assume both the original class and the cost-induced neighborhoods are definable by first-order formulas over the reals with exponentiation. The induced strategic class then remains definable by existential quantification over the neighborhood relation. Since R_exp is o-minimal, every definable family has finite VC dimension, and the sample complexity is controlled by the syntactic complexity of the formulas. This assumption captures linear classifiers, l_p costs, Wasserstein distance, and several information divergences, which is a reasonable range. The argument looks direct and does not rely on fitted parameters or circular definitions. One soft spot is that verifying the definability assumption for a new cost function still takes some work, though the paper shows it holds for the examples they care about. No other load-bearing gaps appear in the logic. This is for readers working on PAC theory with geometric or logical structure, or on strategic classification more broadly. It deserves a serious referee because the counterexample is sharp and the positive theorem is general enough to be applied beyond the specific cases in prior work.

Referee Report

0 major / 3 minor

Summary. The paper shows that strategic classification can turn a VC-dimension 1 hypothesis class into one with infinite VC-dimension, even with simple interval neighborhoods on the real line. It then introduces a geometric definability assumption: both the hypothesis class and the cost-induced neighborhood relation are first-order definable over the real exponential field R_exp. Under this assumption, the induced strategic hypothesis class remains PAC-learnable, with sample complexity controlled by the syntactic complexity of the defining formulas.

Significance. If the results hold, this supplies a broad, model-theoretic sufficient condition for preserving PAC learnability under strategic manipulation. By invoking o-minimality of R_exp, any first-order definable family has finite VC-dimension, and the strategic induction step preserves definability via existential quantification over the neighborhood relation; the resulting sample-complexity bound is then governed by formula complexity rather than case-by-case analysis. The counterexample demonstrates that the assumption is necessary in general. The approach covers natural families such as ell_p balls, Wasserstein distances, and information-theoretic divergences.

minor comments (3)

[§3] §3 (counterexample construction): the explicit hypothesis class and interval neighborhood relation that produce infinite VC-dimension should be stated with a concrete formula or diagram so that the infinite shattering argument can be verified directly.
[Theorem 4.2] Theorem 4.2 (preservation): the sample-complexity bound is stated in terms of formula complexity, but the precise dependence (e.g., via the number of quantifier alternations or the degree of the defining polynomials) is not displayed; adding an explicit corollary relating syntactic size to the Sauer-Shelah bound would strengthen the result.
[§2] Notation: the symbol R_exp is introduced without a reference to the standard definition of the real exponential field; a brief parenthetical or footnote citing van den Dries or Marker would aid readers unfamiliar with o-minimality.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and accurate summary of our results on strategic PAC learnability under geometric definability, as well as for recommending minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation relies on the external fact that the structure R_exp is o-minimal, which is a known model-theoretic property implying finite VC-dimension for any first-order definable family of sets. The paper shows that the strategic hypothesis class remains first-order definable over R_exp by closing under existential quantification over the definable neighborhood relation, thereby inheriting finite VC-dimension and a sample-complexity bound governed by formula complexity. This is a direct application of standard results on o-minimal structures and definability preservation; it does not reduce any claim to a fitted parameter, self-referential definition, or load-bearing self-citation. The counterexample for general failure is constructed explicitly without reference to the positive result, keeping the positive theorem self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the geometric definability assumption as the key domain assumption that ensures finite induced VC dimension; no free parameters or invented entities are introduced.

axioms (1)

domain assumption Both the hypothesis class and the cost-induced neighborhood relation can be defined by first-order formulas over R_exp.
This is the load-bearing assumption introduced to guarantee that strategic behavior does not inflate complexity to infinity.

pith-pipeline@v0.9.0 · 5549 in / 1181 out tokens · 41692 ms · 2026-05-15T05:18:21.035347+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Main Theorem 2: if both H and N are definable in R_exp then HN is PAC-learnable with growth function bounded by C m^k (Johnson-Laskowski [JL10])
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Main Theorem 4: existential R_exp formulas of format F and degree D yield VC(HN)=O(F log D) via restricted Pfaffian cell decomposition [BV20, BNZ22]
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Strategic hypothesis defined by ∃y (Φ_N(x,y) ∧ Φ_H(y,a))

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages

[1]

Honest Compressions and Their Application to Compression Schemes , booktitle =

Roi Livni and Pierre Simon , editor =. Honest Compressions and Their Application to Compression Schemes , booktitle =. 2013 , url =

work page 2013
[2]

CoRR , volume =

Noga Alon and Dmitrii Avdiukhin and Dor Elboim and Orr Fischer and Grigory Yaroslavtsev , title =. CoRR , volume =. 2023 , url =. doi:10.48550/ARXIV.2312.00379 , eprinttype =. 2312.00379 , timestamp =

work page doi:10.48550/arxiv.2312.00379 2023
[3]

Sign rank versus

Noga Alon and Shay Moran and Amir Yehudayoff , editor =. Sign rank versus. Proceedings of the 29th Conference on Learning Theory,. 2016 , url =

work page 2016
[4]

Limitations of Learning Via Embeddings in Euclidean Half Spaces , journal =

Shai Ben. Limitations of Learning Via Embeddings in Euclidean Half Spaces , journal =. 2002 , url =

work page 2002
[5]

Talagrand , title =

M. Talagrand , title =

work page
[6]

A. W. van der Vaart and J. A. Wellner , Date-Added =

work page
[7]

Journal of the ACM (JACM) , volume=

Learnability and the Vapnik-Chervonenkis dimension , author=. Journal of the ACM (JACM) , volume=. 1989 , publisher=

work page 1989
[8]

Steve Hanneke , title =. J. Mach. Learn. Res. , volume =. 2016 , url =

work page 2016
[9]

Vapnik and A

V. Vapnik and A. Chervonenkis , title =

work page
[10]

Journal of Machine Learning Research , volume=

The optimal sample complexity of PAC learning , author=. Journal of Machine Learning Research , volume=

work page
[11]

Proper Learning, Helly Number, and an Optimal

Olivier Bousquet and Steve Hanneke and Shay Moran and Nikita Zhivotovskiy , editor =. Proper Learning, Helly Number, and an Optimal. Conference on Learning Theory,. 2020 , url =

work page 2020
[12]

2011 , url=

Vapnik-Chervonenkis density in some theories without the independence property, I , author=. 2011 , url=

work page 2011
[13]

Transactions of the American Mathematical Society , volume=

Lower bounds for approximation by nonlinear manifolds , author=. Transactions of the American Mathematical Society , volume=. 1968 , publisher=

work page 1968
[14]

2006 , KEYWORDS =

Basu, Saugata and Pollack, Richard and Roy, Marie-Fran. 2006 , KEYWORDS =

work page 2006
[15]

and Jerrum, Mark R

Goldberg, Paul W. and Jerrum, Mark R. , date =. Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers , url =. Machine Learning , number =. 1995 , bdsk-url-1 =. doi:10.1007/BF00993408 , id =

work page doi:10.1007/bf00993408 1995
[16]

Handbook of combinatorics , volume=

Tools from higher algebra , author=. Handbook of combinatorics , volume=. 1995 , publisher=

work page 1995
[17]

Journal of Machine Learning Research , year =

Ravi Sundaram and Anil Vullikanti and Haifeng Xu and Fan Yao , title =. Journal of Machine Learning Research , year =

work page
[18]

Johnson, H. R. and Laskowski, M. C. , date =. Compression Schemes, Stable Definable Families, and o-Minimal Structures , url =. Discrete & Computational Geometry , number =. 2010 , bdsk-url-1 =. doi:10.1007/s00454-009-9201-3 , id =

work page doi:10.1007/s00454-009-9201-3 2010
[19]

Proceedings of the 2019 ACM Conference on Economics and Computation (EC) , pages =

Jon Kleinberg and Manish Raghavan , title =. Proceedings of the 2019 ACM Conference on Economics and Computation (EC) , pages =

work page 2019
[20]

Edelman and Brian Axelrod , title =

Yonadav Shavit and Benjamin L. Edelman and Brian Axelrod , title =. Proceedings of the 37th International Conference on Machine Learning (ICML) , pages =

work page
[21]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Benyamin Trachtenberg and Nir Rosenfeld , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =

work page
[22]

Journal of Computer and System Sciences , volume =

Karpinski, Marek and Macintyre, Angus , title =. Journal of Computer and System Sciences , volume =. 1997 , doi =

work page 1997
[23]

arXiv preprint arXiv:2209.10972 , year =

Binyamini, Gal and Novikov, Dmitri and Zak, Benny , title =. arXiv preprint arXiv:2209.10972 , year =

work page arXiv
[24]

2002 , doi =

Marker, David , title =. 2002 , doi =

work page 2002
[25]

Effective Cylindrical Cell Decompositions for Restricted Sub-Pfaffian Sets , volume =

Binyamini, Gal and Vorobjov, Nicolai , year =. Effective Cylindrical Cell Decompositions for Restricted Sub-Pfaffian Sets , volume =. International Mathematics Research Notices , doi =

work page
[26]

Proceedings of the 2016 ACM conference on innovations in theoretical computer science , pages=

Strategic classification , author=. Proceedings of the 2016 ACM conference on innovations in theoretical computer science , pages=

work page 2016
[27]

The Journal of Machine Learning Research , volume=

Static prediction games for adversarial learning problems , author=. The Journal of Machine Learning Research , volume=. 2012 , publisher=

work page 2012
[28]

Annals of Mathematics , number =

Gal Binyamini and Dmitry Novikov and Benny Zak , title =. Annals of Mathematics , number =. 2024 , doi =

work page 2024
[29]

, title =

Wilkie, Alex J. , title =. Selecta Mathematica , volume =

work page
[30]

The Thirty Seventh Annual Conference on Learning Theory , pages=

Learnability gaps of strategic classification , author=. The Thirty Seventh Annual Conference on Learning Theory , pages=. 2024 , organization=

work page 2024
[31]

Incentive-aware

Zhang, Hanrui and Conitzer, Vincent , booktitle=. Incentive-aware

work page
[32]

Wilkie , journal =

Alex J. Wilkie , journal =. Model Completeness Results for Expansions of the Ordered Field of Real Numbers by Restricted Pfaffian Functions and the Exponential Function , urldate =

work page
[33]

identification of semi-algebraic sets , author=

Localization vs. identification of semi-algebraic sets , author=. Proceedings of the sixth annual conference on Computational learning theory , pages=

work page
[34]

Tarski, Alfred , title =

work page
[35]

International Conference on Machine Learning , pages=

Strategic classification is causal modeling in disguise , author=. International Conference on Machine Learning , pages=. 2020 , organization=

work page 2020
[36]

International Conference on Machine Learning , pages=

Causal strategic classification: A tale of two shifts , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023
[37]

International Conference on Machine Learning , pages=

Generalized strategic classification and the case of aligned incentives , author=. International Conference on Machine Learning , pages=. 2022 , organization=

work page 2022
[38]

International Conference on Machine Learning , pages=

Strategic classification in the dark , author=. International Conference on Machine Learning , pages=. 2021 , organization=

work page 2021
[39]

Proceedings of the 39th International Conference on Machine Learning , pages =

Information Discrepancy in Strategic Learning , author =. Proceedings of the 39th International Conference on Machine Learning , pages =. 2022 , volume =

work page 2022
[40]

Advances in Neural Information Processing Systems , volume=

Strategic Classification under Unknown Personalized Manipulation , author=. Advances in Neural Information Processing Systems , volume=

work page
[41]

International Conference on Machine Learning , year=

Strategic Classification with Unknown User Manipulations , author=. International Conference on Machine Learning , year=

work page
[42]

Forty-first International Conference on Machine Learning , year=

One-Shot Strategic Classification Under Unknown Costs , author=. Forty-first International Conference on Machine Learning , year=

work page
[43]

1999 , issue_date =

Basu, Saugata , title =. 1999 , issue_date =. doi:10.1145/320211.320240 , journal =

work page doi:10.1145/320211.320240 1999
[44]

On the Combinatorial and Algebraic Complexity of Quantifier Elimination , journal =

Basu, Saugata and Pollack, Richard and Roy, Marie-Fran. On the Combinatorial and Algebraic Complexity of Quantifier Elimination , journal =. 1996 , doi =

work page 1996
[45]

Journal of Symbolic Computation , volume =

Renegar, James , title =. Journal of Symbolic Computation , volume =. 1992 , doi =

work page 1992
[46]

, title =

Collins, George E. , title =. Automata Theory and Formal Languages , series =. 1975 , doi =

work page 1975
[47]

Annals of Mathematics , volume =

Seidenberg, Abraham , title =. Annals of Mathematics , volume =. 1954 , doi =

work page 1954
[48]

, title =

Khovanskii, Askold G. , title =. 1991 , pages =

work page 1991
[49]

Normal Forms, Bifurcations and Finiteness Problems in Differential Equations , editor =

Gabrielov, Andrei and Vorobjov, Nicolai , title =. Normal Forms, Bifurcations and Finiteness Problems in Differential Equations , editor =

work page
[50]

Journal of the London Mathematical Society , series =

Gabrielov, Andrei and Vorobjov, Nicolai , title =. Journal of the London Mathematical Society , series =. 2009 , doi =

work page 2009
[51]

2019 , publisher=

Exponential families on resource-constrained systems , author=. 2019 , publisher=

work page 2019
[52]

1998 , PAGES =

van den Dries, Lou , TITLE =. 1998 , PAGES =. doi:10.1017/CBO9780511525919 , URL =

work page doi:10.1017/cbo9780511525919 1998
[53]

Geometric

Grothendieck, Alexandre , TITLE =. Geometric. 1997 , ISBN =

work page 1997
[54]

On the speed of algebraically defined graph classes , journal =

Lisa Sauermann , keywords =. On the speed of algebraically defined graph classes , journal =. 2021 , issn =. doi:https://doi.org/10.1016/j.aim.2021.107593 , url =

work page doi:10.1016/j.aim.2021.107593 2021
[55]

Bartlett and Nick Harvey and Christopher Liaw and Abbas Mehrabian , title =

Peter L. Bartlett and Nick Harvey and Christopher Liaw and Abbas Mehrabian , title =. Journal of Machine Learning Research , year =

work page
[56]

Almost Linear VC Dimension Bounds for Piecewise Polynomial Networks , url =

Bartlett, Peter and Maiorov, Vitaly and Meir, Ron , booktitle =. Almost Linear VC Dimension Bounds for Piecewise Polynomial Networks , url =

work page
[57]

Proceedings of the 36th AAAI Conference on Artificial Intelligence , pages=

Learning Losses for Strategic Classification , author=. Proceedings of the 36th AAAI Conference on Artificial Intelligence , pages=

work page
[58]

Proceedings of the Thirty-Second Conference on Learning Theory , pages =

VC Classes are Adversarially Robustly Learnable, but Only Improperly , author =. Proceedings of the Thirty-Second Conference on Learning Theory , pages =. 2019 , editor =

work page 2019