arxiv: 2605.13840 · v1 · submitted 2026-05-13 · 📊 stat.ML · cs.DS· cs.LG· math.ST· stat.CO· stat.TH

Recognition: no theorem link

What is Learnable in Valiant's Theory of the Learnable?

Steve Hanneke , Anay Mehrotra , Grigoris Velegkas , Manolis Zampetakis

Authors on Pith no claims yet

Pith reviewed 2026-05-14 17:30 UTC · model grok-4.3

classification 📊 stat.ML cs.DScs.LGmath.STstat.COstat.TH

keywords Valiant learning modelmembership queriessample compressionlearnability characterizationpositive exampleshalfspacesPAC learningquery complexity

0 comments

The pith

A class is learnable in Valiant's model exactly when every realizable positive sample admits a poly-size adaptive query-compression scheme.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Valiant's 1984 model gives the learner only positive examples, permits membership queries, and requires the output hypothesis to contain no false positives. For any finite domain the paper proves that a concept class is learnable in this model if and only if every realizable positive sample can be certified by a polynomial-size adaptive scheme of membership queries. The same characterization places Valiant's learnability strictly between ordinary PAC learnability and the query-free version of Valiant's model. The strict separation survives when the domain is permitted to be infinite. Halfspaces in d dimensions, previously unlearnable without queries, become learnable once membership queries are allowed.

Core claim

For every finite domain, a class is learnable in Valiant's model if and only if every realizable positive sample can be certified by a poly-size adaptive query-compression scheme. This places Valiant's learnability strictly between PAC learnability and the version of Valiant's model without membership queries. The same strict sandwich persists for arbitrary domains. Halfspaces in d dimensions are learnable with a poly(d) sample and poly(d) polylog(1/epsilon) query algorithm, and at least Omega(d) samples or queries are necessary.

What carries the argument

poly-size adaptive query-compression scheme: a short interactive protocol in which the learner uses membership queries to certify that a given positive sample is consistent with some concept in the class

If this is right

Learnability in Valiant's model is strictly stronger than the query-free variant but strictly weaker than PAC learnability.
d-dimensional halfspaces, which are not learnable without queries, become learnable once membership queries are permitted.
Halfspaces admit an algorithm using poly(d) samples and poly(d) polylog(1/epsilon) queries, with an Omega(d) lower bound on the combined number of samples and queries.
The same strict sandwich between PAC and query-free Valiant learnability holds on arbitrary, possibly infinite domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Membership queries can change which classes are qualitatively learnable, rather than merely improving sample or computational efficiency.
The compression perspective may be used to design algorithms for other geometric classes by focusing on interactive certification of positive samples.
The absence of an exact characterization for infinite domains highlights the need for new techniques that handle unbounded domains without losing the sandwich property.

Load-bearing premise

The learner must output a hypothesis with no false positives, and the domain must be finite to obtain the exact if-and-only-if characterization.

What would settle it

A finite-domain concept class in which some realizable positive sample lacks a polynomial-size adaptive query-compression scheme yet the class is still learnable in Valiant's model.

Figures

Figures reproduced from arXiv: 2605.13840 by Anay Mehrotra, Grigoris Velegkas, Manolis Zampetakis, Steve Hanneke.

**Figure 1.** Figure 1: Outline of the proof of Theorem 1.1. 3.2.2 Uniform Success Over All Potential Targets In this section, we show that any learnable class in Valiant’s model admits a learner whose output, with high probability, is simultaneously valid for every hypothesis that remains consistent with the realized interaction. Definition 12 (Uniform Success Over All Potential Targets). A learner L has uniform success over all… view at source ↗

read the original abstract

Valiant's 1984 paper is widely credited with introducing the PAC learning model, but it, in fact, introduced a different model: unlike PAC learning, the learner receives only positives, may issue membership queries, and must output a hypothesis with no false positives. Prior work characterized variants, including the case without queries. We revisit Valiant's original model and ask: *Which classes are learnable in it?* For every finite domain, including Valiant's Boolean-hypercube setting, we show that a class is learnable if and only if every realizable positive sample can be certified by a poly-size adaptive query-compression scheme. This is a new variant of sample compression where the learner certifies samples via a short interaction with the membership oracle. Our characterization shows that learnability in Valiant's model is strictly sandwiched between learnability in the PAC model and the variant of Valiant's model without membership queries. This is one of the rare cases where introducing membership queries changes the set of learnable classes, and not just the sample or computational complexity. Next, we study the natural extension of the model to arbitrary domains. While we do not obtain an exact characterization, our techniques readily generalize and show that the same strict sandwiching persists. Finally, we show that $d$-dimensional halfspaces, which are not learnable without queries, are learnable with queries: we give a $\mathrm{poly}(d) \tilde{O}(1/\epsilon)$ sample and $\mathrm{poly}(d) \mathrm{polylog}(1/\epsilon)$ query algorithm, and prove that at least $\Omega(d)$ samples or queries are necessary. To our knowledge, this is the first algorithm for halfspaces in Valiant's model. Together, these results uncover a surprisingly rich theory behind Valiant's original notion of learnability and introduce ideas that may be of independent interest in learning theory.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Valiant's original model gets an exact characterization via adaptive query compression for finite domains, plus the first halfspace algorithm.

read the letter

The main takeaway is that this paper pins down learnability in Valiant's 1984 positives-only model with membership queries. For finite domains a class is learnable exactly when every realizable positive sample has a poly-size adaptive query-compression certificate. That places the model strictly between PAC learnability and the no-query version, which is a real distinction rather than just a complexity shift. They also give the first algorithm for halfspaces: poly(d) samples and polylog(1/epsilon) queries, with an Omega(d) lower bound on samples or queries combined. The compression notion is new and seems to fit the no-false-positive requirement directly from the model definition. The sandwich result holds for general domains too, even without a full characterization there. The halfspace bounds look tight and the algorithm is concrete. The work stays faithful to Valiant's setup without adding extra assumptions. One soft spot is that the full compression equivalence proof is not visible in the abstract, so the details need checking, but the claims line up internally with no circularity. The infinite-domain part is only a partial extension, which they note. Citations are standard for the area. This is for people working on learning theory foundations, query models, or positive-only settings. A reader who cares about how model variants change the learnable classes will get something from it. It deserves peer review because the central if-and-only-if statement and the halfspace result are specific enough to be worth referee time.

Referee Report

0 major / 3 minor

Summary. The paper revisits Valiant's 1984 model (positives only, membership queries allowed, hypothesis with no false positives) and gives an exact characterization for finite domains: a class is learnable iff every realizable positive sample admits a poly-size adaptive query-compression scheme. This places Valiant's model strictly between PAC learnability and the no-query variant. The paper supplies a poly(d) sample / polylog(1/ε) query algorithm for d-dimensional halfspaces (with Ω(d) lower bound) and shows the sandwiching persists for arbitrary domains.

Significance. If the central claims hold, the work supplies a clean, query-sensitive characterization of learnability in Valiant's original model and demonstrates that membership queries can change the set of learnable classes rather than merely their complexity. The adaptive query-compression notion is new and may be reusable; the halfspace algorithm is the first explicit result in this model and comes with nearly tight bounds.

minor comments (3)

[Abstract] Abstract, sentence on the characterization: the phrase 'poly-size adaptive query-compression scheme' is introduced without an immediate pointer to its formal definition; a parenthetical reference to the relevant section would improve readability.
[Halfspace section] Halfspace algorithm paragraph: the sample bound is written 'poly(d) Õ(1/ε)'; clarify whether the Õ hides logarithmic factors in d, in 1/ε, or both, and state the precise dependence on the dimension in the theorem statement.
[Arbitrary domains section] The extension to arbitrary domains states that 'the same strict sandwiching persists' but does not obtain an exact characterization; a short remark on why the finite-domain proof technique fails to extend exactly would help readers assess the scope of the result.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the supportive review and recommendation of minor revision. The referee's summary accurately reflects the paper's contributions on the characterization of learnability in Valiant's model and the halfspace results. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper derives an if-and-only-if characterization for finite domains by defining an adaptive query-compression scheme directly from the model's constraints (positive samples only, membership queries, no false positives in the output hypothesis) and proving equivalence to learnability without any reduction to fitted parameters, self-referential equations, or load-bearing self-citations. The halfspace algorithm is presented as an independent constructive example that satisfies the new scheme, and the sandwiching between PAC and query-free variants follows from the explicit definitions rather than renaming or smuggling prior ansatzes. The derivation chain remains self-contained against the stated model.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on the standard definition of Valiant's model and the finite-domain restriction for the exact characterization; the query-compression scheme is a new construct introduced by the paper.

axioms (1)

domain assumption Finite domain for the exact if-and-only-if characterization
The abstract states the characterization holds for every finite domain.

invented entities (1)

poly-size adaptive query-compression scheme no independent evidence
purpose: Certifies that a positive sample is realizable via short membership-query interaction
New technical device introduced to express the learnability condition.

pith-pipeline@v0.9.0 · 5676 in / 1229 out tokens · 45204 ms · 2026-05-14T17:30:54.054013+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

9 extracted references · 9 canonical work pages

[1]

Neural Net Algorithms That Learn in Polynomial Time from Examples and Queries

URL: https://doi.org/10.1023/A:1022821128753 (cit. on pp. 2, 7, 13). [Bau91] Eric B. Baum. “Neural Net Algorithms That Learn in Polynomial Time from Examples and Queries”. In: IEEE Transactions on Neural Networks 2.1 (1991), pp. 5–19 (cit. on p. 13). [BBL05] Stéphane Boucheron, Olivier Bousquet, and Gábor Lugosi. “Theory of Classification: A Sur- vey of S...

work page doi:10.1023/a:1022821128753 1991
[2]

The true sample complexity of active learning

URL: https://doi.org/10.1007/PL00013833 (cit. on p. 14). [BHV10] Maria-Florina Balcan, Steve Hanneke, and Jennifer Wortman Vaughan. “The true sample complexity of active learning”. In: Machine learning 80.2 (2010), pp. 111–139 (cit. on pp. 2, 13). 44 [BKMS21] Mark Braverman, Gillat Kol, Shay Moran, and Raghuvansh R. Saxena. “Near Optimal Dis- tributed Lea...

work page doi:10.1007/pl00013833 2010
[3]

In: Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pp

URL: https://proceedings.mlr.press/v291/cai25b.html (cit. on p. 13). [CP25] Moses Charikar and Chirag Pabbaraju. “Exploring Facets of Language Generation in the Limit”. In: Proceedings of Thirty Eighth Conference on Learning Theory . Vol. 291. Proceedings of Machine Learning Research. PMLR, 2025, pp. 854–887. URL: https://proceedings. mlr.press/v291/chari...

work page doi:10.1137/1 2025
[4]

Valiant , title =

URL: https://doi.org/10.1145/1968.1972 (cit. on pp. 1, 2). [Val84b] Leslie Gabriel Valiant. “Deductive learning”. In: Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences 312.1522 (1984), pp. 441–446 (cit. on p. 1). [VC71] Vladimir Vapnik and Alexey Chervonenkis. “On the Uniform Convergence of Relative Fr...

work page doi:10.1145/1968.1972 1968
[5]

if h ⋆ ∈ K, then S ⊆ DS,K ⊆ supp(h⋆)

work page
[6]

if S ̸⊆ DS,K, then h⋆ /∈ K. Proof. Suppose h⋆ ∈ K . Every point in S is positive for h⋆, so h⋆ ∈ VK(S). Hence ΣS,K is a valid query compression scheme for S, and the realized transcript rS,K is realizable with respect to VK(S). The defining property of query compression gives S ⊆ DS,K. Moreover, since h⋆ itself generated the transcript rS,K, we have h⋆ ∈ ...

work page
[7]

Hence, by a Chernoff bound, with probability at least 1 − η at least 3k/5 runs are successful

(Successful run) Each run is successful with probability at least 2/3, and these events are independent across j. Hence, by a Chernoff bound, with probability at least 1 − η at least 3k/5 runs are successful. Let E denote this event. No false positives. Assume E holds and fix any x ∈ {0, 1}d with h⋆(x) = 0. Every successful hj satisfies hj(x) = 0, so at m...

work page
[8]

Initialize H0 ≡ 0

We now describe the learner. Initialize H0 ≡ 0. For each scale s = 0, 1, . . . ,T, given Hs, let ps := Pr x∼D (Hs(x) = 0) . Because D is compatible with h⋆, this is exactly the D-mass of the current residual positive region. Testing the residual mass. Draw As := C1 log 1/ηs τs fresh samples from D, where C1 > 0 is a suf- ficiently large absolute constant....

work page
[9]

If s = 0, then p0 ≤ 1, so p1 ≤ 1 3 = τ0

Therefore Hs+1 = Hs ∨ hs has no false positives, and ps+1 = Pr x∼D (Hs(x) = 0 and hs(x) = 0) = ps Pr x∼µs (hs(x) = 0) ≤ ps 3 . If s = 0, then p0 ≤ 1, so p1 ≤ 1 3 = τ0. If s ≥ 1, then the induction hypothesis gives ps ≤ τs−1, and hence ps+1 ≤ τs−1 3 = τs. This completes the induction. Correctness. By the no-false-positive invariant and (10), on E the final...

work page