Online learning of smooth functions on $\mathbb{R}$

Alexander Wang; Jesse Geneson; Kuldeep Singh

arxiv: 2604.03525 · v1 · submitted 2026-04-04 · 💻 cs.LG

Online learning of smooth functions on mathbb{R}

Jesse Geneson , Kuldeep Singh , Alexander Wang This is my paper

Pith reviewed 2026-05-13 19:03 UTC · model grok-4.3

classification 💻 cs.LG

keywords online learningadversarial learningsmooth functionsabsolutely continuousill-posednessreal linelocality constraints

0 comments

The pith

An adversary can force infinite cumulative loss when learning smooth functions on the real line in the standard online model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that the standard adversarial online learning setup for real-valued functions on the entire real line is ill-posed for the class of absolutely continuous functions whose derivative satisfies an L^q integrability bound. For any loss exponent p at least 1 and any smoothness parameter q greater than 1, the adversary possesses a strategy that drives the learner's total p-loss to infinity. This motivates three modified scenarios that confine the adversary to nearby points, apply penalties only locally, or down-weight losses from distant queries. In the first two scenarios the authors derive sharp characterizations of achievable loss in multiple regimes. In the weighted scenario a clean threshold separates weights that still permit infinite loss from those that yield finite sharp bounds, at least when p equals q equals 2. The natural slice generalization to functions on R^d remains ill-posed for every dimension two and higher even after the same modifications.

Core claim

The standard model of online learning against an adversary is ill-posed for the class G_q of absolutely continuous functions whose q-th power of the derivative integrates to at most one: for every p ≥ 1 and q > 1 the adversary can force the cumulative p-loss to be infinite. Under three locality-respecting modifications of the model, finite and sometimes sharp loss bounds become available in one dimension for certain parameter values, while the corresponding slice classes on R^d for d ≥ 2 remain ill-posed.

What carries the argument

The class G_q of absolutely continuous functions with ∫|f'|^q dx ≤ 1, together with the three modified scenarios that restrict the adversary's choice of distant points or down-weight distant losses by a decaying function g.

If this is right

In the unrestricted model no learner can keep total loss finite against an adaptive adversary on R.
Scenarios 1 and 2 admit explicit characterizations of the minimal achievable loss for several pairs of p and q.
Exponential decay of the weight g guarantees finite loss for p = q = 2 while slower decay still permits infinite loss.
The slice classes G_{q,d} on R^d for d ≥ 2 admit infinite loss under all three modified scenarios.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Well-posed online learning of smooth functions appears to require either a bounded domain or explicit locality constraints that prevent arbitrary jumps.
Similar ill-posedness is likely to appear for other function classes on unbounded domains unless comparable distance restrictions are imposed.
Practical algorithms could adopt distance-based weighting schemes to realize the finite bounds that appear for rapidly decaying g.

Load-bearing premise

That the adversary is allowed to select query points arbitrarily far from all previous observations without any penalty or restriction on distance.

What would settle it

An explicit sequence of query points together with values consistent with a single function in G_q such that every learner strategy produces cumulative p-loss diverging to infinity.

read the original abstract

We study adversarial online learning of real-valued functions on $\mathbb{R}$. In each round the learner is queried at $x_t\in\mathbb{R}$, predicts $\hat y_t$, and then observes the true value $f(x_t)$; performance is measured by cumulative $p$-loss $\sum_{t\ge 1}|\hat y_t-f(x_t)|^p$. For the class \[ \mathcal{G}_q=\Bigl\{f:\mathbb{R}\to\mathbb{R}\ \text{absolutely continuous}:\ \int_{\mathbb{R}}|f'(x)|^q\,dx\le 1\Bigr\}, \] we show that the standard model becomes ill-posed on $\mathbb{R}$: for every $p\ge 1$ and $q>1$, an adversary can force infinite loss. Motivated by this obstruction, we analyze three modified learning scenarios that limit the influence of queries that are far from previously observed inputs. In Scenario 1 the adversary must choose each new query within distance $1$ of some past query. In Scenario 2 the adversary may query anywhere, but the learner is penalized only on rounds whose query lies within distance $1$ of a past query. In Scenario 3 the loss in round $t$ is multiplied by a weight $g(\min_{j<t}|x_t-x_j|)$. We obtain sharp characterizations for Scenarios 1-2 in several regimes. For Scenario 3 we identify a clean threshold phenomenon: if $g$ decays too slowly, then the adversary can force infinite weighted loss. In contrast, for rapidly decaying weights such as $g(z)=e^{-cz}$ we obtain finite and sharp guarantees in the quadratic case $p=q=2$. Finally, we study a natural multivariable slice generalization $\mathcal{G}_{q,d}$ of $\mathcal{G}_q$ on $\mathbb{R}^d$ and show a sharp dichotomy: while the one-dimensional case admits finite opt-values in certain regimes, for every $d\ge 2$ the slice class $\mathcal{G}_{q,d}$ is too permissive, and even under Scenarios 1-3 an adversary can force infinite loss.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows the standard online learning model for L^q-smooth functions is ill-posed on R and restores well-posedness with three modified scenarios that have sharp thresholds.

read the letter

The core result is that the usual adversarial online learning setup for absolutely continuous functions with bounded L^q derivative norm becomes ill-posed on the whole real line: for any p >=1 and q>1 the adversary can force infinite cumulative p-loss. The argument is that the derivative budget can be spread over arbitrarily wide intervals so that the incremental cost of a unit change at a new distant point goes to zero. That is a clean obstruction and it explains why bounded-domain results do not extend. The three modified scenarios fix this by restricting queries to distance-1 neighborhoods, penalizing only local queries, or exponentially discounting distant loss. In the first two the authors give sharp characterizations in several regimes; in the third they show a threshold on the decay rate of the weight g, with the exponential case g(z)=e^{-c z} yielding finite sharp bounds when p=q=2. The slice-class extension to R^d is also useful: it is well-behaved in d=1 under some regimes but ill-posed for every d>=2 even under the modified scenarios. These are genuine new observations that do not collapse to prior bounded-domain work. The modeling restrictions are somewhat artificial, but they are chosen precisely to block the spreading construction, so the motivation holds. The claims look internally consistent; the stress-test derivation of the ill-posedness matches the abstract and there is no circularity. The paper stays at the level of existence of finite versus infinite values rather than giving explicit algorithms, which is a limitation if one wants implementable bounds, but that is a matter of scope rather than a flaw. This is for people working on online learning over continuous or unbounded domains. It is solid enough to deserve referee time.

Referee Report

0 major / 2 minor

Summary. The paper studies adversarial online learning of real-valued functions from the class G_q of absolutely continuous functions on R with integral |f'|^q dx bounded by 1. It proves that the standard model is ill-posed for every p >= 1 and q > 1, as an adversary can force infinite cumulative p-loss by spreading the derivative budget over increasingly distant transition intervals. Three modified scenarios are introduced to restore well-posedness: (1) restricting new queries to distance-1 neighborhoods of past points, (2) penalizing only local queries, and (3) weighting losses by a decaying function g of the minimum distance to prior queries. Sharp characterizations are given for Scenarios 1-2 in several regimes, a threshold phenomenon is identified for Scenario 3 (infinite loss if g decays too slowly, finite sharp guarantees for g(z) = e^{-c z} when p = q = 2), and a slice generalization G_{q,d} on R^d is shown to remain ill-posed for all d >= 2 even under the modified scenarios.

Significance. If the results hold, the work establishes a fundamental ill-posedness obstruction for online learning of smooth functions over unbounded domains and supplies concrete, analyzable modifications that recover finite regret bounds with explicit thresholds. The sharp dichotomy between d = 1 and d >= 2, together with the explicit constructions and minimax arguments, provides clear guidance on when and how domain connectivity and loss localization matter. These contributions are valuable for the theory of online learning in continuous spaces.

minor comments (2)

[Abstract] The abstract states that sharp characterizations are obtained for Scenarios 1-2 'in several regimes' but does not list the regimes; adding an explicit enumeration (e.g., p = q = 2, or p = 1) would improve readability.
[Abstract] Notation for the weight function g in Scenario 3 is introduced without an immediate example; inserting the exponential case g(z) = e^{-c z} right after the definition would help readers follow the subsequent threshold statement.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review, detailed summary of our contributions, and recommendation to accept the manuscript. We are pleased that the significance of the ill-posedness obstruction for the standard model, the sharp characterizations under the three modified scenarios, and the d=1 versus d>=2 dichotomy have been recognized.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The derivations consist of explicit adversarial constructions showing ill-posedness of the standard model for G_q when q>1, followed by direct minimax analysis under the three modified scenarios that restrict query neighborhoods or apply weights. These steps rely on the integral bound defining G_q and the scenario restrictions themselves; no step reduces a claimed bound or prediction to a quantity defined in terms of the result, nor invokes self-citations as load-bearing premises. The multivariable dichotomy for d>=2 likewise follows from independent geometric arguments on orthogonal directions. The paper is self-contained against its stated modeling assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claims rest on the standard real-analysis definition of absolute continuity and the integral bound that defines G_q, plus the three new modeling restrictions introduced in the paper; no free parameters or new postulated entities appear.

axioms (1)

domain assumption Functions in G_q are absolutely continuous with integral |f'(x)|^q dx <=1
This is the central smoothness class whose online learnability is analyzed.

pith-pipeline@v0.9.0 · 5698 in / 1328 out tokens · 80932 ms · 2026-05-13T19:03:48.757690+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages

[1]

D. Angluin. Queries and concept learning. Machine Learning 2 (1988) 319-342

work page 1988
[2]

Auer, P.M

P. Auer, P.M. Long, W. Maass, and G.J. Woeginger. On the complexity of function learning. Machine Learning 18 (1995) 187-230

work page 1995
[3]

Auer and P.M

P. Auer and P.M. Long. Structural results about on-line learning models with and with- out queries. Machine Learning 36 (1999) 147-181

work page 1999
[4]

R. Feng, J. Geneson, A. Lee, and E. Slettnes. Sharp bounds on the price of bandit feedback for several models of mistake-bounded online learning. Theoretical Computer Science 965C (2023) 113980

work page 2023
[5]

Filmus, S

Y. Filmus, S. Hanneke, I. Mehalel, and S. Moran. Bandit-feedback online multiclass classification: variants and tradeoffs. NeurIPS 2024

work page 2024
[6]

J. Geneson. A note on the price of bandit feedback for mistake-bounded online learning. Theoretical Computer Science 874 (2021) 42-45. 33

work page 2021
[7]

Geneson, M

J. Geneson, M. Li, and L. Tang. Mistake-bounded online learning with operation caps. ArXiv preprint (2025)

work page 2025
[8]

Geneson and L

J. Geneson and L. Tang, Bounds on the price of feedback for mistake-bounded online learning. CoRR abs/2401.05794 (2024)

work page arXiv 2024
[9]

Geneson and E

J. Geneson and E. Zhou. Online learning of smooth functions. Theoretical Computer Science 979C (2023) 114203

work page 2023
[10]

Kimber and P

D. Kimber and P. M. Long. On-line learning of smooth functions of a single variable. Theoretical Computer Science 148 (1995) 141-156

work page 1995
[11]

Littlestone

N. Littlestone. Learning quickly when irrelevant attributes abound: a new linear- threshold algorithm. Machine Learning 2 (1988) 285-318

work page 1988
[12]

P. M. Long. Improved bounds about on-line learning of smooth functions of a single variable. Theoretical Computer Science, 241 (2000) 25-35

work page 2000
[13]

P.M. Long. New bounds on the price of bandit feedback for mistake-bounded online multiclass learning. Theoretical Computer Science 808 (2020) 159-163

work page 2020
[14]

W. Xie. Worst-case error bounds for online learning of smooth functions. ArXiv preprint (2025). 34

work page 2025

[1] [1]

D. Angluin. Queries and concept learning. Machine Learning 2 (1988) 319-342

work page 1988

[2] [2]

Auer, P.M

P. Auer, P.M. Long, W. Maass, and G.J. Woeginger. On the complexity of function learning. Machine Learning 18 (1995) 187-230

work page 1995

[3] [3]

Auer and P.M

P. Auer and P.M. Long. Structural results about on-line learning models with and with- out queries. Machine Learning 36 (1999) 147-181

work page 1999

[4] [4]

R. Feng, J. Geneson, A. Lee, and E. Slettnes. Sharp bounds on the price of bandit feedback for several models of mistake-bounded online learning. Theoretical Computer Science 965C (2023) 113980

work page 2023

[5] [5]

Filmus, S

Y. Filmus, S. Hanneke, I. Mehalel, and S. Moran. Bandit-feedback online multiclass classification: variants and tradeoffs. NeurIPS 2024

work page 2024

[6] [6]

J. Geneson. A note on the price of bandit feedback for mistake-bounded online learning. Theoretical Computer Science 874 (2021) 42-45. 33

work page 2021

[7] [7]

Geneson, M

J. Geneson, M. Li, and L. Tang. Mistake-bounded online learning with operation caps. ArXiv preprint (2025)

work page 2025

[8] [8]

Geneson and L

J. Geneson and L. Tang, Bounds on the price of feedback for mistake-bounded online learning. CoRR abs/2401.05794 (2024)

work page arXiv 2024

[9] [9]

Geneson and E

J. Geneson and E. Zhou. Online learning of smooth functions. Theoretical Computer Science 979C (2023) 114203

work page 2023

[10] [10]

Kimber and P

D. Kimber and P. M. Long. On-line learning of smooth functions of a single variable. Theoretical Computer Science 148 (1995) 141-156

work page 1995

[11] [11]

Littlestone

N. Littlestone. Learning quickly when irrelevant attributes abound: a new linear- threshold algorithm. Machine Learning 2 (1988) 285-318

work page 1988

[12] [12]

P. M. Long. Improved bounds about on-line learning of smooth functions of a single variable. Theoretical Computer Science, 241 (2000) 25-35

work page 2000

[13] [13]

P.M. Long. New bounds on the price of bandit feedback for mistake-bounded online multiclass learning. Theoretical Computer Science 808 (2020) 159-163

work page 2020

[14] [14]

W. Xie. Worst-case error bounds for online learning of smooth functions. ArXiv preprint (2025). 34

work page 2025