Online learning of smooth functions on mathbb{R}
Pith reviewed 2026-05-13 19:03 UTC · model grok-4.3
The pith
An adversary can force infinite cumulative loss when learning smooth functions on the real line in the standard online model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The standard model of online learning against an adversary is ill-posed for the class G_q of absolutely continuous functions whose q-th power of the derivative integrates to at most one: for every p ≥ 1 and q > 1 the adversary can force the cumulative p-loss to be infinite. Under three locality-respecting modifications of the model, finite and sometimes sharp loss bounds become available in one dimension for certain parameter values, while the corresponding slice classes on R^d for d ≥ 2 remain ill-posed.
What carries the argument
The class G_q of absolutely continuous functions with ∫|f'|^q dx ≤ 1, together with the three modified scenarios that restrict the adversary's choice of distant points or down-weight distant losses by a decaying function g.
If this is right
- In the unrestricted model no learner can keep total loss finite against an adaptive adversary on R.
- Scenarios 1 and 2 admit explicit characterizations of the minimal achievable loss for several pairs of p and q.
- Exponential decay of the weight g guarantees finite loss for p = q = 2 while slower decay still permits infinite loss.
- The slice classes G_{q,d} on R^d for d ≥ 2 admit infinite loss under all three modified scenarios.
Where Pith is reading between the lines
- Well-posed online learning of smooth functions appears to require either a bounded domain or explicit locality constraints that prevent arbitrary jumps.
- Similar ill-posedness is likely to appear for other function classes on unbounded domains unless comparable distance restrictions are imposed.
- Practical algorithms could adopt distance-based weighting schemes to realize the finite bounds that appear for rapidly decaying g.
Load-bearing premise
That the adversary is allowed to select query points arbitrarily far from all previous observations without any penalty or restriction on distance.
What would settle it
An explicit sequence of query points together with values consistent with a single function in G_q such that every learner strategy produces cumulative p-loss diverging to infinity.
read the original abstract
We study adversarial online learning of real-valued functions on $\mathbb{R}$. In each round the learner is queried at $x_t\in\mathbb{R}$, predicts $\hat y_t$, and then observes the true value $f(x_t)$; performance is measured by cumulative $p$-loss $\sum_{t\ge 1}|\hat y_t-f(x_t)|^p$. For the class \[ \mathcal{G}_q=\Bigl\{f:\mathbb{R}\to\mathbb{R}\ \text{absolutely continuous}:\ \int_{\mathbb{R}}|f'(x)|^q\,dx\le 1\Bigr\}, \] we show that the standard model becomes ill-posed on $\mathbb{R}$: for every $p\ge 1$ and $q>1$, an adversary can force infinite loss. Motivated by this obstruction, we analyze three modified learning scenarios that limit the influence of queries that are far from previously observed inputs. In Scenario 1 the adversary must choose each new query within distance $1$ of some past query. In Scenario 2 the adversary may query anywhere, but the learner is penalized only on rounds whose query lies within distance $1$ of a past query. In Scenario 3 the loss in round $t$ is multiplied by a weight $g(\min_{j<t}|x_t-x_j|)$. We obtain sharp characterizations for Scenarios 1-2 in several regimes. For Scenario 3 we identify a clean threshold phenomenon: if $g$ decays too slowly, then the adversary can force infinite weighted loss. In contrast, for rapidly decaying weights such as $g(z)=e^{-cz}$ we obtain finite and sharp guarantees in the quadratic case $p=q=2$. Finally, we study a natural multivariable slice generalization $\mathcal{G}_{q,d}$ of $\mathcal{G}_q$ on $\mathbb{R}^d$ and show a sharp dichotomy: while the one-dimensional case admits finite opt-values in certain regimes, for every $d\ge 2$ the slice class $\mathcal{G}_{q,d}$ is too permissive, and even under Scenarios 1-3 an adversary can force infinite loss.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies adversarial online learning of real-valued functions from the class G_q of absolutely continuous functions on R with integral |f'|^q dx bounded by 1. It proves that the standard model is ill-posed for every p >= 1 and q > 1, as an adversary can force infinite cumulative p-loss by spreading the derivative budget over increasingly distant transition intervals. Three modified scenarios are introduced to restore well-posedness: (1) restricting new queries to distance-1 neighborhoods of past points, (2) penalizing only local queries, and (3) weighting losses by a decaying function g of the minimum distance to prior queries. Sharp characterizations are given for Scenarios 1-2 in several regimes, a threshold phenomenon is identified for Scenario 3 (infinite loss if g decays too slowly, finite sharp guarantees for g(z) = e^{-c z} when p = q = 2), and a slice generalization G_{q,d} on R^d is shown to remain ill-posed for all d >= 2 even under the modified scenarios.
Significance. If the results hold, the work establishes a fundamental ill-posedness obstruction for online learning of smooth functions over unbounded domains and supplies concrete, analyzable modifications that recover finite regret bounds with explicit thresholds. The sharp dichotomy between d = 1 and d >= 2, together with the explicit constructions and minimax arguments, provides clear guidance on when and how domain connectivity and loss localization matter. These contributions are valuable for the theory of online learning in continuous spaces.
minor comments (2)
- [Abstract] The abstract states that sharp characterizations are obtained for Scenarios 1-2 'in several regimes' but does not list the regimes; adding an explicit enumeration (e.g., p = q = 2, or p = 1) would improve readability.
- [Abstract] Notation for the weight function g in Scenario 3 is introduced without an immediate example; inserting the exponential case g(z) = e^{-c z} right after the definition would help readers follow the subsequent threshold statement.
Simulated Author's Rebuttal
We thank the referee for their positive review, detailed summary of our contributions, and recommendation to accept the manuscript. We are pleased that the significance of the ill-posedness obstruction for the standard model, the sharp characterizations under the three modified scenarios, and the d=1 versus d>=2 dichotomy have been recognized.
Circularity Check
No significant circularity detected
full rationale
The derivations consist of explicit adversarial constructions showing ill-posedness of the standard model for G_q when q>1, followed by direct minimax analysis under the three modified scenarios that restrict query neighborhoods or apply weights. These steps rely on the integral bound defining G_q and the scenario restrictions themselves; no step reduces a claimed bound or prediction to a quantity defined in terms of the result, nor invokes self-citations as load-bearing premises. The multivariable dichotomy for d>=2 likewise follows from independent geometric arguments on orthogonal directions. The paper is self-contained against its stated modeling assumptions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Functions in G_q are absolutely continuous with integral |f'(x)|^q dx <=1
Reference graph
Works this paper leans on
-
[1]
D. Angluin. Queries and concept learning. Machine Learning 2 (1988) 319-342
work page 1988
- [2]
-
[3]
P. Auer and P.M. Long. Structural results about on-line learning models with and with- out queries. Machine Learning 36 (1999) 147-181
work page 1999
-
[4]
R. Feng, J. Geneson, A. Lee, and E. Slettnes. Sharp bounds on the price of bandit feedback for several models of mistake-bounded online learning. Theoretical Computer Science 965C (2023) 113980
work page 2023
- [5]
-
[6]
J. Geneson. A note on the price of bandit feedback for mistake-bounded online learning. Theoretical Computer Science 874 (2021) 42-45. 33
work page 2021
-
[7]
J. Geneson, M. Li, and L. Tang. Mistake-bounded online learning with operation caps. ArXiv preprint (2025)
work page 2025
-
[8]
J. Geneson and L. Tang, Bounds on the price of feedback for mistake-bounded online learning. CoRR abs/2401.05794 (2024)
-
[9]
J. Geneson and E. Zhou. Online learning of smooth functions. Theoretical Computer Science 979C (2023) 114203
work page 2023
-
[10]
D. Kimber and P. M. Long. On-line learning of smooth functions of a single variable. Theoretical Computer Science 148 (1995) 141-156
work page 1995
-
[11]
N. Littlestone. Learning quickly when irrelevant attributes abound: a new linear- threshold algorithm. Machine Learning 2 (1988) 285-318
work page 1988
-
[12]
P. M. Long. Improved bounds about on-line learning of smooth functions of a single variable. Theoretical Computer Science, 241 (2000) 25-35
work page 2000
-
[13]
P.M. Long. New bounds on the price of bandit feedback for mistake-bounded online multiclass learning. Theoretical Computer Science 808 (2020) 159-163
work page 2020
-
[14]
W. Xie. Worst-case error bounds for online learning of smooth functions. ArXiv preprint (2025). 34
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.