Recognition: no theorem link
Mini-Batch Stochastic Krasnosel'skiui-Mann Algorithm for Nonexpansive Fixed Point Problems
Pith reviewed 2026-05-10 17:48 UTC · model grok-4.3
The pith
The mini-batch stochastic Krasnosel'skiĭ-Mann algorithm converges almost surely to a fixed point of a nonexpansive mapping when batch sizes increase appropriately.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that the Krasnosel'skiĭ-Mann iteration remains almost surely convergent when the exact nonexpansive mapping is replaced by an unbiased mini-batch stochastic estimator and the batch sizes are increased at a rate that satisfies the required summability conditions. A rate-of-convergence result is also derived for the resulting sequence of iterates.
What carries the argument
The mini-batch stochastic mapping, an unbiased estimator of the nonexpansive mapping, inserted into the Krasnosel'skiĭ-Mann recursion together with an increasing batch-size schedule.
If this is right
- The algorithm becomes applicable to large-scale fixed-point problems where full mapping evaluations are prohibitive.
- Almost-sure convergence is retained under standard assumptions on step sizes and batch-size growth.
- Convergence-rate bounds give concrete guidance on the number of iterations needed for a target accuracy.
- The method extends classical deterministic fixed-point theory to stochastic settings without losing the fixed-point guarantee.
Where Pith is reading between the lines
- The same unbiased-estimator construction could be applied to other fixed-point iterations such as Mann or Halpern schemes.
- In practice, the batch-size schedule can be tuned to balance per-iteration cost against overall runtime in large optimization tasks.
- Variance-reduction techniques might be layered on top to obtain faster rates than those currently proved.
- Numerical tests on concrete nonexpansive operators arising in feasibility or equilibrium problems would quantify the computational savings.
Load-bearing premise
The mini-batch estimator must be unbiased for the true nonexpansive mapping and the batch sizes must increase without bound in a controlled manner that makes the accumulated noise vanish.
What would settle it
Apply the algorithm to the projection onto a simple convex set with a known fixed point and check whether the iterates fail to approach that point when the batch sizes follow the prescribed increasing schedule.
read the original abstract
The Krasnosel'ski\u\i-Mann algorithm is a well-known method for finding fixed points of a nonexpansive mapping with strong theoretical guarantees. However, there are practical large-scale problems to which this algorithm cannot be applied. Here, to resolve the issue caused by the computational difficulty of the mapping, we define a computable mini-batch stochastic mapping, which is a unbiased estimator of the nonexpansive mapping, and implement it in the Krasnosel'ski\u\i-Mann algorithm. We show that the algorithm with increasing batch sizes converges almost surely to a fixed point of the nonexpansive mapping. We also perform a convergence rate analysis on the algorithm.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a mini-batch stochastic Krasnosel'skiĭ-Mann algorithm for fixed-point problems with nonexpansive mappings. It defines a computable mini-batch stochastic mapping as an unbiased estimator of the original mapping and proves that the resulting iteration with increasing batch sizes converges almost surely to a fixed point; a convergence-rate analysis is also included.
Significance. If the proofs are complete, the work is a useful extension of classical deterministic fixed-point methods to computationally expensive large-scale settings. The almost-sure convergence result under an unbiased mini-batch estimator and the accompanying rate analysis provide a theoretically grounded stochastic approximation scheme that could be applied in optimization and equilibrium problems where full mapping evaluations are prohibitive.
major comments (1)
- [§3, Theorem 3.1] §3, Theorem 3.1 (almost-sure convergence): the statement assumes only that the batch-size sequence satisfies b_n → ∞. For the Robbins-Siegmund or supermartingale argument to yield a.s. convergence, the perturbation term requires an explicit summability condition such as ∑ α_n² / b_n < ∞ (or an equivalent moment bound on the conditional variance of the mini-batch estimator). Without this condition being stated and verified in the theorem hypotheses, the almost-sure convergence claim is not fully supported even under unbiasedness.
minor comments (2)
- [Abstract] Abstract: 'a unbiased estimator' should read 'an unbiased estimator'.
- [Notation] Notation: the step-size sequence is denoted α_n in some places and a_n in others; consistent use throughout would improve readability.
Simulated Author's Rebuttal
We thank the referee for the careful reading and the precise observation regarding the hypotheses needed for almost-sure convergence. We address the comment below and will revise the manuscript to incorporate the necessary condition.
read point-by-point responses
-
Referee: [§3, Theorem 3.1] §3, Theorem 3.1 (almost-sure convergence): the statement assumes only that the batch-size sequence satisfies b_n → ∞. For the Robbins-Siegmund or supermartingale argument to yield a.s. convergence, the perturbation term requires an explicit summability condition such as ∑ α_n² / b_n < ∞ (or an equivalent moment bound on the conditional variance of the mini-batch estimator). Without this condition being stated and verified in the theorem hypotheses, the almost-sure convergence claim is not fully supported even under unbiasedness.
Authors: We agree that the current formulation of Theorem 3.1 states only b_n → ∞ and that an additional summability condition on the variance of the mini-batch estimator is required to close the supermartingale argument. In the revised version we will add the hypothesis ∑_{n=1}^∞ α_n² / b_n < ∞ to Theorem 3.1. We will also include a short verification that this condition is satisfied for the standard step-size sequences (e.g., α_n = 1/n) paired with batch sizes that grow at least linearly (e.g., b_n = n), thereby making the almost-sure convergence claim fully rigorous under the stated unbiasedness assumption. revision: yes
Circularity Check
Derivation is self-contained; no reduction to fitted inputs or self-citations.
full rationale
The paper defines a mini-batch stochastic mapping as an unbiased estimator of the nonexpansive operator and invokes standard almost-sure convergence arguments for stochastic approximation under the stated growth condition on batch sizes. No equation or theorem in the provided abstract or description reduces the claimed a.s. convergence or rate result to a tautological fit, a self-referential definition, or a load-bearing self-citation whose content is unverified outside the paper. The unbiasedness assumption and nonexpansiveness are external to the algorithm construction and supply independent grounding for the supermartingale or Robbins-Siegmund step. The result therefore does not collapse by construction to its inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The underlying mapping is nonexpansive (i.e., Lipschitz continuous with constant 1)
- domain assumption The mini-batch stochastic mapping is an unbiased estimator of the original nonexpansive mapping
invented entities (1)
-
mini-batch stochastic mapping
no independent evidence
Forward citations
Cited by 3 Pith papers
-
Stochastic Krasnoselskii-Mann Iterations: Convergence without Uniformly Bounded Variance
Stochastic Krasnoselskii-Mann iterations converge almost surely weakly with finite variance only at a single fixed point, recovering best-known rates without uniform variance bounds.
-
Stochastic Krasnoselskii-Mann Iterations: Convergence without Uniformly Bounded Variance
Stochastic Krasnoselskii-Mann iterations converge almost surely and with rates under finite variance at a single fixed point rather than uniform variance bounds, recovering optimal complexity and providing first such ...
-
Mini-Batch Stochastic Halpern Algorithm for Nonexpansive Fixed point Problems
A mini-batch stochastic Halpern algorithm converges in mean square to the closest fixed point when using diminishing step sizes and increasing batch sizes.
Reference graph
Works this paper leans on
-
[1]
In: The Twelfth Inter- national Conference on Learning Representations (2024)
Ahn, K., Cheng, X., Song, M., Yun, C., Jadbabaie, A., Sra, S.: Linear attention is (maybe) all you need (to understand transformer optimization). In: The Twelfth Inter- national Conference on Learning Representations (2024). URLhttps://openreview. net/forum?id=0uI5415ry7
2024
-
[2]
Israel Journal of Mathematics26, 137–150 (1977)
Baillon, J.B., Haddad, G.: Quelques propri´ et´ es des op´ erateurs angle-born´ es etn- cycliquement monotones. Israel Journal of Mathematics26, 137–150 (1977)
1977
-
[3]
Battash, B., Wolf, L., Lindenbaum, O.: Revisiting the noise model of stochastic gradient descent. In: S. Dasgupta, S. Mandt, Y. Li (eds.) Proceedings of The 27th International Conference on Artificial Intelligence and Statistics,Proceedings of Machine Learning Research, vol. 238, pp. 4780–4788. PMLR (2024). URLhttps://proceedings.mlr. press/v238/battash24a.html
2024
-
[4]
SIAM Review38(3), 367–426 (1996) Title Suppressed Due to Excessive Length 21
Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Review38(3), 367–426 (1996) Title Suppressed Due to Excessive Length 21
1996
-
[5]
Springer, New York (2017)
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, New York (2017)
2017
-
[6]
Springer, Berlin (2007)
Berinde, V.: Iterative Approximation of Fixed Points. Springer, Berlin (2007)
2007
-
[7]
Athena Scientific, Cambridge, MA (2003)
Bertsekas, D.P., Nedi´ c, A., Ozdaglar, A.E.: Convex Analysis and Optimization. Athena Scientific, Cambridge, MA (2003)
2003
-
[8]
SIAM Journal on Control and Optimization62(1), 191–219 (2024)
Bravo, M., Cominetti, R.: Stochastic fixed-point iterations for nonexpansive maps: Con- vergence and error bounds. SIAM Journal on Control and Optimization62(1), 191–219 (2024). DOI 10.1137/22M1515550. URLhttps://doi.org/10.1137/22M1515550
-
[9]
SIAM Journal on Optimization25(2), 1221–1248 (2015)
Combettes, P.L., Pesquet, J.C.: Stochastic quasi-Fej´ er block-coordinate fixed point iter- ations with random sweeping. SIAM Journal on Optimization25(2), 1221–1248 (2015). DOI 10.1137/140971233. URLhttps://doi.org/10.1137/140971233
-
[10]
Israel Journal of Mathematics199(2), 757–772 (2014)
Cominetti, R., Soto, J.A., Vaisman, J.: On the rate of convergence of Krasnosel’ski˘ ı- Mann iterations and their connection with sums of Bernoullis. Israel Journal of Mathematics199(2), 757–772 (2014). DOI 10.1007/s11856-013-0045-4. URLhttps: //doi.org/10.1007/s11856-013-0045-4
-
[11]
Garg, S., Zhanson, J., Parisotto, E., Prasad, A., Kolter, Z., Lipton, Z., Balakrishnan, S., Salakhutdinov, R., Ravikumar, P.: On proximal policy optimization’s heavy-tailed gradients. In: M. Meila, T. Zhang (eds.) Proceedings of the 38th International Con- ference on Machine Learning,Proceedings of Machine Learning Research, vol. 139, pp. 3610–3619. PMLR ...
2021
-
[12]
Handbook of convergence theorems for (stochastic) gradient methods,
Garrigos, G., Gower, R.M.: Handbook of convergence theorems for (stochastic) gradient methods (2024). URLhttps://arxiv.org/abs/2301.11235
-
[13]
Cambridge Studies in Advanced Mathematics
Goebel, K., Kirk, W.A.: Topics in Metric Fixed Point Theory. Cambridge Studies in Advanced Mathematics. Cambridge University Press, New York (1990)
1990
-
[14]
Dekker, New York and Basel (1984)
Goebel, K., Reich, S.: Uniform Convexity, Hyperbolic Geometry, and Nonexpansive Mappings. Dekker, New York and Basel (1984)
1984
-
[15]
Journal of Mathematical Analysis and Applications40(2), 369–372 (1972)
Groetsch, C.: A note on segmenting Mann iterates. Journal of Mathematical Analysis and Applications40(2), 369–372 (1972). DOI https://doi.org/10.1016/ 0022-247X(72)90056-X. URLhttps://www.sciencedirect.com/science/article/ pii/0022247X7290056X
-
[16]
Journal of Optimization Theory and Applications148, 580–592 (2011)
Iiduka, H.: Iterative algorithm for solving triple-hierarchical constrained optimization problem. Journal of Optimization Theory and Applications148, 580–592 (2011)
2011
-
[17]
Uspekhi Matematicheskikh Nauk10, 123–127 (1955)
Krasnosel’ski˘ ı, M.A.: Two remarks on the method of successive approximations. Uspekhi Matematicheskikh Nauk10, 123–127 (1955)
1955
-
[18]
Proceedings of American Mathematical Society4, 506–510 (1953)
Mann, W.R.: Mean value methods in iteration. Proceedings of American Mathematical Society4, 506–510 (1953)
1953
-
[19]
Simsekli, U., Sagun, L., Gurbuzbalaban, M.: A tail-index analysis of stochastic gradient noise in deep neural networks. In: K. Chaudhuri, R. Salakhutdinov (eds.) Proceedings of the 36th International Conference on Machine Learning,Proceedings of Machine Learning Research, vol. 97, pp. 5827–5837. PMLR (2019). URLhttps://proceedings. mlr.press/v97/simsekli19a.html
2019
-
[20]
John Wiley & Sons Inc (1998)
Stark, H., Yang, Y.: Vector Space Projections: A Numerical Approach to Signal and Image Processing. John Wiley & Sons Inc (1998)
1998
-
[21]
Yokohama Publishers, Yokohama (2000)
Takahashi, W.: Nonlinear Functional Analysis. Yokohama Publishers, Yokohama (2000)
2000
-
[22]
Transactions on Machine Learning Research (2025)
Umeda, H., Iiduka, H.: Increasing both batch size and learning rate accelerates stochas- tic gradient descent. Transactions on Machine Learning Research (2025). URL https://openreview.net/forum?id=sbmp55k6iE
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.