pith. sign in

arxiv: 2605.21263 · v1 · pith:QYGSY5QWnew · submitted 2026-05-20 · 💻 cs.LG

Nonparametric Learning and Earning with One-Point Feedback under Nonstationarity

Pith reviewed 2026-05-21 05:00 UTC · model grok-4.3

classification 💻 cs.LG
keywords dynamic pricingnonparametric learningone-point feedbacknonstationary environmentsregret boundsrestarting mechanismmeta-learning
0
0 comments X

The pith

A nonparametric pricing method learns demand from single revenue observations per period and adapts to market shifts via restarts, bounding revenue loss by time horizon and variation size.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a framework for sellers to set prices dynamically when only the revenue from one chosen price is observed each period and when customer demand can change over time. It updates prices using gradient approximations built directly from those single revenue readings, without assuming any particular shape for the demand curve. To cope with shifts, the method periodically restarts the learning process to discard outdated data, and adds a meta-learning layer that hedges across different restart schedules when the pace of change is unknown. If the guarantees hold, total revenue lost compared to knowing the full demand function grows only with the length of the selling horizon and the total amount the market has varied.

Core claim

By constructing revenue-based gradient approximations from one observation per period and incorporating a restarting mechanism that periodically refreshes the learning process, the seller's cumulative revenue loss relative to a fully informed benchmark depends on both the time horizon and the magnitude of market variation.

What carries the argument

Revenue-based gradient approximations from one observation per period, combined with a restarting mechanism that periodically refreshes the learning process to discount outdated information.

If this is right

  • Cumulative revenue loss scales with both the time horizon and the total variation in market conditions.
  • The procedure requires no parametric assumption on the demand function.
  • A meta-learning layer allows adaptation when the degree of nonstationarity is unknown.
  • Simulation results on synthetic and real-world data show practical effectiveness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same restart-and-meta structure could be tested in other limited-feedback sequential problems such as inventory control under drifting demand.
  • Platforms could deploy the method to maintain pricing performance across seasonal cycles without requiring manual tuning of restart frequency.
  • Adding occasional side observations, such as competitor prices, might tighten the loss bounds further.

Load-bearing premise

The restarting mechanism effectively discounts outdated information so that learning can track changes in the underlying demand relationship.

What would settle it

In a controlled setting with known abrupt demand shifts, removing the restarts produces revenue loss that grows linearly with the number of changes instead of staying bounded by the variation measure.

Figures

Figures reproduced from arXiv: 2605.21263 by Feng Xu, Jian-qiang Hu, Jiaqiao Hu, Xiangyu Yang.

Figure 1
Figure 1. Figure 1: Overview of the proposed methods: Algorithm [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: No Variation gradient ascent on the feasible box X with a fixed step size η = 0.01. At each round, the perturbation radius δ = 0.1. The meta-learner updates expert weights via exponential weighting with rate ε = 0.5. Bandit feedback is corrupted by an additive Gaussian noise N (0, 0.1 2 ). As a naive baseline, we also include a random policy that selects actions uniformly from X . All results are averaged … view at source ↗
Figure 3
Figure 3. Figure 3: Low Variation [PITH_FULL_IMAGE:figures/full_fig_p020_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: High Variation 20 [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Variation=10, b path samples adaptively to the moving high-value regions. In contrast, the benchmark methods become less responsive under stronger nonstationarity, which leads to less efficient exploration and inferior tracking performance. 6.3 Real-world Nonstationary Pricing Experiment Using Walmart Dataset To further evaluate the proposed policy in a more realistic nonstationary demand environment, we c… view at source ↗
Figure 6
Figure 6. Figure 6: Variation=10, sample heatmaps [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Variation=10, action trajectories 23 [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Variation=10, regret [PITH_FULL_IMAGE:figures/full_fig_p024_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Variation=40, b path 24 [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Variation=40, sample heatmaps [PITH_FULL_IMAGE:figures/full_fig_p025_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Variation=40, action trajectories 25 [PITH_FULL_IMAGE:figures/full_fig_p025_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Variation=40, regret 26 [PITH_FULL_IMAGE:figures/full_fig_p026_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Real-world nonstationary environment 7 Conclusion This paper studies a nonparametric dynamic pricing problem in a nonstationary environment with one-point feedback. The seller observes only realized revenues from a single posted price in each period, while the underlying revenue functions may change over time. We analyze this problem through a hierarchical construction that combines online mirror ascent w… view at source ↗
Figure 14
Figure 14. Figure 14: Variation=0, b path [PITH_FULL_IMAGE:figures/full_fig_p041_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Variation=0, sample heatmaps 41 [PITH_FULL_IMAGE:figures/full_fig_p041_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Variation=0, action trajectories [PITH_FULL_IMAGE:figures/full_fig_p042_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Variation=0, regret 42 [PITH_FULL_IMAGE:figures/full_fig_p042_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Variation=20, b path [PITH_FULL_IMAGE:figures/full_fig_p043_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Variation=20, sample heatmaps 43 [PITH_FULL_IMAGE:figures/full_fig_p043_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Variation=20, action trajectories [PITH_FULL_IMAGE:figures/full_fig_p044_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Variation=20, regret 44 [PITH_FULL_IMAGE:figures/full_fig_p044_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Variation=30, b path [PITH_FULL_IMAGE:figures/full_fig_p045_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Variation=30, sample heatmaps 45 [PITH_FULL_IMAGE:figures/full_fig_p045_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Variation=30, action trajectories [PITH_FULL_IMAGE:figures/full_fig_p046_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Variation=30, regret References Chen G, Teboulle M. 1993. Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM Journal on Optimization, 3 (3), 538-543. Cesa-Bianchi N, Lugosi G. 2006. Prediction, Learning, and Games. Cambridge University Press. 46 [PITH_FULL_IMAGE:figures/full_fig_p046_25.png] view at source ↗
read the original abstract

Firms increasingly rely on dynamic pricing to respond to evolving customer demand, yet in many applications they observe only the revenue generated by a single posted price in each period. At the same time, market conditions may shift gradually or abruptly due to changes in customer preferences, competition, or external shocks. These features create two intertwined challenges: learning the revenue--demand relationship from limited feedback and adapting pricing decisions to a changing environment. We study how a seller can learn and earn effectively under these constraints, without assuming a specific parametric form for demand. We develop a learning framework that updates prices using revenue-based gradient approximations constructed from one observation per period. To address environmental changes, we incorporate a restarting mechanism that periodically refreshes the learning process so that outdated information is discounted. When the degree of nonstationarity is unknown, we further introduce a meta-learning layer to adaptively hedge across multiple restarting schedules. We provide performance guarantees for our approach, showing how cumulative revenue loss relative to a fully informed benchmark depends on both the time horizon and the magnitude of market variation. Simulation experiments using synthetic and real-world data illustrate the effectiveness of the proposed procedures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper studies nonparametric dynamic pricing with one-point (revenue-only) feedback in nonstationary environments. It proposes a framework that constructs revenue-based gradient approximations from single observations per period, incorporates periodic restarts to discount outdated information, and adds a meta-learning layer that hedges across multiple restarting schedules when the degree of nonstationarity is unknown. Performance guarantees are claimed showing that cumulative revenue loss relative to a fully informed benchmark scales with the time horizon T and a measure of market variation V; the claims are illustrated with synthetic and real-world simulations.

Significance. If the stated regret bounds hold, the work provides a useful nonparametric extension of online learning techniques to nonstationary pricing problems with minimal feedback. The adaptive restarting-plus-meta-learning construction and the explicit dependence on a variation measure V are technically interesting and practically relevant for revenue management applications. The simulation results on real data add empirical support, though the theoretical contribution would be strengthened by matching lower bounds or comparisons to parametric baselines.

major comments (2)
  1. [§3] §3 (variation measure definition): the paper introduces a specific measure V of market variation to obtain the claimed O(T^{2/3} + V) type bound, but it is unclear whether this V is equivalent to standard total-variation or Lipschitz notions used in the nonstationary bandit literature; an explicit comparison or reduction would clarify whether the bound is novel or recovers known rates.
  2. [§4.2] §4.2 (meta-learning analysis): the regret decomposition for the adaptive hedging layer over restarting schedules appears to rely on the variation being bounded within each restart interval; the proof sketch should explicitly state the assumption on intra-interval variation and show how the meta-regret term remains sublinear when V is unknown.
minor comments (2)
  1. [Abstract] The abstract states that performance guarantees exist but does not mention the precise rate or the variation measure V; adding one sentence with the dependence on T and V would improve readability for readers who stop at the abstract.
  2. [Simulation section] Figure captions for the simulation results should include the exact parameter settings (e.g., number of restarts, meta-learning rates) used to generate each curve so that the experiments are fully reproducible from the text alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful review and the recommendation for minor revision. We address each major comment below and will revise the manuscript accordingly to improve clarity.

read point-by-point responses
  1. Referee: [§3] §3 (variation measure definition): the paper introduces a specific measure V of market variation to obtain the claimed O(T^{2/3} + V) type bound, but it is unclear whether this V is equivalent to standard total-variation or Lipschitz notions used in the nonstationary bandit literature; an explicit comparison or reduction would clarify whether the bound is novel or recovers known rates.

    Authors: We appreciate this suggestion for clarification. Our variation measure V is defined as the sum over time of the total variation in the revenue curve, specifically V := sum_{t=1}^{T-1} sup_p |r_t(p) - r_{t+1}(p)| where r_t is the revenue function at time t. This is a natural extension of the total variation for functions. In the revised version, we will include a remark in Section 3 explicitly relating V to the standard notions: when the demand functions are Lipschitz continuous with constant L, our V is bounded by L times the total variation in the demand parameters, thus recovering the standard rates in the literature. This comparison highlights that our bound is novel in the nonparametric one-point feedback setting but consistent with prior work. revision: yes

  2. Referee: [§4.2] §4.2 (meta-learning analysis): the regret decomposition for the adaptive hedging layer over restarting schedules appears to rely on the variation being bounded within each restart interval; the proof sketch should explicitly state the assumption on intra-interval variation and show how the meta-regret term remains sublinear when V is unknown.

    Authors: Thank you for this observation. In our analysis, we assume that the variation within each restart interval of length tau is at most V * (tau / T), which follows from the definition of V as the total variation. For the meta-learning layer, we employ a standard exponential weights algorithm over a grid of possible restart frequencies, and the meta-regret is bounded by O(sqrt(K log T)) where K is the number of schedules, independent of V. When V is unknown, the adaptive choice ensures the overall regret remains O(T^{2/3} + V). We will expand the proof sketch in the appendix to explicitly state this assumption and derive the sublinear meta-regret term. revision: yes

Circularity Check

0 steps flagged

No significant circularity; bounds derived from independent variation measure and restart schedule

full rationale

The paper defines a market variation measure V externally from the sequence of demand functions, then constructs a restarting-plus-meta-learning procedure whose regret analysis yields an explicit dependence on both T and V. This dependence is obtained by standard online-learning arguments applied to the restarted nonparametric gradient estimates; it is not obtained by fitting V to the regret or by renaming an internal quantity. No load-bearing step reduces by construction to a fitted parameter or to a self-citation whose content is the target bound itself. The derivation therefore remains self-contained against the stated external parameters.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the approach relies on unspecified regularity conditions for the revenue function and on the existence of a variation measure that can be bounded.

pith-pipeline@v0.9.0 · 5734 in / 1091 out tokens · 41240 ms · 2026-05-21T05:00:42.183222+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 1 internal anchor

  1. [1]

    Walmart Cuts Profit Outlook as It Lowers Prices to Move Goods

    Sarah Nassauer. Walmart Cuts Profit Outlook as It Lowers Prices to Move Goods. 2022

  2. [2]

    Dynamic pricing and learning: Historical origins, current research, and new directions , journal =

    Arnoud V. Dynamic pricing and learning: Historical origins, current research, and new directions , journal =. 2015 , issn =

  3. [3]

    Operations Research , volume =

    Besbes, Omar and Gur, Yonatan and Zeevi, Assaf , title =. Operations Research , volume =

  4. [4]

    Proceedings of The 33rd International Conference on Machine Learning , pages =

    Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient , author =. Proceedings of The 33rd International Conference on Machine Learning , pages =. 2016 , volume =

  5. [5]

    Proceedings of the 19th International Conference on Artificial Intelligence and Statistics , pages=

    Hu, Xiaowei and Prashanth, LA and Gy. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics , pages=. 2016 , month =

  6. [6]

    1997 , issn =

    A one-measurement form of simultaneous perturbation stochastic approximation , journal =. 1997 , issn =

  7. [7]

    and Kalai, Adam Tauman and McMahan, H

    Flaxman, Abraham D. and Kalai, Adam Tauman and McMahan, H. Brendan , title =. Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms , pages =. 2005 , isbn =

  8. [8]

    Generalizing

    Gao, Katelyn and Sener, Ozan , booktitle =. Generalizing. 2022 , volume =

  9. [9]

    2003 , issn =

    Mirror descent and nonlinear projected subgradient methods for convex optimization , journal =. 2003 , issn =

  10. [10]

    Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics , pages =

    Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback , author =. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics , pages =. 2011 , volume =

  11. [11]

    , journal=

    Chen, Tianyi and Giannakis, Georgios B. , journal=. Bandit Convex Optimization for Scalable and Dynamic IoT Management , year=

  12. [12]

    Journal of Machine Learning Research , year =

    Peng Zhao and Guanghui Wang and Lijun Zhang and Zhi-Hua Zhou , title =. Journal of Machine Learning Research , year =

  13. [13]

    Management Science , volume =

    Cheung, Wang Chi and Simchi-Levi, David and Zhu, Ruihao , title =. Management Science , volume =

  14. [14]

    Prediction, Learning, and Games , publisher=

    Cesa-Bianchi, Nicolo and Lugosi, Gabor , year=. Prediction, Learning, and Games , publisher=

  15. [15]

    Bandit Algorithms , publisher=

    Lattimore, Tor and Szepesv\'. Bandit Algorithms , publisher=

  16. [16]

    Management Science , volume =

    Besbes, Omar and Zeevi, Assaf , title =. Management Science , volume =

  17. [17]

    Proceedings of the Twentieth International Conference on Machine Learning , pages =

    Zinkevich, Martin , title =. Proceedings of the Twentieth International Conference on Machine Learning , pages =. 2003 , isbn =

  18. [18]

    Foundations and Trends in Optimization , volume=

    Gradient-based algorithms for zeroth-order optimization , author=. Foundations and Trends in Optimization , volume=. 2025 , publisher=

  19. [19]

    Bandit Convex Optimisation , publisher=

    Lattimore, Tor , year=. Bandit Convex Optimisation , publisher=

  20. [20]

    Operations Research , volume =

    Besbes, Omar and Zeevi, Assaf , title =. Operations Research , volume =

  21. [21]

    and Keskin, N

    den Boer, Arnoud V. and Keskin, N. Bora , title =. Management Science , volume =

  22. [22]

    Management Science , volume =

    Aviv, Yossi and Pazgal, Amit , title =. Management Science , volume =

  23. [23]

    Bora and Li, Meng , title =

    Keskin, N. Bora and Li, Meng , title =. Operations Research , volume =

  24. [24]

    2002 , note =

    Learning and control in a changing economic environment , journal =. 2002 , note =

  25. [25]

    and Keskin, Nuri Bora

    den Boer, Arnoud V. and Keskin, Nuri Bora. Dynamic Pricing and Demand Learning in Nonstationary Environments. The Elements of Joint Learning and Optimization in Operations Management. 2022

  26. [26]

    Bora and Zeevi, Assaf , title =

    Keskin, N. Bora and Zeevi, Assaf , title =. Mathematics of Operations Research , volume =

  27. [27]

    2015 , author =

    Tracking the market: Dynamic pricing and learning in a changing environment , journal =. 2015 , author =

  28. [28]

    Jeff and Li, Chenghuai and Luo, Jun , title =

    Hong, L. Jeff and Li, Chenghuai and Luo, Jun , title =. Naval Research Logistics (NRL) , volume =

  29. [29]

    2024 , author =

    Nonparametric multi-product dynamic pricing with demand learning via simultaneous price perturbation , journal =. 2024 , author =

  30. [30]

    On upper-confidence bound policies for switching bandit problems , year =

    Garivier, Aur\'. On upper-confidence bound policies for switching bandit problems , year =. Proceedings of the 22nd International Conference on Algorithmic Learning Theory , pages =

  31. [31]

    Bora , title =

    Ban, Gah-Yi and Keskin, N. Bora , title =. Management Science , volume =

  32. [32]

    Production and Operations Management , volume =

    Miao, Sentao and Chen, Xi and Chao, Xiuli and Liu, Jiaxi and Zhang, Yidong , title =. Production and Operations Management , volume =

  33. [33]

    Mathematics of Operations Research , volume =

    Luo, Yiyun and Sun, Will Wei and Liu, Yufeng , title =. Mathematics of Operations Research , volume =

  34. [34]

    Journal of the American Statistical Association , number=

    Contextual dynamic pricing: Algorithms, optimality, and local differential privacy constraints , author=. Journal of the American Statistical Association , number=. 2026 , publisher=

  35. [35]

    Manufacturing & Service Operations Management , volume =

    Zhang, Huanan and Jasin, Stefanus , title =. Manufacturing & Service Operations Management , volume =

  36. [36]

    and Chen, Hongfan (Kevin) and Keskin, N

    Birge, John R. and Chen, Hongfan (Kevin) and Keskin, N. Bora , title =. Operations Research , volume =

  37. [37]

    Meylahn, Janusz M. and V. den Boer, Arnoud , title =. Manufacturing & Service Operations Management , volume =

  38. [38]

    Koolen and Dirk van der Hoeven , title =

    Tim van Erven and Wouter M. Koolen and Dirk van der Hoeven , title =. Journal of Machine Learning Research , year =

  39. [39]

    Introduction to Online Convex Optimization , edition =

    Hazan, Elad , isbn=. Introduction to Online Convex Optimization , edition =. 2022 , publisher=

  40. [40]

    Tracking the Best Expert in Non-stationary Stochastic Environments , volume =

    Wei, Chen-Yu and Hong, Yi-Te and Lu, Chi-Jen , booktitle =. Tracking the Best Expert in Non-stationary Stochastic Environments , volume =

  41. [41]

    2018 Annual American Control Conference (ACC) , pages=

    On abruptly-changing and slowly-varying multiarmed bandit problems , author=. 2018 Annual American Control Conference (ACC) , pages=. 2018 , organization=

  42. [42]

    Proceedings of the 31st Conference On Learning Theory , pages =

    Efficient Contextual Bandits in Non-stationary Worlds , author =. Proceedings of the 31st Conference On Learning Theory , pages =. 2018 , editor =

  43. [43]

    2015 , volume =

    Jadbabaie, Ali and Rakhlin, Alexander and Shahrampour, Shahin and Sridharan, Karthik , booktitle =. 2015 , volume =

  44. [44]

    Proceedings of the 35th International Conference on Machine Learning , pages =

    Dynamic Regret of Strongly Adaptive Methods , author =. Proceedings of the 35th International Conference on Machine Learning , pages =. 2018 , editor =

  45. [45]

    Stochastic Systems , volume =

    Besbes, Omar and Gur, Yonatan and Zeevi, Assaf , title =. Stochastic Systems , volume =

  46. [46]

    Operations Research , volume =

    Wang, Yining , title =. Operations Research , volume =

  47. [47]

    , title =

    Chen, Yiwei and Farias, Vivek F. , title =. Operations Research , volume =

  48. [48]

    Proceedings of the 37th International Conference on Machine Learning , pages =

    When Demands Evolve Larger and Noisier: Learning and Earning in a Growing Environment , author =. Proceedings of the 37th International Conference on Machine Learning , pages =. 2020 , editor =

  49. [49]

    Bandit Learning in Concave N-Person Games , volume =

    Bravo, Mario and Leslie, David and Mertikopoulos, Panayotis , booktitle =. Bandit Learning in Concave N-Person Games , volume =

  50. [50]

    Operations Research , volume =

    Ba, Wenjia and Lin, Tianyi and Zhang, Jiawei and Zhou, Zhengyuan , title =. Operations Research , volume =

  51. [51]

    SIAM journal on computing , volume=

    The nonstochastic multiarmed bandit problem , author=. SIAM journal on computing , volume=. 2002 , publisher=

  52. [52]

    , author=

    X-Armed Bandits. , author=. Journal of Machine Learning Research , volume=

  53. [53]

    Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

    Gaussian process optimization in the bandit setting: No regret and experimental design , author=. arXiv preprint arXiv:0912.3995 , year=

  54. [54]

    Machine learning , volume=

    Finite-time analysis of the multiarmed bandit problem , author=. Machine learning , volume=. 2002 , publisher=

  55. [55]

    Naval Research Logistics (NRL) , volume =

    Zhang, Huanan and Shi, Cong and Qin, Chao and Hua, Cheng , title =. Naval Research Logistics (NRL) , volume =

  56. [56]

    2025 , eprint=

    Learning When to Restart: Nonstationary Newsvendor from Uncensored to Censored Demand , author=. 2025 , eprint=

  57. [57]

    Production and Operations Management , volume =

    Chen, Boxiao , title =. Production and Operations Management , volume =

  58. [58]

    Mathematics of Operations Research , volume =

    Chen, Boxiao and Chao, Xiuli and Shi, Cong , title =. Mathematics of Operations Research , volume =

  59. [59]

    Chen G, Teboulle M. 1993. Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM Journal on Optimization, 3 (3), 538-543

  60. [60]

    Cesa-Bianchi N, Lugosi G. 2006. Prediction, Learning, and Games. Cambridge University Press