Optimal Contextual Pricing under Agnostic Non-Lipschitz Demand

Jianyu Xu, Yu-Xiang Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-08 15:07 UTC · model grok-4.3

classification 💻 cs.LG econ.EMstat.ML

keywords pricingagnosticcontextualnon-lipschitzregretunderalgorithmdemand

0 comments

The pith

A new polynomial-time algorithm achieves Õ(T^{2/3}) regret for contextual dynamic pricing under linear valuations and agnostic non-Lipschitz noise, matching known lower bounds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The work studies online pricing where each customer has a value that depends linearly on observed features, but the noise in whether they buy can create demand curves with jumps and point masses. These breaks prevent standard smoothing techniques from working across different customer contexts. Earlier algorithms could only guarantee regret that grew like T to the three-quarters power. The proposed Conservative-Markdown Redirect-UCB Pricing method uses randomized estimation of the linear parameters, careful probing of possible prices on a grid while staying conservative, and then one-step redirection using confidence bounds. This combination yields regret that scales as T to the two-thirds power, which matches the best possible rate known from earlier lower-bound results. The approach remains computationally efficient and works even when the noise distribution is completely unknown except for being bounded.

Core claim

Our algorithm achieves Õ(T^{2/3}) optimal regret, matching the known lower bounds of Kleinberg and Leighton (2003) up to logarithmic factors and improving over the previous upper bound of Xu and Wang (2022).

Load-bearing premise

The setting assumes linear valuations with bounded-support agnostic noise whose induced demand may be non-Lipschitz with arbitrary jumps and atoms; if this model does not capture the discontinuities, the regret guarantee may not hold.

read the original abstract

We study contextual dynamic pricing with linear valuations and bounded-support agnostic noise, whose induced demand curve may be non-Lipschitz with arbitrary jumps and atoms. Such discontinuities break the cross-context interpolation arguments used by smooth-demand pricing algorithms, while the best previous method achieved only $\tilde O(T^{3/4})$ regret. We propose Conservative-Markdown Redirect-UCB Pricing, a polynomial-time algorithm that combines randomized parameter estimation, conservative residual-grid probing, and confidence-based one-step redirection. Our algorithm achieves $\tilde O(T^{2/3})$ optimal regret, matching the known lower bounds of Kleinberg and Leighton (2003) up to logarithmic factors and improving over the previous upper bound of Xu and Wang (2022). Under stochastic well-conditioned contexts, this closes the long-existing open regret gap in linear-valuation contextual pricing under agnostic non-Lipschitz noise distribution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 4 minor

Summary. The paper studies contextual dynamic pricing with linear valuations and bounded-support agnostic noise inducing non-Lipschitz demand with jumps and atoms. It proposes the Conservative-Markdown Redirect-UCB Pricing algorithm using randomized parameter estimation, conservative residual-grid probing, and confidence-based one-step redirection. The central claim is that this polynomial-time method achieves Õ(T^{2/3}) regret under stochastic well-conditioned contexts, matching the Kleinberg-Leighton (2003) lower bound up to logs and improving on the prior Õ(T^{3/4}) upper bound.

Significance. If the analysis holds, the result is significant: it closes the long-standing regret gap for this setting by attaining the information-theoretically optimal rate without smoothness or Lipschitz assumptions on demand. The explicit handling of discontinuities via probing and redirection, together with the polynomial-time guarantee, strengthens the theoretical foundation for dynamic pricing under realistic agnostic noise models.

minor comments (4)

§3 (algorithm description): the pseudocode for Conservative-Markdown Redirect-UCB Pricing would be clearer if the redirection step explicitly shows how it interacts with the residual grid to control the probability of atom hits.
Theorem 4.1 (main regret bound): state the explicit dependence on context dimension d and noise support size B; the current Õ notation leaves open whether these enter only logarithmically.
Related Work: add a short paragraph contrasting the new techniques against the cross-context interpolation used in smooth-demand papers, to highlight precisely where prior arguments fail.
Assumption 2 (well-conditioned contexts): clarify whether the stochastic-context assumption is necessary for the T^{2/3} rate or merely simplifies the analysis; a remark on potential extensions would be useful.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our work, recognition of its significance in closing the long-standing regret gap, and recommendation for minor revision. We are pleased that the polynomial-time Õ(T^{2/3}) regret result under agnostic non-Lipschitz noise is viewed as strengthening the theoretical foundation for dynamic pricing.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper proposes a new algorithm (Conservative-Markdown Redirect-UCB Pricing) with explicit components including randomized parameter estimation, residual-grid probing, and one-step redirection. Its central claim of Õ(T^{2/3}) regret is derived from the analysis of this algorithm under the stated model assumptions (linear valuations with bounded-support agnostic noise), matching an external lower bound from Kleinberg and Leighton (2003). The reference to Xu and Wang (2022) serves only as a benchmark for improvement over the prior Õ(T^{3/4}) upper bound and is not used as a load-bearing premise or self-referential input in the new derivation. No self-definitional reductions, fitted inputs renamed as predictions, or ansatz smuggling are present in the described chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard domain assumption of linear valuations and bounded-support agnostic noise; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Linear valuations with bounded-support agnostic noise inducing possibly non-Lipschitz demand
Explicitly stated as the problem setting in the abstract.

pith-pipeline@v0.9.0 · 5445 in / 1212 out tokens · 39364 ms · 2026-05-08T15:07:39.147234+00:00 · methodology

Optimal Contextual Pricing under Agnostic Non-Lipschitz Demand

Core claim

Load-bearing premise

discussion (0)