Recognition: no theorem link
Online Localized Conformal Prediction
Pith reviewed 2026-05-11 01:46 UTC · model grok-4.3
The pith
Localizing calibration to similar covariates produces narrower valid online prediction sets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Online Localized Conformal Prediction (OLCP) pairs online adaptation of the conformal threshold with localization of nonconformity scores to covariate-similar points. OLCP-Hedge further casts bandwidth choice as an online expert aggregation task solved by constrained convex optimization. Both procedures come with long-run coverage guarantees and, in simulations and real-data experiments, deliver valid coverage with narrower prediction sets than global baselines.
What carries the argument
Covariate-dependent localization of nonconformity scores together with online convex optimization for automatic bandwidth hedging.
If this is right
- Long-run coverage remains valid even without exchangeability.
- Prediction sets become narrower by using only locally relevant calibration data.
- Bandwidth sensitivity is reduced by treating localization radius as an online expert problem.
- The methods outperform standard adaptive conformal inference on heterogeneous data streams.
Where Pith is reading between the lines
- The localization idea could extend to streaming settings such as online reinforcement learning where state similarity matters.
- Dynamic or learned localization radii might further improve efficiency beyond fixed-bandwidth hedging.
- Narrower valid sets would directly reduce over-conservatism in sequential forecasting and control tasks.
Load-bearing premise
The underlying online data process must allow the localized scores and online updates to produce long-run coverage at the nominal level.
What would settle it
A long online sequence in which the empirical coverage rate of the localized method falls materially below the nominal level while the global baseline does not.
Figures
read the original abstract
Conformal prediction is a framework that provides valid uncertainty quantification for general models with exchangeable data. However, in the online learning and time-series settings, exchangeability is not satisfied. Existing online conformal methods, such as adaptive conformal inference (ACI), can achieve long-run validity, yet they remain inefficient under covariate heterogeneity because they rely on global calibration. We propose \emph{Online Localized Conformal Prediction (OLCP)}, which combines online adaptation with covariate-dependent localization to better reflect heterogeneity. To reduce sensitivity to the localization bandwidth, we further develop \emph{OLCP-Hedge}, which performs bandwidth selection as an online expert aggregation problem using a constrained online convex optimization framework. Importantly, we provide coverage guarantees for both algorithms and demonstrate through simulations and real-data experiments that the proposed methods attain valid long-run coverage with narrower prediction sets than existing baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Online Localized Conformal Prediction (OLCP), which augments online conformal methods with covariate-dependent localization to handle heterogeneity, and OLCP-Hedge, which treats bandwidth selection as an online expert aggregation problem solved via constrained online convex optimization. It asserts coverage guarantees for both algorithms and reports that simulations and real-data experiments show valid long-run coverage together with narrower prediction sets than global baselines such as adaptive conformal inference.
Significance. If the stated coverage guarantees hold, the work would meaningfully extend online conformal prediction to heterogeneous streaming settings by replacing global calibration with localized, adaptively tuned neighborhoods. The framing of bandwidth selection as an online convex optimization problem with expert aggregation is a clean technical contribution that could reduce manual tuning. Empirical demonstrations of narrower sets while preserving coverage would be practically useful, provided the theoretical conditions are made explicit and verifiable.
major comments (3)
- [§3] §3 (Coverage Guarantees): The long-run coverage claim for OLCP rests on sublinear regret of the online update applied to a local empirical quantile. Under covariate heterogeneity the effective local sample size is controlled by the bandwidth and the local density; the manuscript does not state density or mixing conditions that would guarantee the local sample size grows sufficiently fast for the quantile to concentrate. Without such conditions the regret argument does not automatically translate into long-run coverage at level α.
- [§4.1] §4.1 (OLCP-Hedge Algorithm): The constrained online convex optimization formulation for bandwidth selection is presented, but the regret bound is stated only with respect to the expert loss; it is not shown that the selected bandwidth sequence preserves the coverage property of the underlying OLCP procedure when the local sample size is small. A concrete bound linking the Hedge regret to the deviation of the local quantile would be needed to support the joint guarantee.
- [Table 2, Figure 3] Table 2 and Figure 3 (Real-data experiments): The reported coverage is close to the nominal level on average, yet the experiments do not stratify results by local density or by regions where the bandwidth yields fewer than, say, 50 observations. If coverage degrades in low-density strata, the claim of uniformly valid long-run coverage would be weakened.
minor comments (2)
- The notation for the localization kernel and the online update rule is introduced without a compact summary table; adding one would improve readability.
- The abstract states that OLCP-Hedge 'performs bandwidth selection as an online expert aggregation problem' but does not mention the specific loss function or the constraint set; these details appear only later and should be previewed.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The comments highlight important aspects of the theoretical conditions and empirical validation that we will strengthen in the revision. We address each major comment below.
read point-by-point responses
-
Referee: [§3] §3 (Coverage Guarantees): The long-run coverage claim for OLCP rests on sublinear regret of the online update applied to a local empirical quantile. Under covariate heterogeneity the effective local sample size is controlled by the bandwidth and the local density; the manuscript does not state density or mixing conditions that would guarantee the local sample size grows sufficiently fast for the quantile to concentrate. Without such conditions the regret argument does not automatically translate into long-run coverage at level α.
Authors: We agree that the long-run coverage argument requires the local empirical quantile to concentrate, which depends on the effective local sample size growing sufficiently fast. The current manuscript implicitly relies on the bandwidth choice to ensure this but does not state explicit conditions. We will add assumptions on the covariate density being bounded away from zero in the relevant support and on the mixing rate of the underlying process. Under these conditions we will show that the sublinear regret of the online update implies the desired long-run coverage at level α, and we will revise the statement and proof of the relevant theorem in Section 3. revision: yes
-
Referee: [§4.1] §4.1 (OLCP-Hedge Algorithm): The constrained online convex optimization formulation for bandwidth selection is presented, but the regret bound is stated only with respect to the expert loss; it is not shown that the selected bandwidth sequence preserves the coverage property of the underlying OLCP procedure when the local sample size is small. A concrete bound linking the Hedge regret to the deviation of the local quantile would be needed to support the joint guarantee.
Authors: The referee correctly notes that the existing regret bound is stated relative to the best fixed expert and does not yet explicitly connect to coverage preservation when local samples are limited. We will add a supporting lemma that uses the sublinear regret of the constrained Hedge algorithm to bound the probability that the selected bandwidth yields an insufficient local sample size. This will establish that the coverage property of the base OLCP procedure is preserved with high probability, thereby completing the joint guarantee for OLCP-Hedge. The new analysis will appear in Section 4. revision: yes
-
Referee: [Table 2, Figure 3] Table 2 and Figure 3 (Real-data experiments): The reported coverage is close to the nominal level on average, yet the experiments do not stratify results by local density or by regions where the bandwidth yields fewer than, say, 50 observations. If coverage degrades in low-density strata, the claim of uniformly valid long-run coverage would be weakened.
Authors: We acknowledge that average coverage does not fully address potential variation across density regimes. We will augment the experimental section with a new stratification of both coverage and interval width by estimated local density and by bins of realized local sample size. This will include separate reporting for low-density regions (e.g., fewer than 50 local observations). Any observed degradation will be discussed explicitly as a limitation of the method in heterogeneous settings. revision: yes
Circularity Check
No significant circularity; coverage guarantees remain independent of method definitions.
full rationale
The provided abstract and context contain no quoted equations or self-citations that reduce the claimed long-run coverage to a fitted parameter, self-defined quantity, or prior author result by construction. OLCP and OLCP-Hedge are defined via localization plus online convex optimization, with coverage asserted as a separate guarantee (likely from regret analysis) and supported by simulations. This matches the reader's assessment of no reduction to inputs. No load-bearing self-citation chains or ansatz smuggling appear in the given text.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Conformal prediction: A gentle introduction
Anastasios N Angelopoulos, Stephen Bates, et al. Conformal prediction: A gentle introduction. Foundations and trends® in machine learning, 16(4):494–591, 2023. 9
work page 2023
-
[2]
Conformal prediction beyond exchangeability.The Annals of Statistics, 51(2):816–845, 2023
Rina Foygel Barber, Emmanuel J Candes, Aaditya Ramdas, and Ryan J Tibshirani. Conformal prediction beyond exchangeability.The Annals of Statistics, 51(2):816–845, 2023
work page 2023
-
[3]
Cboe Global Markets. Cboe V olatility Index (vix). https://www.cboe.com/tradable_ products/vix/, 2026
work page 2026
-
[4]
CDC FluView: Influenza-like illness surveillance
Centers for Disease Control and Prevention. CDC FluView: Influenza-like illness surveillance. https://www.cdc.gov/fluview/, 2024
work page 2024
-
[5]
Steven De Rooij, Tim Van Erven, Peter D Grünwald, and Wouter M Koolen. Follow the leader if you can, hedge if you must.The Journal of Machine Learning Research, 15(1):1281–1316, 2014
work page 2014
-
[6]
A second-order bound with excess losses
Pierre Gaillard, Gilles Stoltz, and Tim Van Erven. A second-order bound with excess losses. In Conference on Learning Theory, pages 176–196. PMLR, 2014
work page 2014
-
[7]
Adaptive conformal inference under distribution shift
Isaac Gibbs and Emmanuel Candes. Adaptive conformal inference under distribution shift. Advances in Neural Information Processing Systems, 34:1660–1672, 2021
work page 2021
-
[8]
Isaac Gibbs and Emmanuel J Candès. Conformal inference for online prediction with arbitrary distribution shifts.Journal of Machine Learning Research, 25(162):1–36, 2024
work page 2024
-
[9]
Leying Guan. Localized conformal prediction: A generalized inference framework for confor- mal prediction.Biometrika, 110(1):33–50, 2023
work page 2023
-
[10]
Hengquan Guo, Xin Liu, Honghao Wei, and Lei Ying. Online convex optimization with hard constraints: Towards the best of two worlds and beyond.Advances in Neural Information Processing Systems, 35:36426–36439, 2022
work page 2022
-
[11]
Splice-2 comparative evaluation: Electricity pricing
Michael Harries. Splice-2 comparative evaluation: Electricity pricing. Technical report, The University of New South Wales, 1999
work page 1999
-
[12]
Julien Herzen, Francesco Lässig, Samuele Giuliano Piazzetta, Thomas Neuer, Léo Tafti, Guil- laume Raille, Tomas Van Pottelbergh, Marek Pasieka, Andrzej Skrodzki, Nicolas Huguenin, Maxime Dumonal, Jan Ko´scisz, Dennis Bader, Frédérick Gusset, Mounir Benheddi, Camila Williamson, Michal Kosinski, Matej Petrik, and Gaël Grosch. Darts: User-friendly modern mac...
work page 2022
-
[13]
Rohan Hore and Rina Foygel Barber. Conformal prediction with local weights: randomiza- tion enables robust guarantees.Journal of the Royal Statistical Society Series B: Statistical Methodology, 87(2):549–578, 2025
work page 2025
-
[14]
Second-order quantile methods for experts and combina- torial games
Wouter M Koolen and Tim Van Erven. Second-order quantile methods for experts and combina- torial games. InConference on Learning Theory, pages 1155–1175. PMLR, 2015
work page 2015
-
[15]
Colin Lea, Rene Vidal, Austin Reiter, and Gregory D. Hager. Temporal convolutional networks: A unified approach to action segmentation, 2016
work page 2016
-
[16]
Jing Lei, Max G’Sell, Alessandro Rinaldo, Ryan J Tibshirani, and Larry Wasserman. Distribution-free predictive inference for regression.Journal of the American Statistical Associ- ation, 113(523):1094–1111, 2018
work page 2018
-
[17]
Conformal prediction after data-dependent model selection
Ruiting Liang, Wanrong Zhu, and Rina Foygel Barber. Conformal prediction after efficiency- oriented model selection.arXiv preprint arXiv:2408.07066, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[18]
Online conformal prediction via universal portfolio algorithms.arXiv preprint arXiv:2602.03168, 2026
Tuo Liu, Edgar Dobriban, and Francesco Orabona. Online conformal prediction via universal portfolio algorithms.arXiv preprint arXiv:2602.03168, 2026
-
[19]
A Modern Introduction to Online Learning
Francesco Orabona. A modern introduction to online learning.arXiv preprint arXiv:1912.13213, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1912
-
[20]
Francesco Orabona and Dávid Pál. Coin betting and parameter-free online learning.Advances in Neural Information Processing Systems, 29, 2016. 10
work page 2016
-
[21]
Conformalized quantile regression
Yaniv Romano, Evan Patterson, and Emmanuel Candes. Conformalized quantile regression. Advances in neural information processing systems, 32, 2019
work page 2019
-
[22]
Scott.Multivariate Density Estimation: Theory, Practice, and Visualization
David W. Scott.Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, 1992
work page 1992
-
[23]
Silverman.Density Estimation for Statistics and Data Analysis
Bernard W. Silverman.Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986
work page 1986
-
[24]
Abhishek Sinha and Rahul Vaze. Optimal algorithms for online convex optimization with adversarial constraints.Advances in Neural Information Processing Systems, 37:41274–41302, 2024
work page 2024
- [25]
-
[26]
Conformal prediction for time-series forecasting with change points
Sophia Sun and Rose Yu. Conformal prediction for time-series forecasting with change points. arXiv preprint arXiv:2509.02844, 2025
-
[27]
Ryan J Tibshirani, Rina Foygel Barber, Emmanuel Candes, and Aaditya Ramdas. Conformal prediction under covariate shift.Advances in neural information processing systems, 32, 2019
work page 2019
-
[28]
Vladimir V ovk, Alexander Gammerman, and Glenn Shafer.Algorithmic learning in a random world. Springer, 2005
work page 2005
-
[29]
Conformal prediction interval for dynamic time-series
Chen Xu and Yao Xie. Conformal prediction interval for dynamic time-series. In Marina Meila and Tong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pages 11559–11569. PMLR, 18–24 Jul 2021
work page 2021
-
[30]
Sequential predictive conformal inference for time series
Chen Xu and Yao Xie. Sequential predictive conformal inference for time series. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 38707–38727. PMLR, 23–29 Jul 2023
work page 2023
-
[31]
Yachong Yang and Arun Kumar Kuchibhotla. Selection and aggregation of conformal prediction sets.Journal of the American Statistical Association, 120(549):435–447, 2025
work page 2025
-
[32]
Adaptive conformal predictions for time series
Margaux Zaffran, Olivier Féron, Yannig Goude, Julie Josse, and Aymeric Dieuleveut. Adaptive conformal predictions for time series. InInternational conference on machine learning, pages 25834–25866. PMLR, 2022. 11 A Proof of Proposition 3.1 Proof.By definition, zt =α t +γ(α−err t), L t = (−zt)+, U t = (zt −1) +. Since projection onto[0,1]satisfies Π[0,1](z...
work page 2022
-
[33]
Missing values are filled by interpolation followed by forward/backward filling. The series is split chronologically into 70% training, 10% validation, and 20% testing, giving 913, 130, and 262 observations, respectively. The response is standardized using the training mean and standard deviation, and intervals are constructed on this standardized scale. ...
work page 2008
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.