Selective Conformal Risk Control

Wenge Guo; Yunpeng Xu; Zhi Wei

arxiv: 2512.12844 · v2 · submitted 2025-12-14 · 💻 cs.LG · cs.AI

Selective Conformal Risk Control

Yunpeng Xu , Wenge Guo , Zhi Wei This is my paper

Pith reviewed 2026-05-16 22:05 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords conformal predictionselective classificationrisk controluncertainty quantificationprediction setsdistribution-free guaranteescalibrationmachine learning

0 comments

The pith

Selective Conformal Risk Control shrinks prediction sets by filtering to confident samples before calibration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Selective Conformal Risk Control as a two-stage framework that first selects confident samples and then applies conformal risk control only to that subset. This integration aims to produce smaller prediction sets than standard conformal prediction while retaining distribution-free coverage guarantees. Two algorithms are developed: SCRC-T computes thresholds jointly across calibration and test samples for exact finite-sample guarantees, and SCRC-I uses only calibration data for faster PAC-style probabilistic guarantees. Experiments on public datasets confirm that both variants meet target coverage and risk levels with nearly identical performance.

Core claim

The central claim is that formulating uncertainty control as selective classification followed by conformal risk control on the selected subset allows construction of calibrated prediction sets that achieve target coverage and risk levels, with SCRC-T providing exact finite-sample guarantees via joint thresholds and SCRC-I offering efficient PAC-style guarantees.

What carries the argument

The two-stage process of first selecting confident samples via a selection rule and then applying conformal risk control on that subset, implemented in the joint-threshold SCRC-T variant and the calibration-only SCRC-I variant.

If this is right

Prediction sets become more compact than those produced by standard conformal prediction while still meeting coverage targets.
SCRC-T delivers exact finite-sample coverage guarantees through joint threshold computation over calibration and test samples.
SCRC-I achieves similar performance with PAC-style probabilistic guarantees and lower computational cost.
Both methods maintain the desired risk levels on the selected subset across tested datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could support real-time deployment in domains like medical diagnostics by reducing set sizes without losing reliability.
Different selection criteria might be tested to see how they trade off set size against guarantee tightness.
The framework might combine with other uncertainty methods to handle structured outputs such as sequences or graphs.

Load-bearing premise

The selection of confident samples preserves the exchangeability properties needed for the conformal guarantees to hold on the selected subset.

What would settle it

Finding that empirical coverage on the selected test samples drops below the target level in experiments where the selection rule introduces dependence that violates exchangeability between calibration and test points.

Figures

Figures reproduced from arXiv: 2512.12844 by Wenge Guo, Yunpeng Xu, Zhi Wei.

**Figure 2.** Figure 2: CIFAR-10: Risk control at different values of [PITH_FULL_IMAGE:figures/full_fig_p020_2.png] view at source ↗

**Figure 3.** Figure 3: CIFAR-10: Comparison of different selection score functions. [PITH_FULL_IMAGE:figures/full_fig_p021_3.png] view at source ↗

**Figure 4.** Figure 4: DR Detection: Coverage control at different values of [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗

**Figure 5.** Figure 5: DR Detection: Risk control at different values of [PITH_FULL_IMAGE:figures/full_fig_p023_5.png] view at source ↗

**Figure 6.** Figure 6: DR Detection: Comparison of different selection score functions. [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗

read the original abstract

Reliable uncertainty quantification is essential for deploying machine learning systems in high-stakes domains. Conformal prediction provides distribution-free coverage guarantees but often produces overly large prediction sets, limiting its practical utility. To address this issue, we propose \textit{Selective Conformal Risk Control} (SCRC), a unified framework that integrates conformal prediction with selective classification. The framework formulates uncertainty control as a two-stage problem: the first stage selects confident samples for prediction, and the second stage applies conformal risk control on the selected subset to construct calibrated prediction sets. We develop two algorithms under this framework. The first, SCRC-T, preserves exchangeability by computing thresholds jointly over calibration and test samples, offering exact finite-sample guarantees. The second, SCRC-I, is a calibration-only variant that provides PAC-style probabilistic guarantees while being more computational efficient. Experiments on two public datasets show that both methods achieve the target coverage and risk levels, with nearly identical performance, while SCRC-I exhibits slightly more conservative risk control but superior computational practicality. Our results demonstrate that selective conformal risk control offers an effective and efficient path toward compact, reliable uncertainty quantification.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a two-stage selective conformal risk control setup with one exact-guarantee algorithm and one faster PAC version, but the exchangeability step for exact coverage looks unproven.

read the letter

The main takeaway is a framework that first selects confident points via some score threshold and then runs conformal risk control only on that subset to get smaller prediction sets. SCRC-T computes thresholds jointly on calibration and test points to claim exact finite-sample coverage, while SCRC-I works from calibration data alone for better speed and PAC-style bounds. Experiments on two public datasets report that both hit the target coverage and risk levels with similar performance.

Referee Report

2 major / 2 minor

Summary. The paper proposes Selective Conformal Risk Control (SCRC), a two-stage framework that first selects confident samples via a data-dependent rule and then applies conformal risk control on the selected subset to produce calibrated prediction sets with controlled risk. It introduces SCRC-T, which jointly computes thresholds over the combined calibration and test pool to claim exact finite-sample coverage guarantees, and SCRC-I, a calibration-only variant providing PAC-style probabilistic guarantees with improved efficiency. Experiments on two public datasets are reported to achieve the target coverage and risk levels for both methods.

Significance. If the coverage claims hold, the work offers a principled route to smaller prediction sets than standard conformal methods by incorporating selective classification, which could improve practical utility in high-stakes settings without sacrificing distribution-free guarantees. The joint-threshold approach in SCRC-T, if rigorously justified, would be a notable technical contribution over naive post-selection conformal methods.

major comments (2)

[SCRC-T algorithm and guarantee statement] The exact finite-sample coverage claim for SCRC-T rests on the assertion that joint threshold computation over the full calibration+test pool preserves exchangeability for the data-dependent selected subset. No derivation is supplied showing that the coverage inequality continues to hold after selection when the selection rule depends on the same scores or nonconformity values used for thresholding; standard conformal arguments apply to the full exchangeable pool but do not automatically transfer to the induced random subset. A detailed proof or counter-example analysis is required in the SCRC-T section.
[Experiments] The experimental section reports that both methods achieve target coverage and risk levels on two public datasets but supplies no error bars, no description of data splits or exclusion rules, and no comparison against standard conformal baselines or selective classification methods without conformal control. These omissions make it impossible to assess whether the observed performance supports the claimed advantage in compactness while preserving guarantees.

minor comments (2)

[Method overview] Notation for the selection function and the joint threshold computation should be introduced with explicit definitions before the guarantee statements to improve readability.
[Abstract and §1] The abstract and introduction should clarify whether the risk control is on the selected subset only or includes a risk term for the rejected samples.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the paper to incorporate the requested clarifications and additions.

read point-by-point responses

Referee: [SCRC-T algorithm and guarantee statement] The exact finite-sample coverage claim for SCRC-T rests on the assertion that joint threshold computation over the full calibration+test pool preserves exchangeability for the data-dependent selected subset. No derivation is supplied showing that the coverage inequality continues to hold after selection when the selection rule depends on the same scores or nonconformity values used for thresholding; standard conformal arguments apply to the full exchangeable pool but do not automatically transfer to the induced random subset. A detailed proof or counter-example analysis is required in the SCRC-T section.

Authors: We agree that the current manuscript states the exact finite-sample coverage for SCRC-T without supplying a full derivation. In the revised version we will add a rigorous proof in the SCRC-T section. The proof will show that joint threshold selection over the combined calibration and test pool preserves the exchangeability of the selected subset even when the selection rule is a function of the same nonconformity scores, by explicitly tracking the dependence and verifying that the coverage inequality still holds via the standard conformal argument applied to the augmented pool. We will also include a brief discussion of edge cases and any necessary counter-example checks. revision: yes
Referee: [Experiments] The experimental section reports that both methods achieve target coverage and risk levels on two public datasets but supplies no error bars, no description of data splits or exclusion rules, and no comparison against standard conformal baselines or selective classification methods without conformal control. These omissions make it impossible to assess whether the observed performance supports the claimed advantage in compactness while preserving guarantees.

Authors: We acknowledge these omissions limit the interpretability of the results. In the revision we will add error bars from repeated runs with different random seeds, provide explicit descriptions of the data splits and any exclusion rules applied, and include direct comparisons against standard conformal prediction baselines as well as selective classification methods that do not use conformal risk control. These additions will allow readers to evaluate both the compactness gains and the empirical validity of the coverage and risk guarantees. revision: yes

Circularity Check

0 steps flagged

No significant circularity; guarantees rest on standard exchangeability without reduction to fitted inputs or self-citations

full rationale

The paper's central claim is that SCRC-T achieves exact finite-sample coverage by jointly computing thresholds over the combined calibration and test pool before selection, thereby preserving exchangeability. This is presented as a direct consequence of the standard conformal prediction exchangeability assumption applied to the joint set, rather than any self-definitional loop, fitted parameter renamed as prediction, or load-bearing self-citation. No equations in the provided abstract or description reduce the coverage guarantee to a quantity defined by the selection rule itself. SCRC-I is explicitly distinguished as providing only PAC-style bounds. The derivation chain therefore remains self-contained against external conformal theory benchmarks and does not trigger any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard conformal-prediction assumption of exchangeability and on the definition of risk-control levels; no free parameters, new entities, or ad-hoc axioms are introduced in the abstract.

axioms (1)

domain assumption Data samples are exchangeable
Invoked to obtain finite-sample guarantees for SCRC-T and PAC-style guarantees for SCRC-I.

pith-pipeline@v0.9.0 · 5488 in / 1133 out tokens · 35550 ms · 2026-05-16T22:05:55.046276+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Lemma 1. Suppose (X1,Y1),…,(Xn+1,Yn+1) are exchangeable, and let I be a symmetric selection rule… Then, conditional on EI, the subcollection {(Xi,Yi)}i∈I is exchangeable.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

SCRC-T preserves exchangeability by computing thresholds jointly over calibration and test samples, offering exact finite-sample guarantees.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Conformal Selective Acting: Anytime-Valid Risk Control for RLVR-Trained LLMs
cs.LG 2026-05 conditional novelty 8.0

Conformal Selective Acting (CSA) fills a gap in conformal methods by providing per-round, pathwise-valid selective risk bounds for adaptive RLVR LLM streams under predictable updates and isotonic calibration.
ST-BCP: Tightening Coverage Bound for Backward Conformal Prediction via Non-Conformity Score Transformation
stat.ML 2026-02 conditional novelty 7.0

ST-BCP tightens the coverage bound in Backward Conformal Prediction by applying a computable data-dependent transformation to nonconformity scores, reducing the average gap from 4.20% to 1.12% on benchmarks while prov...
Explainable Wastewater Digital Twins: Adaptive Context-Conditioned Structured Simulators with Self-Falsifying Decision Support
cs.AI 2026-05 unverdicted novelty 5.0

CCSS-IX is a context-conditioned structured simulator for wastewater digital twins that uses adaptive expert mixing and self-falsifying conformal decision rules to reduce unsafe actions while maintaining low predictio...

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · cited by 3 Pith papers · 1 internal anchor

[1]

Conformal Risk Control

Anastasios N Angelopoulos et al. “Conformal Risk Control”. In:ICLR (2024)

work page 2024
[2]

A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification

Anastasios N. Angelopoulos and Stephen Bates. “A gentle introduction to conformal prediction and distribution-free uncertainty quantification”. In:”arXiv:2107.07511”(”2021”)

work page internal anchor Pith review Pith/arXiv arXiv 2021
[3]

Selective Conformal Inference with False Coverage- Statement Rate Control

Yajie Bao et al. “Selective Conformal Inference with False Coverage- Statement Rate Control”. In:Biometrika(2024)

work page 2024
[4]

Classification with a Reject Option using a Hinge Loss

Peter L. Bartlett and Marten H. Wegkamp. “Classification with a Reject Option using a Hinge Loss”. In:Journal of Machine Learning Research 9.59 (2008), pp. 1823–1840

work page 2008
[5]

Weight Uncertainty in Neural Networks

Charles Blundell et al. “Weight Uncertainty in Neural Networks”. In:Pro- ceedings of the 32nd International Conference on Machine Learning. 2015

work page 2015
[6]

An optimum character recognition system using decision functions

C. K. Chow. “An optimum character recognition system using decision functions”. In:IRE Transactions on Electronic Computers(1957)

work page 1957
[7]

Calibrated Selective Classification

Adam Fisch, Tommi Jaakkola, and Regina Barzilay. “Calibrated Selective Classification”. In:https://arxiv.org/abs/2208.12084(2022)

work page arXiv 2022
[8]

Conformal Predic- tion Sets with Limited False Positives

Adam Fisch, Tommi Jaakkola, and Regina Barzilay. “Conformal Predic- tion Sets with Limited False Positives”. In:Proceedings of the 39 th In- ternational Conference on Machine Learning. 2022

work page 2022
[9]

Optimal Strategies for Reject Option Classifiers

Vojtech Franc, Daniel Prusa, and Vaclav Voracek. “Optimal Strategies for Reject Option Classifiers”. In:Journal of Machine Learning Research24 (2023), pp. 1–49

work page 2023
[10]

Dropout as a Bayesian Approxima- tion: Representing Model Uncertainty in Deep Learning

Yarin Gal and Zoubin Ghahramani. “Dropout as a Bayesian Approxima- tion: Representing Model Uncertainty in Deep Learning”. In:Proceedings of the 33rd International Conference on Machine Learning. 2016

work page 2016
[11]

Selecting Informative Conformal Prediction Sets with False Coverage Rate Control

Ulysse Gazin et al. “Selecting Informative Conformal Prediction Sets with False Coverage Rate Control”. In:arXiv preprint arXiv:2403.12295(2024)

work page arXiv 2024
[12]

Selective classification for deep neu- ral networks

Yonatan Geifman and Ran El-Yaniv. “Selective classification for deep neu- ral networks”. In:Proceedings of the 31st International Conference on Neural Information Processing Systems(2017). 16

work page 2017
[13]

SelectiveNet: A Deep Neural Net- work with an Integrated Reject Option

Yonatan Geifman and Ran El-Yaniv. “SelectiveNet: A Deep Neural Net- work with an Integrated Reject Option”. In:Proceedings of the 36 th In- ternational Conference on Machine Learning(2019)

work page 2019
[14]

On Calibration of Modern Neural Networks

Chuan Guo et al. “On Calibration of Modern Neural Networks”. In:Pro- ceedings of the 34th International Conference on Machine Learning. 2017

work page 2017
[15]

Deep Residual Learning for Image Recognition

Kaiming He et al. “Deep Residual Learning for Image Recognition”. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016

work page 2016
[16]

The Nearest Neighbor Classification Rule with a Re- ject Option

Martin E. Hellman. “The Nearest Neighbor Classification Rule with a Re- ject Option”. In:IEEE Transactions on Systems Science and Cybernetics (1970)

work page 1970
[17]

Machine Learning with a Reject Option: A Sur- vey

Kilian Hendrickx et al. “Machine Learning with a Reject Option: A Sur- vey”. In:Machine Learning113 (2024), pp. 3073–3110

work page 2024
[18]

A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks

Dan Hendrycks and Kevin Gimpel. “A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks”. In:Proceedings of International Conference on Learning Representations(2017)

work page 2017
[19]

Accurate Un- certainties for Deep Learning Using Calibrated Regression

Volodymyr Kuleshov, Nathan Fenner, and Stefano Ermon. “Accurate Un- certainties for Deep Learning Using Calibrated Regression”. In:Proceed- ings of the 35th International Conference on Machine Learning. 2018

work page 2018
[20]

Distribution-free Predictive Inference for Regression

Jing Lei et al. “Distribution-free Predictive Inference for Regression”. In: Journal of the American Statistical Association(2018)

work page 2018
[21]

Energy-based Out-of-distribution Detection

Weitang Liu et al. “Energy-based Out-of-distribution Detection”. In:34th Conference on Neural Information Processing Systems(2020)

work page 2020
[22]

Inductive Confidence Machines for Regres- sion

Harris Papadopoulos et al. “Inductive Confidence Machines for Regres- sion”. In:ECML. 2002

work page 2002
[23]

AUC-based Selective Classifi- cation

Andrea Pugnana and Salvatore Ruggieri. “AUC-based Selective Classifi- cation”. In: 2023

work page 2023
[24]

Conformal- ized Quantile Regression

Yaniv Romano, Evan Patterson, and Emmanuel J. Cand` es. “Conformal- ized Quantile Regression”. In:Advances in Neural Information Processing Systems. 2019

work page 2019
[25]

A Tutorial on Conformal Prediction

Glenn Shafer and Vladimir Vovk. “A Tutorial on Conformal Prediction”. In:Journal of Machine Learning Research9 (2008), pp. 371–421

work page 2008
[26]

Conformal Prediction Under Covariate Shift

Ryan J. Tibshirani et al. “Conformal Prediction Under Covariate Shift”. In:Advances in Neural Information Processing Systems (NeurIPS). 2019

work page 2019
[27]

Evaluating Model Calibration in Classifica- tion

Juozas Vaicenavicius et al. “Evaluating Model Calibration in Classifica- tion”. In:Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics. 2019

work page 2019
[28]

Machine-learning applications of algorithmic randomness

Vladimir Vovk, Alex Gammerman, and Craig Saunders. “Machine-learning applications of algorithmic randomness”. In:Sixteenth International Con- ference on Machine Learning (ICML)(1999). 17

work page 1999
[29]

Vladimir Vovk, Alexander Gammerman, and Glenn Shafer.Algorithmic learning in a random world. Vol. 29. Springer, 2005

work page 2005
[30]

Conformal Risk Control for Ordi- nal Classification

Yunpeng Xu, Wenge Guo, and Zhi Wei. “Conformal Risk Control for Ordi- nal Classification”. In:Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence (UAI)(2023)

work page 2023
[31]

Two-stage Risk Control with Application to Ranked Retrieval

Yunpeng Xu et al. “Two-stage Risk Control with Application to Ranked Retrieval”. In:Proceedings of the Thirty-Fourth International Joint Con- ference on Artificial Intelligence (IJCAI)(2025). 18 Figure 1: CIFAR-10: Coverage control at different values ofξwithα= 0.1 (margin score). 19 Figure 2: CIFAR-10: Risk control at different values ofαwithξ= 0.7 (marg...

work page 2025

[1] [1]

Conformal Risk Control

Anastasios N Angelopoulos et al. “Conformal Risk Control”. In:ICLR (2024)

work page 2024

[2] [2]

A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification

Anastasios N. Angelopoulos and Stephen Bates. “A gentle introduction to conformal prediction and distribution-free uncertainty quantification”. In:”arXiv:2107.07511”(”2021”)

work page internal anchor Pith review Pith/arXiv arXiv 2021

[3] [3]

Selective Conformal Inference with False Coverage- Statement Rate Control

Yajie Bao et al. “Selective Conformal Inference with False Coverage- Statement Rate Control”. In:Biometrika(2024)

work page 2024

[4] [4]

Classification with a Reject Option using a Hinge Loss

Peter L. Bartlett and Marten H. Wegkamp. “Classification with a Reject Option using a Hinge Loss”. In:Journal of Machine Learning Research 9.59 (2008), pp. 1823–1840

work page 2008

[5] [5]

Weight Uncertainty in Neural Networks

Charles Blundell et al. “Weight Uncertainty in Neural Networks”. In:Pro- ceedings of the 32nd International Conference on Machine Learning. 2015

work page 2015

[6] [6]

An optimum character recognition system using decision functions

C. K. Chow. “An optimum character recognition system using decision functions”. In:IRE Transactions on Electronic Computers(1957)

work page 1957

[7] [7]

Calibrated Selective Classification

Adam Fisch, Tommi Jaakkola, and Regina Barzilay. “Calibrated Selective Classification”. In:https://arxiv.org/abs/2208.12084(2022)

work page arXiv 2022

[8] [8]

Conformal Predic- tion Sets with Limited False Positives

Adam Fisch, Tommi Jaakkola, and Regina Barzilay. “Conformal Predic- tion Sets with Limited False Positives”. In:Proceedings of the 39 th In- ternational Conference on Machine Learning. 2022

work page 2022

[9] [9]

Optimal Strategies for Reject Option Classifiers

Vojtech Franc, Daniel Prusa, and Vaclav Voracek. “Optimal Strategies for Reject Option Classifiers”. In:Journal of Machine Learning Research24 (2023), pp. 1–49

work page 2023

[10] [10]

Dropout as a Bayesian Approxima- tion: Representing Model Uncertainty in Deep Learning

Yarin Gal and Zoubin Ghahramani. “Dropout as a Bayesian Approxima- tion: Representing Model Uncertainty in Deep Learning”. In:Proceedings of the 33rd International Conference on Machine Learning. 2016

work page 2016

[11] [11]

Selecting Informative Conformal Prediction Sets with False Coverage Rate Control

Ulysse Gazin et al. “Selecting Informative Conformal Prediction Sets with False Coverage Rate Control”. In:arXiv preprint arXiv:2403.12295(2024)

work page arXiv 2024

[12] [12]

Selective classification for deep neu- ral networks

Yonatan Geifman and Ran El-Yaniv. “Selective classification for deep neu- ral networks”. In:Proceedings of the 31st International Conference on Neural Information Processing Systems(2017). 16

work page 2017

[13] [13]

SelectiveNet: A Deep Neural Net- work with an Integrated Reject Option

Yonatan Geifman and Ran El-Yaniv. “SelectiveNet: A Deep Neural Net- work with an Integrated Reject Option”. In:Proceedings of the 36 th In- ternational Conference on Machine Learning(2019)

work page 2019

[14] [14]

On Calibration of Modern Neural Networks

Chuan Guo et al. “On Calibration of Modern Neural Networks”. In:Pro- ceedings of the 34th International Conference on Machine Learning. 2017

work page 2017

[15] [15]

Deep Residual Learning for Image Recognition

Kaiming He et al. “Deep Residual Learning for Image Recognition”. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016

work page 2016

[16] [16]

The Nearest Neighbor Classification Rule with a Re- ject Option

Martin E. Hellman. “The Nearest Neighbor Classification Rule with a Re- ject Option”. In:IEEE Transactions on Systems Science and Cybernetics (1970)

work page 1970

[17] [17]

Machine Learning with a Reject Option: A Sur- vey

Kilian Hendrickx et al. “Machine Learning with a Reject Option: A Sur- vey”. In:Machine Learning113 (2024), pp. 3073–3110

work page 2024

[18] [18]

A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks

Dan Hendrycks and Kevin Gimpel. “A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks”. In:Proceedings of International Conference on Learning Representations(2017)

work page 2017

[19] [19]

Accurate Un- certainties for Deep Learning Using Calibrated Regression

Volodymyr Kuleshov, Nathan Fenner, and Stefano Ermon. “Accurate Un- certainties for Deep Learning Using Calibrated Regression”. In:Proceed- ings of the 35th International Conference on Machine Learning. 2018

work page 2018

[20] [20]

Distribution-free Predictive Inference for Regression

Jing Lei et al. “Distribution-free Predictive Inference for Regression”. In: Journal of the American Statistical Association(2018)

work page 2018

[21] [21]

Energy-based Out-of-distribution Detection

Weitang Liu et al. “Energy-based Out-of-distribution Detection”. In:34th Conference on Neural Information Processing Systems(2020)

work page 2020

[22] [22]

Inductive Confidence Machines for Regres- sion

Harris Papadopoulos et al. “Inductive Confidence Machines for Regres- sion”. In:ECML. 2002

work page 2002

[23] [23]

AUC-based Selective Classifi- cation

Andrea Pugnana and Salvatore Ruggieri. “AUC-based Selective Classifi- cation”. In: 2023

work page 2023

[24] [24]

Conformal- ized Quantile Regression

Yaniv Romano, Evan Patterson, and Emmanuel J. Cand` es. “Conformal- ized Quantile Regression”. In:Advances in Neural Information Processing Systems. 2019

work page 2019

[25] [25]

A Tutorial on Conformal Prediction

Glenn Shafer and Vladimir Vovk. “A Tutorial on Conformal Prediction”. In:Journal of Machine Learning Research9 (2008), pp. 371–421

work page 2008

[26] [26]

Conformal Prediction Under Covariate Shift

Ryan J. Tibshirani et al. “Conformal Prediction Under Covariate Shift”. In:Advances in Neural Information Processing Systems (NeurIPS). 2019

work page 2019

[27] [27]

Evaluating Model Calibration in Classifica- tion

Juozas Vaicenavicius et al. “Evaluating Model Calibration in Classifica- tion”. In:Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics. 2019

work page 2019

[28] [28]

Machine-learning applications of algorithmic randomness

Vladimir Vovk, Alex Gammerman, and Craig Saunders. “Machine-learning applications of algorithmic randomness”. In:Sixteenth International Con- ference on Machine Learning (ICML)(1999). 17

work page 1999

[29] [29]

Vladimir Vovk, Alexander Gammerman, and Glenn Shafer.Algorithmic learning in a random world. Vol. 29. Springer, 2005

work page 2005

[30] [30]

Conformal Risk Control for Ordi- nal Classification

Yunpeng Xu, Wenge Guo, and Zhi Wei. “Conformal Risk Control for Ordi- nal Classification”. In:Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence (UAI)(2023)

work page 2023

[31] [31]

Two-stage Risk Control with Application to Ranked Retrieval

Yunpeng Xu et al. “Two-stage Risk Control with Application to Ranked Retrieval”. In:Proceedings of the Thirty-Fourth International Joint Con- ference on Artificial Intelligence (IJCAI)(2025). 18 Figure 1: CIFAR-10: Coverage control at different values ofξwithα= 0.1 (margin score). 19 Figure 2: CIFAR-10: Risk control at different values ofαwithξ= 0.7 (marg...

work page 2025