Non-asymptotic decomposition of conditional miscoverage in conformal prediction into score-estimation error, finite-sample calibration error, and intrinsic conditional-mismatch error, with guidance for model selection and extensions to covariate shift and structured data.
Journal of machine learning research , volume=
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
The paper proves statistical consistency of contrastive loss to optimal ranking via an AUC criterion and derives generalization bounds O(1/m + 1/sqrt(n)) for supervised and O(1/sqrt(m) + 1/sqrt(n)) for self-supervised CRL that explain benefits of large negative sets.
Risk-sensitive preference games using convex risk measures produce policies that are robust across data strata and match or exceed standard Nash learning performance without added cost.
SPIN lets weak LLMs become strong by self-generating training data from previous model versions and training to prefer human-annotated responses over its own outputs, outperforming DPO even with extra GPT-4 data on benchmarks.
The survey unifies extensions of PAC-Bayesian theory to data-dependent sets, geometric and topological complexity measures of optimization trajectories, and stability replacements for information terms into one template inequality with comparative evaluation.
citing papers explorer
-
A Unified Theory of Conditional Coverage in Conformal Prediction with Applications
Non-asymptotic decomposition of conditional miscoverage in conformal prediction into score-estimation error, finite-sample calibration error, and intrinsic conditional-mismatch error, with guidance for model selection and extensions to covariate shift and structured data.
-
Statistical Consistency and Generalization of Contrastive Representation Learning
The paper proves statistical consistency of contrastive loss to optimal ranking via an AUC criterion and derives generalization bounds O(1/m + 1/sqrt(n)) for supervised and O(1/sqrt(m) + 1/sqrt(n)) for self-supervised CRL that explain benefits of large negative sets.
-
Structure from Strategic Interaction & Uncertainty: Risk Sensitive Games for Robust Preference Learning
Risk-sensitive preference games using convex risk measures produce policies that are robust across data strata and match or exceed standard Nash learning performance without added cost.
-
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
SPIN lets weak LLMs become strong by self-generating training data from previous model versions and training to prefer human-annotated responses over its own outputs, outperforming DPO even with extra GPT-4 data on benchmarks.
-
A Survey on Data-Dependent Worst-Case Generalization Bounds
The survey unifies extensions of PAC-Bayesian theory to data-dependent sets, geometric and topological complexity measures of optimization trajectories, and stability replacements for information terms into one template inequality with comparative evaluation.