pith. machine review for the scientific record. sign in

arxiv: 2501.02703 · v2 · submitted 2025-01-06 · 📊 stat.ME

Recognition: unknown

Full-conformal novelty detection

Authors on Pith no claims yet
classification 📊 stat.ME
keywords noveltydetectione-valuesdatadatasetreferenceconformalcontrol
0
0 comments X
read the original abstract

This paper presents a powerful methodology for flexible full-data nonparametric novelty detection that offers distribution-free false discovery rate (FDR) control guarantees. Building on the full conformal inference framework and the concept of e-values, we introduce full conformal e-values to quantify evidence for novelty relative to a given reference dataset. These e-values are then utilized by carefully crafted multiple testing procedures to identify a set of novel units out-of-sample with provable finite-sample FDR control. We showcase several instantiations of e-values, including those which employ a data-driven model selection strategy to amplify power. Furthermore, our framework is extended to address distribution shift, accommodating scenarios where novelty detection must be performed on data drawn from a shifted distribution relative to the reference dataset. In all settings, our method can perform powerfully -- outperforming existing novelty detection methods -- even with limited amounts of reference data; this is illustrated by empirical evaluations on synthetic data and an application to a malicious LLM prompts dataset.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Inference for Clustering: Conformal Sets for Cluster Labels

    stat.ME 2026-04 unverdicted novelty 7.0

    Split conformal clustering with stochastic labels provides finite-sample marginal coverage guarantees for cluster label confidence sets, controlled by soft-label consistency and replace-one stability of the clustering...