arxiv: 2605.06955 · v1 · submitted 2026-05-07 · 💻 cs.LG · cs.AI

Recognition: no theorem link

Kurtosis-Guided Denoising Score Matching for Tabular Anomaly Detection

Victor Livernoche , Jie Zan , Reihaneh Rabbany

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:49 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords tabular anomaly detectiondenoising score matchingkurtosisnoise scalingsemi-supervised learningunsupervised anomaly detectiondensity estimationperturbation scale

0 comments

The pith

Kurtosis of marginal distributions guides noise scaling to make single-scale denoising score matching a strong tabular anomaly detector.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that denoising score matching can detect anomalies in tabular data by training a network to recover the score function from noise-perturbed samples, where the magnitude of the score at a point indicates inconsistency with the learned distribution. The central practical issue is selecting the right noise scale per feature, since too little noise destabilizes estimates in sparse areas and too much erases structure. By deriving each feature's noise level directly from the kurtosis of its marginal distribution, the method improves coverage of low-density regions while preserving precision in high-density ones, all without adding model complexity or requiring a validation set. Experiments show this kurtosis-guided single-scale approach reaches state-of-the-art results on standard benchmarks in the semi-supervised setting and remains effective in the fully unsupervised contaminated setting when paired with a simple EMA-teacher filter that removes low-density training points.

Core claim

Denoising score matching learns the gradient of the log-density by training on noise-corrupted samples so that score magnitude can serve as an anomaly signal. The authors introduce kurtosis-based noise scaling (K-DSM), a per-feature scheme that sets perturbation levels from the shape of each marginal distribution. This choice improves both coverage of low-density regions and precision in high-density regions. Contrary to prior claims that multi-scale or noise-conditioned training is required, a carefully trained single-scale model already yields a strong anomaly detector. On standard tabular benchmarks K-DSM reaches state-of-the-art performance in the semi-supervised setting; combined with a

What carries the argument

Kurtosis-based noise scaling (K-DSM): a per-feature scheme that derives perturbation levels from the kurtosis of each marginal distribution to adapt noise without extra model complexity or post-hoc tuning.

If this is right

Single-scale denoising score matching becomes competitive with multi-scale methods once noise levels are set from marginal kurtosis.
K-DSM achieves state-of-the-art performance on standard tabular anomaly benchmarks in the semi-supervised setting.
Adding a lightweight EMA-teacher rule to filter low-density training points before each gradient step yields strong results even when training data is contaminated with anomalies.
Data-adaptive noise scaling reduces reliance on hyperparameter tuning for anomaly detection tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar marginal-shape statistics could guide noise selection in other score-based models applied to tabular or low-dimensional data.
The finding that single-scale training suffices may encourage simpler architectures in score-based anomaly detection when adaptive scaling is available.
Practitioners facing contaminated training sets could test the EMA-teacher filter as a lightweight way to bootstrap unsupervised performance.

Load-bearing premise

The kurtosis of each marginal distribution provides a reliable validation-free guide for choosing perturbation scales that improve score estimates in both low- and high-density regions.

What would settle it

A tabular dataset on which kurtosis-derived scales produce lower anomaly-detection AUROC than a single fixed scale or a validation-tuned scale would directly challenge the claim that kurtosis supplies a sufficient guide.

Figures

Figures reproduced from arXiv: 2605.06955 by Jie Zan, Reihaneh Rabbany, Victor Livernoche.

**Figure 2.** Figure 2: Original density (left) drawn from x ∼ N (0, 1) and y ∼ Laplace 0, √ 1 2 . Density under σglobal = 0.5 (center). Density under feature-wise perturbation guided by kurtosis (right), showing improved coverage of the surrounding space. 3.2 Per-Feature Perturbation The standard DSM objective applies a single global noise scale to all features. We argue that this is suboptimal for tabular data, where feature… view at source ↗

**Figure 3.** Figure 3: Histogram rearrangement for kurtosisbased noise selection. Top: original marginals. Bottom: symmetric decreasing rearrangements. Rearranged kurtosis κ ∗ better reflects tail extent and determines the per-feature noise scale σj ; it is not affected by skewness and is robust to multimodality. Histogram rearrangement. To ensure that kurtosis captures tail weight rather than shape artifacts, we apply a histo… view at source ↗

**Figure 4.** Figure 4: Average and standard deviation of AUC-PR and AUC-ROC on the 57 datasets from [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Ablation studies on ADBench. (a) K-DSM is substantially more robust than single-scale DSM to the choice of σ0. (b) CF and GGD mappings under the standard DSM loss (dotted) and the ε-prediction loss (solid); ε-prediction helps both, and the affine CF rule dominates GGD for all τ . Robustness to base noise scale. We compare K-DSM and single-scale DSM as the base noise scale σ0 varies (Figure 5a). Single-scal… view at source ↗

**Figure 6.** Figure 6: Feature values with directional attributions for two anomaly [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Score-based localization on MVTec–capsule. (a) Input image, (b) per-pixel score magnitude [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Average inference and training time on the 57 datasets from ADBench (semi-supervised [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗

**Figure 9.** Figure 9: Average inference and training time on the 57 datasets from ADBench (unsupervised [PITH_FULL_IMAGE:figures/full_fig_p026_9.png] view at source ↗

**Figure 10.** Figure 10: Ablation over the filtering percentile γ for K-DSM-EMA on 57 ADBench datasets (unsupervised setting, 5 seeds). Shaded bands show ±1 standard error. Performance is robust across a broad range; γ = 80 (vertical dashed line) achieves the best AUC-PR and is used in all reported experiments. 31 [PITH_FULL_IMAGE:figures/full_fig_p031_10.png] view at source ↗

read the original abstract

Denoising score matching (DSM) provides a way to learn data distributions by training a neural network to recover the score function, defined as the gradient of the log density, from noise-corrupted samples. Once trained, the score magnitude at a test point reflects how consistent that point is with the learned distribution, making it a natural anomaly signal. The key practical challenge is selecting the perturbation scale: too little noise yields unstable score estimates in sparse regions, while too much erases local structure and weakens anomaly sensitivity. This is compounded by the difficulty of hyperparameter tuning when anomalies are unknown and no validation set is available. We introduce kurtosis-based noise scaling (K-DSM), a per-feature scheme that sets noise levels from the shape of each marginal distribution, improving coverage of low-density regions and precision in high-density regions without extra model complexity. Contrary to prior claims that multi-scale or noise-conditioned training is necessary, we find that a carefully trained single-scale model is already a strong anomaly detector. On standard tabular anomaly detection benchmarks, K-DSM achieves state-of-the-art performance in the semi-supervised setting. When combined with a lightweight EMA-teacher filtering rule that removes low-density training points before each gradient step, it also achieves strong performance in the fully unsupervised (contaminated) setting, suggesting that simple, data-adaptive noise scaling enables robust anomaly detection while reducing reliance on hyperparameter tuning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

K-DSM uses per-feature kurtosis to pick noise scales for single-scale DSM anomaly detection and claims this delivers SOTA without multi-scale training or validation, but the marginal-to-joint assumption needs checking.

read the letter

The main takeaway is that this paper sets noise scales in denoising score matching by the kurtosis of each feature's marginal distribution, then shows a single-scale model already works well for tabular anomaly detection. They add a lightweight EMA-teacher filter to clean contaminated training data for the unsupervised case. This directly targets the practical headache of choosing perturbation levels when you have no validation set or known anomalies. The kurtosis rule and the single-scale sufficiency claim are the new pieces relative to earlier DSM anomaly work. It keeps the model simple and cuts down on hyperparameter search, which matters in industrial tabular settings. The framing of the scale-selection problem is clear and the proposed fix is easy to code. The soft spot is the one the stress-test note flags. Marginal kurtosis may not give the right effective noise level once features are correlated, since many tabular anomalies live in the joint space rather than in single-feature tails. The paper does not appear to derive why the kurtosis-to-scale map stays reliable under dependence, nor does it isolate the scaling choice with targeted ablations against fixed or variance-based baselines. Without those, it is hard to know how much of the reported gains come from the kurtosis step versus architecture or training details. The SOTA numbers on standard benchmarks are stated, but they need the full tables, baselines, and statistical checks to evaluate. This is for researchers and practitioners doing tabular anomaly detection who want lighter score-based alternatives. It deserves peer review because the idea is concrete, the claims are testable, and the experiments can be reproduced or refuted directly.

Referee Report

3 major / 2 minor

Summary. The paper introduces Kurtosis-Guided Denoising Score Matching (K-DSM) for tabular anomaly detection. It sets per-feature perturbation scales in DSM using the kurtosis of each marginal distribution to balance coverage of low-density regions and precision in high-density regions. The central claims are that a single-scale model suffices (contrary to prior emphasis on multi-scale training), that this yields SOTA performance on standard benchmarks in the semi-supervised setting, and that adding a lightweight EMA-teacher filtering rule enables strong results in the fully unsupervised contaminated setting while reducing hyperparameter dependence.

Significance. If the empirical claims hold, the work is significant for anomaly detection because it offers a simple, data-adaptive, validation-free method for choosing noise scales in score matching, potentially simplifying training and deployment where validation sets are unavailable. The demonstration that single-scale DSM can be competitive is a useful counterpoint to multi-scale approaches, and the EMA filtering rule provides a practical mechanism for handling contamination. Credit is due for grounding scale selection in a computable statistic (kurtosis) without introducing new model parameters or post-hoc tuning.

major comments (3)

[§3.2] §3.2 (kurtosis-to-scale mapping): the per-feature noise variance is defined solely from marginal kurtosis statistics; the manuscript provides neither a derivation showing invariance to feature dependence nor an ablation isolating performance when correlations dominate low-density regions, which is load-bearing for the claim that marginal kurtosis reliably improves joint score estimation quality over fixed or variance-based scales.
[§5] §5 (experimental results): the SOTA claims in both semi-supervised and unsupervised settings rest on benchmark comparisons, yet the paper does not report ablations that remove the kurtosis component while keeping architecture and EMA filtering fixed, making it impossible to attribute gains specifically to the proposed scaling rather than other design choices.
[§4.1] §4.1 (EMA-teacher filtering): while the rule is described as lightweight, the threshold for removing low-density points is not derived from the same kurtosis principle and appears to require its own hyperparameter; this partially undercuts the paper's emphasis on reduced tuning dependence.

minor comments (2)

Notation for the score network and perturbation process could be standardized across sections to avoid minor inconsistencies in variable names.
Figure captions would benefit from explicit mention of which datasets and metrics are shown, improving readability for readers scanning the experimental section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our work. The comments highlight important aspects of our method's justification and empirical validation. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [§3.2] §3.2 (kurtosis-to-scale mapping): the per-feature noise variance is defined solely from marginal kurtosis statistics; the manuscript provides neither a derivation showing invariance to feature dependence nor an ablation isolating performance when correlations dominate low-density regions, which is load-bearing for the claim that marginal kurtosis reliably improves joint score estimation quality over fixed or variance-based scales.

Authors: We agree that the kurtosis-to-scale mapping is a heuristic grounded in marginal statistics rather than a theoretically invariant construction. The neural network still estimates the joint score on the full feature vector, and the per-feature scales are intended to provide adaptive coverage of heavy-tailed marginals without adding model parameters or conditioning. While we do not claim invariance to feature dependence, we will add a dedicated discussion in §3.2 clarifying the heuristic motivation and include an ablation that compares K-DSM against fixed-scale and variance-based variants on datasets with varying correlation strengths to isolate the contribution when dependence is strong. revision: yes
Referee: [§5] §5 (experimental results): the SOTA claims in both semi-supervised and unsupervised settings rest on benchmark comparisons, yet the paper does not report ablations that remove the kurtosis component while keeping architecture and EMA filtering fixed, making it impossible to attribute gains specifically to the proposed scaling rather than other design choices.

Authors: We acknowledge the need for clearer isolation of the kurtosis-guided scaling. The current experiments compare against external baselines but do not include an internal ablation that disables the kurtosis component while retaining the same architecture and EMA rule. We will add this ablation to §5 (and the corresponding tables) in the revision, reporting performance with fixed scales and with variance-based scales under identical training conditions to better attribute the observed gains. revision: yes
Referee: [§4.1] §4.1 (EMA-teacher filtering): while the rule is described as lightweight, the threshold for removing low-density points is not derived from the same kurtosis principle and appears to require its own hyperparameter; this partially undercuts the paper's emphasis on reduced tuning dependence.

Authors: The EMA-teacher filtering is introduced as a practical, optional mechanism specifically for the contaminated unsupervised setting to mitigate the effect of anomalies in the training data. The threshold is a single scalar chosen from a narrow default range based on empirical stability rather than extensive search, and it operates independently of the kurtosis scale selection. We will revise §4.1 to explicitly state the default threshold value, report sensitivity analysis showing limited performance variation across a small interval, and clarify that the overall method still avoids validation-set tuning for the core noise scales while using this lightweight rule only when contamination is present. revision: partial

Circularity Check

0 steps flagged

No circularity: noise scales derived directly from data statistics independent of model

full rationale

The paper's central construction computes per-feature perturbation scales from marginal kurtosis of the input data distribution before any model training occurs. This choice is presented as a fixed, validation-free preprocessing step rather than a fitted parameter or self-referential prediction. The subsequent DSM training and anomaly scoring then proceed from these externally determined scales, with performance claims resting on benchmark comparisons rather than any reduction of the target result to the input by definition or self-citation chain. No load-bearing equation equates a derived quantity back to its own construction, and no uniqueness theorem or ansatz is imported from prior author work to force the method.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard DSM assumptions plus the new kurtosis-to-noise mapping; no free parameters or invented entities are introduced in the abstract description.

axioms (2)

domain assumption The magnitude of the estimated score function at a point reflects its consistency with the learned data distribution
Core premise of DSM-based anomaly detection stated in the abstract
ad hoc to paper Kurtosis of marginal distributions can be used to set perturbation scales that balance coverage and precision
The key novel rule introduced by K-DSM

pith-pipeline@v0.9.0 · 5554 in / 1368 out tokens · 58889 ms · 2026-05-11T00:49:04.160801+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

80 extracted references · 16 canonical work pages · 1 internal anchor

[1]

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

Bergmann, Paul and Fauser, Michael and Sattlegger, David and Steger, Carsten , title =. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =
[2]

European Conference on Computer Vision (ECCV) , year =

Zou, Yang and Jeong, Jongheon and Pemula, Latha and Zhang, Dongqing and Dabeer, Onkar , title =. European Conference on Computer Vision (ECCV) , year =
[3]

Transactions on Machine Learning Research (TMLR) , year =

Oquab, Maxime and Darcet, Timoth. Transactions on Machine Learning Research (TMLR) , year =
[4]

Advances in Neural Information Processing Systems , year =

Ho, Jonathan and Jain, Ajay and Abbeel, Pieter , title =. Advances in Neural Information Processing Systems , year =
[5]

Advances in Neural Information Processing Systems , year =

Karras, Tero and Aittala, Miika and Aila, Timo and Laine, Samuli , title =. Advances in Neural Information Processing Systems , year =
[6]

Scaling Learning Algorithms Towards

Bengio, Yoshua and LeCun, Yann , booktitle =. Scaling Learning Algorithms Towards
[7]

Lieb and Michael Loss , title =

Elliott H. Lieb and Michael Loss , title =. 2001 , isbn =

2001
[8]

Inequalities , edition =

Godfrey Harold Hardy and John Edensor Littlewood and George P\'. Inequalities , edition =
[9]

1987 , isbn =

Luc Devroye , title =. 1987 , isbn =

1987
[10]

Journal of Applied Statistics , volume =

Saralees Nadarajah , title =. Journal of Applied Statistics , volume =. 2005 , doi =

2005
[11]

Revue de l'Institut International de Statistique , volume =

Edmund Alfred Cornish and Ronald Aylmer Fisher , title =. Revue de l'Institut International de Statistique , volume =
[12]

1992 , isbn =

Peter Hall , title =. 1992 , isbn =

1992
[13]

and Osindero, Simon and Teh, Yee Whye , journal =

Hinton, Geoffrey E. and Osindero, Simon and Teh, Yee Whye , journal =. A Fast Learning Algorithm for Deep Belief Nets , volume =
[14]

2016 , publisher=

Deep learning , author=. 2016 , publisher=

2016
[15]

International Conference on Learning Representations , year=

Multiscale Score Matching for Out-of-Distribution Detection , author=. International Conference on Learning Representations , year=
[16]

Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =

Shin, Woosang and Lee, Jonghyeon and Lee, Taehan and Lee, Sangmoon and Yun, Jong Pil , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , month =. 2023 , pages =

2023
[17]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 , pages =

Sattarov, Timur and Schreyer, Marco and Borth, Damian , title =. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 , pages =. 2025 , isbn =. doi:10.1145/3711896.3736910 , abstract =

work page doi:10.1145/3711896.3736910 2025
[18]

Neural Information Processing Systems (NeurIPS) , year=

ADBench: Anomaly Detection Benchmark , author=. Neural Information Processing Systems (NeurIPS) , year=
[19]

SIGMOD Rec

Ramaswamy, Sridhar and Rastogi, Rajeev and Shim, Kyuseok , title =. SIGMOD Rec. , month = may, pages =. 2000 , issue_date =. doi:10.1145/335191.335437 , abstract =

work page doi:10.1145/335191.335437 2000
[20]

International Conference on Learning Representations , year=

Anomaly Detection for Tabular Data with Internal Contrastive Learning , author=. International Conference on Learning Representations , year=
[21]

The Twelfth International Conference on Learning Representations , year=

On Diffusion Modeling for Anomaly Detection , author=. The Twelfth International Conference on Learning Representations , year=
[22]

International Conference on Machine Learning , pages=

Fascinating supervisory signals and where to find them: Deep anomaly detection with scale learning , author=. International Conference on Machine Learning , pages=. 2023 , organization=

2023
[23]

, title =

Spencer, Carole A. , title =. Endotext [Internet] , editor =. 2025 , note =

2025
[24]

McDermott and Haoran Zhang and Lasse Hyldig Hansen and Giovanni Angelotti and Jack Gallifant , booktitle=

Matthew B.A. McDermott and Haoran Zhang and Lasse Hyldig Hansen and Giovanni Angelotti and Jack Gallifant , booktitle=. A Closer Look at. 2024 , url=

2024
[25]

, title =

Dunlap, Dickson B. , title =. Clinical Methods: The History, Physical, and Laboratory Examinations , editor =. 1990 , url =

1990
[26]

2025 , url =

Free Thyroxine Index (FTI), Serum , organization=. 2025 , url =

2025
[27]

2025 , url =

T\_3 Uptake , organization=. 2025 , url =

2025
[28]

ACM Trans

Li, Zhong and Zhu, Yuxuan and Van Leeuwen, Matthijs , title =. ACM Trans. Knowl. Discov. Data , month = sep, articleno =. 2023 , issue_date =. doi:10.1145/3609333 , abstract =

work page doi:10.1145/3609333 2023
[29]

Westfall , title =

Peter H. Westfall , title =. The American Statistician , volume =. 2014 , publisher =. doi:10.1080/00031305.2014.917055 , note =

work page doi:10.1080/00031305.2014.917055 2014
[30]

The relationship between precision-recall and ROC curves,

Davis, Jesse and Goadrich, Mark , title =. Proceedings of the 23rd International Conference on Machine Learning , pages =. 2006 , isbn =. doi:10.1145/1143844.1143874 , abstract =

work page doi:10.1145/1143844.1143874 2006
[31]

2025 , eprint=

We Need to Rethink Benchmarking in Anomaly Detection , author=. 2025 , eprint=

2025
[32]

and Sharma, Rajesh Kumar

Chandra, B. and Sharma, Rajesh Kumar. Adaptive Noise Schedule for Denoising Autoencoder. Neural Information Processing. 2014

2014
[33]

Geras and Charles Sutton , editor =

Krzysztof J. Geras and Charles Sutton , editor =. Scheduled denoising autoencoders , booktitle =. 2015 , url =

2015
[34]

Journal of Machine Learning Research , volume=

Estimation of non-normalized statistical models by score matching , author=. Journal of Machine Learning Research , volume=
[35]

2021 , eprint=

Multiscale Score Matching for Out-of-Distribution Detection , author=. 2021 , eprint=

2021
[36]

2020 , eprint=

Generative Modeling by Estimating Gradients of the Data Distribution , author=. 2020 , eprint=

2020
[37]

2023 , eprint=

Anomaly Detection via Gumbel Noise Score Matching , author=. 2023 , eprint=

2023
[38]

2017 , eprint=

Categorical Reparameterization with Gumbel-Softmax , author=. 2017 , eprint=

2017
[39]

Anomaly Detection with Conditioned Denoising Diffusion Models , ISBN=

Mousakhan, Arian and Brox, Thomas and Tayyub, Jawad , year=. Anomaly Detection with Conditioned Denoising Diffusion Models , ISBN=. doi:10.1007/978-3-031-85181-0_12 , booktitle=

work page doi:10.1007/978-3-031-85181-0_12
[40]

2023 , eprint=

MadSGM: Multivariate Anomaly Detection with Score-based Generative Models , author=. 2023 , eprint=

2023
[41]

2022 , eprint=

Diffusion Models for Adversarial Purification , author=. 2022 , eprint=

2022
[42]

Score-Based Generative Modeling through Stochastic Differential Equations

Yang Song and Jascha Sohl. Score-Based Generative Modeling through Stochastic Differential Equations , journal =. 2020 , url =. 2011.13456 , timestamp =

work page internal anchor Pith review Pith/arXiv arXiv 2020
[43]

MacKay, David J. C. , title =. 2002 , isbn =

2002
[44]

A Connection Between Score Matching and Denoising Autoencoders , year=

Vincent, Pascal , journal=. A Connection Between Score Matching and Denoising Autoencoders , year=
[45]

CoRR , volume =

Guansong Pang and Chunhua Shen and Longbing Cao and Anton van den Hengel , title =. CoRR , volume =. 2020 , url =. 2007.02500 , timestamp =

work page arXiv 2020
[46]

AutoAudit: Mining Accounting and Time-Evolving Graphs , journal =

Meng. AutoAudit: Mining Accounting and Time-Evolving Graphs , journal =. 2020 , url =. 2011.00447 , timestamp =

work page arXiv 2020
[47]

Arnold and Emily Zhao and Yilian Yuan , title =

Wenyuan Li and Yunlong Wang and Yong Cai and Corey W. Arnold and Emily Zhao and Yilian Yuan , title =. CoRR , volume =. 2018 , url =. 1812.00547 , timestamp =

work page arXiv 2018
[48]

Quiroz and Norman Bin Mariun and Mohammad Rezazadeh Mehrjou and Mahdi Izadi and Norhisam Misron and Mohd Amran Mohd Radzi , title =

Juan C. Quiroz and Norman Bin Mariun and Mohammad Rezazadeh Mehrjou and Mahdi Izadi and Norhisam Misron and Mohd Amran Mohd Radzi , title =. CoRR , volume =. 2017 , url =. 1711.02510 , timestamp =

work page arXiv 2017
[49]

and Kitagawa, H

Papadimitriou, S. and Kitagawa, H. and Gibbons, P.B. and Faloutsos, C. , booktitle=. LOCI: fast outlier detection using the local correlation integral , year=
[50]

Hubert, Mia and Debruyne, Michiel and Rousseeuw, Peter J. , year=. Minimum covariance determinant and extensions , volume=. WIREs Computational Statistics , publisher=. doi:10.1002/wics.1421 , number=

work page doi:10.1002/wics.1421
[51]

2019 , eprint=

Sliced Score Matching: A Scalable Approach to Density and Score Estimation , author=. 2019 , eprint=

2019
[52]

Wasserstein barycenter and its application to texture mixing

Julien Rabin and Gabriel Peyr. Wasserstein Barycenter and Its Application to Texture Mixing , booktitle =. 2011 , url =. doi:10.1007/978-3-642-24785-9\_37 , timestamp =

work page doi:10.1007/978-3-642-24785-9 2011
[53]

Will Grathwohl and Ricky T. Q. Chen and Jesse Bettencourt and Ilya Sutskever and David Duvenaud , title =. CoRR , volume =. 2018 , url =. 1810.01367 , timestamp =

work page Pith review arXiv 2018
[54]

2014 , eprint=

Generative Adversarial Networks , author=. 2014 , eprint=

2014
[55]

Waldstein, Ursula Schmidt-Erfurth, and Georg Langs

Thomas Schlegl and Philipp Seeb. Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , journal =. 2017 , url =. 1703.05921 , timestamp =

work page arXiv 2017
[56]

International Conference on Learning Representations , year=

Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , author=. International Conference on Learning Representations , year=
[57]

Chen , title =

Zheng Li and Yue Zhao and Xiyang Hu and Nicola Botta and Cezar Ionescu and George H. Chen , title =. CoRR , volume =. 2022 , url =. 2201.00382 , timestamp =

work page arXiv 2022
[58]

Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop , pages=

A novel anomaly detection scheme based on principal component classifier , author=. Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop , pages=
[59]

Neural Computation , volume=

Estimating the support of a high-dimensional distribution , author=. Neural Computation , volume=
[60]

Technometrics , volume=

A fast algorithm for the minimum covariance determinant estimator , author=. Technometrics , volume=
[61]

Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data , pages=

LOF: identifying density-based local outliers , author=. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data , pages=

2000
[62]

Machine Learning , volume=

Loda: Lightweight on-line detector of anomalies , author=. Machine Learning , volume=
[63]

Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining , pages=

Feature bagging for outlier detection , author=. Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining , pages=
[64]

KI-2012: Poster and Demo Track , pages=

Histogram-based outlier score (HBOS): A fast unsupervised anomaly detection algorithm , author=. KI-2012: Poster and Demo Track , pages=

2012
[65]

2020 IEEE International Conference on Data Mining (ICDM) , pages=

COPOD: Copula-based outlier detection , author=. 2020 IEEE International Conference on Data Mining (ICDM) , pages=

2020
[66]

Pattern Recognition Letters , volume=

Discovering cluster-based local outliers , author=. Pattern Recognition Letters , volume=
[67]

Isolation Forest , year=

Liu, Fei Tony and Ting, Kai Ming and Zhou, Zhi-Hua , booktitle=. Isolation Forest , year=
[68]

Proceedings of the 35th International Conference on Machine Learning , pages =

Deep One-Class Classification , author =. Proceedings of the 35th International Conference on Machine Learning , pages =. 2018 , editor =

2018
[69]

Mining of Massive Datasets , publisher=

Rajaraman, Anand and Ullman, Jeffrey David , year=. Mining of Massive Datasets , publisher=
[70]

Johnson and Samuel Kotz and N

Norman L. Johnson and Samuel Kotz and N. Balakrishnan , title =
[71]

International Conference on Learning Representations , year=

Classification-Based Anomaly Detection for General Data , author=. International Conference on Learning Representations , year=
[72]

International Conference on Machine Learning , pages=

Neural transformation learning for deep anomaly detection beyond images , author=. International Conference on Machine Learning , pages=. 2021 , organization=

2021
[73]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Tarvainen, Antti and Valpola, Harri , title =. Advances in Neural Information Processing Systems (NeurIPS) , year =
[74]

International Conference on Learning Representations (ICLR) , year =

Laine, Samuli and Aila, Timo , title =. International Conference on Learning Representations (ICLR) , year =
[75]

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , booktitle =

Grill, Jean-Bastien and Strub, Florian and Altch. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , booktitle =
[76]

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =

He, Kaiming and Fan, Haoqi and Wu, Yuxin and Xie, Saining and Girshick, Ross , title =. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , year =
[77]

Emerging Properties in Self-Supervised Vision Transformers , booktitle =

Caron, Mathilde and Touvron, Hugo and Misra, Ishan and J. Emerging Properties in Self-Supervised Vision Transformers , booktitle =
[78]

Advances in neural information processing systems , volume=

Revisiting deep learning models for tabular data , author=. Advances in neural information processing systems , volume=
[79]

and Juditsky, Anatoli B

Polyak, Boris T. and Juditsky, Anatoli B. , title =. SIAM Journal on Control and Optimization , volume =
[80]

2025 , eprint=

DINOv3 , author=. 2025 , eprint=

2025