Recognition: unknown
Kitchen Sink Anomaly Detection
Pith reviewed 2026-05-09 23:29 UTC · model grok-4.3
The pith
A combined kitchen sink observable set of Energy Flow Polynomials and subjettiness variables outperforms standard baselines in sensitivity to a wide range of resonant signals, with new public benchmarks released and an attribute bagging variant reducing training cost.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We find that our kitchen sink approach is the most sensitive to a broad range of signal types.
Load-bearing premise
That the chosen high-level observables remain sufficiently agnostic and performant when applied to real detector data and backgrounds rather than idealized simulations.
Figures
read the original abstract
An enormous amount of R&D effort has resulted in many new resonant anomaly detection methods being proposed in recent years. However, the vast majority of previous R&D studies have suffered from two limitations: they have focused on a very small set of simulated signal benchmark models; and they have either used small sets of carefully crafted high-level jet substructure observables, which can be highly performant but are prone to model dependence, or the full collider event phase space, which is more agnostic but suffers from reduced sensitivity. In this work, we address both limitations: we formulate a number of new simulated signal benchmarks, which we make publicly available in a format fully compatible with the LHCO R&D benchmark; and we explore a high-level, yet highly agnostic, observable set consisting of Energy Flow Polynomials in addition to the usual subjettiness variables. We evaluate this "kitchen sink" observable set for both an idealized anomaly detector and the CWoLa hunting task, along with three baseline observable sets (the Baseline LHC Olympics set, subjettiness observables, and Energy Flow Polynomials). We find that our kitchen sink approach is the most sensitive to a broad range of signal types. Furthermore, we show that an attribute bagging variant, in which each ensemble member is trained on a random subset of substructure observables, yields comparable anomaly detection performance while significantly reducing training cost.
Editorial analysis
A structured set of objections, weighed in public.
Circularity Check
No significant circularity
full rationale
The paper is an empirical comparison study that introduces new public signal benchmarks and evaluates the sensitivity of a combined observable set (Energy Flow Polynomials plus subjettiness) against three baselines in both idealized anomaly detection and CWoLa settings. The central performance claims rest on direct numerical comparisons across these benchmarks rather than any mathematical derivation, fitted parameter, or self-referential definition. No load-bearing step reduces by construction to the paper's own inputs, self-citations, or renamed known results; the public release of the benchmarks supplies independent material for verification.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Simulated signal and background events sufficiently represent the statistical properties of real LHC data for the purpose of comparing anomaly detection methods.
Forward citations
Cited by 1 Pith paper
-
Open LHC Monte Carlo Event Generation
A review of initiatives to make LHC Monte Carlo event generations available as open data to minimize redundant simulations and resource use.
Reference graph
Works this paper leans on
-
[1]
Of course, such an ideal template is not available in real data, but it can be generated for simulated data
The Idealized Anomaly Detector The idealized anomaly detector (IAD) [26] uses a per- fect background template constructed by sampling from the true background distribution in the SR. Of course, such an ideal template is not available in real data, but it can be generated for simulated data. Therefore, the IAD simply provides an optimal benchmark for weakl...
-
[2]
The side- bands in our specific analysis are defined in Section IIID
CWoLa hunting CWoLa hunting [23–25] is a fully data-driven weakly supervised anomaly detection method that uses the data in the sidebands (SBs) of the SR as the BT. The side- bands in our specific analysis are defined in Section IIID. CWoLa hunting relies on two crucial assumptions. The first is the standard bump hunt assumption that the new 3 physics res...
2020
-
[3]
The first signalX→Y Y ′ →4qwithm Y = 100 GeVandm Y ′ = 500 GeVhas a2 + 2prong topology like the LHCO 2-prong signal. However, unlike in the LHCO signal where all resonantly pro- duced particles are vector bosons, theY ′ particle is a scalar in this model.Y ′ decays into a pair of bottom quarks whileYcan also decay into pairs of light quarks
-
[4]
The radion decays into twoWbosons
The second signalW KK →W R→3Wwith mR = 500 GeVconsists of a heavy Kaluza-Klein vector bosonW KK decaying into aWboson and a scalar radionR[46, 47]. The radion decays into twoWbosons. We analyze the fully hadronic chan- nel, where allWbosons decay into two light quarks each, resulting in a2 + 4prong structure
-
[5]
TheT ′ par- ticles decay into a top quark and aZboson
The third signalZ ′ →T ′T ′ →tZtZwithm T ′ = 400 GeVconsists of aZ ′ vector boson that decays into two vector-like quarksT′ [48, 49]. TheT ′ par- ticles decay into a top quark and aZboson. Again we only consider the fully hadronic channel, where all the intermediate vector bosons decay into light quarks, resulting in a5 + 5prong topology
-
[6]
kitchen sink
The last signalG KK →HH→4twith mH = 400 GeVconsists of a heavy spin-2 Randall- Sundrum gravitonG KK that decays into two Higgs-like scalarsH[50]. The scalars decay into two top quarks, resulting in a6 +6prong structure in the fully hadronic channel which is considered here. We useMadGraph5_amc@nlo 3.6.2[51] at leading order to simulate the hard process, i...
-
[7]
G. Kasieczkaet al., Rept. Prog. Phys.84, 124201 (2021), arXiv:2101.08320 [hep-ph]
-
[8]
Aarrestadet al., SciPost Phys.12, 043 (2021), arXiv:2105.14027 [hep-ph]
T. Aarrestadet al., SciPost Phys.12, 043 (2021), arXiv:2105.14027 [hep-ph]
-
[9]
Karagiorgiet al., Nature Reviews Physics4, 399 (2022)
G. Karagiorgiet al., Nature Reviews Physics4, 399 (2022)
2022
- [10]
-
[11]
A Living Review of Machine Learning for Particle Physics,
HEP ML Community, “A Living Review of Machine Learning for Particle Physics,”
-
[12]
ATLAS Collaboration, Phys. Rev. Lett.125, 131801 (2020), arXiv:2005.02983 [hep-ex]. 12 0 200 400 600 800 1000 Nsig 0 10 20 30 40 50 60 70max(SIC) LHCO 2-prong / IADCombined Random (EFP) EFP7 ( = 1) 2 0.0 0.44 0.87 1.31 1.74 2.18 S/ B 0 200 400 600 800 1000 Nsig 0 10 20 30 40 50 60 70max(SIC) GKK HH 4t / IAD 0.0 0.43 0.86 1.29 1.73 2.16 S/ B FIG. 4. We pre...
- [13]
- [14]
-
[15]
R. Gambhiret al., Phys.Rev.Lett.135, 021902 (2025), arXiv:2502.14036 [hep-ph]
- [16]
-
[17]
E. Buhmannet al., Phys. Rev. D109, 055015 (2023), arXiv:2310.06897 [hep-ph]
-
[18]
D. Senguptaet al., JHEP04, 109 (2023), arXiv:2312.10130 [physics.data-an]
-
[19]
V. Mikuni and B. Nachman, Phys. Rev. D111, L051504 (2024), arXiv:2404.16091 [hep-ph]
-
[20]
J. Thaler and K. Van Tilburg, JHEP03, 015 (2011), arXiv:1011.2268 [hep-ph]
-
[21]
J. Thaler and K. Van Tilburg, JHEP02, 093 (2012), arXiv:1108.2701 [hep-ph]
- [22]
-
[23]
R&d dataset for lhc olympics 2020 anomaly detection challenge,
G. Kasieczka, B. Nachman, and D. Shih, “R&d dataset for lhc olympics 2020 anomaly detection challenge,” (2019)
2020
-
[24]
Additional signal models for the lhco2020 r&d,
R. Daset al., “Additional signal models for the lhco2020 r&d,” (2026)
2026
- [25]
-
[26]
T. Finkeet al., Phys. Rev. D109, 034033 (2023), arXiv:2309.13111 [hep-ph]
-
[27]
M. Freytsis, M. Perelstein, and Y. C. San, JHEP02, 220 (2023), arXiv:2310.13057 [hep-ph]
-
[28]
L. Grinsztajn, E. Oyallon, and G. Varoquaux, (2022), arXiv:2207.08815 [cs.LG]
- [29]
- [30]
- [31]
- [32]
- [33]
-
[34]
A. Hallinet al., Phys. Rev. D107, 114012 (2022), arXiv:2210.14924 [hep-ph]
-
[35]
T. Gollinget al., Phys. Rev. D107, 096025 (2023), arXiv:2212.11285 [hep-ph]
-
[36]
T. Gollinget al., Eur. Phys. J. C84, 241 (2023), arXiv:2307.11157 [hep-ph]
-
[37]
Neyman and E
J. Neyman and E. S. Pearson, Phil. Trans. Roy. Soc. Lond. A231, 289 (1933)
1933
-
[38]
B. Nachman and D. Shih, Phys. Rev. D101, 075042 (2020), arXiv:2001.04990 [hep-ph]
-
[39]
A. Andreassen, B. Nachman, and D. Shih, Phys. Rev. D 101, 095004 (2020), arXiv:2001.05001 [hep-ph]
-
[40]
K. Benkendorfer, L. L. Pottier, and B. Nachman, Phys. Rev. D104, 035003 (2020), arXiv:2009.02205 [hep-ph]
- [41]
-
[42]
Leighet al., JHEP12, 105 (2025), arXiv:2407.19818 [hep-ph]
M. Leighet al., JHEP12, 105 (2025), arXiv:2407.19818 [hep-ph]
-
[43]
I. Oleksiyuk, S. Voloshynovskiy, and T. Golling, JHEP 07, 177 (2025), arXiv:2503.04342 [hep-ph]
-
[44]
Pedregosaet al., Journal of Machine Learning Re- search12, 2825 (2011)
F. Pedregosaet al., Journal of Machine Learning Re- search12, 2825 (2011)
2011
-
[45]
Keet al., inNeural Information Processing Systems (2017)
G. Keet al., inNeural Information Processing Systems (2017)
2017
-
[46]
Heinet al., (2025), arXiv:2511.14832 [hep-ph]
M. Heinet al., (2025), arXiv:2511.14832 [hep-ph]
-
[47]
Asymptotic formulae for likelihood-based tests of new physics
G. Cowanet al., Eur. Phys. J. C71, 1554 (2011), arXiv:1007.1727 [physics.data-an]
work page internal anchor Pith review arXiv 2011
-
[49]
Bierlichet al., SciPost Phys
C. Bierlichet al., SciPost Phys. Codebases , 8 (2022)
2022
-
[50]
DELPHES 3, A modular framework for fast simulation of a generic collider experiment
J. de Favereauet al.(DELPHES 3), JHEP02, 057 (2014), arXiv:1307.6346 [hep-ex]
work page internal anchor Pith review arXiv 2014
-
[51]
Additional qcd background events for lhco2020 r&d (signal region only),
D. Shih, “Additional qcd background events for lhco2020 r&d (signal region only),” (2021)
2021
-
[52]
Agasheet al., JHEP01, 016 (2017), arXiv:1608.00526 [hep-ph]
K. Agasheet al., JHEP01, 016 (2017), arXiv:1608.00526 [hep-ph]
-
[53]
K. Agasheet al., Phys. Rev. D99, 075016 (2019), 13 arXiv:1711.09920 [hep-ph]
-
[54]
LHC signatures of vector-like quarks
Y. Okada and L. Panizzi, Adv. High Energy Phys.2013, 364936 (2013), arXiv:1207.5607 [hep-ph]
work page Pith review arXiv 2013
-
[55]
Model Independent Framework for Searches of Top Partners
M. Buchkremeret al., Nucl. Phys. B876, 376 (2013), arXiv:1305.4172 [hep-ph]
work page Pith review arXiv 2013
-
[56]
Gravity particles from warped extra dimensions, predictions for LHC
A. Carvalho, (2014), arXiv:1404.0102 [hep-ph]
-
[57]
J. Alwallet al., JHEP07, 079 (2014), arXiv:1405.0301 [hep-ph]
work page internal anchor Pith review arXiv 2014
-
[58]
The anti-k_t jet clustering algorithm
M. Cacciari, G. P. Salam, and G. Soyez, JHEP04, 063 (2008), arXiv:0802.1189 [hep-ph]
work page internal anchor Pith review arXiv 2008
-
[59]
M. Cacciari, G. P. Salam, and G. Soyez, Eur. Phys. J. C72, 1896 (2012), arXiv:1111.6097 [hep-ph]
work page internal anchor Pith review arXiv 2012
- [60]
- [61]
- [62]
- [63]
-
[64]
A. Andreassenet al., Phys. Rev. Lett.124, 182001 (2020), arXiv:1911.09107 [hep-ph]
- [65]
-
[66]
T. K. Ho, IEEE Transactions on Pattern Analysis and Machine Intelligence20, 832 (1998)
1998
-
[67]
Bryll, R
R. Bryll, R. Gutierrez-Osuna, and F. Quek, Pattern Recognition36, 1291 (2003)
2003
-
[68]
Kitchen Sink Anomaly Detection Code,
L. Lang, “Kitchen Sink Anomaly Detection Code,” (2026)
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.