pith. machine review for the scientific record. sign in

arxiv: 2604.25885 · v1 · submitted 2026-04-28 · ✦ hep-ph · cs.LG· hep-ex

Recognition: unknown

Explainable AI for Jet Tagging: A Comparative Study of GNNExplainer, GNNShap, and GradCAM for Jet Tagging in the Lund Jet Plane

Pahal D. Patel, Sanmay Ganguly

Authors on Pith no claims yet

Pith reviewed 2026-05-07 15:52 UTC · model grok-4.3

classification ✦ hep-ph cs.LGhep-ex
keywords explainable AIjet tagginggraph neural networksLund jet planeQCD jet substructureN-subjettinessenergy correlation functions
0
0 comments X

The pith

Explainability methods applied to jet-tagging graph networks show node importance correlating with classical QCD observables.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper adapts perturbation-based, Shapley-value, and gradient-based explainability techniques to a graph neural network that represents jets as Lund-plane graphs. Each node in this representation corresponds to a parton splitting with direct physical interpretation. The authors evaluate the explainers using Monte Carlo truth masks and compare the resulting node importance scores against standard jet substructure quantities such as N-subjettiness ratios and energy correlation functions. The analysis is repeated in separate transverse-momentum intervals to expose how the focus of the explanations changes between non-perturbative and perturbative regimes. Overall the measured correlations indicate that the network has captured aspects of known jet-substructure moments rather than purely statistical associations.

Core claim

When GNNExplainer, GNNShap, and GradCAM are run on LundNet operating on Lund-plane graphs, the importance weights they assign to nodes exhibit a correlation with the analytic observables τ21, τ32, and energy correlation functions; the strength and pattern of these correlations shift across pT bins in the manner expected from QCD, demonstrating that the trained network learns some aspects of jet-substructure moments.

What carries the argument

The Lund-plane graph representation together with the physics-informed evaluation that compares explainer node importance directly to Monte Carlo truth masks and to classical substructure observables.

If this is right

  • The network's internal decisions align with established QCD jet-evolution features across momentum regimes.
  • Explanation quality and focus differ systematically between non-perturbative and perturbative phase space.
  • The open-source implementation permits direct comparison of multiple explainability techniques on the same jet-tagging task.
  • Physics-informed metrics can supplement standard fidelity scores when judging explanation quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same correlation check could be applied to other graph-based models to detect reliance on detector-specific artifacts instead of physics.
  • If the correlations weaken for certain tagging algorithms, the method offers a concrete diagnostic for guiding architectural changes.
  • The approach supplies a template for testing whether any collider machine-learning model has internalized known theoretical structures.

Load-bearing premise

That a statistical correlation between explainer-assigned importance and classical observables such as τ21 or energy correlation functions demonstrates the network has learned genuine QCD dynamics rather than a spurious statistical association.

What would settle it

If the same correlations between explainer weights and τ21, τ32, or energy correlation functions persist in a controlled Monte Carlo sample where jet labels are assigned independently of all substructure observables, the interpretation that the network learned those physical features would be falsified.

Figures

Figures reproduced from arXiv: 2604.25885 by Pahal D. Patel, Sanmay Ganguly.

Figure 1
Figure 1. Figure 1: FIG. 1. A schematic description of Lund Jet Plane, show view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. The figure summarizes the analysis strategy of explainability of a ML based jet tagger in the Lund Jet Plane. We study view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Distribution of average LJP emission density view at source ↗
Figure 4
Figure 4. Figure 4 view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. Distribution of average weighted LJP emission density view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5. Distribution of average weighted LJP emission density view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6. The distribution of the view at source ↗
Figure 7
Figure 7. Figure 7: FIG. 7. The distribution of the view at source ↗
Figure 8
Figure 8. Figure 8: FIG. 8. Weighted feature importance, as defined through Eq. 6, is shown for all the five LJP features across the nine combinations view at source ↗
Figure 9
Figure 9. Figure 9: FIG. 9. The distribution of the view at source ↗
read the original abstract

Graph neural networks such as ParticleNet and transformer based networks on point clouds such as ParticleTransformer achieve state-of-the-art performance on jet tagging benchmarks at the Large Hadron Collider, yet the physical reasoning behind their predictions remains opaque. We present different methods, i.e. perturbation-based (GNNExplainer), Shapley-value-based (GNNShap), and gradient-based (GRADCam); adapted to operate on LundNet's Lund-plane graph representation. Leveraging the fact that each node in the Lund plane corresponds to a physically meaningful parton splitting, we construct Monte Carlo truth explanation masks and introduce a physics-informed evaluation framework that goes beyond standard fidelity metrics. We perform the analysis in three transverse-momentum bins ($\mathrm{p_T} \in [500,700]$, $[800,1000]$, and the inclusive region $[500,1000]$ GeV), revealing how explanation quality and focus shift between non-perturbative and perturbative regimes. We further quantify the correlation between explainer-assigned node importance and classical jet substructure observables -- $N$-subjettiness ratios $\tau_{21}$ and $\tau_{32}$ and the energy correlation functions -- establishing the degree to which the model has learned known QCD features. We find that overall the weight assigned by explainability methods has a correlation with analytic observables, with expected shift across different phase space regimes, indicating that a trained neural network indeed learns some aspects of jet-substructure moments. Our open-source implementation enables reproducible explainability studies for graph-based jet taggers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper applies three explainability methods (GNNExplainer, GNNShap, and GradCAM) to LundNet, a graph neural network operating on the Lund jet plane representation for jet tagging. It introduces Monte Carlo truth explanation masks and a physics-informed evaluation framework, performs the analysis in three pT bins, and quantifies correlations between explainer-assigned node importance scores and classical jet substructure observables (N-subjettiness ratios τ21, τ32 and energy correlation functions). The central claim is that these correlations, which shift across perturbative and non-perturbative regimes, demonstrate that the trained network has learned aspects of jet-substructure moments.

Significance. If the reported correlations can be shown to be robust against confounding variables and not merely artifacts of the shared clustering procedure, the work would provide a concrete example of how explainability tools can link GNN predictions to established QCD observables. The open-source implementation is a clear strength that supports reproducibility and follow-up studies in high-energy physics.

major comments (3)
  1. [Abstract] Abstract: the assertion that explainer weights exhibit 'a correlation with analytic observables' is presented without any quantitative metrics (Pearson/Spearman coefficients, p-values, error bars, or sample sizes), statistical tests, or controls for confounding variables such as jet multiplicity and total pT; this leaves the data-to-claim link unverifiable from the text.
  2. [Evaluation Framework] The evaluation framework (Monte Carlo truth masks and correlation analysis with τ21/τ32/ECFs): because the Lund-plane nodes are constructed from the same clustering that defines the classical observables, any sufficiently expressive model will produce importance scores that correlate with those observables even without having internalized perturbative splitting kernels; the manuscript does not report partial-correlation controls or a null model trained on shuffled labels to distinguish genuine physics learning from spurious statistical dependence.
  3. [Results] pT-binned results: the claimed 'expected shift across different phase space regimes' is described qualitatively but without tabulated correlation values, uncertainties, or a statistical test for the significance of the shift between the [500,700], [800,1000], and inclusive bins, weakening the cross-regime interpretation.
minor comments (2)
  1. [Methods] Notation: the distinction between 'GNNShap' and standard SHAP values for graphs should be clarified in the methods section to avoid confusion with existing literature.
  2. [Conclusion] The open-source repository is mentioned but its URL and exact contents (trained models, scripts for reproducing the correlation plots) are not provided in the manuscript; adding this information would improve reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed report, including the positive assessment of the work's significance and the open-source implementation. We address each major comment below and will revise the manuscript to incorporate additional quantitative details and controls where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that explainer weights exhibit 'a correlation with analytic observables' is presented without any quantitative metrics (Pearson/Spearman coefficients, p-values, error bars, or sample sizes), statistical tests, or controls for confounding variables such as jet multiplicity and total pT; this leaves the data-to-claim link unverifiable from the text.

    Authors: We agree that the abstract, as a concise summary, does not contain specific numerical values or statistical details. The main text and figures provide the correlation analyses with N-subjettiness ratios and energy correlation functions. To strengthen verifiability, we will revise the abstract to include representative quantitative metrics (e.g., average Pearson coefficients per pT bin) and a brief reference to the evaluation framework. revision: yes

  2. Referee: [Evaluation Framework] The evaluation framework (Monte Carlo truth masks and correlation analysis with τ21/τ32/ECFs): because the Lund-plane nodes are constructed from the same clustering that defines the classical observables, any sufficiently expressive model will produce importance scores that correlate with those observables even without having internalized perturbative splitting kernels; the manuscript does not report partial-correlation controls or a null model trained on shuffled labels to distinguish genuine physics learning from spurious statistical dependence.

    Authors: We appreciate this concern about potential confounding from the shared clustering. The Monte Carlo truth explanation masks are constructed from generator-level parton splitting information, which is independent of the clustering used for the classical observables and directly reflects physical processes. This provides grounding beyond statistical dependence. Nevertheless, to further demonstrate robustness, we will add a null-model analysis with shuffled labels in the revised manuscript and explore partial-correlation controls with respect to jet multiplicity and pT. revision: yes

  3. Referee: [Results] pT-binned results: the claimed 'expected shift across different phase space regimes' is described qualitatively but without tabulated correlation values, uncertainties, or a statistical test for the significance of the shift between the [500,700], [800,1000], and inclusive bins, weakening the cross-regime interpretation.

    Authors: The pT-binned results are currently shown via figures with qualitative discussion of the shifts between non-perturbative and perturbative regimes. We agree that tabulated values and statistical tests would improve clarity. In the revision we will add a summary table reporting Pearson and Spearman coefficients (with uncertainties) for each explainer and observable across the three bins, together with statistical tests (e.g., t-tests) assessing the significance of the observed shifts. revision: yes

Circularity Check

0 steps flagged

Empirical correlations with external observables show no circular reduction

full rationale

The paper's central claim rests on measured correlations between explainer-assigned node importance scores and independently defined classical jet substructure observables (N-subjettiness ratios τ21/τ32 and energy correlation functions) computed from Monte Carlo truth. These observables are standard, externally defined QCD quantities not derived from or fitted to the GNN's predictions or parameters. The evaluation framework uses physics-informed Monte Carlo truth masks and performs the analysis across separate pT bins without any self-referential fitting, renaming of known results, or load-bearing self-citation. The Lund-plane graph representation enables node-level analysis but does not force the observed correlations by construction, as the importance scores are generated post-training and compared against external benchmarks. The derivation chain is therefore empirical and self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that Lund-plane nodes map directly to parton splittings and that standard GNN explainability methods remain valid when applied to these graphs; no new free parameters or invented entities are introduced beyond the three off-the-shelf explainers.

axioms (2)
  • domain assumption Each node in the Lund plane corresponds to a physically meaningful parton splitting
    Invoked to construct Monte Carlo truth explanation masks.
  • standard math Standard assumptions underlying GNNExplainer, GNNShap and GradCAM transfer to jet graphs
    Required for the adaptation step described in the abstract.

pith-pipeline@v0.9.0 · 5598 in / 1469 out tokens · 71656 ms · 2026-05-07T15:52:25.264077+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Dissecting Jet-Tagger Through Mechanistic Interpretability

    hep-ph 2026-05 accept novelty 8.0

    A Particle Transformer jet tagger contains a sparse six-head circuit whose source-relay-readout structure recovers most performance and whose residual stream preferentially encodes 2-prong energy correlators.

Reference graph

Works this paper leans on

62 extracted references · 51 canonical work pages · cited by 1 Pith paper · 5 internal anchors

  1. [1]

    MinimizingH(Y| Gexpl)is equivalent to findingGexpl such that the model’s predic- tion is most certain when conditioned onGexpl alone

    GNNExplainer: Mutual information maximization Given a trained GNNΦand an input graphG= (V, E,X)with prediction ˆY= Φ(G), GNNExplainer seeks a subgraphG expl ⊆ Gthat maximises the mutual information: max Gexpl I(Y,G expl) =H(Y)−H(Y| G expl),(A1) whereH(Y)is the entropy of the prediction (constant for a fixed model and input). MinimizingH(Y| Gexpl)is equiva...

  2. [2]

    GNNShap: Shapley value computation GNNShap constructs its interpretability scores strictly upon the axioms of cooperative game theory, specifically utilizing Shapley values. Rather than focusing on nodes or features, GNNShap primarily treats the edges of the computational graph as the "players" in the game, pro- viding a mathematically fair and fine-grain...

  3. [3]

    Graph GradCAM: Layer-wise activation mapping For classc, GradCAM computes the importance weight for channelkat layerℓas: αc,(ℓ) k = 1 N NX i=1 ∂y c ∂H (ℓ) ik ,(A6) whereNis the number of nodes andH (ℓ) ik is the activa- tion of nodeiin channelkat layerℓ. The node-level importance is: wc,(ℓ) i = ReLU KℓX k=1 αc,(ℓ) k H(ℓ) ik .(A7) 17 When aggregating acros...

  4. [4]

    Each emission in the declustering tree becomes a nodev i with kinematic feature vectorx i

    LundNet details LundNet [26] transforms the Lund-plane representa- tion into a graph suitable for GNN processing. Each emission in the declustering tree becomes a nodev i with kinematic feature vectorx i. Two variants are de- fined. LundNet constructs ak-nearest-neighbour graph withk= 16in the five-dimensional feature spacex i = (lnz,ln ∆, ψ,lnm,lnk T)i, ...

  5. [5]

    ParticleNet details The working principle of ParticleNet is similar to LundNet, that of applying EdgeConvolution operation Eq. B1. The core difference is for ParticleNet, a graph is dynamically constructed, in the LJP, at the input level of each EdgeConv layer. In this case we have used 3 EdgeConv layers

  6. [6]

    We use three Embedding layers with latent space dimension(128,128,128)

    Particle Transformer details We applied the self-attention mechanism Attention(Q, K, V) = softmax QK ⊤ √dk +U V.(B2) on the input graph from LJP to adapt Particle Transformer. We use three Embedding layers with latent space dimension(128,128,128). Appendix C: Training details AllthemodelsaretrainedwithAdamWoptimizerwith weight decay factor of 0.01 and lea...

  7. [7]

    G. P. Salam, Towards Jetography, Eur. Phys. J. C67, 637 (2010), arXiv:0906.1833 [hep-ph]

  8. [8]

    A. J. Larkoski, I. Moult, and B. Nachman, Jet Substruc- ture at the Large Hadron Collider: A Review of Recent Advances in Theory and Machine Learning, Phys. Rept. 841, 1 (2020), arXiv:1709.04464 [hep-ph]

  9. [9]

    F. A. Dreyer, G. P. Salam, and G. Soyez, The Lund Jet Plane, JHEP12, 064, arXiv:1807.04758 [hep-ph]

  10. [10]

    Kogleret al., Jet Substructure at the Large Hadron Collider: Experimental Review, Rev

    R. Kogleret al., Jet Substructure at the Large Hadron Collider: Experimental Review, Rev. Mod. Phys.91, 045003 (2019), arXiv:1803.06991 [hep-ex]

  11. [11]

    Looking inside jets: an introduction to jet substructure and boosted-object phenomenology

    S. Marzani, G. Soyez, and M. Spannowsky,Looking inside jets: an introduction to jet substructure and boosted-object phenomenology, Vol. 958 (Springer, 2019) arXiv:1901.10342 [hep-ph]

  12. [12]

    Butteret al., The Machine Learning landscape of top taggers, SciPost Phys.7, 014 (2019), arXiv:1902.09914 [hep-ph]

    A. Butteret al., The Machine Learning landscape of top taggers, SciPost Phys.7, 014 (2019), arXiv:1902.09914 [hep-ph]. [7]Search for highly energetic double Higgs boson production in the two bottom quark and two vector boson all-hadronic final state, Tech. Rep. (CERN, Geneva, 2024)

  13. [13]

    de Oliveira, M

    L. de Oliveira, M. Kagan, L. Mackey, B. Nachman, and A. Schwartzman, Jet-images — deep learning edition, JHEP07, 069, arXiv:1511.05190 [hep-ph]

  14. [14]

    P. T. Komiske, E. M. Metodiev, and M. D. Schwartz, Deep learning in color: towards automated quark/gluon jet discrimination, JHEP01, 110, arXiv:1612.01551 [hep- ph]

  15. [15]

    Kasieczka, T

    G. Kasieczka, T. Plehn, M. Russell, and T. Schell, Deep- learning Top Taggers or The End of QCD?, JHEP05, 006, arXiv:1701.08784 [hep-ph]

  16. [16]

    Shlomi, P

    J. Shlomi, P. Battaglia, and J.-R. Vlimant, Graph neural networks in particle physics, Machine Learning: Science and Technology2, 021001 (2021)

  17. [17]

    Thais, P

    S. Thais, P. Calafiura, G. Chachamis, G. DeZoort, J. Duarte, S. Ganguly, M. Kagan, D. Murnane, M. S. Neubauer, and K. Terao, Graph Neural Networks in Par- ticle Physics: Implementations, Innovations, and Chal- lenges, inSnowmass 2021(2022) arXiv:2203.12852 [hep- ex]

  18. [18]

    Qu and L

    H. Qu and L. Gouskos, ParticleNet: Jet Tagging via Particle Clouds, Phys. Rev. D101, 056019 (2020), arXiv:1902.08570 [hep-ph]

  19. [19]

    E. A. Moreno, T. Q. Nguyen, J.-R. Vlimant, O. Cerri, H. B. Newman, A. Periwal, M. Spiropulu, J. M. Duarte, and M. Pierini, Interaction networks for the identification of boostedH→b bdecays, Phys. Rev. D102, 012010 (2020), arXiv:1909.12285 [hep-ex]

  20. [20]

    S. Gong, Q. Meng, J. Zhang, H. Qu, C. Li, S. Qian, W. Du, Z.-M. Ma, and T.-Y. Liu, An efficient Lorentz equivariant graph neural network for jet tagging, JHEP 07, 030, arXiv:2201.08187 [hep-ph]

  21. [21]

    H. Qu, C. Li, and S. Qian, Particle transformer for jet tagging (2024), arXiv:2202.03772 [hep-ph]

  22. [22]

    Piacquadio and C

    G. Piacquadio and C. Weiser, A new inclusive secondary vertex algorithm for b-jet tagging in ATLAS, J. Phys. Conf. Ser.119, 032032 (2008)

  23. [23]

    Shlomi, S

    J. Shlomi, S. Ganguly, E. Gross, K. Cranmer, Y. Lipman, H. Serviansky, H. Maron, and N. Segol, Secondary vertex finding in jets with neural networks, Eur. Phys. J. C81, 540 (2021), arXiv:2008.02831 [hep-ex]

  24. [24]

    Aadet al.(ATLAS), Transforming jet flavour tag- ging at ATLAS, Nature Commun.17, 541 (2026), arXiv:2505.19689 [hep-ex]

    G. Aadet al.(ATLAS), Transforming jet flavour tag- ging at ATLAS, Nature Commun.17, 541 (2026), arXiv:2505.19689 [hep-ex]

  25. [25]

    Thaler and K

    J. Thaler and K. Van Tilburg, Identifying Boosted Ob- jectswithN-subjettiness,JHEP03,015,arXiv:1011.2268 [hep-ph]

  26. [26]

    Thaler and K

    J. Thaler and K. Van Tilburg, Maximizing Boosted Top Identification by Minimizing N-subjettiness, JHEP02, 093, arXiv:1108.2701 [hep-ph]

  27. [27]

    A. J. Larkoski, G. P. Salam, and J. Thaler, Energy Cor- relation Functions for Jet Substructure, JHEP06, 108, arXiv:1305.0007 [hep-ph]

  28. [28]

    Moult, L

    I. Moult, L. Necib, and J. Thaler, New Angles on Energy Correlation Functions, JHEP12, 153, arXiv:1609.07483 [hep-ph]

  29. [29]

    A. J. Larkoski, S. Marzani, G. Soyez, and J. Thaler, Soft Drop, JHEP05, 146, arXiv:1402.2657 [hep-ph]

  30. [30]

    Barnard, E

    J. Barnard, E. N. Dawe, M. J. Dolan, and N. Rajcic, Parton Shower Uncertainties in Jet Substructure Analy- ses with Deep Neural Networks, Phys. Rev. D95, 014018 (2017), arXiv:1609.00607 [hep-ph]

  31. [31]

    F. A. Dreyer and H. Qu, Jet tagging in the Lund plane with graph networks, JHEP03, 052, arXiv:2012.08526 [hep-ph]

  32. [32]

    Y. L. Dokshitzer, G. D. Leder, S. Moretti, and B. R. Webber, Better jet clustering algorithms, JHEP08, 001, arXiv:hep-ph/9707323

  33. [33]

    Wobisch and T

    M. Wobisch and T. Wengler, Hadronization corrections to jet cross-sections in deep inelastic scattering, inWork- shop on Monte Carlo Generators for HERA Physics (Ple- nary Starting Meeting)(1998) pp. 270–279, arXiv:hep- ph/9907280

  34. [34]

    Lifson, G

    A. Lifson, G. P. Salam, and G. Soyez, Calculating the primary Lund Jet Plane density, JHEP10, 170, arXiv:2007.06578 [hep-ph]

  35. [35]

    G. P. Salam, Elements of QCD for hadron colliders, in2009 European School of High-Energy Physics(2010) arXiv:1011.5131 [hep-ph]

  36. [36]

    Aadet al.(ATLAS), Measurement of the Lund Jet Plane Using Charged Particles in 13 TeV Proton-Proton Collisions with the ATLAS Detector, Phys

    G. Aadet al.(ATLAS), Measurement of the Lund Jet Plane Using Charged Particles in 13 TeV Proton-Proton Collisions with the ATLAS Detector, Phys. Rev. Lett. 124, 222002 (2020), arXiv:2004.03540 [hep-ex]

  37. [37]

    Hayrapetyanet al.(CMS), Measurement of the pri- mary Lund jet plane density in proton-proton collisions at √s = 13 TeV, JHEP05, 116, arXiv:2312.16343 [hep- ex]

    A. Hayrapetyanet al.(CMS), Measurement of the pri- mary Lund jet plane density in proton-proton collisions at √s = 13 TeV, JHEP05, 116, arXiv:2312.16343 [hep- ex]

  38. [38]

    Hayrapetyanet al.(CMS), A method for correcting the substructure of multiprong jets using the Lund jet plane, JHEP11, 038, arXiv:2507.07775 [hep-ex]

    A. Hayrapetyanet al.(CMS), A method for correcting the substructure of multiprong jets using the Lund jet plane, JHEP11, 038, arXiv:2507.07775 [hep-ex]

  39. [39]

    Cohen, J

    T. Cohen, J. Roloff, and C. Scherb, Dark sector showers in the Lund jet plane, Phys. Rev. D108, L031501 (2023), arXiv:2301.07732 [hep-ph]. [35]GN3: Multi-task, Multi-modal Transformers for Jet Flavour Tagging in ATLAS, Tech. Rep. (2026)

  40. [40]

    Grojean, A

    C. Grojean, A. Paul, Z. Qian, and I. Strümke, Lessons on interpretable machine learning from particle physics, Nature Rev. Phys.4, 284 (2022), arXiv:2203.08021 [hep- ph]

  41. [41]

    S. J. Wetzel, S. Ha, R. Iten, M. Klopotek, and Z. Liu, Interpretable Machine Learning in Physics: A Review, arXiv e-prints , arXiv:2503.23616 (2025), 25 arXiv:2503.23616 [physics.comp-ph]

  42. [42]

    W. J. Murdoch, C. Singh, K. Kumbier, R. Abbasi- Asl, and B. Yu, Interpretable machine learning: def- initions, methods, and applications, arXiv e-prints , arXiv:1901.04592 (2019), arXiv:1901.04592 [stat.ML]

  43. [43]

    (2020).Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

    A. Barredo Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil- López, D. Molina, R. Benjamins, R. Chatila, and F. Her- rera, Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Re- sponsible AI, arXiv e-prints , arXiv:1910.10045 (2019), arXiv:1910.10045 [cs.AI]

  44. [44]

    A. Wang, A. Gandrakota, J. Ngadiuba, V. Sahu, P. Bhat- nagar, E. E. Khoda, and J. Duarte, Interpreting Trans- formers for Jet Tagging (2024) arXiv:2412.03673 [hep- ph]

  45. [45]

    Bogatskiy, T

    A. Bogatskiy, T. Hoffman, D. W. Miller, J. T. Offer- mann, and X. Liu, Explainable equivariant neural net- works for particle physics: PELICAN, JHEP03, 113, arXiv:2307.16506 [hep-ph]

  46. [46]

    M. R. Islam, A. Khan, M. S. Hossain, C. B. Y. Siddiqui, M. Z. Hossan, T. Khan, M. A. Momen, A. A. Ali, and A. M. Rahman, E-PCN: Jet Tagging with Explainable Particle Chebyshev Networks Using Kinematic Features, (2025), arXiv:2512.07420 [hep-ph]

  47. [47]

    S. Vent, R. Winterhalder, and T. Plehn, The Physics Behind ML-based Quark-Gluon Taggers, SciPost Phys. 20, 084 (2026), arXiv:2507.21214 [hep-ph]

  48. [48]

    Konar, V

    P. Konar, V. S. Ngairangbam, M. Spannowsky, and D. Srivastava, Stable and interpretable jet physics with IRC-safe equivariant feature extraction, JHEP03, 219, arXiv:2509.22059 [hep-ph]

  49. [49]

    H. Yuan, H. Yu, S. Gui, and S. Ji, Explainability in Graph Neural Networks: A Taxonomic Survey , IEEE Transactions on Pattern Analysis & Machine Intelligence 45, 5782 (2023)

  50. [51]

    R. Ying, D. Bourgeois, J. You, M. Zitnik, and J. Leskovec, GNNExplainer: Generating Explana- tions for Graph Neural Networks, arXiv e-prints , arXiv:1903.03894 (2019), arXiv:1903.03894 [cs.LG]

  51. [52]

    Akkas and A

    S. Akkas and A. Azad, Gnnshap: Scalable and accurate gnn explanation using shapley values, inProceedings of the ACM Web Conference 2024, WWW ’24 (Association for Computing Machinery, New York, NY, USA, 2024) p. 827–838

  52. [53]

    P. E. Pope, S. Kolouri, M. Rostami, C. E. Martin, and H. Hoffmann, Explainability methods for graph convolu- tional neural networks, in2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) pp. 10764–10773

  53. [54]

    Amara, R

    K. Amara, R. Ying, Z. Zhang, Z. Han, Y. Shan, U. Bran- des, S. Schemm, and C. Zhang, GraphFramEx: To- wards Systematic Evaluation of Explainability Meth- ods for Graph Neural Networks, arXiv e-prints , arXiv:2206.09677 (2022), arXiv:2206.09677 [cs.LG]

  54. [55]

    Agarwal, O

    C. Agarwal, O. Queen, H. Lakkaraju, and M. Zitnik, Evaluating explainability for graph neural networks, in Nature Scientific Data, Nature Scientific Data, Vol. 10 (2023) p. 144, arXiv:2208.09339 [cs.LG]

  55. [56]

    Adamset al., Towards an Understanding of the Cor- relations in Jet Substructure, Eur

    D. Adamset al., Towards an Understanding of the Cor- relations in Jet Substructure, Eur. Phys. J. C75, 409 (2015), arXiv:1504.00679 [hep-ph]

  56. [57]

    Dasgupta, A

    M. Dasgupta, A. Fregoso, S. Marzani, and G. P. Salam, Towards an understanding of jet substructure, JHEP09, 029, arXiv:1307.0007 [hep-ph]

  57. [58]

    A. J. Larkoski, QCD masterclass lectures on jet physics and machine learning, Eur. Phys. J. C84, 1117 (2024), arXiv:2407.04897 [hep-ph]

  58. [59]

    The anti-k_t jet clustering algorithm

    M. Cacciari, G. P. Salam, and G. Soyez, The anti-kt jet clustering algorithm, JHEP04, 063, arXiv:0802.1189 [hep-ph]

  59. [60]

    A comprehensive guide to the physics and usage of PYTHIA 8.3

    C. Bierlichet al., A comprehensive guide to the physics and usage of PYTHIA 8.3, SciPost Phys. Codeb.2022, 8 (2022), arXiv:2203.11601 [hep-ph]

  60. [61]

    Herwig++ Physics and Manual

    M. Bahret al., Herwig++ Physics and Manual, Eur. Phys. J. C58, 639 (2008), arXiv:0803.0883 [hep-ph]

  61. [62]

    Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, andJ.M.Solomon,DynamicGraphCNNforLearningon Point Clouds, arXiv e-prints , arXiv:1801.07829 (2018), arXiv:1801.07829 [cs.CV]

  62. [63]

    Malara (ATLAS, CMS), Exploring jets: substruc- ture and flavour tagging in CMS and ATLAS, PoS LHCP2024, 150 (2025), arXiv:2410.14330 [hep-ex]

    A. Malara (ATLAS, CMS), Exploring jets: substruc- ture and flavour tagging in CMS and ATLAS, PoS LHCP2024, 150 (2025), arXiv:2410.14330 [hep-ex]