E-PCN: Jet Tagging with Explainable Particle Chebyshev Networks Using Kinematic Features
Pith reviewed 2026-05-17 01:01 UTC · model grok-4.3
The pith
E-PCN classifies jets by building four graphs each weighted by a different kinematic variable and uses Grad-CAM to show angular separation plus transverse momentum drive 76 percent of decisions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
E-PCN constructs four graph representations per jet, each weighted by one of angular separation Δ, transverse momentum k_T, momentum fraction z, or invariant mass squared m². Application of Grad-CAM reveals that Δ and k_T together account for approximately 76 percent of classification decisions. On the JetClass dataset with ten signal classes the network reaches 94.67 percent macro-accuracy, 96.78 percent macro-AUC, and 86.79 percent macro-AUPR, improving on the baseline PCN by 2.36 percent, 4.13 percent, and 24.88 percent respectively while supplying physically interpretable attributions.
What carries the argument
Four kinematic-weighted graph representations of each jet together with Grad-CAM attribution, allowing separate measurement of how much each kinematic variable contributes to the output class scores.
If this is right
- Macro accuracy, AUC, and AUPR all rise relative to the baseline Particle Chebyshev Network on the ten-class JetClass task.
- Angular separation receives 40.72 percent and transverse momentum 35.67 percent of the total attribution weight.
- The remaining 24 percent of decisions are attributed to momentum fraction and invariant mass squared combined.
- The learned representations remain directly tied to measurable particle kinematics rather than opaque latent embeddings.
- The same four-graph construction can be applied to other graph neural networks used for jet substructure analysis.
Where Pith is reading between the lines
- Physicists could use the same weighting scheme to test whether traditional jet algorithms already capture the same dominant variables that the network discovers.
- If the attributions remain stable across detector variations, the approach might reduce the number of kinematic inputs needed for future real-time triggers.
- Similar four-graph constructions could be tested on other high-energy datasets to see whether the 76 percent dominance of angular and transverse momentum variables generalizes.
- Direct comparison of Grad-CAM maps against permutation-based feature importance would provide an independent check on the explanation quality.
Load-bearing premise
Grad-CAM attributions computed on the four kinematic-weighted graphs accurately reflect the causal importance of those variables inside the model's decision process rather than artifacts of the explanation method.
What would settle it
An ablation that removes angular separation and transverse momentum features while keeping the other two, then checks whether the measured performance drop is at least three times larger than the drop obtained by removing only momentum fraction and invariant mass.
read the original abstract
The identification and classification of collimated particle sprays, or jets, are essential for interpreting data from high-energy collider experiments. While deep learning has improved jet classification, it often lacks interpretability. We introduce the Explainable Particle Chebyshev Network (E-PCN), a graph neural network extending the Particle Chebyshev Network (PCN). E-PCN integrates kinematic variables into jet classification by constructing four graph representations per jet, each weighted by a distinct variable: angular separation ($\Delta$), transverse momentum ($k_T$), momentum fraction ($z$), and invariant mass squared ($m^2$). We use the concept of Gradient-weighted Class Activation Mapping (Grad-CAM) to determine which kinematic variables dominate classification outcomes. Analysis reveals that angular separation and transverse momentum collectively account for approximately 76% of classification decisions (40.72% and 35.67%, respectively), with momentum fraction and invariant mass contributing the remaining 24%. Evaluated on the JetClass dataset with 10 signal classes, E-PCN achieves a macro-accuracy of 94.67%, macro-AUC of 96.78%, and macro-AUPR of 86.79%, representing improvements of 2.36%, 4.13%, and 24.88% respectively over the baseline PCN implementation, while demonstrating physically interpretable feature learning.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Explainable Particle Chebyshev Network (E-PCN) as an extension of the Particle Chebyshev Network (PCN) for jet tagging. E-PCN constructs four graph representations per jet weighted by distinct kinematic variables: angular separation (Δ), transverse momentum (k_T), momentum fraction (z), and invariant mass squared (m²). It employs Grad-CAM to identify dominant features, reporting that angular separation and transverse momentum account for approximately 76% of classification decisions. On the JetClass dataset with 10 signal classes, E-PCN achieves a macro-accuracy of 94.67%, macro-AUC of 96.78%, and macro-AUPR of 86.79%, with improvements of 2.36%, 4.13%, and 24.88% over the baseline PCN.
Significance. If the interpretability claims hold, this contributes meaningfully to developing transparent ML models for high-energy physics applications. Jet tagging benefits from models that not only perform well but also provide insights aligned with physical quantities. The multi-graph approach using kinematic weights is a logical extension, and the reported metrics indicate competitive performance, though the explainability is the key differentiator.
major comments (2)
- [§4 (Performance Evaluation)] The abstract and results report specific performance numbers (e.g., 94.67% macro-accuracy) without error bars, standard deviations from multiple runs, or details on the training configuration and hyperparameter search. This makes it hard to evaluate the robustness of the claimed improvements over PCN and whether they are statistically meaningful.
- [§3.2 (Grad-CAM Analysis)] The central claim regarding feature importance (angular separation 40.72%, pT 35.67%) is based on Grad-CAM applied to the four kinematic-weighted graphs. No validation is described for the faithfulness of these attributions, such as checks against gradient saturation, ablation of individual graphs, or comparison to alternative explanation techniques. This is load-bearing for the novelty of 'physically interpretable feature learning' and requires additional evidence to support that the attributions reflect causal usage in the model rather than artifacts.
minor comments (1)
- Consider adding a brief explanation or reference for the choice of the four specific kinematic variables (Δ, k_T, z, m²) and how they relate to standard jet substructure observables.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The comments highlight important aspects of robustness and the validation of interpretability claims. We address each major comment below and outline the revisions planned for the next version of the manuscript.
read point-by-point responses
-
Referee: [§4 (Performance Evaluation)] The abstract and results report specific performance numbers (e.g., 94.67% macro-accuracy) without error bars, standard deviations from multiple runs, or details on the training configuration and hyperparameter search. This makes it hard to evaluate the robustness of the claimed improvements over PCN and whether they are statistically meaningful.
Authors: We agree that reporting error bars and experimental details is necessary to allow proper evaluation of statistical significance and robustness. In the revised manuscript we will add standard deviations obtained from multiple independent training runs using different random seeds. We will also expand Section 4 to include the complete training configuration (optimizer, learning-rate schedule, batch size, number of epochs) and a description of the hyperparameter search procedure. revision: yes
-
Referee: [§3.2 (Grad-CAM Analysis)] The central claim regarding feature importance (angular separation 40.72%, pT 35.67%) is based on Grad-CAM applied to the four kinematic-weighted graphs. No validation is described for the faithfulness of these attributions, such as checks against gradient saturation, ablation of individual graphs, or comparison to alternative explanation techniques. This is load-bearing for the novelty of 'physically interpretable feature learning' and requires additional evidence to support that the attributions reflect causal usage in the model rather than artifacts.
Authors: We acknowledge that additional validation of the Grad-CAM attributions would strengthen the interpretability claims. The present manuscript applies Grad-CAM but does not report explicit faithfulness checks. In the revision we will include an ablation study that removes or masks each kinematic-weighted graph in turn and quantifies the resulting drop in classification metrics. We will also add a brief discussion of known Grad-CAM limitations such as gradient saturation and, space permitting, a comparison with a perturbation-based attribution method. revision: partial
Circularity Check
No significant circularity detected; results are empirical evaluations on held-out data
full rationale
The paper reports empirical performance (94.67% macro-accuracy, etc.) from training E-PCN on the JetClass dataset and evaluating on its test split, plus post-hoc Grad-CAM attributions on the four kinematic-weighted graphs that yield the 76% figure. These quantities are measured outputs, not quantities that reduce by construction to the model definition or to fitted parameters. No derivation chain, uniqueness theorem, or self-citation is shown to be load-bearing for the accuracy or attribution numbers. The architecture choice of four separate graphs is an explicit modeling decision whose consequences are tested externally rather than presupposed.
Axiom & Free-Parameter Ledger
free parameters (1)
- graph construction and weighting hyperparameters
axioms (1)
- domain assumption Grad-CAM produces faithful feature attributions for the GNN classifier
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
four graph representations per jet, each weighted by a distinct variable: angular separation (Δ), transverse momentum (kT), momentum fraction (z), and invariant mass squared (m²)... Grad-CAM... angular separation and transverse momentum collectively account for approximately 76%
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Chebyshev graph convolutions... alternating ChebConv→EdgeConv
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Explainable AI for Jet Tagging: A Comparative Study of GNNExplainer, GNNShap, and GradCAM for Jet Tagging in the Lund Jet Plane
Explainability techniques applied to LundNet show that assigned node importance correlates with classical jet substructure observables such as N-subjettiness ratios and energy correlation functions, with shifts across...
Reference graph
Works this paper leans on
-
[1]
High-Luminosity Large Hadron Collider (HL-LHC): Technical design report,
O. Aberle, I. B´ ejar Alonso, O. Br¨ uning et al.,High-Luminosity Large Hadron Collider (HL-LHC): Technical design report, vol. 10 ofCERN Yellow Reports: Monographs, CERN, Geneva (2020), 10.23731/CYRM-2020-0010, [inSPIRE]
-
[2]
CERN, “High-Luminosity LHC.” https://home.cern/science/accelerators/high-luminosity-lhc, 2018
work page 2018
-
[3]
S. Mondal and L. Mastrolorenzo,Machine learning in high energy physics: a review of heavy-flavor jet tagging at the LHC,Eur. Phys. J. ST233(2024) 2657 [arXiv:2404.01071] [inSPIRE]. – 16 –
-
[4]
Machine Learning in High Energy Physics Community White Paper
K. Albertsson et al.,Machine Learning in High Energy Physics Community White Paper,J. Phys. Conf. Ser.1085(2018) 022008 [arXiv:1807.02876] [inSPIRE]
work page internal anchor Pith review Pith/arXiv arXiv 2018
- [5]
-
[6]
V. Mikuni and F. Canelli,Point cloud transformers applied to collider physics,Mach. Learn. Sci. Tech.2(2021) 035027 [arXiv:2102.05073] [inSPIRE]
-
[7]
Shimmin,Particle Convolution for High Energy Physics, 7, 2021
C. Shimmin,Particle Convolution for High Energy Physics, (2021), [arXiv:2107.02908] [inSPIRE]
- [8]
- [9]
- [10]
- [11]
-
[12]
Y. Semlani, M. Relan and K. Ramesh,PCN: A Deep Learning Approach to Jet Tagging Utilizing Novel Graph Construction Methods and Chebyshev Graph Convolutions,JHEP07 (2024) 247 [arXiv:2309.08630] [inSPIRE]
- [13]
-
[14]
L. Liao, Z. Hu, Y. Zheng et al.,An improved dynamic Chebyshev graph convolution network for traffic flow prediction with spatial-temporal attention,Appl. Intell.52(2022) 16104
work page 2022
-
[15]
O. Boyaci, M.R. Narimani, K. Davis et al.,Cyberattack Detection in Large-Scale Smart Grids using Chebyshev Graph Convolutional Networks, inInternational Conference on Electrical and Electronics Engineering (ICEEE), (2022), pp. 217–221 [arXiv:2112.13166]
-
[16]
Wavelets on Graphs via Spectral Graph Theory
D.K. Hammond, P. Vandergheynst and R. Gribonval,Wavelets on graphs via spectral graph theory,Applied and Computational Harmonic Analysis30(2011) 129 [arXiv:0912.3848]
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[17]
D.I. Shuman, S.K. Narang, P. Frossard et al.,The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains, IEEE signal processing magazine30(2013) 83 [arXiv:1211.0053]
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[18]
Chebyshev Polynomial Approximation for Distributed Signal Processing
D.I. Shuman, P. Vandergheynst and P. Frossard,Chebyshev Polynomial Approximation for Distributed Signal Processing, inInternational Conference on Distributed Computing in Sensor Systems and Workshops (DCOSS), (2011), pp. 1–8 [arXiv:1105.1891]
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[19]
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
M. Defferrard, X. Bresson and P. Vandergheynst,Convolutional neural networks on graphs with fast localized spectral filtering, inInternational Conference on Neural Information Processing Systems, (2016), [arXiv:1606.09375]
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[20]
Y. Wang, Y. Sun, Z. Liu et al.,Dynamic Graph CNN for Learning on Point Clouds,ACM Trans. Graph.38(2019) [arXiv:1801.07829] [inSPIRE]. – 17 –
work page internal anchor Pith review Pith/arXiv arXiv 2019
- [21]
-
[22]
Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, and Dhruv Batra
R.R. Selvaraju, M. Cogswell, A. Das et al.,Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization, inIEEE International Conference on Computer Vision (ICCV), (2017), pp. 618–626 [arXiv:1610.02391]
-
[23]
Jet-Images -- Deep Learning Edition
L. de Oliveira, M. Kagan, L. Mackey et al.,Jet-images — deep learning edition,JHEP07 (2016) 069 [arXiv:1511.05190] [inSPIRE]
work page internal anchor Pith review Pith/arXiv arXiv 2016
- [24]
-
[25]
P.T. Komiske, E.M. Metodiev and J. Thaler,Energy Flow Networks: Deep Sets for Particle Jets,JHEP01(2019) 121 [arXiv:1810.05165] [inSPIRE]
- [26]
-
[27]
F.A. Dreyer and H. Qu,Jet tagging in the Lund plane with graph networks,JHEP03(2021) 052 [arXiv:2012.08526] [inSPIRE]
-
[28]
A. Bogatskiy, T. Hoffman, D.W. Miller et al.,PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network for Particle Physics, inMachine Learning and the Physical Sciences Workshop at NeurIPS, (2022), [arXiv:2211.00454] [inSPIRE]
-
[29]
JEDI-linear: Fast and Efficient Graph Neural Networks for Jet Tagging on FPGAs
Z. Que et al.,JEDI-linear: Fast and Efficient Graph Neural Networks for Jet Tagging on FPGAs,arXiv:2508.15468
work page internal anchor Pith review Pith/arXiv arXiv
-
[30]
A. Wang, Z. Zhao, S. Katel et al.,Spatially Aware Linear Transformer (SAL-T) for Particle Jet Tagging,arXiv:2510.23641
work page internal anchor Pith review arXiv
-
[31]
F.A. Dreyer, G.P. Salam and G. Soyez,The Lund Jet Plane,JHEP12(2018) 064 [arXiv:1807.04758] [inSPIRE]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[32]
Better Jet Clustering Algorithms
Y.L. Dokshitzer, G.D. Leder, S. Moretti et al.,Better jet clustering algorithms,JHEP08 (1997) 001 [arXiv:hep-ph/9707323] [inSPIRE]
work page internal anchor Pith review Pith/arXiv arXiv 1997
-
[33]
Hadronization Corrections to Jet Cross Sections in Deep-Inelastic Scattering
M. Wobisch and T. Wengler,Hadronization corrections to jet cross-sections in deep inelastic scattering, inWorkshop on Monte Carlo Generators for HERA Physics (Plenary Starting Meeting), (1998), pp. 270–279 [arXiv:hep-ph/9907280] [inSPIRE]
work page internal anchor Pith review Pith/arXiv arXiv 1998
-
[34]
S. Mandelstam,Determination of the pion - nucleon scattering amplitude from dispersion relations and unitarity. General theory,Phys. Rev.112(1958) 1344 [inSPIRE]
work page 1958
-
[35]
H. Qu, C. Li and S. Qian,JetClass: A Large-Scale Dataset for Deep Learning in Jet Physics, June, 2022. 10.5281/zenodo.6619768. [37]CMScollaboration,Boosted jet identification using particle candidates and deep neural networks,CMS-DP-2017-049(2017) . [38]Particle Data Groupcollaboration,Review of Particle Physics,Physical Review D110 (2024) 030001
-
[36]
Identifying Boosted Objects with N-subjettiness
J. Thaler and K. Van Tilburg,Identifying boosted objects with N-subjettiness,Journal of High Energy Physics2011(2011) 015 [arXiv:1011.2268] . – 18 –
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[37]
Energy Correlation Functions for Jet Substructure
A.J. Larkoski, G.P. Salam and J. Thaler,Energy correlation functions for jet substructure, Journal of High Energy Physics2013(2013) 108 [arXiv:1305.0007]
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[38]
A. Bogatskiy, B. Anderson, J.T. Offermann et al.,Lorentz Group Equivariant Neural Network for Particle Physics, inProceedings of the 37th International Conference on Machine Learning, vol. 119 ofPMLR, (2020), pp. 992–1002
work page 2020
-
[39]
Lorentz-equivariant geometric algebra transformers for high-energy physics
J. Spinner, T. Schiffer and T. Plehn,Lorentz-Equivariant Geometric Algebra Transformers for High-Energy Physics,arXiv preprint(2024) [arXiv:2405.14806]
-
[40]
R. Corso,Lorentz-invariant augmentation for high-energy physics machine learning, Master’s thesis, Politecnico di Torino, 2024
work page 2024
-
[41]
S. Chen, E. Dobriban and J.H. Lee,A group-theoretic framework for data augmentation, Journal of Machine Learning Research21(2020) 1
work page 2020
-
[42]
D. Zhang and D. Shen,Multi-Modal Multi-Task Learning for Joint Prediction of Multiple Regression and Classification Variables in Alzheimer’s Disease,NeuroImage59(2011) 895
work page 2011
-
[43]
J. Bingel and A. Søgaard,Identifying beneficial task relations for multi-task learning in deep neural networks, inProceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 2, (2017), pp. 164–169
work page 2017
-
[44]
S. Francescato et al.,Model compression and simplification pipelines for fast deep neural network inference in FPGAs in HEP,The European Physical Journal C81(2021) 1005
work page 2021
-
[45]
M. Wielgosz, M. Mertik, A. Skocze´ n et al.,The model of an anomaly detector for HiLumi LHC magnets based on Recurrent Neural Networks and adaptive quantization, Neurocomputing300(2018) 121
work page 2018
-
[46]
Y. Iiyama et al.,Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics,Frontiers in Big Data3(2021) 598927
work page 2021
-
[47]
J. Duarte et al.,Fast Inference of Deep Neural Networks for Real-time Computer Vision in Particle Physics Detectors, inProceedings of the 24th International Conference on Computing in High Energy and Nuclear Physics, (2019),
work page 2019
-
[48]
J. Apostolakis et al.,Detector Simulation Challenges for Future Accelerator Experiments, Frontiers in Physics10(2022) 913510
work page 2022
-
[49]
Verkerke,Systematic uncertainties and profiling,
W. Verkerke,Systematic uncertainties and profiling,
- [50]
-
[51]
O. Atkinson et al.,Learning to Classify LHC Topologies with Graph Neural Networks, Journal of High Energy Physics2022(2022) 137. – 19 – Data availability TheJetClassdataset used in this study is publicly available through Zenodo, with an accompa- nying GitHub repository for easy integration into ML workflows. Acknowledgments This research is partially sup...
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.