arxiv: 2605.14131 · v1 · submitted 2026-05-13 · ⚛️ physics.data-an · hep-ex· hep-ph

Recognition: 2 theorem links

· Lean Theorem

Double Metric Learning for Building Directed Graphs with Chain Connections for the ATLAS ITk Detector

Jay Chan

Authors on Pith no claims yet

Pith reviewed 2026-05-15 02:14 UTC · model grok-4.3

classification ⚛️ physics.data-an hep-exhep-ph

keywords double metric learningdirected graph constructionparticle trackingcontrastive lossATLAS ITkchain connectionsGNN tracking

0 comments

The pith

Double Metric Learning resolves contrastive loss conflicts in chain connections by learning two node representations for directed graph construction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses a conflict in standard metric learning for graph construction in particle tracking. When true edges form chains along a track, the contrastive loss pulls a node embedding toward both its predecessor and successor at once, creating an impossible objective. Double Metric Learning instead trains two independent representations per node. A directed edge from node A to node B is then added when one representation of A lies close to the other representation of B. Tests on simulated ATLAS ITk data show that this yields higher-quality graphs than single-metric learning, particularly for high transverse-momentum particles, while also recovering edge directions accurately.

Core claim

Double Metric Learning learns two separate embeddings for each detector hit. Directed edges are constructed by measuring distance between the first embedding of one hit and the second embedding of another. This decouples the learning objectives that conflict under ordinary contrastive loss when edges must form ordered chains.

What carries the argument

Double Metric Learning, which produces two node embeddings per hit so that directed edge decisions rest on the cross-distance between one embedding of the source and the other embedding of the target.

If this is right

Graph construction quality improves especially for high transverse-momentum particles.
Edge directions are recovered directly from the learned representations without extra post-processing.
The resulting directed graphs supply cleaner input to downstream GNN tracking stages.
The same two-embedding pattern can be applied to any tracking detector whose hits form chain-like trajectories.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method may reduce the need for separate direction-inference modules in existing GNN pipelines.
It could be combined with existing embedding regularizers to further control overfitting on simulation.
Extension to multi-layer graphs might allow simultaneous learning of both spatial and directional relations.

Load-bearing premise

Two independent embeddings per node can be learned without one collapsing into the other or introducing bias that degrades tracking performance on real data.

What would settle it

Running the same Double Metric Learning pipeline on actual ATLAS ITk collision data and finding no improvement in graph purity or direction accuracy over single-metric learning would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.14131 by Jay Chan.

**Figure 2.** Figure 2: Schematic of Simple Metric Learning objectives. (a) Cluster connection: forces all track hits [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Schematic of the Double Metric Learning objective for chain connection. By utilizing an asymmetric [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Graph construction efficiency as a function of (a) the transverse momentum [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Ratio of number of reconstructed edges corresponding to two non-successive space points (“hopping [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

Graph construction is an essential step in the Graph Neural Network (GNN) based tracking pipelines. The goal of the graph construction is to construct a graph that contains only the defined true edge connections between nodes (detector hits). A promising approach for the graph construction is through the Metric Learning approach, where a node representation in an embedding space is learned, and nodes are connected according to their distance in the embedding space. The loss function for the metric learning in this case is a contrastive loss encouraging the true pairs of nodes to be close to each other, and pulling away the false pairs of nodes. This approach presents a conflict of the learning objective for the hopping connections when a true edge is defined as a chain connection in a particle track. To address the conflict for this case, we propose a ``Double Metric Learning'' approach, where two node representations are learned. A directed graph can then be constructed based on the distance between the two representations from two nodes respectively. We test this idea with the ATLAS ITk detector at the HL-LHC using the ATLAS ITk simulation and show better graph construction performance particularly for particles with high transverse momentum compared to the Simple Metric Learning approach. We also show that Double Metric Learning is able to accurately predict edge direction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Double metric learning gives a direct way to build directed chain graphs for tracking but the reported gains could just be extra capacity.

read the letter

The core idea here is using two separate embeddings per node so that contrastive loss no longer fights itself on true chain connections like A-B-C. Directed edges come from cross-distances between the two representations, which also lets the model predict direction. That is the actual novelty relative to standard single-metric-learning baselines in the tracking literature. The paper applies it to ATLAS ITk simulation and claims better graph construction for high-pT particles plus decent direction accuracy. That target is practical for HL-LHC GNN pipelines, and the fix is simple enough that others could test it quickly. The simulation setup is the right environment for this kind of work. The main weakness is that the abstract gives no numbers, no error bars, no ablation tables, and no sign that the simple baseline was run with matched parameter count. Doubling the embedding dimension is an obvious capacity increase, so the performance delta might not come from conflict resolution at all. Without those controls the central claim stays under-supported. No circular fitting or invented entities appear in the description. This paper is aimed at people already building or tuning graph construction stages inside tracking GNNs. A reader who needs a directed-graph trick for chain-like data would get something concrete to try, even if the current write-up is light on evidence. It is worth sending to peer review so the authors can add the missing quantitative checks and a capacity-matched baseline; the idea itself is focused enough to justify referee time.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes Double Metric Learning for constructing directed graphs from detector hits in the ATLAS ITk at the HL-LHC. By learning two independent node embeddings per hit, the method constructs directed edges from cross-distances to resolve the contrastive-loss conflict that arises for chain connections (A-B-C) along a particle track; the authors report improved graph-construction performance relative to Simple Metric Learning, especially for high-pT particles, together with accurate edge-direction prediction, all evaluated on ATLAS ITk simulation.

Significance. If the reported gains are shown to arise from the architectural resolution of the chain-connection conflict rather than from doubled embedding capacity, the technique would supply a practical improvement to the graph-construction stage of GNN-based tracking pipelines. The explicit direction prediction is a useful side benefit for downstream directed-graph algorithms. The work is therefore potentially relevant to HL-LHC tracking, but its significance is currently limited by the absence of capacity-matched controls and quantitative metrics.

major comments (2)

[Abstract] Abstract: the claim of 'better graph construction performance particularly for particles with high transverse momentum' is unsupported by any numerical values, error bars, baseline details, or ablation studies, leaving the central performance claim only weakly evidenced.
[Results / Experiments] Experimental comparison (implicit in the abstract and results): the Simple Metric Learning baseline is not stated to have been capacity-matched (e.g., by doubling its embedding dimension or parameter count to equal that of the double-representation model), so any observed improvement could be attributable to increased model capacity rather than to the proposed mechanism for resolving contrastive-loss conflicts on chain connections.

minor comments (2)

[Methods] The notation distinguishing the two learned representations per node should be introduced with explicit equations early in the methods section to improve readability.
[Discussion] A brief discussion of how the directed-graph output integrates with existing GNN tracking pipelines would help readers assess downstream impact.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and valuable comments on our manuscript. We address each major comment below and will make the necessary revisions to strengthen the paper.

read point-by-point responses

Referee: [Abstract] Abstract: the claim of 'better graph construction performance particularly for particles with high transverse momentum' is unsupported by any numerical values, error bars, baseline details, or ablation studies, leaving the central performance claim only weakly evidenced.

Authors: We agree with the referee that the abstract's performance claim would be stronger with supporting numerical evidence. In the revised manuscript, we will include specific quantitative results, such as efficiency and purity metrics for high-pT particles with error bars, and clarify the baseline details and any ablation studies performed. revision: yes
Referee: [Results / Experiments] Experimental comparison (implicit in the abstract and results): the Simple Metric Learning baseline is not stated to have been capacity-matched (e.g., by doubling its embedding dimension or parameter count to equal that of the double-representation model), so any observed improvement could be attributable to increased model capacity rather than to the proposed mechanism for resolving contrastive-loss conflicts on chain connections.

Authors: This is a fair criticism. The current manuscript does not explicitly describe a capacity-matched baseline for Simple Metric Learning. We will revise the experimental section to include a comparison against a capacity-matched variant of Simple Metric Learning, for example by increasing its embedding dimension to match the total parameters of the Double Metric Learning model. This will help demonstrate whether the gains arise from the double-embedding architecture's ability to resolve the chain-connection conflict in the contrastive loss. revision: yes

Circularity Check

0 steps flagged

No circularity; architectural proposal tested on external simulation

full rationale

The paper proposes Double Metric Learning as an independent architectural change that learns two node representations to construct directed edges and resolve contrastive-loss conflicts for chain connections. This is motivated by the limitations of standard metric learning and then evaluated empirically on ATLAS ITk simulation data, with reported gains versus the simple baseline. No equations, fitted parameters, or claims reduce by construction to the inputs themselves; no self-citations bear the load of the central result; and the performance claims rest on external simulation benchmarks rather than internal redefinitions or renamings.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The proposal rests on the domain assumption that two independent embeddings can be trained to encode direction without additional constraints; no free parameters or invented entities are named in the abstract.

axioms (1)

domain assumption Contrastive loss applied separately to two embeddings can encode directed chain connections without conflict
Invoked when the paper states the double representation solves the hopping-connection problem

pith-pipeline@v0.9.0 · 5519 in / 1127 out tokens · 36425 ms · 2026-05-15T02:14:42.641074+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Double Metric Learning... two node representations... distance between the source representation of the first hit and the target representation of the second... resolves the chain conflict
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

contrastive hinge loss... |p_i - p_j|^2 for true pairs, max(0, m - |p_i - p_j|^2) otherwise

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 3 internal anchors

[1]

Track and vertex reconstruction: From classical to adaptive methods , author =. Rev. Mod. Phys. , volume =. 2010 , month =. doi:10.1103/RevModPhys.82.1419 , url =

work page doi:10.1103/revmodphys.82.1419 2010
[2]

Performance of the ATLAS Track Reconstruction Algorithms in Dense Environments in LHC Run 2

Aaboud, M. and others. Performance of the ATLAS Track Reconstruction Algorithms in Dense Environments in LHC Run 2. Eur. Phys. J. C. 2017. doi:10.1140/epjc/s10052-017-5225-7. arXiv:1704.07983

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1140/epjc/s10052-017-5225-7 2017
[3]

Description and performance of track and primary-vertex reconstruction with the CMS tracker

Chatrchyan, Serguei and others. Description and performance of track and primary-vertex reconstruction with the CMS tracker. JINST. 2014. doi:10.1088/1748-0221/9/10/P10009. arXiv:1405.6569

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1088/1748-0221/9/10/p10009 2014
[4]

Optimizations of the ATLAS ITk GNN reconstruction pipeline. 2025

work page 2025
[5]

Performance of a geometric deep learning pipeline for HL-LHC particle tracking

Ju, Xiangyang and others. Performance of a geometric deep learning pipeline for HL-LHC particle tracking. Eur. Phys. J. C. 2021. doi:10.1140/epjc/s10052-021-09675-8. arXiv:2103.06995

work page doi:10.1140/epjc/s10052-021-09675-8 2021
[6]

Towards a realistic track reconstruction algorithm based on graph neural networks for the HL-LHC

Biscarat, Catherine and Caillou, Sylvain and Rougier, Charline and Stark, Jan and Zahreddine, Jad. Towards a realistic track reconstruction algorithm based on graph neural networks for the HL-LHC. EPJ Web Conf. 2021. doi:10.1051/epjconf/202125103047. arXiv:2103.00916

work page doi:10.1051/epjconf/202125103047 2021
[7]

ATLAS ITk Track Reconstruction with a GNN-based pipeline

Caillou, Sylvain and Calafiura, Paolo and Farrell, Steven Andrew and Ju, Xiangyang and Murnane, Daniel Thomas and Rougier, Charline and Stark, Jan and Vallier, Alexis. ATLAS ITk Track Reconstruction with a GNN-based pipeline. 2022

work page 2022
[8]

High Pileup Particle Tracking with Object Condensation

Lieret, Kilian and DeZoort, Gage and Chatterjee, Devdoot and Park, Jian and Miao, Siqi and Li, Pan. High Pileup Particle Tracking with Object Condensation. 2023. arXiv:2312.03823

work page arXiv 2023
[9]

Accelerating the Inference of the Exa.TrkX Pipeline

Lazar, Alina and others. Accelerating the Inference of the Exa.TrkX Pipeline. J. Phys. Conf. Ser. 2023. doi:10.1088/1742-6596/2438/1/012008. arXiv:2202.06929

work page doi:10.1088/1742-6596/2438/1/012008 2023
[10]

The Tracking Machine Learning challenge : Accuracy phase

Amrouche, Sabrina and others. The Tracking Machine Learning challenge : Accuracy phase. The NeurIPS '18 Competition: From Machine Learning to Intelligent Conversations. 2019. doi:10.1007/978-3-030-29135-8_9. arXiv:1904.06778

work page doi:10.1007/978-3-030-29135-8_9 2019
[11]

The Tracking Machine Learning Challenge: Throughput Phase

Amrouche, Sabrina and others. The Tracking Machine Learning Challenge: Throughput Phase. Comput. Softw. Big Sci. 2023. doi:10.1007/s41781-023-00094-w. arXiv:2105.01160

work page doi:10.1007/s41781-023-00094-w 2023
[12]

A density-based algorithm for discovering clusters in large spatial databases with noise , year =

Ester, Martin and Kriegel, Hans-Peter and Sander, J\". A density-based algorithm for discovering clusters in large spatial databases with noise , year =. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining , pages =

work page
[13]

Sigmoid-weighted linear units for neural network function approximation in reinforcement learning , journal =

Stefan Elfwing and Eiji Uchibe and Kenji Doya , keywords =. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning , journal =. 2018 , note =. doi:https://doi.org/10.1016/j.neunet.2017.12.012 , url =

work page doi:10.1016/j.neunet.2017.12.012 2018
[14]

Advances in Neural Information Processing Systems 32 , editor =

PyTorch: An Imperative Style, High-Performance Deep Learning Library , author =. Advances in Neural Information Processing Systems 32 , editor =. 2019 , publisher =

work page 2019
[15]

doi:10.5281/zenodo.3828935 , license =

Falcon, William and. doi:10.5281/zenodo.3828935 , license =

work page doi:10.5281/zenodo.3828935
[16]

Atkinson, Markus Julian and Caillou, Sylvain and Clafiura, Paolo and Collard, Christophe and Farrell, Steven Andrew and Huth, Benjamin and Ju, Xiangyang and Liu, Ryan and Minh Pham, Tuan and Murnane, Daniel (corresponding author) and Neubauer, Mark and Rougier, Charline and Stark, Jan and Torres, Heberth and Vallier, Alexis , title =

work page
[17]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization , author=. arXiv:1412.6980

work page internal anchor Pith review Pith/arXiv arXiv
[18]

2023 , url =

RAPIDS: Libraries for End to End GPU Data Science , author =. 2023 , url =

work page 2023
[19]

Learning representations of irregular particle-detector geometry with distance-weighted graph networks

Qasim, Shah Rukh and Kieseler, Jan and Iiyama, Yutaro and Pierini, Maurizio. Learning representations of irregular particle-detector geometry with distance-weighted graph networks. Eur. Phys. J. C. 2019. doi:10.1140/epjc/s10052-019-7113-9. arXiv:1902.07987

work page doi:10.1140/epjc/s10052-019-7113-9 2019
[20]

International Conference on Learning Representations , year=

Graph Attention Networks , author=. International Conference on Learning Representations , year=

work page
[21]

2016 , eprint=

Layer Normalization , author=. 2016 , eprint=

work page 2016