arxiv: 2604.07617 · v1 · submitted 2026-04-08 · ⚛️ physics.comp-ph · physics.plasm-ph

Recognition: 2 theorem links

· Lean Theorem

CATAPULT: A CUDA-Accelerated Timestepper for Alpha Particles Using Local Tricubics

Alexey R. Knyazev, David Bindel, Elizabeth J. Paul, Michael Czekanski

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:52 UTC · model grok-4.3

classification ⚛️ physics.comp-ph physics.plasm-ph

keywords alpha particlesstellaratorsMonte Carlo simulationGPU accelerationCUDAtricubic interpolationparticle confinementshear Alfven waves

0 comments

The pith

CATAPULT delivers a CUDA timestepper that accelerates Monte Carlo alpha particle tracking in stellarators using local tricubic interpolation of the magnetic field.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CATAPULT as a GPU implementation for timestepping alpha particles in Monte Carlo confinement calculations for stellarators. It claims the code runs significantly faster than existing parallel CPU versions while correctly handling both static equilibrium magnetic fields and time-varying shear Alfven waves. A reader would care because alpha particle losses are a key constraint on stellarator fusion performance, and faster simulations allow more particles or longer integration times to be studied in practical design work. The implementation is tested on several example stellarators and released as open source within the firm3d package.

Core claim

We introduce CATAPULT, a CUDA-Accelerated Timestepper for Alpha Particles Using Local Tricubics, for Monte Carlo calculations of alpha particle confinement in stellarators. Our GPU implementation is significantly faster than existing parallelized CPU implementations, and handles both equilibrium magnetic fields and Shear Alfven Waves. We test our implementation on several example stellarators to exhibit both the speed and correctness of our code.

What carries the argument

Local tricubic interpolation of the magnetic field inside a CUDA-accelerated particle timestepper for Monte Carlo orbit integration.

If this is right

Monte Carlo runs can track larger numbers of particles or reach longer times without changing hardware.
The same local-tricubic approach extends to time-dependent fields such as shear Alfven waves without losing the speed gain.
Open-source release allows direct integration into existing stellarator optimization or confinement workflows.
Performance on example devices provides a baseline for scaling to reactor-relevant sizes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Adoption could lower the barrier to exploring alpha-particle transport across many stellarator configurations during design iterations.
Pairing the GPU timestepper with other accelerated modules might eventually support full-device, self-consistent particle-field simulations.

Load-bearing premise

Local tricubic interpolation of the magnetic field is accurate enough for the Monte Carlo timestepping, and the chosen example stellarators are representative of the general case.

What would settle it

Direct side-by-side comparison of alpha particle loss fractions or confinement times between CATAPULT and an established CPU code on the same stellarator equilibrium and wave field, or against known analytic solutions for simple test fields.

Figures

Figures reproduced from arXiv: 2604.07617 by Alexey R. Knyazev, David Bindel, Elizabeth J. Paul, Michael Czekanski.

**Figure 1.** Figure 1: An example cubic spline in s with two cells. Data is known on 7 nodes and continuity is guaranteed on the boundary between cells, but differentiability is not. Interpolant colored by cell. represented on a 4×4×4 grid of points covering the cell and we use tricubic interpolation on the grid to recover a dense representation of the magnetic field. As a particle moves from one cell to another, the data used … view at source ↗

**Figure 3.** Figure 3: Runtime is dominated by increased fidelity through tracing toler [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 2.** Figure 2: Resolution of loss fraction with varying fidelity for 32,768 particles using guiding [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Runtime with varying fidelity. We observe linear scaling in the number of [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: GPU speed up vs. 128 CPU cores on Perlmutter. For all devices, tolerances, and [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Runtime with varying fidelity for SAWs. We observe linear scaling behavior in [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: GPU speed up vs. 128 CPU cores on Perlmutter for SAWs. We observer a much [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: Runtime with varying fidelity for Cartesian vacuum tracing. We observe similar [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

**Figure 8.** Figure 8: GPU speed up vs. 128 CPU cores on Perlmutter for Cartesian vacuum tracing. [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗

read the original abstract

We introduce a CUDA-Accelerated Timestepper for Alpha Particles Using Local Tricubics (CATAPULT) for use in Monte Carlo calculations of alpha particle confinement in stellarators. Our GPU implementation is significantly faster than existing parallelized CPU implementations, and handles both equilibrium magnetic fields and Shear Alfven Waves. We test our implementation on several example stellarators to exhibit both the speed and correctness of our code. The source code is included in the firm3d Python package.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CATAPULT is a straightforward GPU port for alpha-particle Monte Carlo in stellarators that releases its code, but the abstract gives no numbers on interpolation error or orbit fidelity so the correctness claim stays unproven.

read the letter

The paper's main contribution is a named CUDA implementation, CATAPULT, that steps alpha particles on local tricubic interpolants of stellarator fields, including both equilibrium and Shear Alfven perturbations. It is packaged inside firm3d and the authors say it runs substantially faster than existing parallel CPU codes while passing basic tests on a few example devices. Releasing the source is the clearest positive step; anyone who needs faster particle tracking for confinement studies now has a concrete starting point rather than having to build the GPU layer from scratch. The approach itself is not conceptually new—GPU particle pushers and tricubic interpolation both exist—but the specific combination for stellarator alpha work with wave perturbations is a practical increment. The soft spot is exactly where the stress-test note flags it. The abstract asserts both speed and correctness from tests on example stellarators, yet supplies no interpolation error norms, no check that div B remains zero to machine precision, no comparison of loss fractions or orbit invariants against a global spline or analytic reference, and no baseline timings with the same physics. Without those quantities it is impossible to judge whether the speed gain preserves the physics that actually matters for reactor design. Minor implementation details such as memory layout or kernel tuning are secondary until the fidelity question is settled. This work is aimed at computational plasma physicists who already run Monte Carlo alpha studies and want a faster drop-in tool. A reader in that group could extract useful code and timing ideas even if the validation section needs expansion. It is coherent on its own terms and shows honest engagement with the practical problem, so it clears the bar for serious refereeing. I would send it out, with the expectation that the authors add the missing quantitative checks on interpolation accuracy and orbit conservation before acceptance.

Referee Report

2 major / 1 minor

Summary. The paper introduces CATAPULT, a CUDA-accelerated timestepper for alpha particles using local tricubic interpolation of magnetic fields for Monte Carlo confinement calculations in stellarators. It claims the GPU implementation is significantly faster than existing parallelized CPU codes, handles both equilibrium fields and Shear Alfven Wave perturbations, and demonstrates both speed and correctness via tests on several example stellarators. Source code is provided in the firm3d Python package.

Significance. If the local tricubic interpolation proves accurate and the performance claims hold under quantitative scrutiny, this could accelerate Monte Carlo alpha-particle orbit simulations in stellarators, aiding fusion reactor design studies. The combination of GPU acceleration with support for perturbed fields is a practical contribution to computational plasma physics tools.

major comments (2)

[Abstract] Abstract: the assertion that tests on example stellarators exhibit correctness lacks any reported quantitative results, error metrics, baseline comparisons, or details on the test cases (e.g., specific stellarators, orbit parameters, or reference solutions).
[Tests/results description] Validation of the central interpolation method: no interpolation error norms, checks on div B = 0 preservation, field-line tracing accuracy, conservation of orbit invariants, or direct comparisons of loss fractions/confinement times against a global spline or analytic reference implementation are provided. This is load-bearing for the correctness claim.

minor comments (1)

The manuscript would benefit from explicit description of the CUDA kernel design, memory layout for the local tricubics, and the precise stellarator configurations used in the timing and correctness tests.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and constructive comments on the abstract and validation sections. We address each major comment below and indicate the revisions we will make to the next version of the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that tests on example stellarators exhibit correctness lacks any reported quantitative results, error metrics, baseline comparisons, or details on the test cases (e.g., specific stellarators, orbit parameters, or reference solutions).

Authors: The abstract is written as a high-level summary of the work. Detailed descriptions of the test cases, including the specific stellarators examined, orbit parameters, and comparisons to reference solutions, are provided in the Results section of the manuscript. We agree, however, that the abstract would be strengthened by a brief reference to the quantitative aspects of the validation. We will revise the abstract to note the quantitative agreement in loss fractions and orbit invariants obtained against reference implementations. revision: yes
Referee: [Tests/results description] Validation of the central interpolation method: no interpolation error norms, checks on div B = 0 preservation, field-line tracing accuracy, conservation of orbit invariants, or direct comparisons of loss fractions/confinement times against a global spline or analytic reference implementation are provided. This is load-bearing for the correctness claim.

Authors: We acknowledge that the current manuscript relies primarily on qualitative agreement and visual comparisons of trajectories to demonstrate correctness of the local tricubic interpolation. Explicit quantitative metrics such as interpolation error norms, explicit checks on div B = 0, field-line tracing accuracy, conservation of orbit invariants, and side-by-side loss-fraction comparisons to a global spline reference are not reported. We will add these quantitative validations, including tabulated error norms and direct comparisons of confinement times, to the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: implementation and performance paper with no derivation chain

full rationale

The paper introduces a CUDA-accelerated timestepper (CATAPULT) for Monte Carlo alpha-particle calculations in stellarators, emphasizing GPU speed gains over CPU codes and functionality for equilibrium fields plus Shear Alfven waves. All claims rest on direct code implementation, runtime benchmarks, and empirical tests on example stellarators rather than any mathematical derivation, fitted parameters, or predictions. No equations, ansatzes, uniqueness theorems, or self-citations are invoked as load-bearing steps that reduce to the inputs by construction; the work is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a computational implementation paper. No free parameters, physical axioms, or invented entities are described in the abstract.

pith-pipeline@v0.9.0 · 5390 in / 1220 out tokens · 54363 ms · 2026-05-10T16:52:54.578794+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
We use tricubic interpolation on the grid to recover a dense representation of the magnetic field... Dormand-Prince 5 (DP5) adaptive timestepper
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
Our GPU implementation is significantly faster than existing parallelized CPU implementations

Reference graph

Works this paper leans on

15 extracted references · 2 canonical work pages

[1]

Bonofiglo, D

P. Bonofiglo, D. Dudt, C. Swanson, Fast ion confinement in quasi- axisymmetric stellarator equilibria, Nuclear Fusion 65 (2) (2025) 026050

2025
[2]

Bindel, M

D. Bindel, M. Landreman, M. Padidar, Direct optimization of fast-ion confinement in stellarators, Plasma Physics and Controlled Fusion 65 (6) (2023) 065012

2023
[3]

Landreman, B

M. Landreman, B. Medasani, F. Wechsung, A. Giuliani, R. Jorge, C. Zhu, Simsopt: a flexible framework for stellarator optimization, Jour- nal of Open Source Software 6 (65) (2021) 3525

2021
[4]

C. G. Albert, S. V. Kasilov, W. Kernbichler, Accelerated methods for direct computation of fusion alpha particle losses within, stellarator op- timization, Journal of Plasma Physics 86 (2) (2020) 815860201

2020
[5]

R. B. White, M. S. Chance, Hamiltonian guiding center drift orbit cal- culation for plasmas of arbitrary cross section, The Physics of Fluids 27 (10) (1984) 2455–2467. 16

1984
[6]

Hirvijoki, O

E. Hirvijoki, O. Asunta, T. Koskela, T. Kurki-Suonio, J. Miettunen, S. Sipilä, A. Snicker, S. Äkäslompolo, ASCOT: Solving the kinetic equa- tion of minority particle species in tokamak plasmas, Computer Physics Communications 185 (4) (2014) 1310–1321

2014
[7]

Zarzoso, D

D. Zarzoso, D. del Castillo-Negrete, R. Lacroix, P.-E. Bernard, S. Touzet, Transport and losses of fusion-born alpha particles in the presence of tearing modes using the new toroidal accelerated particle simulator (TAPaS), Plasma Physics and Controlled Fusion 64 (4) (2022) 044003

2022
[8]

G.Littlejohn, Variationalprinciples ofguidingcentremotion, Journal of Plasma Physics 29 (1) (1983) 111–125

R. G.Littlejohn, Variationalprinciples ofguidingcentremotion, Journal of Plasma Physics 29 (1) (1983) 111–125

1983
[9]

Imbert-Gérard, E

L.-M. Imbert-Gérard, E. J. Paul, A. M. Wright, An introduction to stel- larators: from magnetic fields to symmetries and optimization, SIAM, 2024

2024
[10]

J. R. Dormand, P. J. Prince, A family of embedded Runge-Kutta for- mulae, Journal of computational and applied mathematics 6 (1) (1980) 19–26

1980
[11]

Boost.Numeric.Odeint, Boost.Odeint: Solving ordinary differential equations in C++,https://www.boost.org/libs/numeric/odeint (2025)

2025
[12]

Medasani, F

B. Medasani, F. Wechsung, M. Landreman, R. Jorge, E. Paul, A. Knyazev, sominlee1211, XB, R. Gaur, C. Zhu, C. Albert, A. Giuliani, D. Stańczak-Marikin, Z. Qu, daringli, tmqian, Columbiastellaratorthe- ory/firm3d: Initial release (2025).doi:10.5281/zenodo.17243333. URLhttps://doi.org/10.5281/zenodo.17243333

work page doi:10.5281/zenodo.17243333 2025
[13]

E. Paul, A. Bhattacharjee, M. Landreman, D. Alex, J. Velasco, R. Nies, Energetic particle loss mechanisms in reactor-scale equilibria close to quasisymmetry, Nuclear Fusion 62 (12) (2022) 126054

2022
[14]

Landreman, S

M. Landreman, S. Buller, M. Drevlak, Optimization of quasi-symmetric stellarators with self-consistent bootstrap current and energetic particle confinement, Physics of Plasmas 29 (8) (2022). 17

2022
[15]

A. R. Knyazev, A. Lachmann, A. Goodman, A. Hyder, M. Czekanski, D.Spong, E.Paul, OnshearAlfvénwave-inducedenergeticiontransport in optimized stellarators, arXiv preprint arXiv:2603.03118 (2026). 18

work page arXiv 2026