arxiv: 2604.26042 · v1 · submitted 2026-04-28 · ⚛️ physics.plasm-ph

Recognition: unknown

FPGA-Accelerated Real-Time Diagnostics at DIII-D Using the SLAC Neural Network Library for ML Inference

Abhilasha Dave , Semin Joung , SangKyeun Kim , Ramon Reed , Keith Erickson , Jalal Butt , Azarakhsh Jalalvand , Mudit Mishra

show 6 more authors

James Russell Larry Ruckman Ryan Herbst Egemen Kolemen David Smith Ryan Coffee

Authors on Pith no claims yet

Pith reviewed 2026-05-07 14:18 UTC · model grok-4.3

classification ⚛️ physics.plasm-ph

keywords FPGA accelerationreal-time machine learningplasma controlfusion diagnosticsneural network inferencetokamak operationdisruptive event forecastinghardware-in-the-loop control

0 comments

The pith

An FPGA integrated into a tokamak real-time control system runs neural network inference on live diagnostic signals to forecast disruptive plasma events.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates deployment of a field-programmable gate array into the real-time plasma control nodes of a tokamak fusion reactor. The FPGA executes a dense neural network on live beam emission spectroscopy signals to estimate the likelihood of disruptive conditions. This estimate then drives a separate controller that activates magnetic coils for suppression. The architecture supports on-the-fly updates to network weights and biases without hardware resynthesis, plus hot-swapping of multiple tasks on the same device. The work positions the setup as a reusable example for embedding machine learning into high-rate diagnostic processing for active reactor control.

Core claim

The authors establish that a neural network hosted on an FPGA inside the real-time plasma control system can process live diagnostic signals to infer the likelihood of disruptive conditions, with the added feature that weights and biases can be updated dynamically without requiring full hardware resynthesis.

What carries the argument

A dense neural network on a field-programmable gate array that permits dynamic updates of weights and biases without full resynthesis, enabling task switching during operation.

If this is right

The likelihood output can feed a plasma controller that activates resonant magnetic perturbation coils to suppress predicted disruptive conditions.
Multiple classification tasks can be hot-swapped on the single FPGA to support context-aware real-time strategies.
Continuous refinement of the model becomes possible during live experimental runs without interrupting operation.
The approach provides a template for general machine-learning processing of high-rate diagnostic signals in active reactor control systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same hardware pattern could be applied to other high-speed diagnostics that require low-latency decisions in scientific instruments.
Dynamic weight updates may allow the system to adapt to new plasma regimes encountered during long-pulse operation without manual intervention.
Similar FPGA deployments could be evaluated in facilities that process streaming data for real-time feedback control.

Load-bearing premise

A neural network trained on historical data will maintain adequate accuracy and sufficiently low latency when applied to live signals inside the operational real-time control environment.

What would settle it

Direct measurement during tokamak operation of whether the network's inferred likelihoods match subsequent observed disruptive events while the inference completes within the control loop's time budget.

Figures

Figures reproduced from arXiv: 2604.26042 by Abhilasha Dave, Azarakhsh Jalalvand, David Smith, Egemen Kolemen, Jalal Butt, James Russell, Keith Erickson, Larry Ruckman, Mudit Mishra, Ramon Reed, Ryan Coffee, Ryan Herbst, SangKyeun Kim, Semin Joung.

**Figure 1.** Figure 1: SNL High Level Design Flow reproduced from [ view at source ↗

**Figure 2.** Figure 2: FPGA based Neural Network Design Data Flow view at source ↗

read the original abstract

In this work, we demonstrate the deployment of a hardware-accelerated machine learning (ML) inference system integrated into a real-time processing at the DIII-D tokamak fusion reactor. The team has successfully deployed an AMD/Xilinx KCU1500 field-programmable gate array (FPGA) into the realtime Plasma Control System (PCS) nodes that receives the live Beam Emission Spectroscopy (BES) signal used for Edge Localized Mode (ELM) forecasting. The FPGA hosts a dense neural network using the SLAC Neural Network Library (SNL) that has been trained to infer the likelihood of disruptive ELM conditions. This likelihood then feeds a separate plasma controller that uses Resonant Magnetic Perturbation coils to suppress the predicted disruptive condition. The SNL allows for on-the-fly updates of the neural network weights and biases without requiring full hardware resynthesis for the FPGA. Judicious design of the neural-network architecture can further allow for the hot-swapping of multiple classification tasks to be executed on the single FPGA, significantly enhancing the real-time adaptability of the system for context-aware control strategies that respond in real-time to evolving reactor conditions. These adaptive weights naturally support continuous model refinement and seamless task switching during live experimental operation. This use case is chosen as a high rate signal processing example that can serve as a template for general ML-based reactor diagnostic processing for active reactor control systems. We see this as an essential development for achieving reactor relevant operation in future continuous operation fusion devices.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript reports the integration of an AMD/Xilinx KCU1500 FPGA into the DIII-D tokamak's real-time Plasma Control System (PCS) nodes. The FPGA hosts a dense neural network via the SLAC Neural Network Library (SNL) that processes live Beam Emission Spectroscopy (BES) signals to forecast Edge Localized Modes (ELMs). Predictions feed a separate controller using Resonant Magnetic Perturbation coils for suppression. The SNL enables on-the-fly weight updates without resynthesis and supports hot-swapping of classification tasks on the single FPGA. The work is positioned as a template for ML-based real-time diagnostics in fusion reactor control systems.

Significance. If substantiated with operational performance data, this implementation would demonstrate a viable route for embedding hardware-accelerated ML inference directly into tokamak PCS hardware. Such integration could enable low-latency, adaptive control responses to evolving plasma conditions, supporting the stability requirements of continuous-operation fusion devices.

major comments (1)

Abstract: The assertion that the FPGA 'has been successfully deployed' into the operational PCS nodes and 'receives the live BES signal' for ELM forecasting is not accompanied by any quantitative metrics (e.g., end-to-end inference latency histograms, prediction accuracy or confusion matrices on live discharges, or timing compliance with the PCS real-time budget). Without these data, the central claim of functional real-time integration cannot be evaluated and rests on unverified extrapolation from offline training.

minor comments (1)

Abstract: The sentence 'integrated into a real-time processing at the DIII-D tokamak fusion reactor' is grammatically incomplete and should be rephrased for clarity (e.g., 'integrated into the real-time processing system at the DIII-D tokamak').

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for their constructive review and for highlighting the need for quantitative support of our deployment claims. We have revised the manuscript to incorporate available performance data from testing and to clarify the current scope of live operation.

read point-by-point responses

Referee: Abstract: The assertion that the FPGA 'has been successfully deployed' into the operational PCS nodes and 'receives the live BES signal' for ELM forecasting is not accompanied by any quantitative metrics (e.g., end-to-end inference latency histograms, prediction accuracy or confusion matrices on live discharges, or timing compliance with the PCS real-time budget). Without these data, the central claim of functional real-time integration cannot be evaluated and rests on unverified extrapolation from offline training.

Authors: We agree that the abstract would be strengthened by explicit quantitative metrics. In the revised manuscript we have updated the abstract to reference measured end-to-end inference latency and PCS timing compliance. We have added a new subsection presenting latency histograms from both offline simulations and initial live-signal tests on the integrated hardware, together with the model's classification accuracy on a held-out set of historical DIII-D BES discharges. We have not included confusion matrices or accuracy figures drawn from live discharges during active ELM-suppression experiments because the FPGA integration is recent and the number of relevant shots accumulated so far is too small for meaningful statistics. The added offline and early-live metrics demonstrate technical feasibility and timing compliance without overstating the extent of operational validation to date. revision: partial

standing simulated objections not resolved

Provision of prediction accuracy or confusion matrices from live discharges during active ELM suppression, as the system has only recently been integrated and insufficient operational data have been collected for statistical analysis.

Circularity Check

0 steps flagged

No circularity: engineering implementation report with no derivations or self-referential reductions

full rationale

This manuscript is a hardware deployment report describing FPGA integration of a neural network for real-time BES signal processing and ELM forecasting at DIII-D. It contains no equations, derivations, fitted parameters, or mathematical claims that could reduce to their own inputs by construction. The central assertions concern successful hardware placement, use of the SNL library for weight updates, and potential for task hot-swapping; these are presented as engineering outcomes rather than derived predictions. No load-bearing self-citations, ansatzes, or uniqueness theorems appear. The lack of live-data accuracy or latency metrics noted by the skeptic is an evidentiary gap, not a circularity issue. The derivation chain is empty, so the paper is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an applied engineering demonstration paper. It introduces no free parameters, mathematical axioms, or invented physical entities. The neural network weights are learned from data but are not specified or analyzed in the abstract.

pith-pipeline@v0.9.0 · 5620 in / 1268 out tokens · 93078 ms · 2026-05-07T14:18:03.564908+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 3 canonical work pages

[1]

url: https://abaco.com/products/ fmc134-fpga-mezzanine-card

https://abaco.com/products/fmc134-fpga-mezzanine-card. url: https://abaco.com/products/ fmc134-fpga-mezzanine-card
[2]

FPGA KCU105 Card

AMD. FPGA KCU105 Card . https://www.amd.com/en/products/adaptive-socs-andfpgas/evaluation- boards/kcu105.html. Accessed: 2025-02-08

2025
[3]

2022 review of data-driven plasma science

Rushil Anirudh et al. “2022 review of data-driven plasma science” . In: IEEE Transactions on Plasma Science 51.7 (2023), pp. 1750–1838

2022
[4]

The advanced tokamak path to a compact net electric fusion pilot plant

Richard J Buttery et al. “The advanced tokamak path to a compact net electric fusion pilot plant”. In: Nuclear Fusion 61.4 (2021), p. 046028

2021
[5]

Franc¸ois Chollet et al. Keras. https://keras.io. 2015

2015
[6]

FPGA -accelerated SpeckleNN with SNL for real -time X-ray single-particle imaging

Abhilasha Dave et al. “FPGA -accelerated SpeckleNN with SNL for real -time X-ray single-particle imaging” . In: Frontiers in High Performance Computing 3 (2025), p. 1520151

2025
[7]

2025 – Integrated control for access to and maintenance of Wide-Pedestal QH-mode

DIII-D National Fusion Facility. 2025 – Integrated control for access to and maintenance of Wide-Pedestal QH-mode. https://d3dfusion.org/2025-12-08/. Accessed: April 22, 2026. Dec. 2025

2025
[8]

Scalable Real -time Diagnostic Infrastructure Supporting Disruption Prediction and Avoidance

K. Erickson. “Scalable Real -time Diagnostic Infrastructure Supporting Disruption Prediction and Avoidance”. In: 24th IEEE Real Time Conference . ICISE, Quy Nhon, Vietnam, Apr. 2024. url: https://indico.global/event/6805/contributions/58371/attachments/29468/ 52359/OS_Erickson_81.pdf

2024
[9]

Scalable Real-time Framework Enabling Machine Learning Based Plasma Control

K. Erickson. “Scalable Real-time Framework Enabling Machine Learning Based Plasma Control” . In: IAEA Technical Meeting on CODAC, Data Management, and Remote Participation in Fusion Research. Sao Paulo, Brazil, July 2024. url: https://conferences.iaea.org/event/377/ contributions/31677/

2024
[10]

Implementation of a framework for deploying ai inference engines in fpgas

Ryan Herbst et al. “Implementation of a framework for deploying ai inference engines in fpgas” . In: Smoky Mountains Computational Sciences and Engineering Conference . Springer. 2022, pp. 120 – 134

2022
[11]

Real -time plasma monitoring framework for advanced plasma control and ML-research in DIII-D

SangKyeun Kim et al. “Real -time plasma monitoring framework for advanced plasma control and ML-research in DIII-D” . Unpublished manuscript. 2025

2025
[12]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems

Mart´ın Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org. 2015. url: https://www.tensorflow.org/. 9

2015
[13]

V100 Card

NVIDIA. V100 Card. https://www.nvidia.com/en-gb/data-center/tesla-v100/. Accessed: 2025-02-08

2025
[14]

Continual Learning with Foundation Models: An Empirical Study of Latent Replay

Oleksiy Ostapenko et al. Continual Learning with Foundation Models: An Empirical Study of Latent Replay. 2022. arXiv: 2205.00329[cs.LG]. url: https://arxiv.org/abs/2205.00329

work page arXiv 2022
[15]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library” . In: Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 2019, pp. 8024–

2019
[16]

url: http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-highperformance-deep- learning-library.pdf
[17]

Neural Network Acceleration on MPSoC board: Integrating SLAC’s SNL, Rogue Software and Auto-SNL

Hamza Ezzaoui Rahali et al. “Neural Network Acceleration on MPSoC board: Integrating SLAC’s SNL, Rogue Software and Auto-SNL” . In: arXiv preprint arXiv:2508.21739 (2025)

work page arXiv 2025
[18]

The 64-Channel Analog Input Card

Concurrent Real-Time. “The 64-Channel Analog Input Card.”. Accessed: 2025-11-24. 2025. url: https://concurrent-rt.com/products/hardware/real-time-i-o/analog/64-channelanalog-input-card/

2025
[19]

Enabling Integrated AI Control on DIII -D: A Control System Design with State-of-the-art Experiments

Andrew Rothstein et al. “Enabling Integrated AI Control on DIII -D: A Control System Design with State-of-the-art Experiments” . In: arXiv preprint arXiv:2511.08818 (2025)

work page arXiv 2025