APU-Accelerated Large Eddy Simulation with the Discontinuous Galerkin Solver GAL{AE}XI
Pith reviewed 2026-06-26 19:37 UTC · model grok-4.3
The pith
GALÆXI performs accurate wall-resolved large eddy simulation of a transonic compressor cascade on AMD MI300A APUs, capturing shock-wave and turbulent boundary-layer interactions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By linking hardware optimization on AMD MI300A APUs, software implementation of the DGSEM framework, and physical validation, the work shows that GALÆXI can accurately capture complex shock-wave/turbulent boundary-layer interactions in a wall-resolved large eddy simulation of a transonic compressor cascade.
What carries the argument
The architecture-agnostic DGSEM framework GALÆXI, which enables GPU acceleration on APUs and integration of wall-modeled LES algorithms while preserving high-order accuracy.
Load-bearing premise
The integration of wall-modeled LES algorithms into the GPU-accelerated DGSEM framework preserves the physical accuracy of the original method on the transonic compressor cascade without introducing unquantified numerical artifacts.
What would settle it
A direct comparison of predicted shock positions, wall shear stress distributions, or turbulence statistics from the compressor cascade simulation against experimental measurements that reveals discrepancies beyond expected numerical error would falsify the accuracy claim.
Figures
read the original abstract
The exascale computing era, driven by heterogeneous GPU architectures, requires a fundamental redesign of traditional CFD solvers to fully leverage those heterogeneous systems. The discontinuous Galerkin spectral element method (DGSEM) provides an ideal foundation for this transition due to its high-order accuracy and local computational stencil. This work presents recent advances in the development and application of the architecture-agnostic DGSEM framework GAL{\AE}XI by linking hardware optimization, software implementation, and physical validation. The performance of GAL{\AE}XI on the AMD MI300A Accelerated Processing Units (APUs) featured on the Hunter supercomputer is analyzed. Specifically, evaluations of the strong and weak scaling performance and the impact of the compute partitioning modes available on the AMD MI300As are performed. Second, the strategy used to integrate the algorithms necessary for wall-modeled large eddy simulations into the GPU-accelerated framework is outlined. Validation of those algorithms is presented in the form of a plane turbulent channel testcase. Finally, the solver is applied to a demanding flow problem in the form of a wall-resolved large eddy simulation of a transonic compressor cascade. The results from this investigation demonstrate the capabilities of GAL{\AE}XI to accurately capture complex shock-wave/turbulent boundary-layer interactions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents advances in the GALÆXI DGSEM framework for APU-accelerated LES on AMD MI300A hardware. It reports strong/weak scaling and partitioning-mode performance, describes integration of wall-modeled LES algorithms, validates those algorithms on a plane turbulent channel, and applies the solver (as wall-resolved LES) to a transonic compressor cascade, claiming that the results demonstrate accurate capture of complex shock-wave/turbulent boundary-layer interactions.
Significance. If the scaling results and accuracy claims hold, the work provides a practical demonstration of porting a high-order DGSEM solver to emerging APU architectures and applying it to a demanding turbomachinery flow. The explicit linkage of hardware optimization, WMLES implementation, and a complex-geometry application is a strength for exascale CFD efforts.
major comments (1)
- [Abstract and application section] Abstract (final sentence) and application paragraph: the central claim that the results demonstrate accurate capture of shock-wave/turbulent boundary-layer interactions in the transonic compressor cascade lacks support. The manuscript validates the WMLES integration only on the plane channel test case and then applies wall-resolved LES to the cascade; no quantitative comparisons to experimental data or reference simulations are described for cascade-specific quantities such as shock location, surface pressure distributions, or boundary-layer profiles.
minor comments (1)
- [Title and abstract] The notation GAL{Æ}XI in the title and abstract should be rendered consistently as GALÆXI throughout.
Simulated Author's Rebuttal
We thank the referee for the constructive review. We address the single major comment below.
read point-by-point responses
-
Referee: [Abstract and application section] Abstract (final sentence) and application paragraph: the central claim that the results demonstrate accurate capture of shock-wave/turbulent boundary-layer interactions in the transonic compressor cascade lacks support. The manuscript validates the WMLES integration only on the plane channel test case and then applies wall-resolved LES to the cascade; no quantitative comparisons to experimental data or reference simulations are described for cascade-specific quantities such as shock location, surface pressure distributions, or boundary-layer profiles.
Authors: We agree that the manuscript validates the WMLES algorithms solely on the plane turbulent channel and performs a wall-resolved LES of the transonic compressor cascade without providing quantitative comparisons (e.g., shock location, surface pressure, or boundary-layer profiles) to experiments or reference data. The final sentence of the abstract therefore overstates what the presented results support. We will revise the abstract and the application-section paragraph to state that the cascade simulation illustrates the solver's capability on a complex geometry involving shock-turbulence interactions, while removing the claim of demonstrated accuracy for this specific case. revision: yes
Circularity Check
No circularity; performance and validation are direct external measurements
full rationale
The manuscript reports direct hardware scaling measurements on AMD MI300A APUs and validates the WMLES integration against the standard plane turbulent channel benchmark before applying the solver (as wall-resolved LES) to the cascade. No equations, fitted parameters, or predictions are presented that reduce by construction to the paper's own inputs or prior self-citations. The central claims rest on external benchmarks and standard test cases rather than self-referential definitions or load-bearing self-citation chains, making the work self-contained against external references.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard mathematical properties of the discontinuous Galerkin spectral element method hold on the target hardware
Reference graph
Works this paper leans on
-
[1]
Standard ISO 2533:1975, International Organization for Standardization, Geneva, Switzerland, 1975
Standard atmosphere. Standard ISO 2533:1975, International Organization for Standardization, Geneva, Switzerland, 1975
1975
-
[2]
AMD CDNA ™ Architec- ture.https://www.amd.com/content/dam/amd/en/ documents/instinct-tech-docs/white-papers/ amd-cdna-3-white-paper.pdf, 2025
Advanced Micro Devices. AMD CDNA ™ Architec- ture.https://www.amd.com/content/dam/amd/en/ documents/instinct-tech-docs/white-papers/ amd-cdna-3-white-paper.pdf, 2025
2025
-
[3]
AMD GPU Driver (amdgpu).https://rocm
Advanced Micro Devices. AMD GPU Driver (amdgpu).https://rocm. docs.amd.com/projects/HIP/en/latest/, 2026
2026
-
[4]
HIP.https://rocm.docs.amd.com/ projects/HIP/en/latest/, 2026
Advanced Micro Devices. HIP.https://rocm.docs.amd.com/ projects/HIP/en/latest/, 2026
2026
-
[5]
A. Beck, T. Bolemann, D. Flad, H. Frank, N. Krais, K. Kukuschkin, M. Son- ntag, and C.-D. Munz. Application and development of the high order discon- tinuous Galerkin spectral element method for compressible multiscale flows. InHigh Performance Computing in Science and Engineering’17, pages 387–
-
[6]
Blind, M
M. Blind, M. Gao, D. Kempf, P. Kopper, M. Kurz, A. Schwarz, and A. Beck. Towards Exascale CFD Simulations Using the Discontinuous Galerkin Solver FLEXI, pages 207–221. Springer Nature Switzerland, 2026
2026
-
[7]
Chandrashekar
P. Chandrashekar. Kinetic Energy Preserving and Entropy Stable Finite V ol- ume Schemes for Compressible Euler and Navier-Stokes Equations.Commu- nications in Computational Physics, 14(5):1252–1286, 2013
2013
-
[8]
J. W. Deardorff. A numerical study of three-dimensional turbulent channel flow at large Reynolds numbers.Journal of Fluid Mechanics, 41(2):453–480, 1970
1970
-
[9]
Fior.CFD Study of an Installed Transonic Rotor
E. Fior.CFD Study of an Installed Transonic Rotor. PhD thesis, Universita degli Studi di Padova, 2018
2018
-
[10]
Harten and J
A. Harten and J. M. Hyman. Self adjusting grid methods for one-dimensional hyperbolic conservation laws.Journal of Computational Physics, 50(2):235– 269, 1983
1983
-
[11]
Kawai and J
S. Kawai and J. Larsson. Wall-modeling in large eddy simulation: Length scales, grid resolution, and accuracy.Physics of Fluids, 24, 01 2012
2012
-
[12]
J. Keim, A. Schwarz, P. Kopper, M. Blind, C. Rohde, and A. Beck. Entropy stable high-order discontinuous Galerkin spectral-element methods on curvi- linear, hybrid meshes.Journal of Computational Physics, 557:114829, 2026
2026
-
[13]
Kempf, M
D. Kempf, M. Gao, A. Beck, M. Blind, P. Kopper, T. Kuhn, , M. Kurz, A. Schwarz, and C.-D. Munz. Development of turbulent inflow methods for the high order HPC framework FLEXI. InHigh Performance Computing in Science and Engineering’21. Springer. In press
-
[14]
Kopper, A
P. Kopper, A. Schwarz, S. M. Copplestone, P. Ortwein, S. Staudacher, and A. Beck. A framework for high-fidelity particle tracking on massively parallel systems.Computer Physics Communications, 289:108762, Apr. 2023
2023
-
[15]
Krais, A
N. Krais, A. Beck, T. Bolemann, H. Frank, D. Flad, G. Gassner, F. Hinden- lang, M. Hoffmann, T. Kuhn, M. Sonntag, and C.-D. Munz. FLEXI: A high APU-Accelerated Large Eddy Simulation with GALÆXI 15 order discontinuous Galerkin framework for hyperbolic–parabolic conserva- tion laws.Computers & Mathematics with Applications, 81:186–219, 2021
2021
-
[16]
M. Kurz, D. Kempf, M. Blind, P. Kopper, P. Offenhäuser, A. Schwarz, S. Starr, J. Keim, and A. Beck. GALÆXI: Solving complex compressible flows with high-order discontinuous Galerkin methods on accelerator-based sys- tems.Computer Physics Communications, 306, 2024
2024
-
[17]
Larsson, S
J. Larsson, S. Kawai, J. Bodart, and I. Bermejo-Moreno. Large eddy simula- tion with modeled wall-stress: Recent progress and future directions.Mechan- ical Engineering Reviews, 3, 11 2015
2015
-
[18]
F. Meng, J. Tang, J. Li, J. Zhong, and P. Guo. Large eddy simulation of shock wave/boundary layer interactions in a transonic compressor cascade.Physics of Fluids, 36(7):076101, 2024
2024
-
[19]
M. J. Pierzga and J. R. Wood. Investigation of the three-dimensional flow field within a transonic fan rotor: Experiment and analysis.Journal of Engineering for Gas Turbines and Power, 107(2):436–448, 1985
1985
-
[20]
J. Reid. The new features of Fortran 2003. 26(1):10–33, 2007
2003
-
[21]
Schwarz, D
A. Schwarz, D. Kempf, J. Keim, P. Kopper, C. Rohde, and A. Beck. Com- parison of entropy stable collocation high-order DG methods for compressible turbulent flows.Computers & Fluids, 303:106874, 2025
2025
-
[22]
Starr, Y
S. Starr, Y . Feldner, P. Kopper, M. Blind, D. Kempf, J. Schrempp, F. Rodach, A. Beck, and A. Schwarz. An architecture-agnostic high-order discontinuous Galerkin framework for compressible flows, 2026
2026
-
[23]
A. J. Strazisar. Investigation of flow phenomena in a transonic fan rotor us- ing laser anemometry.Journal of Engineering for Gas Turbines and Power, 107(2):427–435, 1985
1985
-
[24]
Strazisar, R
J. Strazisar, R. Wood, D. Hathaway, and L. Suder. Laser anemometer mea- surements in a transonic axial-flow fan rotor. Technical report, NASA, 1989. Coordinates of rotor found in PDF starting from PDF page 22
1989
-
[25]
E. R. Van Driest. Turbulent boundary layer in compressible fluids.J. Aero. Sci., 18:145–160, 1951
1951
-
[26]
F. M. White.Viscous fluid flow. McGraw-Hill series in mechanical engineer- ing. McGraw-Hill, Boston, Mass., 3. ed. edition, 2006
2006
-
[27]
J. R. Wood, A. J. Strazisar, and P. S. Simonyi. Shock Structure Measured in a Transonic Fan Using Laser Anemometry. InAGARD Conference Proceedings No. 401 Transonic and Supersonic Phenomena in Turbomachines, pages 2–1, Munich, Germany, 1986
1986
-
[28]
X. Yang, J. Sadique, R. Mittal, and C. Meneveau. Integral wall model for large eddy simulations of wall-bounded turbulent flows.Physics of Fluids, 27:025112, 02 2015
2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.