LPT-matched integrators for cosmological simulations outperform FastPM with O(1-100) timesteps while convergence is limited to order 3/2 post-shell-crossing due to acceleration field irregularity.
PKDGRAV3: Beyond Trillion Particle Cosmological Simulations for the Next Era of Galaxy Surveys
6 Pith papers cite this work. Polarity classification is still indexing.
abstract
We report on the successful completion of a 2 trillion particle cosmological simulation to z=0 run on the Piz Daint supercomputer (CSCS, Switzerland), using 4000+ GPU nodes for a little less than 80h of wall-clock time or 350,000 node hours. Using multiple benchmarks and performance measurements on the US Oak Ridge National Laboratory Titan supercomputer, we demonstrate that our code PKDGRAV3, delivers, to our knowledge, the fastest time-to-solution for large-scale cosmological N-body simulations. This was made possible by using the Fast Multipole Method in conjunction with individual and adaptive particle time steps, both deployed efficiently (and for the first time) on supercomputers with GPU-accelerated nodes. The very low memory footprint of PKDGRAV3 allowed us to run the first ever benchmark with 8 trillion particles on Titan, and to achieve perfect scaling up to 18000 nodes and a peak performance of 10 Pflops.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
COLA-based hybrid emulator reproduces nonlinear power spectrum boosts in w0wa models to <2% error vs EuclidEmulator2 and produces <0.3σ shifts in LSST-like cosmic shear parameter constraints.
Validates redshift-space power spectrum and bispectrum analysis on Abacus-PNG mocks to recover unbiased f_NL constraints for Euclid spectroscopic sample.
FRBs serve as cosmological probes via dispersion measure, scattering, and Faraday rotation to constrain baryon distribution, expansion history, magnetic fields, and fundamental physics effects.
GPU version of OpenGadget3 matches CPU results across multiple test suites with chip-to-chip speedups of 2-5x.
Lecture series on the physics, phenomenology, and statistics of large-scale cosmic structure evolution and non-Gaussian predictions.
citing papers explorer
-
Perturbation-theory informed integrators for cosmological simulations
LPT-matched integrators for cosmological simulations outperform FastPM with O(1-100) timesteps while convergence is limited to order 3/2 post-shell-crossing due to acceleration field irregularity.
-
Modeling nonlinear scales for dynamical dark energy cosmologies with COLA
COLA-based hybrid emulator reproduces nonlinear power spectrum boosts in w0wa models to <2% error vs EuclidEmulator2 and produces <0.3σ shifts in LSST-like cosmic shear parameter constraints.
-
Euclid preparation: Testing multi-field inflation with galaxy power spectrum and bispectrum
Validates redshift-space power spectrum and bispectrum analysis on Abacus-PNG mocks to recover unbiased f_NL constraints for Euclid spectroscopic sample.
-
Fast Radio Bursts as Cosmological Probes
FRBs serve as cosmological probes via dispersion measure, scattering, and Faraday rotation to constrain baryon distribution, expansion history, magnetic fields, and fundamental physics effects.
-
OpenGadget3 GPU solver tests
GPU version of OpenGadget3 matches CPU results across multiple test suites with chip-to-chip speedups of 2-5x.
-
Large-scale structures of the Universe: physics, phenomenology, statistics
Lecture series on the physics, phenomenology, and statistics of large-scale cosmic structure evolution and non-Gaussian predictions.