ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
Machine learning–accelerated computational fluid dynamics
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
MENO enhances neural operators with MeanFlow to restore multi-scale accuracy in dynamical system predictions while keeping inference costs low, achieving up to 2x better power spectrum accuracy and 12x faster inference than diffusion-enhanced baselines on phase-field, Kolmogorov flow, and active-m<f
LASER trains a reinforcement learning policy inside a latent dynamics model to choose sensor placements that improve reconstruction of continuum fields under sparsity.
Porting AI-accelerated CFD model training to IPU-POD16 yields 34% data-feeding speedup and scales throughput to 2805 samples/s on 16 IPUs despite inter-IPU communication limits.
citing papers explorer
-
ShardTensor: Domain Parallelism for Scientific Machine Learning
ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
-
MENO: MeanFlow-Enhanced Neural Operators for Dynamical Systems
MENO enhances neural operators with MeanFlow to restore multi-scale accuracy in dynamical system predictions while keeping inference costs low, achieving up to 2x better power spectrum accuracy and 12x faster inference than diffusion-enhanced baselines on phase-field, Kolmogorov flow, and active-m<f
-
LASER: Learning Active Sensing for Continuum Field Reconstruction
LASER trains a reinforcement learning policy inside a latent dynamics model to choose sensor placements that improve reconstruction of continuum fields under sparsity.
-
Adaptation of AI-accelerated CFD Simulations to the IPU platform
Porting AI-accelerated CFD model training to IPU-POD16 yields 34% data-feeding speedup and scales throughput to 2805 samples/s on 16 IPUs despite inter-IPU communication limits.