A fused gather-GEMM-scatter CUDA kernel achieves 4.6-7.3x end-to-end speedup and 3.2-4.9x lower energy for matrix-free 3D SIMP topology optimization on RTX 4090 compared to three-stage baselines.
Optimalshapedesignasamaterialdistribution problem
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CE 3years
2026 3representative citing papers
An LLM acting as real-time controller for SIMP topology optimization parameters outperforms fixed schedules and heuristics, delivering 5.7-18.1% lower compliance on 2D and 3D benchmarks.
A sequential topology optimization approach uses SIMP results to initialize level-set refinement via signed distance function transfer on 3D meshes, achieving comparable compliance with up to 4.6x speedup on benchmarks.
citing papers explorer
-
Matrix-Free 3D SIMP Topology Optimization with Fused Gather-GEMM-Scatter Kernels
A fused gather-GEMM-scatter CUDA kernel achieves 4.6-7.3x end-to-end speedup and 3.2-4.9x lower energy for matrix-free 3D SIMP topology optimization on RTX 4090 compared to three-stage baselines.
-
Large Language Models as Optimization Controllers: Adaptive Continuation for SIMP Topology Optimization
An LLM acting as real-time controller for SIMP topology optimization parameters outperforms fixed schedules and heuristics, delivering 5.7-18.1% lower compliance on 2D and 3D benchmarks.
-
Sequential topology optimization: SIMP initialization for level-set boundary refinement
A sequential topology optimization approach uses SIMP results to initialize level-set refinement via signed distance function transfer on 3D meshes, achieving comparable compliance with up to 4.6x speedup on benchmarks.