https://www.microsoft.com/en-us/research/blog/deepspeed- extreme-scale-model-training-for-everyone/

DeepSpeed: Extreme-scale model training for everyone · 2020

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

GSPMD: General and Scalable Parallelization for ML Computation Graphs

cs.DC · 2021-05-10 · unverdicted · novelty 6.0

GSPMD automatically infers tensor partitioning from limited user annotations to parallelize single-device ML programs across thousands of TPUs, reporting 50-62% utilization for up to trillion-parameter models.

citing papers explorer

Showing 1 of 1 citing paper.

GSPMD: General and Scalable Parallelization for ML Computation Graphs cs.DC · 2021-05-10 · unverdicted · none · ref 4
GSPMD automatically infers tensor partitioning from limited user annotations to parallelize single-device ML programs across thousands of TPUs, reporting 50-62% utilization for up to trillion-parameter models.

https://www.microsoft.com/en-us/research/blog/deepspeed- extreme-scale-model-training-for-everyone/

fields

years

verdicts

representative citing papers

citing papers explorer