ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
A double-Bayesian framework derives an optimal learning rate for neural network training via two antagonistic Bayesian processes.
citing papers explorer
-
ShardTensor: Domain Parallelism for Scientific Machine Learning
ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
-
Training Neural Networks with Optimal Double-Bayesian Learning
A double-Bayesian framework derives an optimal learning rate for neural network training via two antagonistic Bayesian processes.