Recognition: unknown
Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling
read the original abstract
Stochastic Gradient Descent (SGD) is a popular optimization method which has been applied to many important machine learning tasks such as Support Vector Machines and Deep Neural Networks. In order to parallelize SGD, minibatch training is often employed. The standard approach is to uniformly sample a minibatch at each step, which often leads to high variance. In this paper we propose a stratified sampling strategy, which divides the whole dataset into clusters with low within-cluster variance; we then take examples from these clusters using a stratified sampling technique. It is shown that the convergence rate can be significantly improved by the algorithm. Encouraging experimental results confirm the effectiveness of the proposed method.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Convergence of Riemannian Stochastic Gradient Descents: Varying Batch Sizes And Nonstandard Batch Forming
Convergence theorems are established for Riemannian SGD with iteration-varying probability spaces, applying to varying batch sizes and unbiased batch forming schemes.
-
Variance Matters: Improving Domain Adaptation via Stratified Sampling
VaRDASS improves unsupervised domain adaptation by using stratified sampling to reduce variance in discrepancy estimation for measures like correlation alignment and MMD, with derived error bounds, an optimality proof...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.