VaRDASS improves unsupervised domain adaptation by using stratified sampling to reduce variance in discrepancy estimation for measures like correlation alignment and MMD, with derived error bounds, an optimality proof for MMD under assumptions, and a k-means style algorithm.
Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling
3 Pith papers cite this work. Polarity classification is still indexing.
abstract
Stochastic Gradient Descent (SGD) is a popular optimization method which has been applied to many important machine learning tasks such as Support Vector Machines and Deep Neural Networks. In order to parallelize SGD, minibatch training is often employed. The standard approach is to uniformly sample a minibatch at each step, which often leads to high variance. In this paper we propose a stratified sampling strategy, which divides the whole dataset into clusters with low within-cluster variance; we then take examples from these clusters using a stratified sampling technique. It is shown that the convergence rate can be significantly improved by the algorithm. Encouraging experimental results confirm the effectiveness of the proposed method.
verdicts
UNVERDICTED 3representative citing papers
Convergence theorems are established for Riemannian SGD with iteration-varying probability spaces, applying to varying batch sizes and unbiased batch forming schemes.
A greedy submodular maximization method for mini-batch selection in DNN training yields better generalization than SGD on standard datasets.
citing papers explorer
-
Variance Matters: Improving Domain Adaptation via Stratified Sampling
VaRDASS improves unsupervised domain adaptation by using stratified sampling to reduce variance in discrepancy estimation for measures like correlation alignment and MMD, with derived error bounds, an optimality proof for MMD under assumptions, and a k-means style algorithm.
-
Convergence of Riemannian Stochastic Gradient Descents: Varying Batch Sizes And Nonstandard Batch Forming
Convergence theorems are established for Riemannian SGD with iteration-varying probability spaces, applying to varying batch sizes and unbiased batch forming schemes.
-
Submodular Batch Selection for Training Deep Neural Networks
A greedy submodular maximization method for mini-batch selection in DNN training yields better generalization than SGD on standard datasets.