pith. machine review for the scientific record. sign in

arxiv: 1803.07300 · v3 · submitted 2018-03-20 · 💻 cs.LG · math.OC· stat.ML

Recognition: unknown

Risk and parameter convergence of logistic regression

Authors on Pith no claims yet
classification 💻 cs.LG math.OCstat.ML
keywords datadescentgradientdirectioniterateslogisticmathcaloffset
0
0 comments X
read the original abstract

Gradient descent, when applied to the task of logistic regression, outputs iterates which are biased to follow a unique ray defined by the data. The direction of this ray is the maximum margin predictor of a maximal linearly separable subset of the data; the gradient descent iterates converge to this ray in direction at the rate $\mathcal{O}(\ln\ln t / \ln t)$. The ray does not pass through the origin in general, and its offset is the bounded global optimum of the risk over the remaining data; gradient descent recovers this offset at a rate $\mathcal{O}((\ln t)^2 / \sqrt{t})$.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Efficient Logistic Regression with Mixture of Sigmoids

    cs.LG 2026-04 unverdicted novelty 7.0

    EW with Gaussian prior matches the optimal O(d log(Bn)) regret for online logistic regression at O(B^3 n^5) cost and converges geometrically to a truncated Gaussian vote in the large-B separable regime.

  2. The Effect of Mini-Batch Noise on the Implicit Bias of Adam

    cs.LG 2026-02 unverdicted novelty 6.0

    Mini-batch noise reverses how Adam's β2 controls anti-regularization, making default momentum values suitable for small batches but requiring β1 closer to β2 for large batches to favor flatter minima.