Scalable and Calibrated Sampling for Bayesian Generalized Linear Mixed Model via Stochastic Gradient Markov Chain Monte Carlo

Andrea Agazzi; Felipe A. Medeiros; Samuel I. Berchuck; Youngsoo Baek

arxiv: 2403.03007 · v4 · pith:UOI2SK5Onew · submitted 2024-03-05 · 📊 stat.CO · stat.ME· stat.ML

Scalable and Calibrated Sampling for Bayesian Generalized Linear Mixed Model via Stochastic Gradient Markov Chain Monte Carlo

Youngsoo Baek , Andrea Agazzi , Felipe A. Medeiros , Samuel I. Berchuck This is my paper

classification 📊 stat.CO stat.MEstat.ML

keywords gradientcarlodataglmmsmonteposterioraccuratebayesian

0 comments

read the original abstract

Generalized linear mixed models (GLMMs) are widely used for analyzing correlated data, particularly in large-scale biomedical and social science applications. Scalable Bayesian inference for GLMMs is challenging due to an intractable marginal likelihood and a high computational cost incurred by conventional Markov chain Monte Carlo (MCMC) methods. We develop a stochastic gradient MCMC (SGMCMC) algorithm tailored to GLMMs that enables accurate posterior inference in the large-sample regime. Our approach uses Fisher's identity to construct a (biased) Monte Carlo estimator of the gradient of the marginal log-likelihood, making SGMCMC feasible when direct gradient computation is impossible. We analyze the additional variability, introduced by both data subsampling and gradient approximation, to derive a post-hoc covariance correction that yields properly calibrated posterior uncertainty. We show through simulated studies that the proposed method provides accurate posterior means and variances in settings with a large number of groups, outperforming existing approaches, including control variate methods. We further demonstrate the method's practical utility in an analysis of electronic health records data, where accounting for variance inflation materially changes scientific conclusions.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Safe, Scalable, and Accurate Bayes Posterior Sampling for Large-Data Generalized Linear Mixed Models
stat.ME 2026-04 unverdicted novelty 6.0

A stochastic mirror Langevin dynamics sampler with subsampling and Wasserstein-based post-processing yields accurate posterior samples and variance estimates for large-scale Bayesian GLMMs.