arxiv: 2604.23464 · v2 · submitted 2026-04-25 · 📊 stat.ME · stat.AP

Recognition: no theorem link

On cross-validation for small area estimators

Qianyu Dong , Zehang Richard Li

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:16 UTC · model grok-4.3

classification 📊 stat.ME stat.AP

keywords small area estimationcross-validationcomplex survey designsmodel comparisonsubnational estimationDemographic and Health Surveyspublic health monitoring

0 comments

The pith

A decomposition of cross-validated squared error separates identifiable bias from bounded unidentifiable parts, enabling reliable comparisons of small area estimators under complex survey designs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a cross-validation framework for small area estimators that handles data from complex household surveys used in subnational public health monitoring. Central to the method is a decomposition of the cross-validated squared error that isolates bias terms which can be directly estimated from those that remain unidentifiable but can be bounded. This structure supports model-agnostic comparisons, such as between area-level and unit-level estimators, while conventional leave-one-area-out cross-validation is shown in theory and simulations to produce misleading rankings. The framework also supplies uncertainty quantification and is illustrated on a case study estimating female literacy rates at the subnational level from Demographic and Health Surveys in Zambia.

Core claim

By decomposing the cross-validated squared error into identifiable bias and unidentifiable components that can be bounded, the framework enables more robust and interpretable model comparisons for small area estimators that account for complex survey designs, outperforming conventional cross-validation in simulations and allowing uncertainty measures.

What carries the argument

The decomposition of the cross-validated squared error into identifiable bias and bounded unidentifiable components, which carries the argument by revealing what can be directly estimated versus bounded under the survey design.

If this is right

Conventional leave-one-area-out cross-validation can produce misleading rankings of small area estimators.
The framework permits direct comparisons between area-level and unit-level small area estimation models.
Uncertainty quantification accompanies the model selection process for small area estimators.
More trustworthy model choices improve subnational estimates such as female literacy rates from survey data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Adoption could lower errors in policy decisions that rely on subnational health or literacy indicators.
The bounding approach may apply to validation tasks in other domains where ground truth is unavailable.
Testing under different sample sizes or survey complexities would clarify the method's robustness limits.
The work points toward greater emphasis on design-aware validation throughout survey-based statistics.

Load-bearing premise

That the proposed decomposition can effectively bound the unidentifiable components in a way that supports reliable model comparisons under complex survey designs.

What would settle it

A simulation where the true best model is known in advance, checking whether the proposed cross-validation selects it more often than leave-one-area-out cross-validation, or external validation data on literacy rates that contradicts one set of rankings but not the other.

Figures

Figures reproduced from arXiv: 2604.23464 by Qianyu Dong, Zehang Richard Li.

**Figure 1.** Figure 1: Prevalence estimates and standard deviations of three candidate models for the percentage view at source ↗

**Figure 2.** Figure 2: Comparison of Admin-1 prevalence estimates and 90% credible intervals for three model view at source ↗

**Figure 3.** Figure 3: Left: Adjusted CV score differences versus oracle full-sample MSE differences. The 𝑥-axis shows the oracle full-sample MSE difference for M3 − M1 (top) and M2 − M1 (bottom). Each point represents one synthetic survey replicate. Points in the first and third quadrants indicate that the CV score difference has the same sign as the oracle full-sample MSE difference and thus correct ranking. Right: Adjusted CV… view at source ↗

**Figure 4.** Figure 4: Area-level adjusted CV score differences versus oracle error differences (left) and versus view at source ↗

**Figure 5.** Figure 5: Comparison of M1 and M3 under LOAO validation across 50 simulation replicates. Left: LOAO scores versus oracle full-sample MSE for the two models. Right: LOAO score difference, score LOAO(M3) − score LOAO(M1), versus oracle full-sample MSE difference, MSEoracle (M3) − MSEoracle (M1). ACKNOWLEDGEMENT We are grateful to the Space Time Analysis Bayes (STAB) working group for discussion and feedback on the p… view at source ↗

**Figure 6.** Figure 6: Analysis of female literacy rate using the 2024 Zambia DHS. Panel (a): Point estimates view at source ↗

**Figure 7.** Figure 7: Admin 1 province maps of direct estimates and posterior mean estimates minus population view at source ↗

**Figure 8.** Figure 8: Scatter plots of CV scores versus oracle errors under moderate sample size setting for the view at source ↗

**Figure 9.** Figure 9: Box plot of approximated error bounds over 50 replicates compared to the absolute value view at source ↗

**Figure 10.** Figure 10: Area-level adjusted CV score distributions across ten Admin-1 provinces under moderate view at source ↗

**Figure 11.** Figure 11: Results for province-level comparison under CV-SSU. view at source ↗

**Figure 12.** Figure 12: CV-PSU: Simulation results under 50 clusters per stratum, 30 households per cluster, view at source ↗

**Figure 16.** Figure 16: In summary, the oracle training MSE is a substantially poorer proxy for the full-sample oracle under CV-PSU than under CV-SSU, which can lead to more bias in model ranking. 0.000 0.002 0.004 Set A MSE 50 clusters | cluster 0.000 0.002 0.004 Oracle MSE 0.000 0.005 0.010 0.015 0.020 0.025 Set A MSE 40 clusters | cluster 0.000 0.005 0.010 0.015 0.020 0.025 Oracle MSE model M1 M2 M3 (a) CV-PSU 0.000 0.001 0.0… view at source ↗

**Figure 13.** Figure 13: Comparison of full-data oracle MSE and oracle training MSE under two cross-validation view at source ↗

**Figure 14.** Figure 14: Naive (left) and adjusted (right) CV scores against oracle MSE under CV-PSU, with 50 view at source ↗

**Figure 15.** Figure 15: Naive (left) and adjusted (right) CV scores against oracle MSE under CV-PSU, with 40 view at source ↗

**Figure 16.** Figure 16: Naive (left) and adjusted (right) CV scores against oracle MSE under CV-SSU, with 40 view at source ↗

**Figure 17.** Figure 17: Scatter plots of 2-fold CV scores versus oracle errors for the three models view at source ↗

**Figure 18.** Figure 18: Maps of for district-level direct estimates, view at source ↗

**Figure 19.** Figure 19: Results for district-level comparison under CV-SSU. view at source ↗

**Figure 20.** Figure 20: Interval plot for district-level direct estimates, Direct estimates, view at source ↗

**Figure 21.** Figure 21: 5-fold adjusted CV score comparison by districts and in aggregate. The top row compares view at source ↗

read the original abstract

Subnational monitoring of public health often relies on household surveys where data are sparse at the desired spatial resolution. Small area estimation (SAE) methods address this challenge by borrowing strength across areas and incorporating auxiliary information. However, comparing these estimators remains difficult in the absence of ground truth. We propose a cross-validation framework for evaluating small area estimators that accommodates complex survey designs. Our approach enables model-agnostic comparisons between area-level and unit-level SAE models. Central to our framework is a decomposition of the cross-validated squared error, which reveals both identifiable bias and unidentifiable components that can be bounded. Our theoretical results and simulation studies show that conventional approaches, such as leave-one-area-out cross-validation, can yield misleading model rankings, whereas the proposed approach offers more robust and interpretable model comparison with uncertainty quantification. We demonstrate the framework through a case study comparing SAE models estimating the subnational female literacy rate using Demographic and Health Surveys from Zambia.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's error decomposition for SAE cross-validation is a useful step for model comparison under complex sampling, but the bounds on unidentifiable terms may often stay too wide to deliver clear rankings.

read the letter

The main contribution is a cross-validation framework that decomposes the squared error into an identifiable bias piece and unidentifiable components that can be bounded, letting users compare area-level and unit-level small area estimators without ground truth while respecting complex survey designs. They show through simulations that leave-one-area-out CV can rank models incorrectly and illustrate the new approach on a Zambia DHS case study for subnational female literacy rates. The model-agnostic setup plus uncertainty quantification around the comparisons is a practical addition for applied work in official statistics and public health monitoring. The soft spot is whether those bounds end up tight enough to matter. In multi-stage cluster samples with small effective sizes per area, the unidentifiable variance tied to weights, random effects, and residuals can easily produce wide intervals that overlap, so the method may not overturn the misleading rankings from standard CV as often as hoped. The paper's theory and simulations claim it works in their settings, but the results will depend on how sensitive the bounds are to the auxiliary assumptions on second moments. This is aimed at statisticians doing SAE for subnational monitoring who need better tools for choosing between modeling strategies. It has enough concrete proposal and evidence to deserve a serious referee, even if the bound tightness needs closer checking in review.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a cross-validation framework for small area estimation (SAE) under complex survey designs. It decomposes the cross-validated squared error into an identifiable bias term and unidentifiable components (from sampling design, area-level effects, and residuals) that are bounded to yield uncertainty intervals for comparing area-level versus unit-level SAE models. Theoretical results and simulations are used to argue that leave-one-area-out CV produces misleading rankings, while the proposed approach is more robust and interpretable. The framework is demonstrated on a case study estimating subnational female literacy rates from Zambian DHS data.

Significance. If the bounds on unidentifiable components prove sufficiently tight, the work would meaningfully advance model selection for SAE in sparse survey settings, a frequent challenge in public health applications. Credit is due for the model-agnostic decomposition, explicit uncertainty quantification, accommodation of complex designs, and the combination of theory, simulations, and real-data illustration. These elements address a genuine gap, though the practical utility hinges on the tightness of the derived bounds relative to model differences.

major comments (2)

[Theoretical decomposition] Theoretical decomposition section: The claim that the decomposition enables reliable model comparisons rests on the bounds for unidentifiable components (sampling weights, cluster effects, residuals) being tight enough to produce non-overlapping intervals. In multi-stage designs such as DHS, these components are entangled with inclusion probabilities; if the bounds remain wide (as is common with small effective sample sizes per area), the intervals will overlap and the method will not overturn misleading LOAO rankings. A concrete demonstration that the bounds are decisive under the paper's assumptions is needed.
[Simulation studies] Simulation studies: The simulations must report the proportion of cases in which the proposed intervals produce decisive (non-overlapping) rankings when conventional CV fails, and the coverage properties of the bounds under varying effective sample sizes and design effects. Without these diagnostics, the evidence that the approach is 'more robust' is incomplete.

minor comments (2)

[Methods] The notation for the cross-validated squared error decomposition and the bounding procedure should be presented with explicit definitions of all terms (e.g., how the unidentifiable variance is bounded) to improve readability.
[Case study] In the case study, provide more detail on the specific complex survey features (stratification, clustering, weights) and how they enter the bounds.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments and for recognizing the contributions of our cross-validation framework for small area estimation. We will revise the manuscript to provide the requested concrete demonstrations and additional simulation diagnostics, which will strengthen the evidence for the practical utility of the proposed bounds and comparisons.

read point-by-point responses

Referee: Theoretical decomposition section: The claim that the decomposition enables reliable model comparisons rests on the bounds for unidentifiable components (sampling weights, cluster effects, residuals) being tight enough to produce non-overlapping intervals. In multi-stage designs such as DHS, these components are entangled with inclusion probabilities; if the bounds remain wide (as is common with small effective sample sizes per area), the intervals will overlap and the method will not overturn misleading LOAO rankings. A concrete demonstration that the bounds are decisive under the paper's assumptions is needed.

Authors: We agree that the usefulness of the framework for overturning LOAO rankings depends on the bounds being sufficiently tight in relevant settings. Our theoretical decomposition derives explicit, design-based bounds on the unidentifiable components that remain valid under the multi-stage sampling assumptions used in the paper, including entanglement with inclusion probabilities. The existing simulations already include designs that approximate DHS-style multi-stage sampling and show instances of non-overlapping intervals that produce correct rankings where LOAO does not. In the revision we will add a dedicated table and accompanying text that directly quantifies bound widths relative to observed CV-error differences across the simulated scenarios, thereby providing the concrete demonstration requested under the paper's assumptions. revision: yes
Referee: Simulation studies: The simulations must report the proportion of cases in which the proposed intervals produce decisive (non-overlapping) rankings when conventional CV fails, and the coverage properties of the bounds under varying effective sample sizes and design effects. Without these diagnostics, the evidence that the approach is 'more robust' is incomplete.

Authors: We acknowledge that explicit summary statistics on decisiveness and coverage would make the simulation evidence more complete and easier to interpret. The current simulations already vary effective sample sizes and design effects while illustrating the superiority of the proposed intervals over LOAO, but we will expand the results section to include (i) the proportion of replicates in which the uncertainty intervals yield non-overlapping rankings when LOAO rankings are misleading, and (ii) empirical coverage rates of the derived bounds across the range of effective sample sizes and design effects examined. These additions will be presented in new tables or figures in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: new decomposition of CV squared error is derived independently and validated externally

full rationale

The paper's central contribution is a novel decomposition of cross-validated squared error into an identifiable bias term plus bounded unidentifiable components arising from survey design, area effects, and residuals. This decomposition is presented as a first-principles theoretical result for complex sampling, supported by simulation studies and a real-data case study on Zambian DHS literacy rates. No equation reduces by construction to a fitted parameter renamed as a prediction, no load-bearing premise rests on self-citation, and no uniqueness theorem or ansatz is smuggled in from prior author work. Conventional leave-one-area-out CV is critiqued on external grounds rather than tautologically. The framework therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on this key assumption about the error decomposition in the context of complex survey designs for small area estimation.

axioms (1)

domain assumption The cross-validated squared error decomposes into identifiable bias and unidentifiable components that can be bounded.
This decomposition is central to the proposed framework according to the abstract.

pith-pipeline@v0.9.0 · 5451 in / 1215 out tokens · 61225 ms · 2026-05-12T02:16:02.785897+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 1 internal anchor

[1]

, " * write output.state after.block = add.period write newline

ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence := #2 '...

work page
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page
[3]

, Kraft, S

Alfons, A. , Kraft, S. , Templ, M. & Filzmoser, P. (2010). Simulation of synthetic population data for household surveys with application to EU-SILC . Research Report CS-2010-1, Department of Statistics and Probability Theory, Vienna University of Technology

work page 2010
[4]

, York, J

Besag, J. , York, J. & Molli\' e , A. (1991). Bayesian image restoration with two applications in spatial statistics. Annals of the Institute of Statistics and Mathematics 43, 1--59

work page 1991
[5]

Bradley, J. R. , Holan, S. H. & Wikle, C. K. (2015). Multivariate spatio-temporal models for high-dimensional areal data with application to longitudinal employer-household dynamics. The Annals of Applied Statistics 9, 1761--1791

work page 2015
[6]

, Chambers, R

Brown, G. , Chambers, R. , Heady, P. & Heasman, D. (2001). Evaluation of small area estimation methods—an application to unemployment estimates from the uk lfs. In Proceedings of statistics Canada symposium, vol. 2001. Statistics Canada

work page 2001
[7]

, Fang, H

Chi, G. , Fang, H. , Chatterjee, S. & Blumenstock, J. E. (2022). Microestimates of wealth for all low-and middle-income countries. Proceedings of the National Academy of Sciences 119, e2113658119

work page 2022
[8]

Corsi, D. J. , Neuman, M. , Graham, W. B. & Subramanian, S. (2012). The Demographic and Health Surveys program: an overview. International Journal of Epidemiology 41, 1602--1613

work page 2012
[9]

Datta, G. S. & Lahiri, P. (2000). A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Statistica Sinica , 613--627

work page 2000
[10]

, Neufeld, A

Dharamshi, A. , Neufeld, A. , Motwani, K. , Gao, L. L. , Witten, D. & Bien, J. (2025). Generalized data thinning using sufficient statistics. Journal of the American Statistical Association 120, 511--523

work page 2025
[11]

Dong, Q. , Li, Z. R. , Wu, Y. , Boskovic, A. & Wakefield, J. (2024). surveyPrev : Mapping the Prevalence of Binary Indicators using Survey Data in Small Areas . R package version 1.0.0

work page 2024
[12]

Dong, Q. , Wu, Y. , Li, Z. R. & Wakefield, J. (2026). Toward a principled workflow for prevalence mapping using household survey data. Journal of Survey Statistics and Methodology , smaf048

work page 2026
[13]

& Wakefield, J

Dong, T. & Wakefield, J. (2021). Modeling and presentation of health and demographic indicators in a low- and middle-income countries context. Vaccine 39, 2584--2594

work page 2021
[14]

Dorfman, A. H. (2018). Towards a routine external evaluation protocol for small area estimation. International Statistical Review 86, 259--274

work page 2018
[15]

Efron, B. (2004). The estimation of prediction error: covariance penalties and cross-validation. Journal of the American Statistical Association 99, 619--632

work page 2004
[16]

& Lahiri, P

Fabrizi, E. & Lahiri, P. (2013). A design-based approximation to the bayes information criterion in finite population sampling. Statistica 73, 289--301

work page 2013
[17]

& Herriot, R

Fay, R. & Herriot, R. (1979). Estimates of income for small places: an application of James--Stein procedure to census data. Journal of the American Statistical Association 74, 269--277

work page 1979
[18]

& Maitra, P

Franco, C. & Maitra, P. (2023). Combining surveys in small area estimation using area-level models. Wiley Interdisciplinary Reviews: Computational Statistics 15, e1613

work page 2023
[19]

, Best, N

Gomez-Rubio, V. , Best, N. & Richardson, S. (2008). A comparison of different methods for small area estimation

work page 2008
[20]

H \'a jek, J. (1971). Discussion of, `` A n essay on the logical foundations of survey sampling, part I '', by D . B asu. In Foundations of Statistical Inference, V. Godambe & D. Sprott, eds. Toronto: Holt, Rinehart and Winston

work page 1971
[21]

, Lumley, T

Holbrook, A. , Lumley, T. & Gillen, D. (2020). Estimating prediction error for complex samples. Canadian Journal of Statistics 48, 204--221

work page 2020
[22]

, Lumley, T

Iparragirre, A. , Lumley, T. , Barrio, I. & Arostegui, I. (2023). Variable selection with lasso regression for complex survey data. Stat 12, e578

work page 2023
[23]

, Raim, A

Janicki, R. , Raim, A. M. , Holan, S. H. & Maples, J. J. (2022). Bayesian nonparametric multivariate spatial mixture mixed effects models with application to american community survey special tabulations. The Annals of Applied Statistics 16, 144--168

work page 2022
[24]

On Data Thinning for Model Validation in Small Area Estimation

Kawano, S. , Parker, P. A. & Li, Z. R. (2026). On data thinning for model validation in small area estimation. arXiv preprint arXiv:2604.04141

work page internal anchor Pith review Pith/arXiv arXiv 2026
[25]

& Hancioglu, A

Khan, S. & Hancioglu, A. (2019). Multiple indicator cluster surveys: Delivering robust data on children and women across the globe. Studies in Family Planning 50, 279--286

work page 2019
[26]

, Kennedy, L

Kuh, S. , Kennedy, L. , Chen, Q. & Gelman, A. (2024). Using leave-one-out cross validation (loo) in a multilevel regression and poststratification (mrp) workflow: A cautionary tale. Statistics in Medicine 43, 953--982

work page 2024
[27]

& Pramanik, S

Lahiri, P. & Pramanik, S. (2019). Evaluation of synthetic small-area estimators using design-based methods. Austrian Journal of Statistics 48, 43--57

work page 2019
[28]

Li, Z. R. , Hsiao, Y. , Godwin, J. , Martin, B. D. , Wakefield, J. & Clark, S. J. (2019). Changes in the spatial distribution of the under five mortality rate: small-area analysis of 122 DHS surveys in 262 subregions of 35 countries in A frica. PLoS One 14, e0210645

work page 2019
[29]

& Rue, H

Lindgren, F. & Rue, H. (2015). Bayesian spatial modelling with R-INLA . Journal of Statistical Software 63, 1--25

work page 2015
[30]

Lohr, S. L. & Rao, J. (2009). Jackknife estimation of mean squared error of small area predictors in nonlinear mixed models. Biometrika 96, 457--468

work page 2009
[31]

& Scott, A

Lumley, T. & Scott, A. (2014). Tests for regression models fitted to survey data. Australian & New Zealand Journal of Statistics 56, 1--14

work page 2014
[32]

& Scott, A

Lumley, T. & Scott, A. (2015). Aic and bic for modeling with complex survey data. Journal of Survey Statistics and Methodology 3, 1--18

work page 2015
[33]

& Scott, A

Lumley, T. & Scott, A. (2017). Fitting regression models to survey data. Statistical Science , 265--278

work page 2017
[34]

, Wakefield, J

Mercer, L. , Wakefield, J. , Pantazis, A. , Lutambi, A. , Mosanja, H. & Clark, S. (2015). Small area estimation of childhood mortality in the absence of vital registration. Annals of Applied Statistics 9, 1889--1905

work page 2015
[35]

Merfeld, J. D. , Chen, H. , Lahiri, P. & Newhouse, D. (2024). Small area estimation with geospatial data: A primer. Background Document ESA/STAT/AC.394/BG-3p, United Nations Statistical Commission. Prepared for the 56th Session of the United Nations Statistical Commission (March 2025) by the Inter-Secretariat Working Group on Household Surveys

work page 2024
[36]

Merfeld, J. D. , Newhouse, D. L. , Weber, M. & Lahiri, P. (2022). Combining survey and geospatial data can significantly improve gender-disaggregated estimates of labor market outcomes

work page 2022
[37]

, Dharamshi, A

Neufeld, A. , Dharamshi, A. , Gao, L. L. & Witten, D. (2024). Data thinning for convolution-closed distributions. Journal of Machine Learning Research 25, 1--35

work page 2024
[38]

& Rao, J

Prasad, N. & Rao, J. (1990). The estimation of the mean squared error of small-area estimators. The Annals of Statistics 85, 163--171

work page 1990
[39]

& Molina, I

Rao, J. & Molina, I. (2015). Small Area Estimation, Second Edition. New York: John Wiley

work page 2015
[40]

Rao, J. N. & Scott, A. J. (1981). The analysis of categorical data from complex sample surveys: chi-squared tests for goodness of fit and independence in two-way tables. Journal of the American statistical association 76, 221--230

work page 1981
[41]

Rao, J. N. & Scott, A. J. (1984). On chi-squared tests for multiway contingency tables with cell proportions estimated from survey data. The Annals of statistics , 46--60

work page 1984
[42]

, S rbye, S

Riebler, A. , S rbye, S. , Simpson, D. & Rue, H. (2016). An intuitive B ayesian spatial model for disease mapping that accounts for scaling. Statistical Methods in Medical Research 25, 1145--1165

work page 2016
[43]

& Belmonte, E

Rivest, L.-P. & Belmonte, E. (2000). A conditional mean squared error of small area estimators. Survey Methodology 26, 67--78

work page 2000
[44]

, Martino, S

Rue, H. , Martino, S. & Chopin, N. (2009). Approximate B ayesian inference for latent G aussian models using integrated nested L aplace approximations (with discussion). Journal of the Royal Statistical Society, Series B 71, 319--392

work page 2009
[45]

Saha, U. R. , Das, S. , Baffour, B. & Chandra, H. (2023). Small area estimation of age-specific and total fertility rates in B angladesh. Spatial Demography 11, 2

work page 2023
[46]

, Rue, H

Simpson, D. , Rue, H. , Riebler, A. , Martins, T. & S rbye, S. (2017). Penalising model component complexity: A principled, practical approach to constructing priors (with discussion). Statistical Science 32, 1--28

work page 2017
[47]

Steorts, R. C. , Schmid, T. & Tzavidis, N. (2020). Smoothing and benchmarking for small area estimation. International Statistical Review 88, 580--598

work page 2020
[48]

The Demographic and Health Surveys program

The DHS Program (2026). The Demographic and Health Surveys program. http://www.dhsprogram.com. Accessed: 2026-04-08

work page 2026
[49]

Thomas, D. R. & Rao, J. (1987). Small-sample comparisons of level and power for simple goodness-of-fit statistics under cluster sampling. Journal of the American Statistical Association 82, 630--636

work page 1987
[50]

, Torelli, N

Trevisani, M. , Torelli, N. et al. (2017). A comparison of hierarchical bayesian models for small area estimation of counts. Open Journal of Statistics 7, 521--550

work page 2017
[51]

Multiple indicator cluster surveys

UNICEF (2026). Multiple indicator cluster surveys. http://mics.unicef.org. Accessed: 2026-04-08

work page 2026
[52]

, Mononen, T

Vehtari, A. , Mononen, T. , Tolvanen, V. , Sivula, T. & Winther, O. (2016). Bayesian leave-one-out cross-validation approximations for gaussian latent variable models. Journal of Machine Learning Research 17, 1--38

work page 2016
[53]

, Fuglstad, G.-A

Wakefield, J. , Fuglstad, G.-A. , Riebler, A. , Godwin, J. , Wilson, K. & Clark, S. (2019). Estimating under five mortality in space and time in a developing world context. Statistical Methods in Medical Research 28, 2614--2634

work page 2019
[54]

, Gao, P

Wakefield, J. , Gao, P. A. , Fuglstad, G.-A. & Li, Z. R. (2025). The two cultures of prevalence mapping: Small area estimation and model-based geostatistics. arXiv preprint arXiv:2110.09576

work page arXiv 2025
[55]

, Jiang, J

Wakefield, J. , Jiang, J. & Wu, Y. (2026). Automatic variance adjustment for small area estimation. arXiv preprint arXiv:2602.14387

work page arXiv 2026
[56]

, Okonek, T

Wakefield, J. , Okonek, T. & Pedersen, J. (2020). Small area estimation for disease prevalence mapping. International Statistical Review 88, 398--418

work page 2020
[57]

, Guerin, C

Wieczorek, J. , Guerin, C. & McMahon, T. (2022). K-fold cross-validation for complex sample surveys. Stat 11, e454

work page 2022
[58]

& Franco, C

Wieczorek, O. & Franco, C. (2013). Small area estimation evaluation strategies: An application to the American Community Survey . In Proceedings of the Joint Statistical Meetings, Section on Survey Research Methods. American Statistical Association Alexandria, VA

work page 2013
[59]

Global high resolution population denominators project - funded by the bill and melinda gates foundation (opp1134076)

WorldPop (2018). Global high resolution population denominators project - funded by the bill and melinda gates foundation (opp1134076)

work page 2018
[60]

Wu, Y. , Li, Z. R. , Mayala, B. , Wang, H. , Gao, P. , Paige, J. , Fuglstad, G.-A. , Moe, C. , Godwin, J. , Donohue, R. , Janocha, B. , Croft, T. & Wakefield, J. (2021). Spatial Modeling for Subnational Administrative level 2 Small-Area Estimation. DHS Spatial Analysis Reports No. 21. Rockville, Maryland, USA. [ Zambia Statistics Agency et al.(2024) Zambi...

work page 2021