Geospatial foundation-model embeddings improve population estimation unevenly across space and scale
Pith reviewed 2026-05-10 14:44 UTC · model grok-4.3
The pith
Geospatial foundation model embeddings improve subnational population estimates unevenly across space and scale.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PDFM embeddings capture settlement context more effectively than harmonized geospatial covariates in many cases, yielding better population predictions under geographically structured validation, yet the advantage is geographically and scale-dependent, with performance degrading under spatial aggregation mismatches and providing less flexible transfer across scales.
What carries the argument
The PDFM embeddings, reusable representations learned from multifaceted and heterogeneous geospatial data sources, benchmarked directly against assembled covariates for predictive modeling of population.
If this is right
- PDFM is most advantageous where the geospatial covariates weakly characterise settlement context, such as larger and less-developed subnational areas.
- Embeddings provide less flexible transfer across spatial aggregations than geospatial covariates.
- Geospatial foundation-model representations can improve population estimation in data-poor settings.
- Benefits break down predictably under spatial scale mismatch, revealing a limitation of current geospatial AI.
Where Pith is reading between the lines
- Hybrid models that combine embeddings with traditional covariates may be needed to handle varied geographies reliably.
- The scale-coupling problem suggests developing multi-resolution training objectives for future geospatial foundation models.
- Similar transfer limitations could appear in other spatial prediction tasks that rely on foundation-model embeddings.
- Targeted collection of ground-truth population data could be prioritized in regions where embeddings currently underperform.
Load-bearing premise
The PDFM embeddings capture settlement context more informatively than harmonized geospatial covariates without scale-specific biases introduced by the foundation model's pretraining data or aggregation choices.
What would settle it
A new test set of large, less-developed subnational areas where PDFM embeddings produce zero or negative improvement in predictive fit, or where they transfer across mismatched spatial aggregations worse than the covariates.
read the original abstract
Reliable subnational population estimates are essential for applications, yet remain difficult where censuses are sparse, outdated or spatially coarse. Existing population-mapping workflows rely on hand-built geospatial covariates, such as settlement extent, night-time lights, and environmental conditions, which must be assembled and harmonised across scales and geographies. Geospatial foundation models offer an alternative by learning reusable representations of place from more multifaceted and heterogeneous data sources. Here, we benchmark Population Dynamics Foundation Model (PDFM) embeddings against the harmonised geospatial covariates for subnational population estimation in Brazil, Nigeria and the United States. Under geographically structured validation, PDFM increased predictive fit by a median of 20.1% (IQR: 10.0-33.2%, across country-model comparisons) reduction in unexplained variance, and reduced Kullback-Leibler divergence by 23.2% (9.2-26.2%). However, these gains were uneven. PDFM was most advantageous where the geospatial covariates weakly characterised settlement context, such as larger and less-developed subnational areas. Moreover, PDFM performance was scale-coupled with embeddings providing less flexible transfer across spatial aggregations than geospatial covariates. These findings showed that geospatial foundation-model representations of place can improve population estimation in data poor settings, but their benefits break down predictably under spatial scale mismatch, revealing a fundamental limitation of current geospatial AI.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper benchmarks the use of embeddings from the Population Dynamics Foundation Model (PDFM) against harmonized geospatial covariates for subnational population estimation in Brazil, Nigeria, and the United States. Under geographically structured validation, it reports a median 20.1% reduction in unexplained variance (IQR 10.0-33.2%) and 23.2% reduction in Kullback-Leibler divergence, with improvements being uneven across space and scale, performing best in larger, less-developed areas but showing less flexible transfer across spatial aggregations than traditional covariates.
Significance. If the results are confirmed, this work demonstrates that geospatial foundation models can offer advantages over conventional covariates in population mapping, especially in data-poor contexts, while also identifying key limitations related to spatial scale that must be addressed for broader applicability. This has practical significance for improving demographic estimates used in policy and humanitarian efforts.
major comments (2)
- [Methods] Detailed procedures for extracting and aggregating PDFM embeddings to subnational units, as well as the exact harmonization steps for geospatial covariates, are not described. This omission makes it difficult to determine whether the reported performance gains stem from the embeddings themselves or from differences in aggregation methods, directly impacting the validity of the central claim of uneven improvements.
- [Results] The manuscript should provide more explicit evidence that aggregation procedures were matched exactly between PDFM embeddings and geospatial covariates. Without this, the 20.1% median improvement could be partly attributable to scale-specific summarization choices rather than superior settlement context capture, as suggested by the noted scale-coupling and the abstract's own qualification on transfer flexibility.
minor comments (1)
- [Abstract] The abstract is clear but could specify the number of subnational units or models compared to give context to the IQR ranges reported for the improvements.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments, which highlight important aspects of methodological transparency. We have revised the manuscript to address the concerns about aggregation procedures and have added the requested details and evidence. Below we respond point by point.
read point-by-point responses
-
Referee: [Methods] Detailed procedures for extracting and aggregating PDFM embeddings to subnational units, as well as the exact harmonization steps for geospatial covariates, are not described. This omission makes it difficult to determine whether the reported performance gains stem from the embeddings themselves or from differences in aggregation methods, directly impacting the validity of the central claim of uneven improvements.
Authors: We agree that the original manuscript omitted sufficient detail on these steps, which limits reproducibility and could raise questions about the source of the observed gains. In the revised version we have added a new Methods subsection ('PDFM Embedding Extraction and Covariate Harmonization') that specifies: (i) the PDFM API query parameters and embedding dimensionality, (ii) the exact spatial aggregation (area-weighted mean pooling of embeddings within each subnational polygon), and (iii) the full harmonization pipeline for the geospatial covariates (source datasets, reprojection to a common grid, temporal alignment, and normalization). We also include a direct statement that identical aggregation logic was applied to both feature sets. These additions allow readers to confirm that performance differences arise from the embeddings rather than procedural mismatches. revision: yes
-
Referee: [Results] The manuscript should provide more explicit evidence that aggregation procedures were matched exactly between PDFM embeddings and geospatial covariates. Without this, the 20.1% median improvement could be partly attributable to scale-specific summarization choices rather than superior settlement context capture, as suggested by the noted scale-coupling and the abstract's own qualification on transfer flexibility.
Authors: We accept this critique and have strengthened the Results section accordingly. We now include an explicit paragraph and a supplementary table that document the matched aggregation functions (area-weighted means for both embeddings and covariates) and report a sensitivity check in which alternative summarization choices (e.g., median pooling) were tested; the relative advantage of PDFM remains stable. While we retain the abstract's qualification on scale-coupling, the added evidence demonstrates that the 20.1% median reduction in unexplained variance is not an artifact of mismatched summarization. We have also cross-referenced these details in the discussion of uneven spatial performance. revision: yes
Circularity Check
No circularity in empirical benchmark of foundation-model embeddings
full rationale
The paper reports an empirical benchmark comparing PDFM embeddings to harmonized geospatial covariates for subnational population estimation in Brazil, Nigeria and the United States. Results are obtained via geographically structured validation measuring reductions in unexplained variance and KL divergence on held-out data. No derivations, equations, fitted parameters renamed as predictions, or self-citation chains appear in the load-bearing steps; the claimed improvements are measured directly against external data splits rather than being forced by construction or internal definitions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Geographically structured validation sufficiently prevents spatial autocorrelation leakage in performance estimates.
Reference graph
Works this paper leans on
-
[1]
WorldPop, School of Geography and Environmental Sciences, University of Southampton, United Kingdom
-
[2]
Geographic Data Science Lab, Department of Geography and Planning, School of Environmental Sciences, University of Liverpool, United Kingdom # Corresponding authors: Wenbin Zhang (wb.zhang@soton.ac.uk), and Shengjie Lai (Shengjie.Lai@soton.ac.uk) Abstract Reliable subnational population estimates are essential for applications , yet remain difficult where...
-
[3]
Such representations are appealing for spatial demography because they compress not only otherwise difficult -to-access behavioural signals, such as aggregated search and activity patterns, but also geospatial and contextual data into a single reusable representation of place, thereby reducing the need for downstream applications to assemble and align mul...
work page 2023
-
[4]
The resulting embeddings are fixed, location- specific representations that can be used as predictors in downstream tasks. The goal of this study was not to retrain or modify PDFM, but to evaluate the predictive utility of these precomputed embeddings for population modelling relative to established geospatial covariates. PDFM embeddings were generated se...
-
[5]
and the US (N = 39649) using the same methodological pipeline. In each country, embeddings were produced from country-specific models built on the relevant administrative geography and available place -based input data. Input signals were time -matched across countries to October 2023 to improve consistency. Consequently, cross-country differences in mode...
work page 2023
-
[6]
Candidate polygons were then identified based on spatial proximity to these coordinates, including the nearest polygon and its neighbouring polygons. Within this candidate set, we calculated string similarity between the embedding place name and polygon names using the Jaro-Winkler distance
-
[7]
We evaluated similarity both with the full name and with higher - level region names removed (e.g., removing state names when embedded in district names). The final match was determined by combining spatial proximity and name similarity: the nearest polygon was selected unless another candidate showed substantially higher name similarity beyond a predefin...
work page 2022
-
[8]
Lloyd, C., Sorichetta, A. & Tatem, A. 2017. High resolution global gridded data for use in population studies. Sci Data 4, 170001. https://doi.org/10.1038/sdata.2017.1
-
[9]
Nilsen, K., Tejedor-Garavito, N., Leasure, D.R., Utazi, C.E., Ruktanonchai, C.W., Wigley, A.S., Dooley, C.A., Matthews, Z. and Tatem, A.J., 2021. A review of geospatial methods for population estimation and their use in constructing reproductive, maternal, newborn, child and adolescent health service indicators. BMC health services research, 21(Suppl 1), p.370
work page 2021
-
[10]
Li, D., Sun, L., Feng, K., Zhang, N., Yu, Y ., Zhao, D. and Zhou, Y ., 2025. Disproportionate flood exposure for slum populations of the Global South. Nature Cities, 2(7), pp.626-638
work page 2025
-
[11]
Liu, H., Wang, S., Wei, C., Zhang, W., Tatem, A.J. and Lai, S., 2025. Assessing context-dependent effectiveness of heat adaptation through human mobility under different heatwave regimes. Sustainable Cities and Society, p.107066
work page 2025
-
[12]
Zhang, W.B., Woods, D., Olowe, I.D., Schiavina, M., Fang, W., Hornby, G., Bondarenko, M., Maes, J., Dijkstra, L., Tatem, A.J. and Sorichetta, A., 2025. Assessing the impacts of gridded population model choice on degree of urbanisation metrics. Cities, 166, p.106293
work page 2025
-
[13]
Bozick, R., Burgette, L.F., Sharygin, E., Shih, R.A., Weidmer, B., Tzen, M., Kofner, A., Brand, J.E. and Beltrán-Sánchez, H., 2023. Evaluating the accuracy of 2020 census block-level estimates in California. Demography, 60(6), pp.1903-1921
work page 2023
-
[14]
Forrester, A.C., 2024. Estimating the civilian noninstitutional population for small areas: a modified cohort component approach using public use data. Journal of Population Research, 41(1), p.5
work page 2024
-
[15]
WorldPop, open data for spatial demography
Tatem, A.J., 2017. WorldPop, open data for spatial demography. Scientific data, 4(1), p.170004
work page 2017
-
[16]
Metzger, N., Daudt, R.C., Tuia, D. and Schindler, K., 2024. High-resolution population maps derived from sentinel-1 and sentinel-2. Remote Sensing of Environment, 314, p.114383
work page 2024
-
[17]
Leyk, S., Gaughan, A.E., Adamo, S.B., et al., 2019. The spatial allocation of population: a review of large-scale gridded population data products and their fitness for use. Earth Syst. Sci. Data 11, 1385-1409. https://doi:10.5194/essd-11-1385-2019
-
[18]
Wardrop, N.A., Jochem, W.C., Bird, T.J., Chamberlain, H.R., Clarke, D., Kerr, D., Bengtsson, L., Juran, S., Seaman, V . and Tatem, A.J., 2018. Spatially disaggregated population estimates in the absence of national population and housing census data. Proceedings of the National Academy of Sciences, 115(14), pp.3529-3537
work page 2018
-
[19]
Census counts, undercounts and population estimates: The importance of data quality evaluation
Pelletier, François, 2020. Census counts, undercounts and population estimates: The importance of data quality evaluation. United Nations, Department of Economics and Social Affairs, Population Division, Technical Paper No. 2
work page 2020
-
[20]
Wu, S.S., Qiu, X. and Wang, L., 2005. Population estimation methods in GIS and remote sensing: A review. GIScience & Remote Sensing, 42(1), pp.80-96
work page 2005
-
[21]
Wilson, T., Grossman, I., Alexander, M., Rees, P. and Temple, J., 2022. Methods for small area population forecasts: State-of-the-art and research needs. Population Research and Policy Review, 41(3), pp.865-898
work page 2022
-
[22]
Zhang, W.B., Sorichetta, A., Frye, C., Tejedor-Garavito, N., Fang, W., Cihan, D., Woods, D., Yetman, G., Hilton, J., Tatem, A.J. and Bondarenko, M., 2025. A stochastic approach to integerize floating-point estimates in gridded population mapping. International Journal of Geographical Information Science, pp.1-17
work page 2025
-
[23]
PLOS ONE 10(2), 1–22 (02 2015)
Stevens, F.R., Gaughan, A.E., Linard, C., Tatem, A.J., 2015. Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. PLOS ONE 10(2), e0107042. doi:10.1371/journal.pone.0107042
-
[24]
Adams, D.S., Zimmer, A., Tuccillo, J. et al. 2025. LandScan mosaic enables high- resolution gridded population estimates with explicit uncertainty. Sci Rep 15, 44493. https://doi.org/10.1038/s41598-025-28125-z
-
[25]
Iyer, H.S., Karasaki, S., Yi, L., Hswen, Y ., James, P. and V oPham, T., 2025. Harnessing geospatial artificial intelligence (GeoAI) for environmental epidemiology: a narrative review. Current environmental health reports, 12(1), p.34
work page 2025
-
[26]
Zhu, X. X., Xiong, Z. & Shi, Y . On the foundations of Earth foundation models. Commun. Earth Environ. 7, 116 (2026)
work page 2026
-
[27]
Dynamic population mapping using mobile phone data
Deville P, Linard C, Martin S, et al. Dynamic population mapping using mobile phone data. Proceedings of the National Academy of Sciences. 2014;111(45):15888-15893
work page 2014
-
[28]
Lai, S., Erbach-Schoenberg, E.z., Pezzulo, C. et al. Exploring the use of mobile phone data for national migration statistics. Palgrave Commun 5, 34 (2019). https://doi.org/10.1057/s41599-019-0242-9
-
[29]
Duan, Q., Lai, S., Sorichetta, A. et al. COVID-19 and urban exodus: diverging population redistribution patterns across countries from 2020 to 2022. npj Urban Sustain 6, 59 (2026). https://doi.org/10.1038/s42949-026-00351-y
-
[30]
Zhang, F., Zu, J., Hu, M., Zhu, D., Kang, Y ., Gao, S., Zhang, Y . and Huang, Z., 2020. Uncovering inconspicuous places using social media check-ins and street view images. Computers, Environment and Urban Systems, 81, p.101478
work page 2020
-
[31]
Peng, D., Gui, Z., Wei, W. et al. Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data. Nat Mach Intell 7, 1669- 1684 (2025). https://doi.org/10.1038/s42256-025-01112-9
-
[32]
General Geospatial Inference with a Population Dynamics Foundation Model
Agarwal, M., Sun, M., Kamath, C., Muslim, A., Sarker, P., Paul, J., Yee, H., Sieniek, M., Jablonski, K., Vispute, S. and Kumar, A., 2024. General geospatial inference with a population dynamics foundation model. arXiv preprint arXiv:2411.07207
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[33]
Mai, G. et al. On the Opportunities and Challenges of Foundation Models for GeoAI (Vision Paper). ACM Trans. Spat. Algorithms Syst. 10, Article 11, 1-46 (2024)
work page 2024
-
[34]
Bodnar, C., Bruinsma, W.P., Lucic, A. et al. A foundation model for the Earth system. Nature 641, 1180-1187 (2025). https://doi.org/10.1038/s41586-025-09005-y
-
[35]
Woods, D., McKeen, T., Cunningham, A., Priyatikanto, R., Tatem, A.J., Sorichetta, A. and Bondarenko, M., 2025. Global gridded multi-temporal datasets to support human population distribution modelling. Gates Open Research, 9, p.72
work page 2025
-
[36]
Bell, A., Aides, A., Helmy, A., Muslim, A., Barzilai, A., Slobodkin, A., Jaber, B., Schottlander, D., Leifman, G., Paul, J. and Sun, M., 2025. Earth AI: unlocking geospatial insights with foundation models and cross-modal reasoning. arXiv preprint arXiv:2510.18318
-
[37]
Fan, J. and Thakur, G., 2023. Towards POI-based large-scale land use modeling: spatial scale, semantic granularity, and geographic context. International Journal of Digital Earth, 16(1), pp.430-445
work page 2023
-
[38]
Gong, F.Y ., 2023. Modeling walking accessibility to urban parks using Google Maps crowdsourcing database in the high-density urban environments of Hong Kong. Scientific Reports, 13(1), p.20798
work page 2023
-
[39]
Xiong, S., Zhang, X., Wang, H., Lei, Y ., Tan, G. and Du, S., 2025. Mapping the first dataset of global urban land uses with Sentinel-2 imagery and POI prompt. Remote Sensing of Environment, 327, p.114824
work page 2025
-
[40]
Li, Z., Li, L., Hu, T. et al. Satellite mapping of every building’s function in urban China reveals deep built environment disparities. Nat Commun 17, 2827 (2026). https://doi.org/10.1038/s41467-026-69589-5
-
[41]
Stevens, F.R., Gaughan, A.E., Nieves, J.J., King, A., Sorichetta, A., Linard, C. and Tatem, A.J., 2020. Comparisons of two global built area land cover datasets in methods to disaggregate human population in eleven countries from the global South. International Journal of Digital Earth, 13(1), pp.78-100
work page 2020
-
[42]
Sun, Y ., Xie, J., Wang, Y ., Chan, T.O. and Sun, Z.Y ., 2024. Mapping local-scale working population and daytime population densities using points-of-interest and nighttime light satellite imageries. Geo-Spatial Information Science, 27(6), pp.1852- 1867
work page 2024
-
[43]
Thomson, D.R., Leasure, D.R., Bird, T., Tzavidis, N. and Tatem, A.J., 2022. How accurate are WorldPop-Global-Unconstrained gridded population data at the cell- level?: A simulation analysis in urban Namibia. Plos one, 17(7), p.e0271504
work page 2022
-
[44]
Metz, L., Haggard, R., Moszczynski, M., Asbah, S., Mwase, C., Khomani, P., Smith, T., Cooper, H., Mwale, A., Muslim, A. and Prasad, G., 2025. Application and Validation of Geospatial Foundation Model Data for the Prediction of Health Facility Programmatic Outputs--A Case Study in Malawi. arXiv preprint arXiv:2510.25954
-
[45]
Dark, S.J. and Bram, D., 2007. The modifiable areal unit problem (MAUP) in physical geography. Progress in physical geography, 31(5), pp.471-479
work page 2007
-
[46]
Gotway Crawford, C.A. and Young, L.J., 2005. Change of support: an inter- disciplinary challenge. In Geostatistics for Environmental Applications: Proceedings of the Fifth European Conference on Geostatistics for Environmental Applications (pp. 1-13). Berlin, Heidelberg: Springer Berlin Heidelberg
work page 2005
-
[47]
Mai, G., Janowicz, K., Hu, Y ., Gao, S., Yan, B., Zhu, R., Cai, L. and Lao, N., 2022. A review of location encoding for GeoAI: methods and applications. International Journal of Geographical Information Science, 36(4), pp.639-673
work page 2022
-
[48]
Wang, Y ., Qin, J. and Wang, W., 2017, October. Efficient approximate entity matching using jaro-winkler distance. In International conference on web information systems engineering (pp. 231-239). Cham: Springer International Publishing
work page 2017
-
[49]
Chicco, D., Warrens, M.J. and Jurman, G., 2021. The coefficient of determination R- squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. Peerj computer science, 7, p.e623
work page 2021
-
[50]
Kullback, S. and Leibler, R.A., 1951. On information and sufficiency. The annals of mathematical statistics, 22(1), pp.79-86
work page 1951
-
[51]
Swanwick, R.H., Read, Q.D., Guinn, S.M. et al. Dasymetric population mapping based on US census data and 30-m gridded estimates of impervious surface. Sci Data 9, 523 (2022). https://doi.org/10.1038/s41597-022-01603-z
-
[52]
Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure
Roberts, D.R., Bahn, V ., Ciuti, S., et al., 2017. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40(8), 913-
work page 2017
-
[53]
doi:10.1111/ecog.02881
-
[54]
Ludwig, M., Moreno‐Martinez, A., Hölzel, N., Pebesma, E. and Meyer, H., 2023. Assessing and improving the transferability of current global spatial prediction models. Global Ecology and Biogeography, 32(3), pp.356-368
work page 2023
-
[55]
Altmann, A., Toloşi, L., Sander, O. and Lengauer, T., 2010. Permutation importance: a corrected feature importance measure. Bioinformatics, 26(10), pp.1340-1347. Supplementary Information A. Supplementary Tables Supplementary Table 1. Marginal ordinary least squares (OLS) associations between regional characteristics and PDFM performance gains relative to...
work page 2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.