Recognition: unknown
Geospatial foundation-model embeddings improve population estimation unevenly across space and scale
Pith reviewed 2026-05-10 14:44 UTC · model grok-4.3
The pith
Geospatial foundation model embeddings improve subnational population estimates unevenly across space and scale.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PDFM embeddings capture settlement context more effectively than harmonized geospatial covariates in many cases, yielding better population predictions under geographically structured validation, yet the advantage is geographically and scale-dependent, with performance degrading under spatial aggregation mismatches and providing less flexible transfer across scales.
What carries the argument
The PDFM embeddings, reusable representations learned from multifaceted and heterogeneous geospatial data sources, benchmarked directly against assembled covariates for predictive modeling of population.
If this is right
- PDFM is most advantageous where the geospatial covariates weakly characterise settlement context, such as larger and less-developed subnational areas.
- Embeddings provide less flexible transfer across spatial aggregations than geospatial covariates.
- Geospatial foundation-model representations can improve population estimation in data-poor settings.
- Benefits break down predictably under spatial scale mismatch, revealing a limitation of current geospatial AI.
Where Pith is reading between the lines
- Hybrid models that combine embeddings with traditional covariates may be needed to handle varied geographies reliably.
- The scale-coupling problem suggests developing multi-resolution training objectives for future geospatial foundation models.
- Similar transfer limitations could appear in other spatial prediction tasks that rely on foundation-model embeddings.
- Targeted collection of ground-truth population data could be prioritized in regions where embeddings currently underperform.
Load-bearing premise
The PDFM embeddings capture settlement context more informatively than harmonized geospatial covariates without scale-specific biases introduced by the foundation model's pretraining data or aggregation choices.
What would settle it
A new test set of large, less-developed subnational areas where PDFM embeddings produce zero or negative improvement in predictive fit, or where they transfer across mismatched spatial aggregations worse than the covariates.
read the original abstract
Reliable subnational population estimates are essential for applications, yet remain difficult where censuses are sparse, outdated or spatially coarse. Existing population-mapping workflows rely on hand-built geospatial covariates, such as settlement extent, night-time lights, and environmental conditions, which must be assembled and harmonised across scales and geographies. Geospatial foundation models offer an alternative by learning reusable representations of place from more multifaceted and heterogeneous data sources. Here, we benchmark Population Dynamics Foundation Model (PDFM) embeddings against the harmonised geospatial covariates for subnational population estimation in Brazil, Nigeria and the United States. Under geographically structured validation, PDFM increased predictive fit by a median of 20.1% (IQR: 10.0-33.2%, across country-model comparisons) reduction in unexplained variance, and reduced Kullback-Leibler divergence by 23.2% (9.2-26.2%). However, these gains were uneven. PDFM was most advantageous where the geospatial covariates weakly characterised settlement context, such as larger and less-developed subnational areas. Moreover, PDFM performance was scale-coupled with embeddings providing less flexible transfer across spatial aggregations than geospatial covariates. These findings showed that geospatial foundation-model representations of place can improve population estimation in data poor settings, but their benefits break down predictably under spatial scale mismatch, revealing a fundamental limitation of current geospatial AI.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper benchmarks the use of embeddings from the Population Dynamics Foundation Model (PDFM) against harmonized geospatial covariates for subnational population estimation in Brazil, Nigeria, and the United States. Under geographically structured validation, it reports a median 20.1% reduction in unexplained variance (IQR 10.0-33.2%) and 23.2% reduction in Kullback-Leibler divergence, with improvements being uneven across space and scale, performing best in larger, less-developed areas but showing less flexible transfer across spatial aggregations than traditional covariates.
Significance. If the results are confirmed, this work demonstrates that geospatial foundation models can offer advantages over conventional covariates in population mapping, especially in data-poor contexts, while also identifying key limitations related to spatial scale that must be addressed for broader applicability. This has practical significance for improving demographic estimates used in policy and humanitarian efforts.
major comments (2)
- [Methods] Detailed procedures for extracting and aggregating PDFM embeddings to subnational units, as well as the exact harmonization steps for geospatial covariates, are not described. This omission makes it difficult to determine whether the reported performance gains stem from the embeddings themselves or from differences in aggregation methods, directly impacting the validity of the central claim of uneven improvements.
- [Results] The manuscript should provide more explicit evidence that aggregation procedures were matched exactly between PDFM embeddings and geospatial covariates. Without this, the 20.1% median improvement could be partly attributable to scale-specific summarization choices rather than superior settlement context capture, as suggested by the noted scale-coupling and the abstract's own qualification on transfer flexibility.
minor comments (1)
- [Abstract] The abstract is clear but could specify the number of subnational units or models compared to give context to the IQR ranges reported for the improvements.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments, which highlight important aspects of methodological transparency. We have revised the manuscript to address the concerns about aggregation procedures and have added the requested details and evidence. Below we respond point by point.
read point-by-point responses
-
Referee: [Methods] Detailed procedures for extracting and aggregating PDFM embeddings to subnational units, as well as the exact harmonization steps for geospatial covariates, are not described. This omission makes it difficult to determine whether the reported performance gains stem from the embeddings themselves or from differences in aggregation methods, directly impacting the validity of the central claim of uneven improvements.
Authors: We agree that the original manuscript omitted sufficient detail on these steps, which limits reproducibility and could raise questions about the source of the observed gains. In the revised version we have added a new Methods subsection ('PDFM Embedding Extraction and Covariate Harmonization') that specifies: (i) the PDFM API query parameters and embedding dimensionality, (ii) the exact spatial aggregation (area-weighted mean pooling of embeddings within each subnational polygon), and (iii) the full harmonization pipeline for the geospatial covariates (source datasets, reprojection to a common grid, temporal alignment, and normalization). We also include a direct statement that identical aggregation logic was applied to both feature sets. These additions allow readers to confirm that performance differences arise from the embeddings rather than procedural mismatches. revision: yes
-
Referee: [Results] The manuscript should provide more explicit evidence that aggregation procedures were matched exactly between PDFM embeddings and geospatial covariates. Without this, the 20.1% median improvement could be partly attributable to scale-specific summarization choices rather than superior settlement context capture, as suggested by the noted scale-coupling and the abstract's own qualification on transfer flexibility.
Authors: We accept this critique and have strengthened the Results section accordingly. We now include an explicit paragraph and a supplementary table that document the matched aggregation functions (area-weighted means for both embeddings and covariates) and report a sensitivity check in which alternative summarization choices (e.g., median pooling) were tested; the relative advantage of PDFM remains stable. While we retain the abstract's qualification on scale-coupling, the added evidence demonstrates that the 20.1% median reduction in unexplained variance is not an artifact of mismatched summarization. We have also cross-referenced these details in the discussion of uneven spatial performance. revision: yes
Circularity Check
No circularity in empirical benchmark of foundation-model embeddings
full rationale
The paper reports an empirical benchmark comparing PDFM embeddings to harmonized geospatial covariates for subnational population estimation in Brazil, Nigeria and the United States. Results are obtained via geographically structured validation measuring reductions in unexplained variance and KL divergence on held-out data. No derivations, equations, fitted parameters renamed as predictions, or self-citation chains appear in the load-bearing steps; the claimed improvements are measured directly against external data splits rather than being forced by construction or internal definitions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Geographically structured validation sufficiently prevents spatial autocorrelation leakage in performance estimates.
Reference graph
Works this paper leans on
-
[1]
WorldPop, School of Geography and Environmental Sciences, University of Southampton, United Kingdom
-
[2]
Geographic Data Science Lab, Department of Geography and Planning, School of Environmental Sciences, University of Liverpool, United Kingdom # Corresponding authors: Wenbin Zhang (wb.zhang@soton.ac.uk), and Shengjie Lai (Shengjie.Lai@soton.ac.uk) Abstract Reliable subnational population estimates are essential for applications , yet remain difficult where...
-
[3]
Such representations are appealing for spatial demography because they compress not only otherwise difficult -to-access behavioural signals, such as aggregated search and activity patterns, but also geospatial and contextual data into a single reusable representation of place, thereby reducing the need for downstream applications to assemble and align mul...
2023
-
[4]
The resulting embeddings are fixed, location- specific representations that can be used as predictors in downstream tasks. The goal of this study was not to retrain or modify PDFM, but to evaluate the predictive utility of these precomputed embeddings for population modelling relative to established geospatial covariates. PDFM embeddings were generated se...
-
[5]
In each country, embeddings were produced from country-specific models built on the relevant administrative geography and available place -based input data
and the US (N = 39649) using the same methodological pipeline. In each country, embeddings were produced from country-specific models built on the relevant administrative geography and available place -based input data. Input signals were time -matched across countries to October 2023 to improve consistency. Consequently, cross-country differences in mode...
2023
-
[6]
Within this candidate set, we calculated string similarity between the embedding place name and polygon names using the Jaro-Winkler distance
Candidate polygons were then identified based on spatial proximity to these coordinates, including the nearest polygon and its neighbouring polygons. Within this candidate set, we calculated string similarity between the embedding place name and polygon names using the Jaro-Winkler distance
-
[7]
We evaluated similarity both with the full name and with higher - level region names removed (e.g., removing state names when embedded in district names). The final match was determined by combining spatial proximity and name similarity: the nearest polygon was selected unless another candidate showed substantially higher name similarity beyond a predefin...
2022
-
[8]
Lloyd, C., Sorichetta, A. & Tatem, A. 2017. High resolution global gridded data for use in population studies. Sci Data 4, 170001. https://doi.org/10.1038/sdata.2017.1
-
[9]
and Tatem, A.J., 2021
Nilsen, K., Tejedor-Garavito, N., Leasure, D.R., Utazi, C.E., Ruktanonchai, C.W., Wigley, A.S., Dooley, C.A., Matthews, Z. and Tatem, A.J., 2021. A review of geospatial methods for population estimation and their use in constructing reproductive, maternal, newborn, child and adolescent health service indicators. BMC health services research, 21(Suppl 1), p.370
2021
-
[10]
and Zhou, Y ., 2025
Li, D., Sun, L., Feng, K., Zhang, N., Yu, Y ., Zhao, D. and Zhou, Y ., 2025. Disproportionate flood exposure for slum populations of the Global South. Nature Cities, 2(7), pp.626-638
2025
-
[11]
and Lai, S., 2025
Liu, H., Wang, S., Wei, C., Zhang, W., Tatem, A.J. and Lai, S., 2025. Assessing context-dependent effectiveness of heat adaptation through human mobility under different heatwave regimes. Sustainable Cities and Society, p.107066
2025
-
[12]
and Sorichetta, A., 2025
Zhang, W.B., Woods, D., Olowe, I.D., Schiavina, M., Fang, W., Hornby, G., Bondarenko, M., Maes, J., Dijkstra, L., Tatem, A.J. and Sorichetta, A., 2025. Assessing the impacts of gridded population model choice on degree of urbanisation metrics. Cities, 166, p.106293
2025
-
[13]
and Beltrán-Sánchez, H., 2023
Bozick, R., Burgette, L.F., Sharygin, E., Shih, R.A., Weidmer, B., Tzen, M., Kofner, A., Brand, J.E. and Beltrán-Sánchez, H., 2023. Evaluating the accuracy of 2020 census block-level estimates in California. Demography, 60(6), pp.1903-1921
2023
-
[14]
Estimating the civilian noninstitutional population for small areas: a modified cohort component approach using public use data
Forrester, A.C., 2024. Estimating the civilian noninstitutional population for small areas: a modified cohort component approach using public use data. Journal of Population Research, 41(1), p.5
2024
-
[15]
WorldPop, open data for spatial demography
Tatem, A.J., 2017. WorldPop, open data for spatial demography. Scientific data, 4(1), p.170004
2017
-
[16]
and Schindler, K., 2024
Metzger, N., Daudt, R.C., Tuia, D. and Schindler, K., 2024. High-resolution population maps derived from sentinel-1 and sentinel-2. Remote Sensing of Environment, 314, p.114383
2024
-
[17]
Leyk, S., Gaughan, A.E., Adamo, S.B., et al., 2019. The spatial allocation of population: a review of large-scale gridded population data products and their fitness for use. Earth Syst. Sci. Data 11, 1385-1409. https://doi:10.5194/essd-11-1385-2019
-
[18]
and Tatem, A.J., 2018
Wardrop, N.A., Jochem, W.C., Bird, T.J., Chamberlain, H.R., Clarke, D., Kerr, D., Bengtsson, L., Juran, S., Seaman, V . and Tatem, A.J., 2018. Spatially disaggregated population estimates in the absence of national population and housing census data. Proceedings of the National Academy of Sciences, 115(14), pp.3529-3537
2018
-
[19]
Census counts, undercounts and population estimates: The importance of data quality evaluation
Pelletier, François, 2020. Census counts, undercounts and population estimates: The importance of data quality evaluation. United Nations, Department of Economics and Social Affairs, Population Division, Technical Paper No. 2
2020
-
[20]
and Wang, L., 2005
Wu, S.S., Qiu, X. and Wang, L., 2005. Population estimation methods in GIS and remote sensing: A review. GIScience & Remote Sensing, 42(1), pp.80-96
2005
-
[21]
and Temple, J., 2022
Wilson, T., Grossman, I., Alexander, M., Rees, P. and Temple, J., 2022. Methods for small area population forecasts: State-of-the-art and research needs. Population Research and Policy Review, 41(3), pp.865-898
2022
-
[22]
and Bondarenko, M., 2025
Zhang, W.B., Sorichetta, A., Frye, C., Tejedor-Garavito, N., Fang, W., Cihan, D., Woods, D., Yetman, G., Hilton, J., Tatem, A.J. and Bondarenko, M., 2025. A stochastic approach to integerize floating-point estimates in gridded population mapping. International Journal of Geographical Information Science, pp.1-17
2025
-
[23]
Stevens, F.R., Gaughan, A.E., Linard, C., Tatem, A.J., 2015. Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. PLOS ONE 10(2), e0107042. doi:10.1371/journal.pone.0107042
-
[24]
Adams, D.S., Zimmer, A., Tuccillo, J. et al. 2025. LandScan mosaic enables high- resolution gridded population estimates with explicit uncertainty. Sci Rep 15, 44493. https://doi.org/10.1038/s41598-025-28125-z
-
[25]
and V oPham, T., 2025
Iyer, H.S., Karasaki, S., Yi, L., Hswen, Y ., James, P. and V oPham, T., 2025. Harnessing geospatial artificial intelligence (GeoAI) for environmental epidemiology: a narrative review. Current environmental health reports, 12(1), p.34
2025
-
[26]
X., Xiong, Z
Zhu, X. X., Xiong, Z. & Shi, Y . On the foundations of Earth foundation models. Commun. Earth Environ. 7, 116 (2026)
2026
-
[27]
Dynamic population mapping using mobile phone data
Deville P, Linard C, Martin S, et al. Dynamic population mapping using mobile phone data. Proceedings of the National Academy of Sciences. 2014;111(45):15888-15893
2014
-
[28]
Lai, S., Erbach-Schoenberg, E.z., Pezzulo, C. et al. Exploring the use of mobile phone data for national migration statistics. Palgrave Commun 5, 34 (2019). https://doi.org/10.1057/s41599-019-0242-9
-
[29]
Duan, Q., Lai, S., Sorichetta, A. et al. COVID-19 and urban exodus: diverging population redistribution patterns across countries from 2020 to 2022. npj Urban Sustain 6, 59 (2026). https://doi.org/10.1038/s42949-026-00351-y
-
[30]
and Huang, Z., 2020
Zhang, F., Zu, J., Hu, M., Zhu, D., Kang, Y ., Gao, S., Zhang, Y . and Huang, Z., 2020. Uncovering inconspicuous places using social media check-ins and street view images. Computers, Environment and Urban Systems, 81, p.101478
2020
-
[31]
Peng, D., Gui, Z., Wei, W. et al. Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data. Nat Mach Intell 7, 1669- 1684 (2025). https://doi.org/10.1038/s42256-025-01112-9
-
[32]
General Geospatial Inference with a Population Dynamics Foundation Model
Agarwal, M., Sun, M., Kamath, C., Muslim, A., Sarker, P., Paul, J., Yee, H., Sieniek, M., Jablonski, K., Vispute, S. and Kumar, A., 2024. General geospatial inference with a population dynamics foundation model. arXiv preprint arXiv:2411.07207
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[33]
Mai, G. et al. On the Opportunities and Challenges of Foundation Models for GeoAI (Vision Paper). ACM Trans. Spat. Algorithms Syst. 10, Article 11, 1-46 (2024)
2024
-
[34]
Bodnar, C., Bruinsma, W.P., Lucic, A. et al. A foundation model for the Earth system. Nature 641, 1180-1187 (2025). https://doi.org/10.1038/s41586-025-09005-y
-
[35]
and Bondarenko, M., 2025
Woods, D., McKeen, T., Cunningham, A., Priyatikanto, R., Tatem, A.J., Sorichetta, A. and Bondarenko, M., 2025. Global gridded multi-temporal datasets to support human population distribution modelling. Gates Open Research, 9, p.72
2025
-
[36]
Bell, A., Aides, A., Helmy, A., Muslim, A., Barzilai, A., Slobodkin, A., Jaber, B., Schottlander, D., Leifman, G., Paul, J. and Sun, M., 2025. Earth AI: unlocking geospatial insights with foundation models and cross-modal reasoning. arXiv preprint arXiv:2510.18318
-
[37]
and Thakur, G., 2023
Fan, J. and Thakur, G., 2023. Towards POI-based large-scale land use modeling: spatial scale, semantic granularity, and geographic context. International Journal of Digital Earth, 16(1), pp.430-445
2023
-
[38]
Modeling walking accessibility to urban parks using Google Maps crowdsourcing database in the high-density urban environments of Hong Kong
Gong, F.Y ., 2023. Modeling walking accessibility to urban parks using Google Maps crowdsourcing database in the high-density urban environments of Hong Kong. Scientific Reports, 13(1), p.20798
2023
-
[39]
and Du, S., 2025
Xiong, S., Zhang, X., Wang, H., Lei, Y ., Tan, G. and Du, S., 2025. Mapping the first dataset of global urban land uses with Sentinel-2 imagery and POI prompt. Remote Sensing of Environment, 327, p.114824
2025
-
[40]
Li, Z., Li, L., Hu, T. et al. Satellite mapping of every building’s function in urban China reveals deep built environment disparities. Nat Commun 17, 2827 (2026). https://doi.org/10.1038/s41467-026-69589-5
-
[41]
and Tatem, A.J., 2020
Stevens, F.R., Gaughan, A.E., Nieves, J.J., King, A., Sorichetta, A., Linard, C. and Tatem, A.J., 2020. Comparisons of two global built area land cover datasets in methods to disaggregate human population in eleven countries from the global South. International Journal of Digital Earth, 13(1), pp.78-100
2020
-
[42]
and Sun, Z.Y ., 2024
Sun, Y ., Xie, J., Wang, Y ., Chan, T.O. and Sun, Z.Y ., 2024. Mapping local-scale working population and daytime population densities using points-of-interest and nighttime light satellite imageries. Geo-Spatial Information Science, 27(6), pp.1852- 1867
2024
-
[43]
and Tatem, A.J., 2022
Thomson, D.R., Leasure, D.R., Bird, T., Tzavidis, N. and Tatem, A.J., 2022. How accurate are WorldPop-Global-Unconstrained gridded population data at the cell- level?: A simulation analysis in urban Namibia. Plos one, 17(7), p.e0271504
2022
-
[44]
Metz, L., Haggard, R., Moszczynski, M., Asbah, S., Mwase, C., Khomani, P., Smith, T., Cooper, H., Mwale, A., Muslim, A. and Prasad, G., 2025. Application and Validation of Geospatial Foundation Model Data for the Prediction of Health Facility Programmatic Outputs--A Case Study in Malawi. arXiv preprint arXiv:2510.25954
-
[45]
and Bram, D., 2007
Dark, S.J. and Bram, D., 2007. The modifiable areal unit problem (MAUP) in physical geography. Progress in physical geography, 31(5), pp.471-479
2007
-
[46]
and Young, L.J., 2005
Gotway Crawford, C.A. and Young, L.J., 2005. Change of support: an inter- disciplinary challenge. In Geostatistics for Environmental Applications: Proceedings of the Fifth European Conference on Geostatistics for Environmental Applications (pp. 1-13). Berlin, Heidelberg: Springer Berlin Heidelberg
2005
-
[47]
and Lao, N., 2022
Mai, G., Janowicz, K., Hu, Y ., Gao, S., Yan, B., Zhu, R., Cai, L. and Lao, N., 2022. A review of location encoding for GeoAI: methods and applications. International Journal of Geographical Information Science, 36(4), pp.639-673
2022
-
[48]
and Wang, W., 2017, October
Wang, Y ., Qin, J. and Wang, W., 2017, October. Efficient approximate entity matching using jaro-winkler distance. In International conference on web information systems engineering (pp. 231-239). Cham: Springer International Publishing
2017
-
[49]
and Jurman, G., 2021
Chicco, D., Warrens, M.J. and Jurman, G., 2021. The coefficient of determination R- squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. Peerj computer science, 7, p.e623
2021
-
[50]
and Leibler, R.A., 1951
Kullback, S. and Leibler, R.A., 1951. On information and sufficiency. The annals of mathematical statistics, 22(1), pp.79-86
1951
-
[51]
Swanwick, R.H., Read, Q.D., Guinn, S.M. et al. Dasymetric population mapping based on US census data and 30-m gridded estimates of impervious surface. Sci Data 9, 523 (2022). https://doi.org/10.1038/s41597-022-01603-z
-
[52]
Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure
Roberts, D.R., Bahn, V ., Ciuti, S., et al., 2017. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40(8), 913-
2017
-
[53]
doi:10.1111/ecog.02881
-
[54]
and Meyer, H., 2023
Ludwig, M., Moreno‐Martinez, A., Hölzel, N., Pebesma, E. and Meyer, H., 2023. Assessing and improving the transferability of current global spatial prediction models. Global Ecology and Biogeography, 32(3), pp.356-368
2023
-
[55]
and Lengauer, T., 2010
Altmann, A., Toloşi, L., Sander, O. and Lengauer, T., 2010. Permutation importance: a corrected feature importance measure. Bioinformatics, 26(10), pp.1340-1347. Supplementary Information A. Supplementary Tables Supplementary Table 1. Marginal ordinary least squares (OLS) associations between regional characteristics and PDFM performance gains relative to...
2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.