arxiv: 2604.03456 · v1 · submitted 2026-04-03 · 💻 cs.LG · cs.CY

Recognition: no theorem link

Earth Embeddings Reveal Diverse Urban Signals from Space

Wenjing Gong , Udbhav Srivastava , Yuchen Wang , Yuhao Jia , Qifan Wu , Weishan Bai , Yifan Yang , Xiao Huang

show 1 more author

Xinyue Ye

Authors on Pith no claims yet

Pith reviewed 2026-05-13 19:25 UTC · model grok-4.3

classification 💻 cs.LG cs.CY

keywords earth embeddingssatellite imageryurban indicatorsremote sensingneighborhood monitoringsupervised learningbuilt environment

0 comments

The pith

Satellite-derived Earth embeddings predict neighborhood urban indicators such as health burdens and commuting modes across U.S. cities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper evaluates whether compact vector representations extracted from satellite images, called Earth embeddings, can serve as low-cost proxies for traditional census and survey data on urban conditions. Using a single supervised learning setup, it tests three families of embeddings on the task of predicting 14 neighborhood indicators spanning crime, income, health, and travel across six U.S. metropolitan areas from 2020 to 2023. The embeddings capture substantial variation, performing best on outcomes tied to physical structure such as chronic health burdens and dominant commuting modes, while showing weaker results for behaviors more shaped by policy or individual choice such as cycling rates. Performance remains stable from year to year yet differs noticeably between cities, and compact 64-dimensional versions retain more useful signal than dimensionality-reduced versions of larger embeddings. This points to a practical route for frequent, scalable neighborhood monitoring aligned with sustainable development goals.

Core claim

Earth embeddings from models such as AlphaEarth, Prithvi, and Clay, when fed into a unified supervised learning framework, predict 14 neighborhood-level urban indicators with meaningful accuracy; skill is highest for outcomes directly linked to built-environment structure including chronic health burdens and dominant commuting modes, remains comparatively stable across years, and varies across cities in ways associated with urban form.

What carries the argument

A unified supervised learning framework that treats Earth embeddings as input features to predict urban indicators, with systematic comparisons across embedding families, evaluation settings (global, city-wise, year-wise, city-year), and controlled dimensionality reductions.

If this is right

Earth embeddings provide scalable, frequently updatable features for neighborhood-scale urban monitoring aligned with SDG targets.
Predictive performance is strongest for indicators most directly tied to visible built-environment structure.
Cross-city differences in accuracy track urban form in task-specific ways.
Compact 64-dimensional embeddings remain more informative than 64-dimensional reductions of larger models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could support low-cost tracking of urban change in regions lacking regular census coverage.
Task-specific performance gaps suggest opportunities to refine embeddings by incorporating more fine-scale behavioral cues.
Linking accuracy variation to measurable urban-form metrics could guide future embedding design.

Load-bearing premise

The chosen supervised learning framework and selected indicators accurately reflect transferable urban signals from the embeddings without substantial domain shift or unmeasured confounding across cities and years.

What would settle it

Finding that the embeddings lose all predictive power above simple baselines for health burdens and commuting modes when tested on a new city or future year outside the six metropolitan areas and 2020-2023 window.

read the original abstract

Conventional urban indicators derived from censuses, surveys, and administrative records are often costly, spatially inconsistent, and slow to update. Recent geospatial foundation models enable Earth embeddings, compact satellite image representations transferable across downstream tasks, but their utility for neighborhood-scale urban monitoring remains unclear. Here, we benchmark three Earth embedding families, AlphaEarth, Prithvi, and Clay, for urban signal prediction across six U.S. metropolitan areas from 2020 to 2023. Using a unified supervised-learning framework, we predict 14 neighborhood-level indicators spanning crime, income, health, and travel behavior, and evaluate performance under four settings: global, city-wise, year-wise, and city-year. Results show that Earth embeddings capture substantial urban variation, with the highest predictive skill for outcomes more directly tied to built-environment structure, including chronic health burdens and dominant commuting modes. By contrast, indicators shaped more strongly by fine-scale behavior and local policy, such as cycling, remain difficult to infer. Predictive performance varies markedly across cities but remains comparatively stable across years, indicating strong spatial heterogeneity alongside temporal robustness. Exploratory analysis suggests that cross-city variation in predictive performance is associated with urban form in task-specific ways. Controlled dimensionality experiments show that representation efficiency is critical: compact 64-dimensional AlphaEarth embeddings remain more informative than 64-dimensional reductions of Prithvi and Clay. This study establishes a benchmark for evaluating Earth embeddings in urban remote sensing and demonstrates their potential as scalable, low-cost features for SDG-aligned neighborhood-scale urban monitoring.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Earth embeddings give usable signals for some built-environment urban indicators but the cross-city performance gaps look like they could be driven by unmeasured confounders rather than pure transferable features.

read the letter

The paper's main contribution is a head-to-head benchmark of AlphaEarth, Prithvi, and Clay embeddings on 14 neighborhood indicators across six U.S. metros and four years. It shows that the embeddings, especially the compact 64-dimensional AlphaEarth versions, predict health burdens and commuting modes better than cycling or policy-heavy variables, with performance holding steady over time but varying sharply by city. The controlled dimensionality comparison and the four evaluation settings (global, city-wise, year-wise, city-year) are the clearest new pieces of evidence here. They also run an exploratory check linking performance differences to urban form, which is a reasonable next step for this kind of work. That gives applied remote-sensing and urban-data people a concrete baseline they can build on for low-cost SDG tracking proxies. The temporal robustness finding is useful and not obvious in advance. The setup is straightforward and the claims stay within what the abstract actually reports. The soft spots sit in the methods and controls. The abstract gives no information on data splits, error bars, or statistical tests, so it is impossible to judge how stable the numbers really are. More importantly, the marked city-to-city differences are presented without mention of city fixed effects, policy covariates, or domain-adaptation steps. That leaves the stress-test concern standing: the apparent skill for health and commuting could partly reflect city-specific correlations rather than general embedding signals. If the full paper does not add those checks, the transferability story weakens. This is the kind of incremental benchmarking paper that belongs in a remote-sensing or urban-informatics venue. A reader already working on satellite-derived urban features would get practical value from the comparison and the dimensionality results. It is not foundational, but it is solid enough to deserve referee time once the missing robustness details are supplied. I would send it to review with requests for the exact training protocol, confidence intervals, and any city-level controls they applied.

Referee Report

3 major / 1 minor

Summary. The manuscript benchmarks three Earth embedding families (AlphaEarth, Prithvi, Clay) for predicting 14 neighborhood-level urban indicators spanning crime, income, health, and travel behavior across six U.S. metropolitan areas (2020–2023). Using a unified supervised-learning framework, performance is evaluated under global, city-wise, year-wise, and city-year settings. The central claims are that embeddings capture substantial urban variation (strongest for built-environment outcomes such as chronic health burdens and dominant commuting modes), that performance varies markedly across cities but is stable across years, and that compact 64-dimensional AlphaEarth embeddings outperform dimensionality-reduced versions of the other models.

Significance. If the empirical results hold after addressing methodological transparency, the work supplies a useful benchmark for geospatial foundation models in urban remote sensing. It demonstrates the feasibility of low-cost, scalable neighborhood-scale monitoring aligned with SDG indicators and isolates the practical importance of representation efficiency and temporal robustness versus spatial heterogeneity.

major comments (3)

[Abstract / Methods] Abstract and Methods: The reported performance differences and exploratory associations provide no details on data splits, error bars, statistical tests, or exact model specifications (e.g., base learner, regularization, hyperparameter search). These omissions are load-bearing for the claim that embeddings deliver transferable urban signals.
[Results] Results: Marked cross-city performance variation is reported without city fixed effects, policy covariates, or explicit domain-adaptation steps beyond the four evaluation settings. This leaves open the possibility that predictive skill for health and commuting indicators is driven by unmeasured city-level confounders rather than generalizable features from the satellite embeddings.
[Methods] Methods: The unified supervised-learning framework is not specified with respect to handling of domain shift across cities/years or the precise construction of the 64-dimensional reductions used in the controlled dimensionality experiments; without these details the efficiency claim for AlphaEarth cannot be fully evaluated.

minor comments (1)

[Abstract] Abstract: Consider reporting the total number of neighborhoods and the exact temporal coverage per city to help readers gauge the scale and balance of the benchmark.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which have helped us improve the clarity and transparency of the manuscript. We have revised the paper to address the major concerns regarding methodological details, potential confounders, and domain shift. Our point-by-point responses follow.

read point-by-point responses

Referee: [Abstract / Methods] Abstract and Methods: The reported performance differences and exploratory associations provide no details on data splits, error bars, statistical tests, or exact model specifications (e.g., base learner, regularization, hyperparameter search). These omissions are load-bearing for the claim that embeddings deliver transferable urban signals.

Authors: We agree that these details are essential for reproducibility and to substantiate our claims. In the revised manuscript we have added a new subsection in Methods titled 'Supervised Learning Pipeline' that specifies: (i) data splits as random 70/15/15 train/validation/test partitions with city-year stratification to prevent leakage; (ii) error bars as standard deviations across 5-fold cross-validation; (iii) statistical comparisons via paired t-tests with Bonferroni correction, with p-values now reported in all result tables; and (iv) exact model specifications (ridge regression with L2 regularization, regularization strength selected by grid search over {0.01, 0.1, 1, 10, 100} on the validation fold). The abstract has been updated to reference the cross-validated evaluation protocol. revision: yes
Referee: [Results] Results: Marked cross-city performance variation is reported without city fixed effects, policy covariates, or explicit domain-adaptation steps beyond the four evaluation settings. This leaves open the possibility that predictive skill for health and commuting indicators is driven by unmeasured city-level confounders rather than generalizable features from the satellite embeddings.

Authors: We acknowledge the possibility of city-level confounders. Our design intentionally omits additional covariates to isolate the raw predictive signal contained in the embeddings. The city-wise and city-year settings already provide within-city estimates that are less affected by between-city differences, while the global setting quantifies transfer. We have added a paragraph in the Discussion section explicitly noting the absence of city fixed effects or policy covariates and discussing how unmeasured factors (e.g., local zoning or enforcement) may contribute to observed cross-city variation. We also report an exploratory correlation between performance gaps and urban-form metrics (density, land-use entropy) to partially address the concern. No further domain-adaptation steps were introduced, as that would alter the benchmark's focus on zero-shot transferability. revision: partial
Referee: [Methods] Methods: The unified supervised-learning framework is not specified with respect to handling of domain shift across cities/years or the precise construction of the 64-dimensional reductions used in the controlled dimensionality experiments; without these details the efficiency claim for AlphaEarth cannot be fully evaluated.

Authors: We have expanded the Methods section to clarify both points. Domain shift is addressed solely through the four evaluation settings (global for cross-city transfer, city-wise for local performance, year-wise for temporal stability, and city-year for joint generalization); no explicit adaptation techniques such as adversarial training or fine-tuning were applied. For the dimensionality-controlled experiments, the 64-dimensional versions of Prithvi and Clay were obtained by PCA on the original embeddings, retaining the top 64 principal components (cumulative explained variance now stated as 82% for Prithvi and 76% for Clay in the supplementary material). AlphaEarth embeddings were used in their native 64-dimensional form. These additions allow direct evaluation of the efficiency claim. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmarking of pre-trained embeddings on external urban indicators

full rationale

The paper conducts supervised benchmarking of three external pre-trained Earth embedding models (AlphaEarth, Prithvi, Clay) to predict 14 neighborhood indicators drawn from independent census, survey, and administrative sources across six U.S. cities and four years. Performance is measured via standard metrics under global, city-wise, year-wise, and city-year splits with no equations, fitted parameters renamed as predictions, or self-referential definitions. No load-bearing self-citations, uniqueness theorems, or ansatzes imported from the authors' prior work appear; the embeddings are treated as fixed inputs and the indicators as external ground truth. The central claims rest on observable predictive skill differences (e.g., higher for built-environment outcomes) that remain testable against held-out data and do not reduce to the input embeddings by construction. This is a standard transfer-learning evaluation with no derivation chain that collapses to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available; the central claim rests on the unstated assumption that the embeddings encode transferable urban signals and that the supervised setup isolates those signals from city-specific confounders.

axioms (1)

domain assumption Earth embeddings are transferable across downstream urban prediction tasks without major retraining
Invoked by the unified supervised-learning framework and cross-city evaluation

pith-pipeline@v0.9.0 · 5593 in / 1174 out tokens · 32103 ms · 2026-05-13T19:25:32.793150+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages · 1 internal anchor

[1]

World Health Organization (2010)

World Health Organization: Hidden Cities: Unmasking and Overcoming Health Inequities in Urban Settings. World Health Organization (2010)

work page 2010
[2]

UN-Habitat (2024)

UN-Habitat: World Cities Report 2024: Cities and Climate Action|UN-Habitat. UN-Habitat (2024)

work page 2024
[3]

United Nations (2025)

United Nations: World Urbanization Prospects 2025: Summary of Results|Population Division. United Nations (2025)

work page 2025
[4]

https://www.undp.org/sustainable- development-goals

UNDP: United Nations Development Programme. https://www.undp.org/sustainable- development-goals

work page
[5]

United Nations

United Nations: Unsdg|Leave No One Behind. United Nations

work page
[6]

United Nations

United Nations: IAEG-SDGs — SDG Indicators. United Nations

work page
[7]

https://www.census.gov/programs-surveys/decennial-census/about.html (2021)

Bureau, U.C.: About the Decennial Census of Population and Housing. https://www.census.gov/programs-surveys/decennial-census/about.html (2021)

work page 2021
[8]

https://www.census.gov/programs- surveys/ahs.html (2025)

Bureau, U.C.: American Housing Survey (AHS). https://www.census.gov/programs- surveys/ahs.html (2025)

work page 2025
[9]

https://www.census.gov/topics/research/guidance/restricted- use-microdata/administrative-data.html (2024)

Bureau, U.C.: Administrative Data. https://www.census.gov/topics/research/guidance/restricted- use-microdata/administrative-data.html (2024)

work page 2024
[10]

Proceedings of the National Academy of Sciences120(27), 2220417120 (2023) https://doi.org/10.1073/pnas.2220417120

Fan, Z., Zhang, F., Loo, B.P.Y., Ratti, C.: Urban visual intelligence: Uncovering hidden city profiles with street view images. Proceedings of the National Academy of Sciences120(27), 2220417120 (2023) https://doi.org/10.1073/pnas.2220417120

work page doi:10.1073/pnas.2220417120 2023
[11]

Landscape and Urban Planning215, 104217 (2021) https://doi.org/10.1016/j.landurbplan.2021.104217

Biljecki, F., Ito, K.: Street view imagery in urban analytics and GIS: A review. Landscape and Urban Planning215, 104217 (2021) https://doi.org/10.1016/j.landurbplan.2021.104217

work page doi:10.1016/j.landurbplan.2021.104217 2021
[12]

Computers, Environment and Urban Systems117, 102253 (2025) https://doi.org/10.1016/j.compenvurbsys.2025.102253

Fan, Z., Feng, C.-C., Biljecki, F.: Coverage and bias of street view imagery in mapping the urban environment. Computers, Environment and Urban Systems117, 102253 (2025) https://doi.org/10.1016/j.compenvurbsys.2025.102253

work page doi:10.1016/j.compenvurbsys.2025.102253 2025
[13]

In: 2009 IEEE 12th International Conference on Computer Vision, pp

Frome, A., Cheung, G., Abdulkader, A., Zennaro, M., Wu, B., Bissacco, A., Adam, H., Neven, H., Vincent, L.: Large-scale privacy protection in Google Street View. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2373–2380 (2009). https: //doi.org/10.1109/ICCV.2009.5459413 13

work page doi:10.1109/iccv.2009.5459413 2009
[14]

Proceedings of the National Academy of Sciences114(50), 13108–13113 (2017) https://doi.org/10.1073/pnas.1700035114

Gebru, T., Krause, J., Wang, Y., Chen, D., Deng, J., Aiden, E.L., Fei-Fei, L.: Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States. Proceedings of the National Academy of Sciences114(50), 13108–13113 (2017) https://doi.org/10.1073/pnas.1700035114

work page doi:10.1073/pnas.1700035114 2017
[15]

Remote Sensing of Environment210, 113–143 (2018) https: //doi.org/10.1016/j.rse.2018.03.017

Rom´ an, M.O., Wang, Z., Sun, Q., Kalb, V., Miller, S.D., Molthan, A., Schultz, L., Bell, J., Stokes, E.C., Pandey, B., Seto, K.C., Hall, D., Oda, T., Wolfe, R.E., Lin, G., Golpayegani, N., Devadiga, S., Davidson, C., Sarkar, S., Praderas, C., Schmaltz, J., Boller, R., Stevens, J., Ramos Gonz´ alez, O.M., Padilla, E., Alonso, J., Detr´ es, Y., Armstrong, ...

work page doi:10.1016/j.rse.2018.03.017 2018
[16]

Remote Sensing of Environment280, 113195 (2022) https://doi.org/10.1016/j.rse.2022.113195

Wulder, M.A., Roy, D.P., Radeloff, V.C., Loveland, T.R., Anderson, M.C., Johnson, D.M., Healey, S., Zhu, Z., Scambos, T.A., Pahlevan, N., Hansen, M., Gorelick, N., Crawford, C.J., Masek, J.G., Hermosilla, T., White, J.C., Belward, A.S., Schaaf, C., Woodcock, C.E., Hunt- ington, J.L., Lymburner, L., Hostert, P., Gao, F., Lyapustin, A., Pekel, J.-F., Strobl...

work page doi:10.1016/j.rse.2022.113195 2022
[17]

Drusch, U

Drusch, M., Del Bello, U., Carlier, S., Colin, O., Fernandez, V., Gascon, F., Hoersch, B., Isola, C., Laberinti, P., Martimort, P., Meygret, A., Spoto, F., Sy, O., Marchese, F., Bargellini, P.: Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sensing of Environment120, 25–36 (2012) https://doi.org/10.1016/j.rse.2011.11.026

work page doi:10.1016/j.rse.2011.11.026 2012
[18]

Mellander, C., Lobo, J., Stolarick, K., Matheson, Z.: Night-Time Light Data: A Good Proxy Measure for Economic Activity? PLoS ONE10(10), 0139779 (2015) https://doi.org/10.1371/ journal.pone.0139779

work page 2015
[19]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing13, 5251–5263 (2020) https://doi.org/10.1109/JSTARS

Stark, T., Wurm, M., Zhu, X.X., Taubenb¨ ock, H.: Satellite-Based Mapping of Urban Poverty With Transfer-Learned Slum Morphologies. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing13, 5251–5263 (2020) https://doi.org/10.1109/JSTARS. 2020.3018862

work page doi:10.1109/jstars 2020
[20]

The World Bank Economic Review36(2), 382–412 (2022) https://doi.org/10.1093/wber/lhab015

Engstrom, R., Hersh, J., Newhouse, D.: Poverty from Space: Using High Resolution Satellite Imagery for Estimating Economic Well-being. The World Bank Economic Review36(2), 382–412 (2022) https://doi.org/10.1093/wber/lhab015

work page doi:10.1093/wber/lhab015 2022
[21]

IEEE Geoscience and Remote Sensing Magazine 4(2), 41–57 (2016) https://doi.org/10.1109/MGRS.2016.2548504

Tuia, D., Persello, C., Bruzzone, L.: Domain Adaptation for the Classification of Remote Sens- ing Data: An Overview of Recent Advances. IEEE Geoscience and Remote Sensing Magazine 4(2), 41–57 (2016) https://doi.org/10.1109/MGRS.2016.2548504

work page doi:10.1109/mgrs.2016.2548504 2016
[22]

arXiv (2025)

Xiao, A., Xuan, W., Wang, J., Huang, J., Tao, D., Lu, S., Yokoya, N.: Foundation Models for Remote Sensing and Earth Observation: A Survey. arXiv (2025). https://doi.org/10.48550/ arXiv.2410.16602

work page arXiv 2025
[23]

Alphaearth foundations: An embedding field model for accurate and efficient global mapping from sparse label data.arXiv preprint arXiv:2507.22291, 2025

Brown, C.F., Kazmierski, M.R., Pasquarella, V.J., Rucklidge, W.J., Samsikova, M., Zhang, C., Shelhamer, E., Lahera, E., Wiles, O., Ilyushchenko, S., Gorelick, N., Zhang, L.L., Alj, S., Schechter, E., Askay, S., Guinan, O., Moore, R., Boukouvalas, A., Kohli, P.: AlphaEarth Foundations: An Embedding Field Model for Accurate and Efficient Global Mapping from...

work page doi:10.48550/arxiv.2507.22291 2025
[24]

a rXiv preprint arXiv:2412.02732 (2024)

Szwarcman, D., Roy, S., Fraccaro, P., G´ ıslason, T.E., Blumenstiel, B., Ghosal, R., Oliveira, P.H., Almeida, J.L.d.S., Sedona, R., Kang, Y., Chakraborty, S., Wang, S., Gomes, C., Kumar, A., Truong, M., Godwin, D., Lee, H., Hsu, C.-Y., Asanjan, A.A., Mujeci, B., Shidham, D., Keenan, T., Arevalo, P., Li, W., Alemohammad, H., Olofsson, P., Hain, C., Kennedy...

work page doi:10.48550/arxiv.2412.02732 2025
[25]

https://clay- foundation.github.io/model/

Clay Foundation Model — Clay Foundation Model. https://clay- foundation.github.io/model/

work page
[26]

Klemmer, K., Rolf, E., Russwurm, M., Camps-Valls, G., Czerkawski, M., Ermon, S., Francis, A., Jacobs, N., Kerner, H.R., Mackey, L., Mai, G., Aodha, O.M., Reichstein, M., Robinson, C., Rolnick, D., Shelhamer, E., Sitzmann, V., Tuia, D., Zhu, X.: Earth Embeddings: Towards AI-centric Representations of our Planet (2025)

work page 2025
[27]

Drusch, U

Fang, H., Liang, S., Li, W., Chen, Y., Ma, H., Xu, J., Ma, Y., He, T., Tian, F., Zhang, F., Liang, H.: Generating an annual 30 m rice cover product for monsoon Asia (2018–2023) using harmonized Landsat and Sentinel-2 data and the NASA-IBM geospatial foundation model. Remote Sensing of Environment335, 115256 (2026) https://doi.org/10.1016/j.rse. 2026.115256

work page doi:10.1016/j.rse 2018
[28]

https://doc.arcgis.com/en/pretrained-models/latest/imagery/introduction-to-prithvi-flood- segmentation.htm?utm source=chatgpt.com (2024)

Introduction to the Model—ArcGIS Pretrained Models|Documentation. https://doc.arcgis.com/en/pretrained-models/latest/imagery/introduction-to-prithvi-flood- segmentation.htm?utm source=chatgpt.com (2024)

work page 2024
[29]

Wiratama, W., Chong, M.K., Lim, Y.L., Ho, C.J.: Comparative Analysis of Fine-Tuned Foundation Models for Land Cover Classification using Sentinel-2 Imagery, Study Area: Sumatra and Kalimantan, Indonesia. The International Archives of the Photogramme- try, Remote Sensing and Spatial Information SciencesXL VIII-G-2025, 1559–1564 (2025) https://doi.org/10.51...

work page doi:10.5194/isprs-archives-xlviii-g-2025-1559-2025 2025
[30]

Remote Sensing17(20) (2025) https://doi.org/10.3390/rs17203472

Alvarez, C.I., Vaca, C.A.U., Llumipanta, N.A.E.: Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador. Remote Sensing17(20) (2025) https://doi.org/10.3390/rs17203472

work page doi:10.3390/rs17203472 2025
[31]

In: NeurIPS 2025 Workshop on Tackling Climate Change with Machine Learning (2025)

Ashfaq, H., Arsal, M., Ashfaq, A.: Theory-guided deep learning with alphaearth embeddings for flash flood prediction in data-scarce regions. In: NeurIPS 2025 Workshop on Tackling Climate Change with Machine Learning (2025). https://www.climatechange.ai/papers/neurips2025/98

work page 2025
[32]

Harvesting AlphaEarth: Benchmarking the Geospatial Foundation Model for Agricultural Downstream Tasks

Ma, Y., Shen, Y., Swatantran, A., Lobell, D.B.: Harvesting AlphaEarth: Benchmarking the Geospatial Foundation Model for Agricultural Downstream Tasks. arXiv (2025). https://doi. org/10.48550/arXiv.2601.00857

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2601.00857 2025
[33]

Current Environmental Health Reports9(1), 80–89 (2022) https://doi.org/10.1007/ s40572-022-00336-w

Smith, G.S., Anjum, E., Francis, C., Deanes, L., Acey, C.: Climate Change, Environ- mental Disasters, and Health Inequities: The Underlying Role of Structural Inequali- ties. Current Environmental Health Reports9(1), 80–89 (2022) https://doi.org/10.1007/ s40572-022-00336-w

work page 2022
[34]

Journal of International Development35(7), 1753–1768 (2023) https://doi.org/10.1002/jid

Hall, O., Dompae, F., Wahab, I., Dzanku, F.M.: A review of machine learning and satel- lite imagery for poverty prediction: Implications for development research and applications. Journal of International Development35(7), 1753–1768 (2023) https://doi.org/10.1002/jid. 3751

work page doi:10.1002/jid 2023
[35]

Urban Forestry & Urban Greening117, 129264 (2026) https://doi.org/10.1016/j.ufug.2026

Gong, W., Wu, L., Zhu, C., Song, Y., Ye, X.: Revealing park visitation under dual environ- mental threats in a socially stratified city: Evidence from smartphone mobility data in Dallas. Urban Forestry & Urban Greening117, 129264 (2026) https://doi.org/10.1016/j.ufug.2026. 129264

work page doi:10.1016/j.ufug.2026 2026
[36]

Nature Communications16(1), 10372 (2025) https://doi.org/10.1038/s41467-025-65373-z

Xu, Y., Gao, S., Huang, Q., G¨ o¸ cmen, A., Zhu, Q., Zhang, F.: Predicting human mobility 15 flows in cities using deep learning on satellite imagery. Nature Communications16(1), 10372 (2025) https://doi.org/10.1038/s41467-025-65373-z

work page doi:10.1038/s41467-025-65373-z 2025
[37]

JAMA Cardiology9(6), 556–564 (2024) https://doi

Chen, Z., Dazard, J.-E., Khalifa, Y., Motairek, I., Kreatsoulas, C., Rajagopalan, S., Al- Kindi, S.: Deep Learning–Based Assessment of Built Environment From Satellite Images and Cardiometabolic Disease Prevalence. JAMA Cardiology9(6), 556–564 (2024) https://doi. org/10.1001/jamacardio.2024.0749

work page doi:10.1001/jamacardio.2024.0749 2024
[38]

JAMA Network Open 7(12), 2449113 (2024) https://doi.org/10.1001/jamanetworkopen.2024.49113

Yi, L., Harnois-Leblanc, S., Rifas-Shiman, S.L., Suel, E., Pescador Jimenez, M., Lin, P.-I.D., Hystad, P., Hankey, S., Zhang, W., Hivert, M.-F., Oken, E., Aris, I.M., James, P.: Satellite- Based and Street-View Green Space and Adiposity in US Children. JAMA Network Open 7(12), 2449113 (2024) https://doi.org/10.1001/jamanetworkopen.2024.49113

work page doi:10.1001/jamanetworkopen.2024.49113 2024
[39]

An improved NExT-DMD for efficient automated operational modal analysis.Applied Mathematical Modelling2026,156, 116823

Pucher, J., Dill, J., Handy, S.: Infrastructure, programs, and policies to increase bicycling: An international review. Preventive Medicine50, 106–125 (2010) https://doi.org/10.1016/j. ypmed.2009.07.028

work page doi:10.1016/j 2010
[40]

Transport Reviews 36(1), 9–27 (2016) https://doi.org/10.1080/01441647.2015.1069908

Buehler, R., Dill, J.: Bikeway Networks: A Review of Effects on Cycling. Transport Reviews 36(1), 9–27 (2016) https://doi.org/10.1080/01441647.2015.1069908

work page doi:10.1080/01441647.2015.1069908 2016
[41]

Science353(6301), 790–794 (2016) https://doi.org/10.1126/science.aaf7894

Jean, N., Burke, M., Xie, M., Alampay Davis, W.M., Lobell, D.B., Ermon, S.: Combining satellite imagery and machine learning to predict poverty. Science353(6301), 790–794 (2016) https://doi.org/10.1126/science.aaf7894

work page doi:10.1126/science.aaf7894 2016
[42]

Random Structures & Algorithms22(1), 60–65 (2003) https://doi.org/10.1002/rsa.10073

Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms22(1), 60–65 (2003) https://doi.org/10.1002/rsa.10073

work page doi:10.1002/rsa.10073 2003
[43]

In: IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, pp

Blumenstiel, B., Moor, V., Kienzler, R., Brunschwiler, T.: Multi-Spectral Remote Sens- ing Image Retrieval Using Geospatial Foundation Models. In: IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, pp. 7286–7291 (2024). https: //doi.org/10.1109/IGARSS53475.2024.10641903

work page doi:10.1109/igarss53475.2024.10641903 2024
[44]

In: Proceedings of the 31st International Conference on Neural Information Processing Systems

Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17, pp. 6405–6416. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017
[45]

Morgan, Erica L

Rachel E. Morgan, Erica L. Smith: The National Crime Victimization Survey and National Incident-Based Reporting System: A Complementary Picture of Crime in 2022|Bureau of Justice Statistics. Bureau of Justice Statistics (2023)

work page 2022
[46]

IEEE transactions on pattern analysis and machine intelligence45(4), 4396–4415 (2023) https: //doi.org/10.1109/TPAMI.2022.3195549

Zhou, K., Liu, Z., Qiao, Y., Xiang, T., Loy, C.C.: Domain Generalization: A Survey. IEEE transactions on pattern analysis and machine intelligence45(4), 4396–4415 (2023) https: //doi.org/10.1109/TPAMI.2022.3195549

work page doi:10.1109/tpami.2022.3195549 2023
[47]

In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pp

Pettersson, M.B., Kakooei, M., Ortheden, J., Johansson, F.D., Daoud, A.: Time Series of Satellite Imagery Improve Deep Learning Estimates of Neighborhood-Level Poverty in Africa. In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pp. 6165–6173. International Joint Conferences on Artificial Intelligence Organiza...

work page doi:10.24963/ijcai.2023/684 2023
[48]

Advances in Neural Information Processing Systems35, 197–211 (2022)

Cong, Y., Khanna, S., Meng, C., Liu, P., Rozi, E., He, Y., Burke, M., Lobell, D.B., Ermon, S.: SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery. Advances in Neural Information Processing Systems35, 197–211 (2022)

work page 2022
[49]

International Journal of Computer Vision133(11), 16 7672–7709 (2025) https://doi.org/10.1007/s11263-025-02518-z

Al-Emadi, S.A., Yang, Y., Ofli, F.: Analysing Satellite Imagery Classification under Spatial Domain Shift across Geographic Regions. International Journal of Computer Vision133(11), 16 7672–7709 (2025) https://doi.org/10.1007/s11263-025-02518-z

work page doi:10.1007/s11263-025-02518-z 2025
[50]

Curran Associates Inc., Red Hook, NY, USA (2019)

Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley, D., Nowozin, S., Dillon, J.V., Lakshmi- narayanan, B., Snoek, J.: Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. Curran Associates Inc., Red Hook, NY, USA (2019)

work page 2019
[51]

International Encyclopedia of Human Geogra- phy, 169–173 (2020) https://doi.org/10.1016/B978-0-08-102295-5.10406-8

Buzzelli, M.: Modifiable Areal Unit Problem. International Encyclopedia of Human Geogra- phy, 169–173 (2020) https://doi.org/10.1016/B978-0-08-102295-5.10406-8

work page doi:10.1016/b978-0-08-102295-5.10406-8 2020
[52]

Preventive medicine126, 105735 (2019)

Marquet, O., Hipp, J.A., Alberico, C., Huang, J.-H., Fry, D., Mazak, E., Lovasi, G.S., Floyd, M.F.: Short-term associations between objective crime, park-use, and park-based physical activity in low-income neighborhoods. Preventive medicine126, 105735 (2019)

work page 2019
[53]

https://www.census.gov/programs- surveys/acs/data.html (2025)

Bureau, U.C.: American Community Survey Data. https://www.census.gov/programs- surveys/acs/data.html (2025)

work page 2025
[54]

https://data.cdc.gov/ (2025)

Centers for Disease Control and Prevention: Data|Centers for Disease Control and Prevention. https://data.cdc.gov/ (2025)

work page 2025
[55]

In: Encyclopedia of Quality of Life and Well-Being Research, pp

Zdaniuk, B.: Ordinary least-squares (ols) model. In: Encyclopedia of Quality of Life and Well-Being Research, pp. 4515–4517. Springer, Dordrecht (2014). https://doi.org/10.1007/ 978-94-007-0753-5 2008

work page 2014
[56]

Machine Learning45(1), 5–32 (2001) https://doi.org/10.1023/ A:1010933404324

Breiman, L.: Random Forests. Machine Learning45(1), 5–32 (2001) https://doi.org/10.1023/ A:1010933404324

work page 2001
[57]

Proceedings of the 22nd

Chen, T., Guestrin, C.: XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016). https://doi.org/10.1145/2939672.2939785

work page doi:10.1145/2939672.2939785 2016
[58]

In: Proceedings of the 31st International Conference on Neural Information Processing Systems

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., Liu, T.-Y.: Lightgbm: a highly efficient gradient boosting decision tree. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17, pp. 3149–3157. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017
[59]

The American journal of drug and alcohol abuse37(5), 367–375 (2011) https://doi.org/10.3109/00952990

Hu, M.-C., Pavlicova, M., Nunes, E.V.: Zero-inflated and Hurdle Models of Count Data with Extra Zeros: Examples from an HIV-Risk Reduction Intervention Trial. The American journal of drug and alcohol abuse37(5), 367–375 (2011) https://doi.org/10.3109/00952990. 2011.597280

work page doi:10.3109/00952990 2011
[60]

https://www.census.gov/data/tables/time-series/demo/popest/2020s-total-metro-and- micro-statistical-areas.html

Bureau, U.C.: Metropolitan and Micropolitan Statistical Areas Population Totals: 2020-2024. https://www.census.gov/data/tables/time-series/demo/popest/2020s-total-metro-and- micro-statistical-areas.html

work page 2020
[61]

https://www.epa.gov/smartgrowth/smart- location-mapping (2021)

US EPA, OLEM.: Smart Location Mapping. https://www.epa.gov/smartgrowth/smart- location-mapping (2021)

work page 2021
[62]

I. T. Jolliffe: Principal Component Analysis. Springer Series in Statistics. Springer, New York (2002). https://doi.org/10.1007/b98835

work page doi:10.1007/b98835 2002
[63]

Journal of the Royal Statistical Society

Lawley, D.N., Maxwell, A.E.: Factor Analysis as a Statistical Method. Journal of the Royal Statistical Society. Series D (The Statistician)12(3), 209–229 (1962) https://doi.org/10.2307/ 2986915 2986915

work page 1962
[64]

Neural Computation10(5), 1299–1319 (1998) https://doi.org/10.1162/ 089976698300017467 17

Sch¨ olkopf, B., Smola, A., M¨ uller, K.-R.: Nonlinear Component Analysis as a Kernel Eigen- value Problem. Neural Computation10(5), 1299–1319 (1998) https://doi.org/10.1162/ 089976698300017467 17

work page 1998
[65]

Science290(5500), 2319–2323 (2000) https://doi.org/10.1126/ science.290.5500.2319

Tenenbaum, J.B., Silva, V., Langford, J.C.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science290(5500), 2319–2323 (2000) https://doi.org/10.1126/ science.290.5500.2319

work page arXiv 2000
[66]

In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Bingham, E., Mannila, H.: Random projection in dimensionality reduction: Applications to image and text data. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’01, pp. 245–250. Association for Computing Machinery, New York, NY, USA (2001). https://doi.org/10.1145/502512.502546 18 Supplementary ...

work page doi:10.1145/502512.502546 2001