hub

AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data

Christopher F. Brown, Michal R. Kazmierski, Valerie J. Pasquarella, William J. Rucklidge, Masha Samsikova, Chenhui Zhang · 2025 · cs.CV · arXiv 2507.22291

28 Pith papers cite this work. Polarity classification is still indexing.

28 Pith papers citing it

open full Pith review browse 28 citing papers arXiv PDF

abstract

Unprecedented volumes of Earth observation data are continually collected around the world, but high-quality labels remain scarce given the effort required to make physical measurements and observations. This has led to considerable investment in bespoke modeling efforts translating sparse labels into maps. Here we introduce AlphaEarth Foundations, an embedding field model yielding a highly general, geospatial representation that assimilates spatial, temporal, and measurement contexts across multiple sources, enabling accurate and efficient production of maps and monitoring systems from local to global scales. The embeddings generated by AlphaEarth Foundations are the only to consistently outperform a suite of other well-known/widely accepted featurization approaches tested on a diverse set of mapping evaluations without re-training. We have released a dataset of global, annual, analysis-ready embedding field layers from 2017 through 2024.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 other 1

citation-polarity summary

background 3 unclear 1

representative citing papers

ChronoEarth-492K: A Large Scale and Long Horizon Spatiotemporal Hyperspectral Earth Observation Dataset and Benchmark

cs.CV · 2026-05-15 · conditional · novelty 8.0

Introduces ChronoEarth-492K, a 492K-patch temporally calibrated hyperspectral dataset from the EO-1 Hyperion archive spanning 2001-2017, plus a benchmark for static, short-horizon, and long-horizon spatiotemporal tasks using open geospatial products.

Better Together: Evaluating the Complementarity of Earth Embedding Models

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

Fusing embeddings from four Earth models (AlphaEarth, Tessera, GeoCLIP, SatCLIP) outperforms the best single model on four of six tasks, with gains depending on task and location.

TRAJGANR: Trajectory-Centric Urban Multimodal Learning via Geospatially Aligned Neural Representations

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

TrajGANR learns continuous neural representations of trajectories to enable fine-grained alignment with street-view images and locations in a joint multimodal self-supervised objective, outperforming prior geospatial MSSL methods on urban mobility and road tasks.

UNIGEOCLIP: Unified Geospatial Contrastive Learning

cs.CV · 2026-04-13 · unverdicted · novelty 7.0

UNIGEOCLIP creates a unified embedding for aerial imagery, street views, elevation, text, and coordinates via all-to-all contrastive alignment plus a scaled lat-long encoder, outperforming single-modality and coordinate baselines on geospatial tasks.

Cross-Scale Pretraining: Enhancing Self-Supervised Learning for Low-Resolution Satellite Imagery for Semantic Segmentation

cs.CV · 2026-01-19 · unverdicted · novelty 7.0

A new spatial affinity component for self-supervised pretraining leverages high-resolution imagery to enhance mid-resolution satellite image representations and segmentation performance.

SpectralEarth-FM: Bringing Hyperspectral Imagery into Multimodal Earth Observation Pretraining

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

SpectralEarth-FM is a multisensor hierarchical transformer pretrained on a 40TB co-located HSI-MSI-SAR dataset using a JEPA-style objective and reports state-of-the-art results on hyperspectral and standard EO benchmarks.

FLUXtrapolation: A benchmark on extrapolating ecosystem fluxes

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

FLUXtrapolation is a benchmark for domain generalization in ecosystem flux upscaling using temporal, spatial, and temperature-based extrapolation scenarios, with pilot results showing model separation on tail and multi-scale metrics.

In-context learning enables continental-scale subsurface temperature prediction from sparse local observations

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

A transformer-based in-context learning model predicts continental-scale subsurface temperatures from sparse borehole observations, outperforming physics and interpolation baselines while adapting to new regions with 20 examples.

Does Your Wildfire Prediction Model Actually Work, or Just Score Well?

cs.LG · 2026-05-14 · unverdicted · novelty 6.0 · 2 refs

Introduces WILDFIRE-FM and a fixed-contract evaluation framework demonstrating that wildfire model transfer conclusions depend strongly on evaluation design and task formulation.

No One Knows the State of the Art in Geospatial Foundation Models

cs.CV · 2026-05-12 · accept · novelty 6.0

An audit of 152 papers reveals that geospatial foundation models lack standardized evaluations, training controls, and weight releases, so no one knows the state of the art.

Predictive and Prescriptive AI toward Optimizing Wildfire Suppression

math.OC · 2026-05-06 · unverdicted · novelty 6.0

A new optimization algorithm with double machine learning for wildfire spread estimation enables better crew assignments that reduce total area burned.

A Proxy Consistency Loss for Grounded Fusion of Earth Observation and Location Encoders

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

A proxy consistency loss trains location encoders on proxy geographic data to outperform direct input fusion or frozen embeddings for air quality and poverty mapping with sparse labels.

When Earth Foundation Models Meet Diffusion: An Application to Land Surface Temperature Super-Resolution

cs.CV · 2026-04-18 · unverdicted · novelty 6.0

EFDiff conditions a diffusion model with Prithvi-EO-2.0 geospatial embeddings via cross-attention to achieve 32x LST super-resolution, outperforming baselines on a global Landsat dataset.

Exploring the potential of AlphaEarth and TESSERA embeddings for Fine-scale Local Climate Zone Mapping: A case study across five cities in Switzerland

cs.LG · 2026-06-18 · unverdicted · novelty 5.0

TESSERA embeddings achieve the highest IoU (0.77-0.82) for 10m LCZ mapping across Swiss cities and outperform Sentinel-1/2 and AlphaEarth, though year-to-year transfer remains challenging.

Continuous biome representations from Earth observation embeddings

q-bio.QM · 2026-06-09 · unverdicted · novelty 5.0

Linear classifier on Clay v1.5 embeddings produces continuous biome probabilities that raise mean per-species AUC for occurrence prediction from 0.570 (discrete labels) to 0.618 on 10,015 Brazilian forest plots.

Mini-JEPA Foundation Model Fleet Enables Agentic Hydrologic Intelligence

cs.LG · 2026-05-13 · unverdicted · novelty 5.0

A fleet of sensor-specialized 22M-parameter JEPA models routed by an LLM improves LLM-as-judge scores on hydrologic questions over AlphaEarth alone with Cohen's d of 1.10.

Toward a Scientific Discovery Engine for Weather and Climate Data: A Visual Analytics Workbench for Embedding-Based Exploration

physics.data-an · 2026-05-01 · conditional · novelty 5.0

A visual analytics workbench enables scientists to explore, query, and verify embedding-based similarity searches on weather and climate data by tracing results back to physical evidence.

Transferable Human Mobility Network Reconstruction with neuroGravity

cs.AI · 2026-04-26 · unverdicted · novelty 5.0

neuroGravity reconstructs transferable human mobility networks from basic urban data via physics-informed deep learning, with transferability predicted by a spatial income segregation index.

Unlocking Multi-Spectral Data for Multi-Modal Models with Guided Inputs and Chain-of-Thought Reasoning

cs.CV · 2026-04-22 · unverdicted · novelty 5.0

A prompting-based adaptation technique lets RGB-trained LMMs process multi-spectral inputs and deliver strong zero-shot gains on remote-sensing benchmarks.

Structure-Semantic Decoupled Modulation of Global Geospatial Embeddings for High-Resolution Remote Sensing Mapping

cs.CV · 2026-04-21 · unverdicted · novelty 5.0

SSDM decouples global geospatial embeddings into structural modulation and semantic injection pathways to improve accuracy and consistency in high-resolution remote sensing land cover mapping.

HuiYanEarth-SAR: A Foundation Model for High-Fidelity and Low-Cost Global Remote Sensing Imagery Generation

cs.CV · 2026-04-13 · unverdicted · novelty 5.0

HuiYanEarth-SAR is a foundation model that generates realistic global SAR imagery from geographic coordinates alone by integrating geospatial semantics and implicit scattering characteristics.

Location Is All You Need: Continuous Spatiotemporal Neural Representations of Earth Observation Data

cs.CV · 2026-04-08 · unverdicted · novelty 5.0

LIANet encodes multi-temporal Earth observation data into a coordinate-based neural field that supports label-only fine-tuning for downstream tasks without access to raw imagery.

Earth Embeddings Reveal Diverse Urban Signals from Space

cs.LG · 2026-04-03 · unverdicted · novelty 5.0

Earth embeddings from satellite images predict neighborhood-level urban indicators with higher accuracy for built-environment outcomes than for behavior-driven ones, showing city-specific variation but year-to-year stability.

FireScope: Wildfire Risk Raster Prediction with a Chain-of-Thought Oracle

cs.CV · 2025-11-21 · unverdicted · novelty 5.0 · 2 refs

FireScope trains a VLM on US data to output wildfire risk rasters with reasoning traces and shows improved cross-continental performance on European events compared with prior approaches.

citing papers explorer

Showing 28 of 28 citing papers.

ChronoEarth-492K: A Large Scale and Long Horizon Spatiotemporal Hyperspectral Earth Observation Dataset and Benchmark cs.CV · 2026-05-15 · conditional · none · ref 1 · internal anchor
Introduces ChronoEarth-492K, a 492K-patch temporally calibrated hyperspectral dataset from the EO-1 Hyperion archive spanning 2001-2017, plus a benchmark for static, short-horizon, and long-horizon spatiotemporal tasks using open geospatial products.
Better Together: Evaluating the Complementarity of Earth Embedding Models cs.CV · 2026-05-18 · unverdicted · none · ref 3 · internal anchor
Fusing embeddings from four Earth models (AlphaEarth, Tessera, GeoCLIP, SatCLIP) outperforms the best single model on four of six tasks, with gains depending on task and location.
TRAJGANR: Trajectory-Centric Urban Multimodal Learning via Geospatially Aligned Neural Representations cs.CV · 2026-05-07 · unverdicted · none · ref 3 · internal anchor
TrajGANR learns continuous neural representations of trajectories to enable fine-grained alignment with street-view images and locations in a joint multimodal self-supervised objective, outperforming prior geospatial MSSL methods on urban mobility and road tasks.
UNIGEOCLIP: Unified Geospatial Contrastive Learning cs.CV · 2026-04-13 · unverdicted · none · ref 5 · internal anchor
UNIGEOCLIP creates a unified embedding for aerial imagery, street views, elevation, text, and coordinates via all-to-all contrastive alignment plus a scaled lat-long encoder, outperforming single-modality and coordinate baselines on geospatial tasks.
Cross-Scale Pretraining: Enhancing Self-Supervised Learning for Low-Resolution Satellite Imagery for Semantic Segmentation cs.CV · 2026-01-19 · unverdicted · none · ref 8 · internal anchor
A new spatial affinity component for self-supervised pretraining leverages high-resolution imagery to enhance mid-resolution satellite image representations and segmentation performance.
SpectralEarth-FM: Bringing Hyperspectral Imagery into Multimodal Earth Observation Pretraining cs.CV · 2026-05-20 · unverdicted · none · ref 11 · internal anchor
SpectralEarth-FM is a multisensor hierarchical transformer pretrained on a 40TB co-located HSI-MSI-SAR dataset using a JEPA-style objective and reports state-of-the-art results on hyperspectral and standard EO benchmarks.
FLUXtrapolation: A benchmark on extrapolating ecosystem fluxes cs.LG · 2026-05-19 · unverdicted · none · ref 26 · internal anchor
FLUXtrapolation is a benchmark for domain generalization in ecosystem flux upscaling using temporal, spatial, and temperature-based extrapolation scenarios, with pilot results showing model separation on tail and multi-scale metrics.
In-context learning enables continental-scale subsurface temperature prediction from sparse local observations cs.LG · 2026-05-15 · unverdicted · none · ref 26 · internal anchor
A transformer-based in-context learning model predicts continental-scale subsurface temperatures from sparse borehole observations, outperforming physics and interpolation baselines while adapting to new regions with 20 examples.
Does Your Wildfire Prediction Model Actually Work, or Just Score Well? cs.LG · 2026-05-14 · unverdicted · none · ref 3 · 2 links · internal anchor
Introduces WILDFIRE-FM and a fixed-contract evaluation framework demonstrating that wildfire model transfer conclusions depend strongly on evaluation design and task formulation.
No One Knows the State of the Art in Geospatial Foundation Models cs.CV · 2026-05-12 · accept · none · ref 9 · internal anchor
An audit of 152 papers reveals that geospatial foundation models lack standardized evaluations, training controls, and weight releases, so no one knows the state of the art.
Predictive and Prescriptive AI toward Optimizing Wildfire Suppression math.OC · 2026-05-06 · unverdicted · none · ref 135 · internal anchor
A new optimization algorithm with double machine learning for wildfire spread estimation enables better crew assignments that reduce total area burned.
A Proxy Consistency Loss for Grounded Fusion of Earth Observation and Location Encoders cs.CV · 2026-04-20 · unverdicted · none · ref 1 · internal anchor
A proxy consistency loss trains location encoders on proxy geographic data to outperform direct input fusion or frozen embeddings for air quality and poverty mapping with sparse labels.
When Earth Foundation Models Meet Diffusion: An Application to Land Surface Temperature Super-Resolution cs.CV · 2026-04-18 · unverdicted · none · ref 1 · internal anchor
EFDiff conditions a diffusion model with Prithvi-EO-2.0 geospatial embeddings via cross-attention to achieve 32x LST super-resolution, outperforming baselines on a global Landsat dataset.
Exploring the potential of AlphaEarth and TESSERA embeddings for Fine-scale Local Climate Zone Mapping: A case study across five cities in Switzerland cs.LG · 2026-06-18 · unverdicted · none · ref 3 · internal anchor
TESSERA embeddings achieve the highest IoU (0.77-0.82) for 10m LCZ mapping across Swiss cities and outperform Sentinel-1/2 and AlphaEarth, though year-to-year transfer remains challenging.
Continuous biome representations from Earth observation embeddings q-bio.QM · 2026-06-09 · unverdicted · none · ref 3 · internal anchor
Linear classifier on Clay v1.5 embeddings produces continuous biome probabilities that raise mean per-species AUC for occurrence prediction from 0.570 (discrete labels) to 0.618 on 10,015 Brazilian forest plots.
Mini-JEPA Foundation Model Fleet Enables Agentic Hydrologic Intelligence cs.LG · 2026-05-13 · unverdicted · none · ref 1 · internal anchor
A fleet of sensor-specialized 22M-parameter JEPA models routed by an LLM improves LLM-as-judge scores on hydrologic questions over AlphaEarth alone with Cohen's d of 1.10.
Toward a Scientific Discovery Engine for Weather and Climate Data: A Visual Analytics Workbench for Embedding-Based Exploration physics.data-an · 2026-05-01 · conditional · none · ref 3 · internal anchor
A visual analytics workbench enables scientists to explore, query, and verify embedding-based similarity searches on weather and climate data by tracing results back to physical evidence.
Transferable Human Mobility Network Reconstruction with neuroGravity cs.AI · 2026-04-26 · unverdicted · none · ref 38 · internal anchor
neuroGravity reconstructs transferable human mobility networks from basic urban data via physics-informed deep learning, with transferability predicted by a spatial income segregation index.
Unlocking Multi-Spectral Data for Multi-Modal Models with Guided Inputs and Chain-of-Thought Reasoning cs.CV · 2026-04-22 · unverdicted · none · ref 29 · internal anchor
A prompting-based adaptation technique lets RGB-trained LMMs process multi-spectral inputs and deliver strong zero-shot gains on remote-sensing benchmarks.
Structure-Semantic Decoupled Modulation of Global Geospatial Embeddings for High-Resolution Remote Sensing Mapping cs.CV · 2026-04-21 · unverdicted · none · ref 2 · internal anchor
SSDM decouples global geospatial embeddings into structural modulation and semantic injection pathways to improve accuracy and consistency in high-resolution remote sensing land cover mapping.
HuiYanEarth-SAR: A Foundation Model for High-Fidelity and Low-Cost Global Remote Sensing Imagery Generation cs.CV · 2026-04-13 · unverdicted · none · ref 20 · internal anchor
HuiYanEarth-SAR is a foundation model that generates realistic global SAR imagery from geographic coordinates alone by integrating geospatial semantics and implicit scattering characteristics.
Location Is All You Need: Continuous Spatiotemporal Neural Representations of Earth Observation Data cs.CV · 2026-04-08 · unverdicted · none · ref 4 · internal anchor
LIANet encodes multi-temporal Earth observation data into a coordinate-based neural field that supports label-only fine-tuning for downstream tasks without access to raw imagery.
Earth Embeddings Reveal Diverse Urban Signals from Space cs.LG · 2026-04-03 · unverdicted · none · ref 23 · internal anchor
Earth embeddings from satellite images predict neighborhood-level urban indicators with higher accuracy for built-environment outcomes than for behavior-driven ones, showing city-specific variation but year-to-year stability.
FireScope: Wildfire Risk Raster Prediction with a Chain-of-Thought Oracle cs.CV · 2025-11-21 · unverdicted · none · ref 9 · 2 links · internal anchor
FireScope trains a VLM on US data to output wildfire risk rasters with reasoning traces and shows improved cross-continental performance on European events compared with prior approaches.
Urban Heat MiniCubes: An AI-Ready dataset for urban heat research physics.ao-ph · 2026-06-10 · unverdicted · none · ref 18 · internal anchor
Releases a publicly available, collocated multi-sensor dataset of Landsat, Sentinel-1, GOES-R and microwave observations for urban heat studies across 48 cities.
Mapping Tomato Cropping Systems in California Using AlphaEarth Geospatial Embeddings and Deep Learning Analysis eess.IV · 2026-05-20 · unverdicted · none · ref 26 · internal anchor
A U-Net segmentation model trained on 64-band AlphaEarth embedding chips achieves 99.19% pixel accuracy and 99.04% F1 on an independent test set for distinguishing tomato from non-tomato fields in California.
Agentic AI for Remote Sensing: Technical Challenges and Research Directions cs.CV · 2026-04-27 · unreviewed · ref 10 · 2 links · internal anchor
Seeing SDG 6 from space: local-scale monitoring of piped water and sewage system access across Africa using satellite imagery and self-supervised learning cs.CV · 2024-11-28 · unreviewed · ref 19 · internal anchor

AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer