Recognition: unknown
The Impact of Federated Learning on Distributed Remote Sensing Archives
Pith reviewed 2026-05-10 16:05 UTC · model grok-4.3
The pith
FedProx outperforms FedAvg for deeper models under label skew on remote sensing data while LeNet offers the best accuracy-communication trade-off.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under controlled non-IID label-skew conditions on the UC Merced multi-label dataset, FedProx outperforms FedAvg for deeper convolutional architectures, bulk synchronous parallel approaches centralized accuracy levels at the expense of high sequential communication, and LeNet provides the best accuracy-communication trade-off among the evaluated models.
What carries the argument
Joint comparison of federated algorithms (FedAvg, FedProx, BSP) against CNN depths (LeNet, AlexNet, ResNet-34) under varying client counts, client fractions, batch sizes, and label-skew partitions.
Load-bearing premise
The artificial non-IID label skew created on the UC Merced dataset mirrors the geographic and label variations in real distributed remote sensing archives.
What would settle it
Partitioning actual Sentinel-1 or Sentinel-2 imagery by geographic region, applying the same federated protocols, and checking whether FedProx still outperforms FedAvg for deeper models would confirm or refute the results.
Figures
read the original abstract
Remote sensing archives are inherently distributed: Earth observation missions such as Sentinel-1, Sentinel-2, and Sentinel-3 have collectively accumulated more than 5 petabytes of imagery, stored and processed across many geographically dispersed platforms. Training machine learning models on such data in a centralized fashion is impractical due to data volume, sovereignty constraints, and geographic distribution. Federated learning (FL) addresses this by keeping data local and exchanging only model updates. A central challenge for remote sensing is the non-IID nature of Earth observation data: label distributions vary strongly by geographic region, degrading the convergence of standard FL algorithms. In this paper, we conduct a systematic empirical study of three FL strategies -- FedAvg, FedProx, and bulk synchronous parallel (BSP) -- applied to multi-label remote sensing image classification under controlled non-IID label-skew conditions. We evaluate three convolutional neural network (CNN) architectures of increasing depth (LeNet, AlexNet, and ResNet-34) and analyze the joint effect of algorithm choice, model capacity, client fraction, client count, batch size, and communication cost. Experiments on the UC Merced multi-label dataset show that FedProx outperforms FedAvg for deeper architectures under data heterogeneity, that BSP approaches centralized accuracy at the cost of high sequential communication, and that LeNet provides the best accuracy-communication trade-off for the dataset scale considered.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper conducts a systematic empirical study of three federated learning strategies (FedAvg, FedProx, and bulk synchronous parallel) for multi-label remote sensing image classification. It evaluates three CNN architectures of increasing depth (LeNet, AlexNet, ResNet-34) on the UC Merced dataset under controlled non-IID label-skew partitions, while sweeping hyperparameters including client fraction, client count, batch size, and communication cost. The reported findings are that FedProx outperforms FedAvg for deeper architectures under heterogeneity, BSP approaches centralized accuracy at the expense of high sequential communication, and LeNet yields the best accuracy-communication trade-off for the dataset scale considered.
Significance. If the results hold under more representative conditions, the work supplies actionable guidance on algorithm and architecture selection for federated training on distributed Earth-observation data, quantifying concrete trade-offs between accuracy and communication overhead in non-IID regimes. The comprehensive hyperparameter analysis and direct comparison against centralized baselines are strengths of the empirical design.
major comments (1)
- [Abstract] Abstract: The motivating use case is explicitly the non-IID structure of Sentinel-1/2/3 archives arising from geographic region, acquisition time, spectral characteristics, and land-cover co-occurrence at continental scale. All quantitative claims, however, derive from label-skew partitions of the UC Merced dataset (2100 high-resolution aerial photographs from a narrow set of U.S. locations). Because UC Merced is neither satellite imagery nor geographically distributed at the scale of operational EO archives, the observed performance ordering (FedProx > FedAvg for deeper nets; LeNet best trade-off) may not transfer; a direct test on geographically partitioned Sentinel tiles would be required to support the title and abstract claims.
minor comments (2)
- The abstract and experimental summary omit explicit statements on the number of independent runs, random seeds, and whether error bars or statistical tests accompany the reported accuracy rankings; these details are needed to assess robustness of the relative ordering between FedProx and FedAvg.
- A short paragraph in the conclusions or discussion section explicitly acknowledging that UC Merced serves only as a controlled proxy and that real Sentinel heterogeneity may alter the conclusions would improve clarity without altering the empirical contribution.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for identifying the important gap between the motivating application and the experimental setting. We agree that the claims require clarification and will revise the abstract, introduction, and discussion accordingly while preserving the core empirical contributions on the UC Merced benchmark.
read point-by-point responses
-
Referee: [Abstract] Abstract: The motivating use case is explicitly the non-IID structure of Sentinel-1/2/3 archives arising from geographic region, acquisition time, spectral characteristics, and land-cover co-occurrence at continental scale. All quantitative claims, however, derive from label-skew partitions of the UC Merced dataset (2100 high-resolution aerial photographs from a narrow set of U.S. locations). Because UC Merced is neither satellite imagery nor geographically distributed at the scale of operational EO archives, the observed performance ordering (FedProx > FedAvg for deeper nets; LeNet best trade-off) may not transfer; a direct test on geographically partitioned Sentinel tiles would be required to support the title and abstract claims.
Authors: We acknowledge that UC Merced is a controlled, small-scale aerial benchmark rather than a direct proxy for continental-scale Sentinel archives. The label-skew partitioning was chosen to isolate the impact of non-IID label distributions in a reproducible way, following common practice in federated learning studies on remote-sensing classification. We agree that the specific performance ordering may not generalize to geographically partitioned satellite tiles with additional spectral and temporal heterogeneity. In the revised manuscript we will (i) rephrase the abstract and title to state that the study examines federated learning on a standard multi-label remote-sensing benchmark under controlled label skew, motivated by the challenges of distributed Earth-observation archives, and (ii) add an explicit limitations paragraph that discusses the dataset choice and calls for future work on real Sentinel partitions. We cannot, however, conduct new experiments on full Sentinel tiles within the scope of this revision. revision: yes
- Conducting additional experiments on geographically partitioned Sentinel-1/2/3 tiles at operational scale, due to data-access, storage, and computational constraints.
Circularity Check
No derivation chain; purely empirical comparisons with no fitted predictions or self-referential claims
full rationale
The manuscript contains no equations, first-principles derivations, or claims that a quantity is predicted from a model. All reported results are direct experimental measurements of accuracy, communication cost, and convergence on controlled label-skew partitions of the UC Merced dataset. No parameters are fitted and then re-used as predictions, no uniqueness theorems are invoked, and no self-citations form load-bearing premises. The work is therefore self-contained as an empirical benchmark study; the skeptic concern about dataset representativeness is a question of external validity, not circularity within the paper's own logic.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Vitor C. F. Gomes, Gilberto R. Queiroz, and Karine R. Ferreira. An overview of platforms for big earth obser- vation data management and analysis.Remote Sensing, 12(8), 2020
2020
-
[2]
Federated learning: Collaborative machine learning without cen- tralized training data
Brendan McMahan and Daniel Ramage. Federated learning: Collaborative machine learning without cen- tralized training data. https://ai.googleblog.com/2017/04/ federated-learning-collaborative.html, 2017
2017
- [3]
-
[4]
Communication-efficient learning of deep networks from decentralized data,
H. Brendan McMahan, Eider Moore, Daniel Ramage, and Blaise Ag ¨uera y Arcas. Federated learning of deep networks using model averaging.CoRR, abs/1602.05629, 2016
-
[5]
Anit Kumar Sahu, Tian Li, Maziar Sanjabi, Manzil Zaheer, Ameet Talwalkar, and Virginia Smith. On the convergence of federated optimization in heterogeneous networks.CoRR, abs/1812.06127, 2018
-
[6]
Leslie G. Valiant. A bridging model for parallel compu- tation.Commun. ACM, 33(8):103–111, August 1990
1990
-
[7]
Enhanced deep residual networks for single image super-resolution.CoRR, abs/1707.02921, 2017
Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. Enhanced deep residual networks for single image super-resolution.CoRR, abs/1707.02921, 2017
-
[8]
Imagenet classification with deep convolutional neural networks
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. InAdvances in neural information processing systems, pages 1097–1105, 2012
2012
-
[9]
Gradient-based learning applied to document recognition
Yann LeCun, L ´eon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. InProceedings of the IEEE, volume 86, pages 2278–2324, 1998
1998
-
[10]
Yi Yang and Shawn D. Newsam. Bag-of-visual-words and spatial extensions for land-use classification. In Divyakant Agrawal, Pusheng Zhang, Amr El Abbadi, and Mohamed F. Mokbel, editors,GIS, pages 270–279. ACM, 2010
2010
-
[11]
Ganger, Phillip B
Kevin Hsieh, Aaron Harlap, Nandita Vijaykumar, Dim- itris Konomis, Gregory R. Ganger, Phillip B. Gibbons, and Onur Mutlu. Gaia: Geo-distributed machine learning approaching lan speeds. InProceedings of the 14th USENIX Conference on Networked Systems Design and Implementation, NSDI’17, page 629–647, USA, 2017. USENIX Association
2017
-
[12]
Yujun Lin, Song Han, Huizi Mao, Yu Wang, and William J. Dally. Deep gradient compression: Reducing the communication bandwidth for distributed training, 2020
2020
-
[13]
Reddi, Sebastian U
Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, and Ananda Theertha Suresh. Scaffold: Stochastic controlled averaging for federated learning, 2020
2020
-
[14]
FedBoost: A communication-efficient algorithm for federated learning
Jenny Hamer, Mehryar Mohri, and Ananda Theertha Suresh. FedBoost: A communication-efficient algorithm for federated learning. In Hal Daum ´e III and Aarti Singh, editors,Proceedings of the 37th International Conference on Machine Learning, volume 119 ofProceedings of Machine Learning Research, pages 3973–3983. PMLR, 13–18 Jul 2020
2020
-
[15]
Fetchsgd: Communication-efficient federated learning with sketching, 2020
Daniel Rothchild, Ashwinee Panda, Enayat Ullah, Nikita Ivkin, Ion Stoica, Vladimir Braverman, Joseph Gonzalez, and Raman Arora. Fetchsgd: Communication-efficient federated learning with sketching, 2020
2020
-
[16]
Chaudhuri, B
B. Chaudhuri, B. Demir, S. Chaudhuri, and L. Bruzzone. Multilabel remote sensing image retrieval using a semisu- pervised graph-theoretic method.IEEE Transactions on Geoscience and Remote Sensing, 56(2):1144–1158, 2018
2018
-
[17]
Dietterich
Dan Hendrycks and Thomas G. Dietterich. Benchmark- ing neural network robustness to common corruptions and surface variations, 2019
2019
-
[18]
Multi-labelled classification using maximum entropy method
Shenghuo Zhu, Xiang Ji, Wei Xu, and Yihong Gong. Multi-labelled classification using maximum entropy method. pages 274–281, 08 2005
2005
-
[19]
Data mining and knowledge discovery handbook, 2005
Oded Z Maimon. Data mining and knowledge discovery handbook, 2005
2005
-
[20]
Torch7: A Matlab-like Environment for Machine Learning
Ronan Collobert, Koray Kavukcuoglu, and Cl ´ement Fara- bet. Torch7: A Matlab-like Environment for Machine Learning. Technical report
-
[21]
Openmined/pysyft
OpenMined. Openmined/pysyft
-
[22]
Bigearthnet: A large-scale benchmark archive for remote sensing image understanding,
Gencer Sumbul, Marcela Charfuelan, Beg ¨um Demir, and V olker Markl. Bigearthnet: A large-scale benchmark archive for remote sensing image understanding.CoRR, abs/1902.06148, 2019. TABLE I EXPERIMENTALSETUP: PARAMETERS DL Model FL Algorithm Epochs Clients Batch Size C- Fraction Skewness Client Epochs Small Skew LeNet Centralized 100 NA 4 NA NA NA NA ResNe...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.