arxiv: 2604.23776 · v1 · submitted 2026-04-26 · 💻 cs.CV · cs.AI

Recognition: unknown

From Noisy Historical Maps to Time-Series Oil Palm Mapping Without Annotation in Malaysia and Indonesia (2020-2024)

Nuttaset Kuapanich , Juepeng Zheng , Bohan Shi , Jiaying Liu , Jiayin Jiang , Jiatao Huang , Shenghan Tan , Qingmei Li

show 1 more author

Haohuan Fu

Authors on Pith no claims yet

Pith reviewed 2026-05-08 06:37 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords oil palm mappingSentinel-2time-series mappinglabel noiseU-Netdeep learningIndonesiaMalaysia

0 comments

The pith

A deep learning method generates 10-meter oil palm maps for Indonesia and Malaysia from 2020 to 2024 using only noisy 100-meter historical labels and Sentinel-2 imagery.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a U-Net model trained with Determinant-based Mutual Information loss can convert coarse historical oil palm labels into reliable high-resolution time-series maps without collecting fresh manual annotations. This matters because oil palm plantations drive both economic activity and rapid land-use changes in Southeast Asia, where existing maps lack the detail and recency needed for effective monitoring. The approach addresses the mismatch between 100-meter labels and 10-meter Sentinel-2 images by reducing the effect of label noise during training. Validation against thousands of verified points produces accuracies between 60 and 70 percent across the years, while the resulting maps show coverage peaking in 2022 before declining and expanding into flooded vegetation areas.

Core claim

The framework uses a U-Net architecture optimized with Determinant-based Mutual Information loss on Sentinel-2 imagery to generate 10-meter oil palm plantation maps for 2020, 2022, and 2024 without new annotations. Despite the noise introduced by training on 100-meter historical labels, the method yields overall accuracies of 70.64 percent in 2020, 63.53 percent in 2022, and 60.06 percent in 2024 when tested on 2,058 manually verified points. The maps indicate that oil palm coverage in Indonesia and Malaysia reached a maximum in 2022 and then declined by 2024, with transition analysis showing continued plantation expansion into flooded vegetation despite some stabilization with other crop ro

What carries the argument

U-Net architecture optimized by Determinant-based Mutual Information (DMI) loss, which reduces the impact of noisy labels caused by resolution mismatch between coarse historical maps and fine satellite imagery.

If this is right

Oil palm plantations can be tracked at 10-meter resolution over multiple years without repeated manual labeling campaigns.
The maps show plantation area peaking in 2022 and declining by 2024.
Transition analysis indicates ongoing expansion into flooded vegetation areas alongside some stabilization with other crops.
The produced datasets are released publicly to support monitoring of sustainability and deforestation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same noise-robust training approach could be tested on mapping other crops or land covers where only coarse historical data is available.
Combining the outputs with additional satellite bands or temporal consistency checks might raise accuracy in future applications.
The observed expansion into flooded areas could be cross-checked against independent deforestation reports to assess environmental impact.
Similar frameworks might reduce annotation costs for land-use studies in other rapidly changing tropical regions.

Load-bearing premise

The Determinant-based Mutual Information loss sufficiently compensates for noise when training on 10-meter imagery using 100-meter historical labels.

What would settle it

A large independent set of high-resolution ground truth points collected in the same region showing accuracies below 50 percent or maps that fail to detect the reported 2022 peak and 2024 decline would disprove the central claim.

Figures

Figures reproduced from arXiv: 2604.23776 by Bohan Shi, Haohuan Fu, Jiatao Huang, Jiaying Liu, Jiayin Jiang, Juepeng Zheng, Nuttaset Kuapanich, Qingmei Li, Shenghan Tan.

**Figure 1.** Figure 1: Precision comparison of oil palm plantation maps between 100 meter and 10 meter resolution significant improvement compared to other methods 3. We conduct a comprehensive spatiotemporal analysis of oil palm dynamics, revealing a peak in plantation area in 2022 followed by a contraction in 2024. Furthermore, we quantified land cover transitions, identifying that while agricultural crop rotation remains the … view at source ↗

**Figure 2.** Figure 2: Process for Developing a High-Resolution Oil Palm Plantation Map account for the temporal dynamics and plantation-specific spatial patterns of oil palm systems, where the use of historical LR labels can introduce even more severe noise. 3. Methodology 3.1. Overview of our proposed method This study presents a comprehensive deep learning-based approach for mapping oil palm plantations across Indonesia and M… view at source ↗

**Figure 3.** Figure 3: Geographic distribution of the 2,058 validation points used for accuracy assessment across the study area. 4.2. Generation of multi-temporal validation data To rigorously evaluate the performance of the proposed models and the AOPD data, we established a validation set comprising 2,058 randomly sampled pixels spanning Indonesia and Malaysia ( view at source ↗

**Figure 4.** Figure 4: Oil palm detection results in Malaysia across different models: (a) Sentinel-2 imagery, (b) AOPD noisy label data used for model training, and (c)-(h) oil palm detection outputs from respective models 5. PyramidMamba (Wang et al., 2024): A novel architecture incorporating Mamba-based decoders and Dense Spatial Pyramid Pooling to efficiently process spatial contexts with reduced redundancy. 4.4. Experiment… view at source ↗

**Figure 5.** Figure 5: Oil palm detection results in Indonesia across different models: (a) Sentinel-2 imagery, (b) AOPD noisy label data used for model training, and (c)-(h) oil palm detection outputs from respective models view at source ↗

**Figure 6.** Figure 6: Visual comparison between (a) Sentinel-2 satellite imagery, (b) Ground Truth (AOPD), (c) Results from the U-Net model trained with DMI Loss (learning with noisy labels), and (d) Results from the U-Net model trained with Binary Cross-Entropy Loss (standard supervision). comparison of a UNet model trained under identical configurations, differing only in the loss function employed. Panel (c) displays the res… view at source ↗

**Figure 7.** Figure 7: Temporal Validation of Predictive Model Performance (2020-2024) with CEIC dataset 15 view at source ↗

**Figure 8.** Figure 8: Oil Palm Plantation Area of Malaysia by Region and Year confirming the reliability of the derived trends despite the absence of dense manual annotations view at source ↗

**Figure 9.** Figure 9: Oil Palm Plantation Area of Indonesia by Region and Year Borneo), with areas ranging from 1.0×106 to 2.0×106 ha, reflecting favorable agro-ecological conditions and extensive land availability (Kuok Ho & Qahtani, 2020; Shevade & Loboda, 2019). In contrast, the Malay Peninsula shows lower densities, averaging 0.65 × 106 ha, due to more limited expansion space and less suitable soils. The spatial disparity h… view at source ↗

**Figure 10.** Figure 10: Oil Palm Plantation Maps in Malaysia (a) Plantation area by region in 2024 and (b) Percentage change in plantation area between 2020 and 2024 (a) Plantation Area 2024 < 0.15 million ha 0.15 - 0.5 million ha 0.5 - 1.0 million ha 1.0 - 2.0 million ha > 2.0 million ha (b) Area Change Rate 2020-2024 < -2.5% -2.5% - 0% 0% - 2% 2% - 10% > 10% Oil Palm Plantation Maps of Indonesia view at source ↗

**Figure 11.** Figure 11: Oil Palm Plantation Maps in Indonesia (a) Plantation area by region in 2024 and (b) Percentage change in plantation area between 2020 and 2024 with oil palm, we utilized the Environmental Systems Research Institute (Esri) Land Cover Classification (Esri et al., 2024). In the land cover flow analysis presented in view at source ↗

**Figure 12.** Figure 12: Sankey diagram illustrating land cover transitions (in 10-meter resolution pixels) between oil palm and other classes, including built area, crops, flooded vegetation, rangeland, and water. while, the exchange between oil palm and crops showed a reduction in bidirectional flow, suggesting a stabilization in agricultural rotation. Despite this reduction, the crop-to-palm interaction remained the most subst… view at source ↗

read the original abstract

Accurate monitoring of oil palm plantations is critical for balancing economic development with environmental conservation in Southeast Asia. However, existing plantation maps often suffer from low spatial resolution and a lack of recent temporal coverage, impeding effective surveillance of rapid land-use changes. In this study, we propose a deep learning framework to generate 10-meter resolution oil palm plantation maps for Indonesia and Malaysia from 2020 to 2024, utilizing Sentinel-2 imagery without requiring new manual annotations. To address the resolution mismatch between coarse 100-meter historical labels and 10-meter imagery, we employ a U-Net architecture optimized with Determinant-based Mutual Information (DMI). This approach effectively mitigates the influence of label noise. We validated our method against 2,058 manually verified points, achieving overall accuracies of 70.64%, 63.53%, and 60.06% for the years 2020, 2022, and 2024, respectively. Our comprehensive analysis reveals that oil palm coverage in the region peaked in 2022 before experiencing a decline in 2024. Furthermore, land cover transition analysis highlights a concerning trajectory of plantation expansion into flooded vegetation areas, despite a general stabilization in rotations with other crop types. These high-resolution maps provide essential data for monitoring sustainability commitments and deforestation dynamics in the region, and the generated datasets are made publicly available at https://doi.org/10.5281/zenodo.17768444.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper delivers a practical 10 m oil palm mapping pipeline for Malaysia and Indonesia 2020-2024 trained on noisy 100 m labels, but the evidence that DMI loss actually extracts reliable fine-scale detail is missing key controls.

read the letter

The paper gives a working pipeline for turning existing 100 m historical oil palm labels into 10 m annual maps for Indonesia and Malaysia from 2020 to 2024 using Sentinel-2 and a U-Net with Determinant-based Mutual Information loss. They skip new manual labels, validate on 2,058 independent points, report overall accuracies of 70.64 %, 63.53 %, and 60.06 % for the three years, and release the maps publicly. Some land-cover transition analysis is included as well.

Referee Report

3 major / 2 minor

Summary. The paper claims to develop an annotation-free deep learning method for creating 10m resolution time-series maps of oil palm plantations in Malaysia and Indonesia using Sentinel-2 data from 2020 to 2024. It uses a U-Net with Determinant-based Mutual Information (DMI) loss to handle noise from 100m historical labels. Validation on 2058 independent points shows overall accuracies of 70.64% (2020), 63.53% (2022), and 60.06% (2024), along with analysis of plantation expansion trends and land cover transitions. The maps and data are made publicly available.

Significance. If the DMI loss effectively mitigates label noise to enable reliable sub-100m mapping, this would provide a scalable, low-annotation-cost approach for monitoring oil palm dynamics and sustainability in Southeast Asia, with the public dataset release adding practical value. The temporal trend analysis could inform land-use policy if the maps are shown to be accurate at fine scales.

major comments (3)

[Methods (DMI loss)] The Methods section on the DMI loss provides no ablation study replacing DMI with standard cross-entropy loss (or simple upsampling + CE). This is load-bearing for the central claim, as the reported accuracies on 2058 points could arise from the U-Net learning dominant coarse patterns without DMI's specific noise mitigation.
[Results (validation)] The Results section reporting overall accuracies of 70.64%/63.53%/60.06% includes no baseline comparisons, confusion matrices, per-class metrics, or spatial error analysis. Without these, it is unclear whether the model recovers sub-100m boundaries or simply reproduces the 100m label field plus texture.
[Methods/Experiments] No noise-rate sensitivity analysis or demonstration (e.g., qualitative map comparisons or quantitative boundary metrics) is given to show that DMI enables recovery of details finer than the input 100m labels. This directly undermines the 'without new annotations' and temporal-trend conclusions.

minor comments (2)

[Abstract] The abstract would benefit from specifying the number of validation points per year and class balance to contextualize the overall accuracies.
[Figures] Figure captions and legends should explicitly note any scale bars, coordinate systems, and whether visualizations compare predictions against the coarse labels to illustrate fine-scale recovery.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which have helped us identify areas to strengthen our manuscript. We address each major comment point by point below, and we will incorporate the suggested additions in the revised version.

read point-by-point responses

Referee: [Methods (DMI loss)] The Methods section on the DMI loss provides no ablation study replacing DMI with standard cross-entropy loss (or simple upsampling + CE). This is load-bearing for the central claim, as the reported accuracies on 2058 points could arise from the U-Net learning dominant coarse patterns without DMI's specific noise mitigation.

Authors: We agree that demonstrating the specific contribution of the DMI loss through an ablation study is important to support our central claim. In the revised manuscript, we will add an ablation study comparing the U-Net trained with DMI loss against one trained with standard cross-entropy loss, including quantitative results on the validation set and qualitative map examples. revision: yes
Referee: [Results (validation)] The Results section reporting overall accuracies of 70.64%/63.53%/60.06% includes no baseline comparisons, confusion matrices, per-class metrics, or spatial error analysis. Without these, it is unclear whether the model recovers sub-100m boundaries or simply reproduces the 100m label field plus texture.

Authors: We acknowledge the need for more comprehensive validation metrics to clarify the model's ability to recover fine-scale details. We will revise the Results section to include baseline comparisons (such as direct upsampling of the 100m labels), full confusion matrices, per-class metrics (precision, recall, F1-score), and a spatial error analysis to assess boundary recovery. revision: yes
Referee: [Methods/Experiments] No noise-rate sensitivity analysis or demonstration (e.g., qualitative map comparisons or quantitative boundary metrics) is given to show that DMI enables recovery of details finer than the input 100m labels. This directly undermines the 'without new annotations' and temporal-trend conclusions.

Authors: We recognize that additional experiments are required to show the benefit of DMI for recovering finer details. In the revision, we will include a sensitivity analysis to different noise rates in the labels, along with qualitative comparisons of maps produced with and without DMI, and quantitative boundary metrics (e.g., Hausdorff distance or edge precision). These additions will bolster the support for our annotation-free approach and the validity of the temporal analyses. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external DMI loss and independent validation

full rationale

The paper trains a U-Net with Determinant-based Mutual Information loss on noisy 100 m historical labels to produce 10 m Sentinel-2 oil-palm maps, then reports accuracies on 2,058 separately verified points. No equations, self-citations, or ansatzes are shown that reduce the generated maps or accuracies to quantities fitted from the same inputs by construction. The DMI component is treated as an imported technique for noise mitigation rather than a self-derived result, and the temporal-trend conclusions follow from the output maps rather than presupposing them. This is the common case of an applied deep-learning pipeline whose central claim stands or falls on empirical performance rather than definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that DMI can decouple noisy coarse labels from fine imagery without introducing systematic bias; no free parameters or invented entities are explicitly introduced beyond standard neural network training.

axioms (1)

domain assumption Determinant-based Mutual Information loss mitigates label noise from resolution mismatch
Invoked to justify training without new annotations

pith-pipeline@v0.9.0 · 5600 in / 1259 out tokens · 41381 ms · 2026-05-08T06:37:38.585670+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

6 extracted references · 3 canonical work pages

[1]

sustainable

Alam, A. S. A. F., Er, A. C., & Begum, H. (2015). Malaysian oil palm industry: Prospect and problem. Journal of Food, Agriculture and Environment , 1313, 143–148. Baum, E. B., & Wilczek, F. (1988). Supervised learning of probability distributions by neural networks. In Neural Information Processing Systems (pp. 52–61). Bottou, L. (2010). Large-scale machi...

work page arXiv 2015
[2]

Ioffe, S., & Szegedy, C. (2015). Batch normalization: accelerating deep network training by reduc- ing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37 ICML’15 (p. 448–456). JMLR.org. Kong, Y. (2020). Dominantly truthful multi-task peer prediction with a constant numb...

2015
[3]

doi: 10.5281/zenodo.17768444

URL: https://doi.org/10.5281/zenodo.17768444. doi: 10.5281/zenodo.17768444. Kuok Ho, D. T., & Qahtani, H. (2020). Sustainability of oil palm plantations in malaysia. Environment, Development and Sustainability ,

work page doi:10.5281/zenodo.17768444 2020
[4]

Li, Z., He, W., Cheng, M., Hu, J., Yang, G., & Zhang, H. (2023). Sinolc-1: the first 1 m resolution national-scale land-cover map of china created with a deep learning framework and open-access data. Earth System Science Data , . Li, Z., He, W., Li, J., Lu, F., & Zhang, H. (2024). Learning without exact guidance: Updating large-scale high-resolution land ...

work page arXiv 2023
[5]

Xu, Y., Cao, P., Kong, Y., & Wang, Y. (2019). dmi: a novel information-theoretic loss function for training deep nets robust to label noise. Red Hook, NY, USA: Curran Associates Inc. Xu, Y., Yu, L., Li, W., Ciais, P., Cheng, Y., & Gong, P. (2020). Annual oil palm plantation maps in malaysia and indonesia from 2001 to

2019
[6]

847––867)

Earth System Science Data , (pp. 847––867). Zhao, Q., Yu, L., Li, X., Xu, Y., Du, Z., Kanniah, K., Li, C., Cai, W., Lin, H., Peng, D., & et al. (2024). The expansion and remaining suitable areas of global oil palm plantations. Global Sustainability , 7 , e9. Zhao, W., & Chellappa, R. (2006). Chapter 21 - multimodal biometrics: Augmenting face with other c...

2024