pith. sign in

arxiv: 2605.19931 · v1 · pith:PV2FRIXWnew · submitted 2026-05-19 · 💻 cs.CV · cs.AI· cs.LG

StruMPL: Multi-task Dense Regression under Disjoint Partial Supervision and MNAR Labels

Pith reviewed 2026-05-20 06:52 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG
keywords aboveground biomassmulti-task regressionMNAR labelsinverse propensity weightinglidarremote sensingphysics-informed modelforest mapping
0
0 comments X

The pith

A multi-task model with shared encoding, propensity correction, and allometric physics recovers accurate forest biomass from disjoint lidar and plot labels despite MNAR data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper formalizes the task of dense regression for forest aboveground biomass when lidar supplies structure at millions of sites but no biomass values, while ground plots supply biomass at thousands of biased locations with no structure metrics. It introduces StruMPL, which routes a shared encoder through regression, imputation, and propensity heads and adds a learnable module that applies known allometric laws directly to the model's pixel-wise outputs. Training uses an augmented inverse-propensity-weighted pseudo-outcome whose stop-gradients on the propensity and imputation terms are shown to keep the loss bounded while recovering the correct weighted stationary points. If the approach holds, existing incompatible Earth-observation and field datasets can be combined at continental scales for carbon and ecosystem monitoring without requiring new fully labeled samples.

Core claim

StruMPL addresses multi-task dense regression under heterogeneous disjoint partial supervision with MNAR labels and inter-task physical constraints by feeding a shared encoder into per-variable regression, imputation, and propensity heads together with a learnable physics module that evaluates biome-specific allometric laws on the model's own predictions at every pixel, trained via an Augmented IPW pseudo-outcome loss that incorporates stop-gradients on the propensity and imputation baseline to enable joint optimisation while keeping the loss bounded.

What carries the argument

The Augmented IPW pseudo-outcome with stop-gradients on the propensity and imputation baseline, which recovers IPW-weighted stationary points under the joint physical constraints.

If this is right

  • StruMPL yields lower AGB RMSE and bias than ablation variants and the closest published method on two ecologically distinct biomes.
  • The AIPW component reduces bias in high-AGB strata by approximately 54 percent in stratified analysis.
  • The architecture successfully integrates spaceborne lidar canopy structure with MNAR ground-plot biomass under disjoint supervision and known allometric constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same propensity-plus-imputation heads could be applied to other remote-sensing tasks that combine dense but unlabeled sensor data with sparse, biased ground truth.
  • Making the physics module itself learnable from data rather than fixed allometrics might allow transfer across more biomes without manual recalibration.
  • Stratified bias reduction observed here suggests similar weighting schemes could mitigate selection effects in other ecological mapping problems where high-value regions are undersampled.

Load-bearing premise

The Augmented IPW pseudo-outcome with stop-gradients on the propensity and imputation baseline enables joint optimisation to recover IPW-weighted stationary points while keeping the loss bounded.

What would settle it

If an ablation that removes the stop-gradients produces unbounded loss or fails to recover the IPW-weighted stationary points on the same training distribution, while the full model remains stable, the necessity claim would be falsified.

Figures

Figures reproduced from arXiv: 2605.19931 by Casey M. Ryan, Juan Alberto Molina-Valero, Reza M. Asiyabi, Steven Hancock, The SEOSAW Partnership.

Figure 1
Figure 1. Figure 1: Overview of StruMPL. A shared encoder feeds regression, imputation and propensity heads. [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: AGB RMSE (bars, left axis) and AGB bias (red markers, right axis) for all model configu [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Bias and RMSE on AGB across five quantiles for naive masked MSE, IPW, and AIPW [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Propensity calibration on the Spain test set. Predicted [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗
read the original abstract

Estimating forest aboveground biomass (AGB) from Earth observation combines two structurally incompatible label sources: spaceborne lidar provides canopy structure at millions of locations but no biomass estimate, and ground-based plots provide biomass at thousands of biased locations but no metrics of structure. No single training sample carries labels for all target variables, plot labels are missing not at random (MNAR), and biomass is linked to the structural variables by known but biome-specific allometric laws. We formalise this as multi-task dense regression under heterogeneous disjoint partial supervision with MNAR labels and inter-task physical constraints, and propose StruMPL to address it jointly. A shared encoder feeds per-variable regression, imputation, and propensity heads for spatial MNAR correction, and a learnable physics module that evaluates the inter-task constraint on the model's own predictions at every pixel. The supervised loss uses an Augmented IPW (AIPW) pseudo-outcome with stop-gradients on the propensity and on the imputation baseline; we show analytically and empirically that both are necessary for joint optimisation to recover IPW-weighted stationary points while keeping the loss bounded. On two ecologically distinct biomes, StruMPL outperforms ablation variants and the closest published method on AGB RMSE and bias, with a stratified analysis showing AIPW reduces high-AGB bias by ~54%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes StruMPL for multi-task dense regression of forest aboveground biomass (AGB) and structural variables under disjoint partial supervision with MNAR labels and inter-task allometric constraints. A shared encoder drives regression, imputation, and propensity heads; a learnable physics module enforces constraints on the model's own predictions at each pixel. The core technical contribution is an Augmented IPW (AIPW) pseudo-outcome loss that applies stop-gradients to the propensity and imputation heads, with an analytical claim that this recovers IPW-weighted stationary points while keeping the loss bounded. Empirical results on two biomes report outperformance versus ablations and prior methods on AGB RMSE and bias, including a stratified ~54% reduction in high-AGB bias.

Significance. If the AIPW stop-gradient construction is shown to isolate the IPW terms even after gradients from the differentiable physics module are included, the framework would provide a principled route to joint optimization under heterogeneous supervision and physical constraints. The reported bias reduction and ablation comparisons supply concrete evidence of practical utility for ecological remote-sensing tasks. The absence of explicit equations or proofs in the abstract, however, leaves the load-bearing analytical claim difficult to assess from the provided material.

major comments (1)
  1. [Abstract] Abstract: the analytical claim that the AIPW pseudo-outcome with stop-gradients on the propensity and imputation heads recovers IPW-weighted stationary points while keeping the loss bounded does not address the fact that the learnable physics module evaluates inter-task constraints directly on the model's predictions and therefore passes gradients back into the regression, imputation, and propensity heads. No derivation or argument is supplied showing that the stop-gradient construction still isolates the IPW terms once this additional differentiable path is present.
minor comments (2)
  1. [Experiments] Dataset descriptions and label statistics for the two biomes are not detailed enough to allow independent verification of the stratified high-AGB bias analysis.
  2. [Experiments] Ablation tables should explicitly quantify performance when stop-gradients are removed from the propensity or imputation heads, rather than reporting only the full model versus generic variants.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the analytical claim in the abstract. The point regarding gradient flow from the learnable physics module is well taken, and we address it directly below. We will revise the manuscript to strengthen the presentation of the derivation.

read point-by-point responses
  1. Referee: the analytical claim that the AIPW pseudo-outcome with stop-gradients on the propensity and imputation heads recovers IPW-weighted stationary points while keeping the loss bounded does not address the fact that the learnable physics module evaluates inter-task constraints directly on the model's predictions and therefore passes gradients back into the regression, imputation, and propensity heads. No derivation or argument is supplied showing that the stop-gradient construction still isolates the IPW terms once this additional differentiable path is present.

    Authors: We appreciate the referee drawing attention to this interaction. The manuscript derives the stationary-point property for the AIPW term under stop-gradients on the propensity and imputation heads, but the current text does not explicitly re-derive the result after including the additional gradient path through the differentiable physics module. We will add a concise appendix derivation showing that the stop-gradients continue to isolate the IPW weighting for the supervised loss even when the physics loss back-propagates through the regression outputs: the physics term depends only on the regression predictions (not on the stopped propensity or imputation values inside the AIPW expression), so the overall gradient with respect to the propensity and imputation parameters retains the IPW-weighted form. The empirical ablations already include the physics module and confirm that removing the stop-gradients degrades performance, providing supporting evidence. We will also move the key equations from the abstract into the main text for clarity. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The paper's central technical claim is an analytical demonstration that the Augmented IPW pseudo-outcome with stop-gradients on propensity and imputation heads recovers IPW-weighted stationary points while bounding the loss under joint optimization. This demonstration is presented as internal to the manuscript (abstract states 'we show analytically and empirically'), with the learnable physics module introduced as an additional differentiable component rather than a redefinition of the IPW terms. No equations or steps reduce the claimed stationary-point recovery to a fitted parameter or self-citation by construction; the empirical gains are evaluated against external AGB benchmarks and ablation variants. The derivation therefore remains independent of its own fitted outputs and does not match any enumerated circularity pattern.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review is limited to the abstract; no explicit free parameters, axioms, or invented entities can be audited in detail. The approach relies on domain knowledge of allometric laws and MNAR mechanisms but provides no enumeration of fitted values or unproven assumptions.

axioms (1)
  • domain assumption Biomass is linked to structural variables by known but biome-specific allometric laws
    Stated in the abstract as the basis for the inter-task physical constraints.

pith-pipeline@v0.9.0 · 5793 in / 1271 out tokens · 46451 ms · 2026-05-20T06:52:41.364585+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 2 internal anchors

  1. [1]

    Remote Sensing of Environment , volume=

    A comprehensive framework for assessing the accuracy and uncertainty of global above-ground biomass maps , author=. Remote Sensing of Environment , volume=. 2022 , publisher=

  2. [2]

    Estimating aboveground net biomass change for tropical and subtropical forests: Refinement of

    Requena Suarez, Daniela and Rozendaal, Dana. Estimating aboveground net biomass change for tropical and subtropical forests: Refinement of. Global Change Biology , volume=. 2019 , publisher=

  3. [3]

    2019 , howpublished =

    2019 Refinement to the 2006. 2019 , howpublished =

  4. [4]

    Global change biology , volume=

    Improved allometric models to estimate the aboveground biomass of tropical trees , author=. Global change biology , volume=. 2014 , publisher=

  5. [5]

    Biogeosciences , volume=

    Tree height integrated into pantropical forest biomass estimates , author=. Biogeosciences , volume=. 2012 , publisher=

  6. [6]

    Science of remote sensing , volume=

    The Global Ecosystem Dynamics Investigation: High-resolution laser ranging of the Earth’s forests and topography , author=. Science of remote sensing , volume=. 2020 , publisher=

  7. [7]

    A network to understand the changing socio-ecology of the southern African woodlands (

    SEOSAW-partnership , journal=. A network to understand the changing socio-ecology of the southern African woodlands (. 2021 , publisher=

  8. [8]

    arXiv preprint arXiv:2601.10562 , year=

    Process-Guided Concept Bottleneck Model , author=. arXiv preprint arXiv:2601.10562 , year=

  9. [9]

    Remote Sensing , volume=

    Unified deep learning model for global prediction of aboveground biomass, canopy height, and cover from high-resolution, multi-sensor satellite imagery , author=. Remote Sensing , volume=. 2025 , publisher=

  10. [10]

    Biometrika , volume=

    Inference and missing data , author=. Biometrika , volume=. 1976 , publisher=

  11. [11]

    Journal of the American statistical Association , volume=

    Estimation of regression coefficients when some regressors are not always observed , author=. Journal of the American statistical Association , volume=. 1994 , publisher=

  12. [12]

    2018 , publisher=

    Double/debiased machine learning for treatment and structural parameters , author=. 2018 , publisher=

  13. [13]

    Biometrika , volume=

    The central role of the propensity score in observational studies for causal effects , author=. Biometrika , volume=. 1983 , publisher=

  14. [14]

    Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

    Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

  15. [15]

    Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

    Squeeze-and-excitation networks , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

  16. [16]

    Attention U-Net: Learning Where to Look for the Pancreas

    Attention u-net: Learning where to look for the pancreas , author=. arXiv preprint arXiv:1804.03999 , year=

  17. [17]

    2021 , howpublished =

  18. [18]

    Masek, Jeffrey and Ju, Junchang and Roger, Jean-Claude and Skakun, Sergii and Vermote, Eric and Claverie, Martin and Dungan, Jennifer and Yin, Zhangshi and Freitag, Brian and Justice, Chris , journal=

  19. [19]

    Neural networks: Tricks of the trade , pages=

    Efficient backprop , author=. Neural networks: Tricks of the trade , pages=. 2002 , publisher=

  20. [20]

    Journal of the American Statistical Association , volume=

    Adjusting for nonignorable drop-out using semiparametric nonresponse models , author=. Journal of the American Statistical Association , volume=. 1999 , publisher=

  21. [21]

    Proceedings of the 26th annual international conference on machine learning , pages=

    Curriculum learning , author=. Proceedings of the 26th annual international conference on machine learning , pages=

  22. [22]

    Advances in neural information processing systems , volume=

    Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results , author=. Advances in neural information processing systems , volume=

  23. [23]

    Advances in neural information processing systems , volume=

    Fixmatch: Simplifying semi-supervised learning with consistency and confidence , author=. Advances in neural information processing systems , volume=

  24. [24]

    International Journal of Remote Sensing , volume=

    Michigan microwave canopy scattering model , author=. International Journal of Remote Sensing , volume=. 1990 , publisher=

  25. [25]

    Nature Ecology & Evolution , volume=

    A high-resolution canopy height model of the Earth , author=. Nature Ecology & Evolution , volume=. 2023 , publisher=

  26. [26]

    Mapping global forest canopy height through integration of

    Potapov, Peter and Li, Xinyuan and Hernandez-Serna, Andres and Tyukavina, Alexandra and Hansen, Matthew C and Kommareddy, Anil and Pickens, Amy and Turubanova, Svetlana and Tang, Hao and Silva, Carlos Edibaldo and others , journal=. Mapping global forest canopy height through integration of. 2021 , publisher=

  27. [27]

    Combining

    Guo, Qiyu and Du, Shouhang and Jiang, Jinbao and Guo, Wei and Zhao, Hengqian and Yan, Xuzhe and Zhao, Yinpeng and Xiao, Wanshan , journal=. Combining. 2023 , publisher=

  28. [28]

    Machine learning , volume=

    Multitask learning , author=. Machine learning , volume=. 1997 , publisher=

  29. [29]

    International conference on machine learning , pages=

    Which tasks should be learned together in multi-task learning? , author=. International conference on machine learning , pages=. 2020 , organization=

  30. [30]

    Advances in neural information processing systems , volume=

    Gradient surgery for multi-task learning , author=. Advances in neural information processing systems , volume=

  31. [31]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Joint-task regularization for partially labeled multi-task learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  32. [32]

    Advances in Neural Information Processing Systems , volume=

    Efficiently identifying task groupings for multi-task learning , author=. Advances in Neural Information Processing Systems , volume=

  33. [33]

    Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

    Multi-source deep learning for human pose estimation , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

  34. [34]

    IEEE Journal of Biomedical and Health Informatics , volume=

    Genhpf: General healthcare predictive framework for multi-task multi-source learning , author=. IEEE Journal of Biomedical and Health Informatics , volume=. 2023 , publisher=

  35. [35]

    2019 , publisher=

    Statistical analysis with missing data , author=. 2019 , publisher=

  36. [36]

    Journal of the American statistical Association , volume=

    A generalization of sampling without replacement from a finite universe , author=. Journal of the American statistical Association , volume=. 1952 , publisher=

  37. [37]

    Proceedings of the 13th international conference on web search and data mining , pages=

    Unbiased recommender learning from missing-not-at-random implicit feedback , author=. Proceedings of the 13th international conference on web search and data mining , pages=

  38. [38]

    Journal of Computational physics , volume=

    Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , author=. Journal of Computational physics , volume=. 2019 , publisher=

  39. [39]

    Physics-informed neural networks (

    Cai, Shengze and Mao, Zhiping and Wang, Zhicheng and Yin, Minglang and Karniadakis, George Em , journal=. Physics-informed neural networks (. 2021 , publisher=

  40. [40]

    Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences , volume=

    Physics-informed machine learning: case studies for weather and climate modelling , author=. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences , volume=. 2021 , publisher=

  41. [41]

    Archives of Computational Methods in Engineering , pages=

    Physics-informed neural networks in materials modeling and design: a review , author=. Archives of Computational Methods in Engineering , pages=. 2025 , publisher=

  42. [42]

    Adapting physics-informed neural networks to improve

    Viet Cuong, Dinh and Lali. Adapting physics-informed neural networks to improve. PLOS One , volume=. 2024 , publisher=

  43. [43]

    arXiv preprint arXiv:2501.00502 , year=

    Exploring physics-informed neural networks for crop yield loss forecasting , author=. arXiv preprint arXiv:2501.00502 , year=

  44. [44]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Semi-supervised semantic segmentation with cross-consistency training , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  45. [45]

    IEEE transactions on pattern analysis and machine intelligence , volume=

    Semi-supervised adversarial monocular depth estimation , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2019 , publisher=

  46. [46]

    Workshop on challenges in representation learning, ICML , volume=

    Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks , author=. Workshop on challenges in representation learning, ICML , volume=. 2013 , organization=

  47. [47]

    international conference on machine learning , pages=

    Recommendations as treatments: Debiasing learning and evaluation , author=. international conference on machine learning , pages=. 2016 , organization=

  48. [48]

    Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and others , journal=

  49. [49]

    Decoupled Weight Decay Regularization

    Decoupled weight decay regularization , author=. arXiv preprint arXiv:1711.05101 , year=

  50. [50]

    2024 , publisher =

    Santoro, Mattia and Cartus, Oliver , title =. 2024 , publisher =. doi:10.5285/bf535053562141c6bb7ad831f5998d77 , url =

  51. [51]

    Cuarto Inventario Forestal Nacional (IFN4) , year =