arxiv: 2605.04413 · v1 · submitted 2026-05-06 · 💻 cs.LG · stat.ME

Recognition: 3 theorem links

· Lean Theorem

Counterfactual identifiability beyond global monotonicity: non-monotone triangular structural causal models

Pengcheng Tan , Jiang Chen , Dehui Du

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:46 UTC · model grok-4.3

classification 💻 cs.LG stat.ME

keywords modelstriangularcounterfactualglobalmonotonicitystructuralcausalidentifiability

0 comments

The pith

Non-monotone triangular structural causal models achieve complete counterfactual identifiability through mechanism-wise invertibility and context-independent inverse transport.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that triangular structural causal models can recover all counterfactuals even without global monotonicity, provided each mechanism is invertible on its own and its inverse mapping stays the same regardless of the surrounding context. This matters for systems like robotic contact where the same external noise can produce opposite effects depending on the current state, breaking the monotonicity that most prior identifiability proofs require. The authors replace global monotonicity with two local conditions, prove those conditions are exactly equivalent to the exogenous noise variables being isomorphic across possible worlds, and show that this equivalence delivers full counterfactual identifiability. They also supply a counterexample demonstrating that mere local invertibility without the context-independent transport condition is not enough. The theory is instantiated in a neural architecture called CausalInverter that incorporates orientation gates and transport-stability regularization, yielding measurable gains on synthetic non-monotonic mechanisms and on MuJoCo door and push tasks.

Core claim

In non-monotone triangular structural causal models, mechanism-wise invertibility together with context-independent inverse transport is equivalent to exogenous isomorphism and therefore guarantees complete counterfactual identifiability.

What carries the argument

Mechanism-wise invertibility and context-independent inverse transport, which together replace global monotonicity while preserving the triangular recursion that allows unique inversion of each causal mechanism.

If this is right

The stated conditions are equivalent to exogenous isomorphism.
Local invertibility by itself does not suffice for identifiability.
Counterfactual recovery improves systematically as non-monotonicity increases when the structural bias is enforced.
Perfect event-level counterfactual recovery is achieved on the MuJoCo Door environment.
Lower continuous angle error and greater stability than transformer or conditional-flow baselines are observed on the same task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The result opens a middle ground between globally monotone triangular models and fully unconstrained black-box predictors, suggesting that many embodied interaction systems may still be identifiable with modest structural constraints.
Enforcing transport stability during training could serve as a practical regularizer for counterfactual reasoning in robotics and control even when the underlying dynamics are known to be non-monotonic.
The same local conditions might be testable in non-triangular or cyclic causal graphs, where the triangular ordering is absent but per-mechanism invertibility remains feasible.
In regimes where non-monotonicity is weak the advantage shrinks, indicating that practitioners can choose between the structural bias and simpler models according to the observed degree of non-monotonicity.
keywords:[

Load-bearing premise

Every mechanism must be invertible with an inverse whose transport does not change with context; if either property fails for even one mechanism the equivalence to exogenous isomorphism and the identifiability result both collapse.

What would settle it

A concrete triangular SCM in which every mechanism satisfies invertibility and context-independent inverse transport yet at least one counterfactual query still admits multiple distinct solutions.

Figures

Figures reproduced from arXiv: 2605.04413 by Dehui Du, Jiang Chen, Pengcheng Tan.

**Figure 1.** Figure 1: From fixed orientation to stable exogenous alignment. view at source ↗

**Figure 2.** Figure 2: Synthetic main panel. configurations within each mechanism family, we find that on both threshold-flip and smooth-flip mechanisms, our CF-MSE is significantly lower than both TM-SCM Quantile and BSCM-Flow Contextual (all p < 10−10), with bootstrap confidence intervals for the mean difference staying strictly below zero; see Appendix view at source ↗

**Figure 3.** Figure 3: Door main results. Formal multi-seed Push evaluation. To verify that the Push conclusion is not a single-split accident, we rerun the formal balanced protocol with seeds {7, 17, 27}, producing 96 counterfactual queries. This robustness replication follows the previously stress-tested linear/context comparison rather than re-training every stronger neural baseline across all seeds. Our method attains CF box… view at source ↗

**Figure 4.** Figure 4: The logical structure of the theory view at source ↗

**Figure 5.** Figure 5: Stability results beyond the single main split. view at source ↗

**Figure 6.** Figure 6: Per-task main result visualizations view at source ↗

**Figure 7.** Figure 7: Door per-query continuous error versus final threshold margin for three representative view at source ↗

**Figure 8.** Figure 8: Stage-wise and mode-conditioned diagnostics. view at source ↗

**Figure 9.** Figure 9: Physical ablations for CausalInverter view at source ↗

**Figure 10.** Figure 10: Counterfactual storyboards for Push and Door. view at source ↗

**Figure 11.** Figure 11: Predicted versus environment-truth counterfactuals. view at source ↗

**Figure 12.** Figure 12: Continuous synthetic bridge from globally monotone to strongly non-monotone mecha view at source ↗

**Figure 13.** Figure 13: Synthetic ablations for Experiment 1.1 view at source ↗

read the original abstract

Structural causal models provide a unified semantics for interventions and counterfactuals, but most identifiability results rely on restrictive assumptions like global monotonicity, which are often violated in embodied interaction, where the same exogenous perturbation can induce opposite responses under different contact contexts. We ask what structure still suffices once global monotonicity is dropped. We introduce non-monotone triangular structural causal models (NM-TM-SCM), which retain triangular recursion but replace global monotonicity with mechanism-wise invertibility and context-independent inverse transport. We prove that these conditions are equivalent to exogenous isomorphism and imply complete counterfactual identifiability, and we give a counterexample showing that local invertibility alone is insufficient. We instantiate the theory in CausalInverter, with triangular invertible layers, orientation gates, and transport-stability regularization. On synthetic non-monotonic mechanisms, the structural bias yields systematic counterfactual gains as non-monotonicity increases. On MuJoCo Door, our model achieves perfect event-level counterfactual recovery, lowers continuous angle error relative to a Transformer baseline, and delivers substantially more stable recovery than Transformer and conditional-flow predictors. On MuJoCo Push, where non-monotonicity is weaker, the same low-data predictors remain competitive or better, consistent with a bias-variance boundary. These results identify a broader identifiable regime between globally monotone triangular models and unconstrained black-box world models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper relaxes global monotonicity in triangular SCMs to per-mechanism invertibility plus context-independent transport and proves this yields full counterfactual identifiability.

read the letter

The main takeaway is that triangular SCMs can drop global monotonicity and still deliver complete counterfactual identifiability. The authors replace it with mechanism-wise invertibility and a context-independent inverse transport condition, show these are equivalent to exogenous isomorphism, and supply a counterexample where local invertibility alone fails. That equivalence and the counterexample are the clearest new pieces relative to the monotone triangular literature. The CausalInverter implementation with triangular invertible layers, orientation gates, and transport-stability regularization is a direct way to put the conditions into practice. On synthetic non-monotonic mechanisms the structural bias improves counterfactual recovery as non-monotonicity rises. On MuJoCo Door it reaches perfect event-level recovery and more stable continuous predictions than Transformer or conditional-flow baselines; on Push, where non-monotonicity is milder, the advantage narrows as expected. The theory is grounded in the stated assumptions rather than data-dependent fitting, which is a strength. The main soft spot is that the reported MuJoCo gains lack error bars, data-split details, or significance tests in the abstract, so the practical magnitude is hard to judge precisely. The context-independent transport requirement is also strong and may need case-by-case checking in noisier settings. Readers working on causal models for robotics or physical systems will find the identifiable regime useful. It sits between the restrictive monotone triangular models and fully black-box predictors without claiming more than the assumptions support. I would send this to peer review because the theoretical claim is sharp and the experiments target the relevant tasks.

Referee Report

0 major / 3 minor

Summary. The paper introduces non-monotone triangular structural causal models (NM-TM-SCMs) that retain triangular recursion but replace global monotonicity with mechanism-wise invertibility and context-independent inverse transport. It proves these conditions are equivalent to exogenous isomorphism (hence complete counterfactual identifiability), supplies a counterexample showing local invertibility alone is insufficient, and instantiates the theory in CausalInverter (triangular invertible layers, orientation gates, transport-stability regularization). Experiments on synthetic non-monotonic mechanisms and MuJoCo Door/Push tasks report improved counterfactual recovery that scales with non-monotonicity.

Significance. If the equivalence holds, the result meaningfully enlarges the identifiable regime for counterfactuals in SCMs, covering non-monotonic mechanisms that arise in embodied interaction and robotics. The formal proof plus counterexample, together with the reproducible bias-variance pattern on MuJoCo, constitute a clear advance over globally monotone triangular models while remaining more structured than black-box predictors.

minor comments (3)

[Experiments] Experiments section: the MuJoCo Door and Push results state 'perfect event-level counterfactual recovery' and 'substantially more stable recovery' but omit the number of independent runs, error bars or confidence intervals, train/validation/test splits, and any statistical significance tests against the Transformer and conditional-flow baselines.
[Method / CausalInverter] The definition of 'event-level' recovery and the precise form of the transport-stability regularization term are not fully specified in the provided text; adding these would improve reproducibility.
[Synthetic experiments] The abstract claims systematic gains 'as non-monotonicity increases,' yet the main text should include a quantitative plot or table relating the degree of non-monotonicity (e.g., via a controlled parameter) to the observed counterfactual error reduction.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The central result is a mathematical proof establishing equivalence between mechanism-wise invertibility plus context-independent inverse transport in NM-TM-SCMs and exogenous isomorphism, which yields complete counterfactual identifiability. This equivalence is derived directly from the stated structural assumptions on triangular recursion without any reduction to fitted parameters, data-dependent predictions, or self-referential definitions. The provided counterexample for insufficiency of local invertibility alone is an independent construction. No load-bearing self-citations, smuggled ansatzes, or renaming of known results appear in the derivation chain. Experimental sections use regularization on synthetic and MuJoCo data but do not underpin the theoretical identifiability claim, which remains independent of those fits.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 2 invented entities

The central claim rests on retaining triangular recursion while introducing two new domain assumptions that replace global monotonicity; these assumptions are not derived but posited to restore identifiability.

axioms (3)

standard math Triangular recursion structure of the SCM
Inherited from prior triangular SCM literature and retained without change.
domain assumption Mechanism-wise invertibility for each structural equation
New weakening of global monotonicity; invoked to enable noise recovery.
domain assumption Context-independent inverse transport
Required for the equivalence to exogenous isomorphism; stated as necessary in the abstract.

invented entities (2)

NM-TM-SCM no independent evidence
purpose: Model class that relaxes global monotonicity while preserving identifiability
Newly defined class whose properties are proven in the paper.
CausalInverter no independent evidence
purpose: Neural implementation using triangular invertible layers and orientation gates
Practical instantiation of the theoretical model.

pith-pipeline@v0.9.0 · 5542 in / 1625 out tokens · 51019 ms · 2026-05-08T17:46:02.426699+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Cost.FunctionalEquation / Foundation.AlphaCoordinateFixation washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

v_i = m(v_{<i}) + s(v_{<i})·q(u_i;v_{<i}) with learned λ_cyc, λ_tr, λ_ori and ridge regularizers 2e-3, 1e-2, ...
Foundation.Atomicity / Foundation.ArithmeticFromLogic topoSort_respects echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Theorem 1 proof: triangular induction along the causal order; injectivity/surjectivity of Γ_M by recursive inversion

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

51 extracted references · 4 canonical work pages · 2 internal anchors

[1]

Advances in Neural Information Processing Systems , year=

Nonlinear Causal Discovery with Additive Noise Models , author=. Advances in Neural Information Processing Systems , year=
[2]

Proceedings of the Workshop on Causality: Objectives and Assessment , pages=

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models , author=. Proceedings of the Workshop on Causality: Objectives and Assessment , pages=
[3]

Journal of Machine Learning Research , volume=

Causal Discovery with Continuous Additive Noise Models , author=. Journal of Machine Learning Research , volume=
[4]

arXiv preprint arXiv:2301.09031 , year=

Counterfactual (Non-)identifiability of Learned Structural Causal Models , author=. arXiv preprint arXiv:2301.09031 , year=

work page arXiv
[5]

Proceedings of the 40th International Conference on Machine Learning , pages=

Counterfactual Identifiability of Bijective Causal Models , author=. Proceedings of the 40th International Conference on Machine Learning , pages=
[6]

Proceedings of the 42nd International Conference on Machine Learning , pages=

Exogenous Isomorphism for Counterfactual Identifiability , author=. Proceedings of the 42nd International Conference on Machine Learning , pages=
[7]

NeurIPS Workshop on Optimal Transport and Machine Learning , year=

Causal Discovery via Monotone Triangular Transport Maps , author=. NeurIPS Workshop on Optimal Transport and Machine Learning , year=
[8]

NeurIPS Workshop on Causality and Representation Learning , year=

Triangular Monotonic Generative Models Can Perform Causal Discovery , author=. NeurIPS Workshop on Causality and Representation Learning , year=
[9]

Advances in Neural Information Processing Systems , year=

Deep Structural Causal Models for Tractable Counterfactual Inference , author=. Advances in Neural Information Processing Systems , year=
[10]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics , pages=

Causal Autoregressive Flows , author=. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics , pages=
[11]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics , pages=

Graphical Normalizing Flows , author=. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics , pages=
[12]

International Conference on Learning Representations , year=

Neural Causal Models for Counterfactual Identification and Estimation , author=. International Conference on Learning Representations , year=
[13]

Proceedings of the 12th International Conference on Probabilistic Graphical Models , pages=

Counterfactually-Equivalent Structural Causal Modelling Using Causal Graphical Normalizing Flows , author=. Proceedings of the 12th International Conference on Probabilistic Graphical Models , pages=
[14]

World Models

World Models , author=. arXiv preprint arXiv:1803.10122 , year=

work page internal anchor Pith review arXiv
[15]

Mastering Diverse Domains through World Models

Mastering Diverse Domains through World Models , author=. arXiv preprint arXiv:2301.04104 , year=

work page internal anchor Pith review arXiv
[16]

Towards a causal probabilistic framework for prediction, action-selection & explanations for robot block-stacking tasks.arXiv preprint arXiv:2308.06203, 2023

Towards a Causal Probabilistic Framework for Prediction, Action-Selection & Explanations for Robot Block-Stacking Tasks , author=. arXiv preprint arXiv:2308.06203 , year=

work page arXiv
[17]

Causality: Models, Reasoning, and Inference , author=
[18]

Journal of Machine Learning Research , volume=

Complete Identification Methods for the Causal Hierarchy , author=. Journal of Machine Learning Research , volume=
[19]

Elements of Causal Inference: Foundations and Learning Algorithms , author=
[20]

Proceedings of the IEEE , volume=

Toward Causal Representation Learning , author=. Proceedings of the IEEE , volume=
[21]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics , pages=

Variational Autoencoders and Nonlinear ICA: A Unifying Framework , author=. Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics , pages=
[22]

Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence , pages=

The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA , author=. Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence , pages=
[23]

Advances in Neural Information Processing Systems , year=

Independent Mechanism Analysis, A New Concept? , author=. Advances in Neural Information Processing Systems , year=
[24]

Advances in Neural Information Processing Systems , year=

Causal Normalizing Flows: From Theory to Practice , author=. Advances in Neural Information Processing Systems , year=
[25]

Proceedings of the 40th International Conference on Machine Learning , pages=

On the Identifiability and Estimation of Causal Location-Scale Noise Models , author=. Proceedings of the 40th International Conference on Machine Learning , pages=
[26]

International Conference on Learning Representations , year=

Towards Characterizing Domain Counterfactuals for Invertible Latent Causal Models , author=. International Conference on Learning Representations , year=
[27]

Advances in Neural Information Processing Systems , year=

Learning Nonparametric Latent Causal Graphs with Unknown Interventions , author=. Advances in Neural Information Processing Systems , year=
[28]

Advances in Neural Information Processing Systems , year=

Nonparametric Identifiability of Causal Representations from Unknown Interventions , author=. Advances in Neural Information Processing Systems , year=
[29]

Advances in Neural Information Processing Systems , year=

Counterfactual Generation with Identifiability Guarantees , author=. Advances in Neural Information Processing Systems , year=
[30]

Advances in Neural Information Processing Systems , year=

Identifiability Guarantees for Causal Disentanglement from Soft Interventions , author=. Advances in Neural Information Processing Systems , year=
[31]

Advances in Neural Information Processing Systems , year=

Learning Linear Causal Representations from Interventions under General Nonlinear Mixing , author=. Advances in Neural Information Processing Systems , year=
[32]

Advances in Neural Information Processing Systems , year=

Unpaired Multi-Domain Causal Representation Learning , author=. Advances in Neural Information Processing Systems , year=
[33]

Advances in Neural Information Processing Systems , year=

Causal Discovery from Observational and Interventional Data across Multiple Environments , author=. Advances in Neural Information Processing Systems , year=
[34]

Advances in Neural Information Processing Systems , year=

Marginal Causal Flows for Validation and Inference , author=. Advances in Neural Information Processing Systems , year=
[35]

Advances in Neural Information Processing Systems , year=

Exogenous Matching: Learning Good Proposals for Tractable Counterfactual Estimation , author=. Advances in Neural Information Processing Systems , year=
[36]

Advances in Neural Information Processing Systems , year=

Causal Contrastive Learning for Counterfactual Regression Over Time , author=. Advances in Neural Information Processing Systems , year=
[37]

Advances in Neural Information Processing Systems , year=

Conditional Generative Models are Sufficient to Sample from Any Causal Effect Estimand , author=. Advances in Neural Information Processing Systems , year=
[38]

Advances in Neural Information Processing Systems , year=

On the Parameter Identifiability of Partially Observed Linear Causal Models , author=. Advances in Neural Information Processing Systems , year=
[39]

Advances in Neural Information Processing Systems , year=

Causal Temporal Representation Learning with Nonstationary Sparse Transition , author=. Advances in Neural Information Processing Systems , year=
[40]

Advances in Neural Information Processing Systems , year=

Causal Discovery with Endogenous Context Variables , author=. Advances in Neural Information Processing Systems , year=
[41]

Advances in Neural Information Processing Systems , year=

Identifiability Guarantees for Causal Disentanglement from Purely Observational Data , author=. Advances in Neural Information Processing Systems , year=
[42]

Advances in Neural Information Processing Systems , year=

Linear Causal Representation Learning from Unknown Multi-Node Interventions , author=. Advances in Neural Information Processing Systems , year=
[43]

Advances in Neural Information Processing Systems , year=

Learning Linear Causal Representations from General Environments: Identifiability and Intrinsic Ambiguity , author=. Advances in Neural Information Processing Systems , year=
[44]

Advances in Neural Information Processing Systems , year=

Marrying Causal Representation Learning with Dynamical Systems for Science , author=. Advances in Neural Information Processing Systems , year=
[45]

Advances in Neural Information Processing Systems , year=

Consistency of Neural Causal Partial Identification , author=. Advances in Neural Information Processing Systems , year=
[46]

Advances in Neural Information Processing Systems , year=

DiffPO: A Causal Diffusion Model for Learning Distributions of Potential Outcomes , author=. Advances in Neural Information Processing Systems , year=
[47]

Advances in Neural Information Processing Systems , year=

Intervention and Conditioning in Causal Bayesian Networks , author=. Advances in Neural Information Processing Systems , year=
[48]

Advances in Neural Information Processing Systems , year=

Smoke and Mirrors in Causal Downstream Tasks , author=. Advances in Neural Information Processing Systems , year=
[49]

International Conference on Learning Representations , year=

Dream to Control: Learning Behaviors by Latent Imagination , author=. International Conference on Learning Representations , year=
[50]

International Conference on Learning Representations , year=

Mastering Atari with Discrete World Models , author=. International Conference on Learning Representations , year=
[51]

Advances in Neural Information Processing Systems , year=

PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation , author=. Advances in Neural Information Processing Systems , year=