arxiv: 2605.06288 · v1 · submitted 2026-05-07 · 📊 stat.ME · cs.AI

Recognition: unknown

A Topological Sorting Criterion for Random Causal Directed Acyclic Graphs

Alexander G. Reisach, Antoine Chambaz, Gilles Blanchard, Sebastian Weichwald

Pith reviewed 2026-05-08 07:23 UTC · model grok-4.3

classification 📊 stat.ME cs.AI

keywords causal discoverydirected acyclic graphsrandom graphstopological sortingMarkov equivalence classErdős-Rényiscale-free networkscausal order

0 comments

The pith

In standard random causal DAGs, the number of relatives increases monotonically along the causal order.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that DAGs generated by first imposing a topological order and then sampling edges from Erdős-Rényi or scale-free models have a built-in pattern: the set of nodes reachable via open paths, called relatives, grows larger the further one moves along the causal order. This monotonic growth appears frequently in the random graphs used to benchmark causal discovery methods. As a result, estimating the number of relatives from data and sorting nodes by that estimate often recovers the true causal order. When the increase is strict, the graph has a unique Markov equivalence class. The authors propose time-series DAG sampling as one way to generate alternatives that may lack this pattern.

Core claim

In DAGs generated by imposing an order on Erdős-Rényi and scale-free random graphs, the set of nodes reachable via open paths, termed relatives, increases monotonically along the causal order. Sorting by the estimated number of relatives recovers the causal order. A strict increase of relatives along the causal order leads to a singular Markov equivalence class. Sampling time-series DAGs is proposed as a possible alternative generation method.

What carries the argument

The set of relatives (nodes reachable via open paths) and its monotonic increase in count along the imposed causal order in these random DAGs.

If this is right

Sorting nodes by estimated number of relatives provides an effective proxy for recovering the causal order in many common simulation settings.
A strict monotonic increase in relatives implies the Markov equivalence class is singular.
The pattern is prevalent under standard procedures for generating random causal DAGs.
Time-series DAG sampling offers an alternative that may avoid this monotonic property.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Causal discovery algorithms tested on these DAGs may benefit from the implicit ordering information carried by relative counts, affecting how performance is interpreted.
Simple sorting on estimated relatives could serve as a baseline comparator for more complex causal order methods.
Using generation procedures without this monotonicity would create stricter tests for causal discovery algorithms.

Load-bearing premise

The monotonic increase in relatives is a reliable property of standard random DAG generation by imposing order then sampling edges, and the number of relatives can be accurately estimated from finite observational data.

What would settle it

A generated random DAG from the Erdős-Rényi or scale-free procedure where the number of relatives does not increase monotonically along the imposed order, or data showing that sorting by estimated relatives fails to recover the order accurately.

Figures

Figures reproduced from arXiv: 2605.06288 by Alexander G. Reisach, Antoine Chambaz, Gilles Blanchard, Sebastian Weichwald.

**Figure 1.** Figure 1: Rel-sortability of random ER and SF DAGs with view at source ↗

**Figure 2.** Figure 2: Comparison of rel-sortability to other sortabilities in ER DAGs. view at source ↗

**Figure 3.** Figure 3: Comparative SID (lower is better) performance of rel-SortnRegress on ER DAGs. view at source ↗

**Figure 4.** Figure 4: A directed and cyclic summary graph and corresponding time-unrolled DAG for view at source ↗

**Figure 5.** Figure 5: Comparison of rel-sortability to other sortabilities in SF DAGs. view at source ↗

**Figure 6.** Figure 6: Comparative SID (lower is better) performance of rel-SortnRegress on SF DAGs. view at source ↗

**Figure 7.** Figure 7: compares the different sortabilities on data from non-standardized SCMs as discussed in Section 2.2. We observe that sortability by variance is extremely high for ER and SF DAGs, corroborating the findings in Reisach et al. (2021), and suggesting information in the variance as an explanation for the performance difference of the DAGMA algorithm between the standardized settings (sSCM) in Figures 3a and 6a … view at source ↗

read the original abstract

Random directed acyclic graphs (DAGs) based on imposing an order on Erd\H{o}s-R\'enyi and scale free random graphs are widely used for evaluating causal discovery algorithms. We show that in such DAGs, the set of nodes reachable via open paths, termed relatives, increases monotonically along the causal order. We assess the prevalence of this pattern numerically, and demonstrate that it can be exploited for causal order recovery via sorting by the estimated number of relatives. We note that many simulations in the literature feature settings where this yields an excellent proxy for the causal order, and show that a strict increase of relatives along the causal order leads to a singular Markov equivalence class. We propose sampling time-series DAGs as a possible alternative and discuss implications for causal discovery algorithms and their evaluation on synthetic data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows that common random DAG generators for causal discovery benchmarks build in a monotonic increase in relatives along the order, which often lets simple sorting recover it and questions how informative those benchmarks really are.

read the letter

The main point is that when you generate DAGs by fixing a topological order first and then sampling edges from Erdős-Rényi or scale-free models, the count of relatives (nodes reachable via open paths) tends to rise monotonically along that order. The authors check this numerically across simulations, show that sorting by the estimated count recovers the order in many of those graphs, and note that strict monotonicity implies a singleton Markov equivalence class. They also flag that this pattern appears in a lot of the synthetic data used in the literature and suggest time-series DAGs as one workaround.

Referee Report

2 major / 2 minor

Summary. The paper claims that in random causal DAGs generated by imposing a topological order on Erdős-Rényi or scale-free graphs, the set of nodes reachable via open paths (termed 'relatives') increases monotonically along the causal order. It numerically assesses the prevalence of this monotonicity, shows that sorting nodes by the estimated number of relatives can recover the causal order (and is an excellent proxy in many literature simulations), proves that strict monotonicity implies a singular Markov equivalence class, proposes time-series DAG sampling as an alternative generation method, and discusses implications for evaluating causal discovery algorithms.

Significance. If the monotonicity property holds in the population and the number of relatives can be reliably estimated, the result would identify a structural bias in standard synthetic DAG generators that makes causal order recovery trivial in many simulation settings, thereby improving the design and interpretation of benchmarks for causal discovery methods. The suggestion of time-series alternatives and the Markov equivalence result are useful contributions to the literature on random graph models for causal inference.

major comments (2)

[numerical assessment section] The central claim that sorting by the estimated number of relatives recovers the causal order (abstract and numerical assessment section) is load-bearing on accurate recovery of d-connections from finite observational data, but the manuscript provides no details on the conditional independence tests used, sample sizes, or robustness to test errors and faithfulness violations; this leaves the practical utility of the sorting procedure unverified even when the population monotonicity holds.
[Markov equivalence section] § on Markov equivalence: While the proof that strict increase of relatives along the order yields a singular equivalence class is noted, the manuscript does not quantify how often strict monotonicity occurs under the ER/scale-free generators or discuss whether the result extends beyond the specific random graph models considered.

minor comments (2)

[abstract] The abstract states a numerical assessment of prevalence but the manuscript should include a table or figure with exact simulation parameters (number of nodes, edge probabilities, number of replicates) to allow reproducibility.
[introduction] Notation for 'relatives' and 'open paths' should be defined more formally with reference to d-separation in the main text before the numerical results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive referee report. We address each major comment below and will revise the manuscript accordingly to improve clarity and completeness.

read point-by-point responses

Referee: The central claim that sorting by the estimated number of relatives recovers the causal order (abstract and numerical assessment section) is load-bearing on accurate recovery of d-connections from finite observational data, but the manuscript provides no details on the conditional independence tests used, sample sizes, or robustness to test errors and faithfulness violations; this leaves the practical utility of the sorting procedure unverified even when the population monotonicity holds.

Authors: We agree that details on finite-sample estimation are needed to substantiate practical utility. The current numerical assessment demonstrates prevalence of the monotonicity property using true population d-connections, with sorting presented as a conceptual exploitation. We will revise the numerical assessment section to specify the conditional independence tests (e.g., partial correlation tests), sample sizes, and add simulations assessing robustness to test errors and faithfulness violations. revision: yes
Referee: While the proof that strict increase of relatives along the order yields a singular equivalence class is noted, the manuscript does not quantify how often strict monotonicity occurs under the ER/scale-free generators or discuss whether the result extends beyond the specific random graph models considered.

Authors: The numerical assessment already evaluates prevalence of monotonicity for ER and scale-free generators; we will revise to explicitly report the frequency of strict monotonicity. The Markov equivalence proof is general for any DAG with strict monotonicity in relatives and does not rely on the specific generators. We will add discussion clarifying this generality while noting time-series sampling as one alternative to avoid the bias in practice. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained from graph generation process

full rationale

The paper establishes a monotonicity property of the 'relatives' set (nodes reachable via open paths) directly from the standard random DAG generation procedure (impose topological order, then sample edges via ER or scale-free models). This is shown analytically for the population case and assessed via numerical prevalence checks on generated graphs. The proposal to sort by estimated relatives count for order recovery follows as an application, without any step where a fitted parameter or self-citation is redefined as the output. Estimation from data is discussed as a practical step but does not enter the core derivation by construction. No self-definitional loops, fitted-input predictions, or load-bearing self-citations appear in the chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis relies on standard definitions from causal graphical models and random graph theory; no new free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Random causal DAGs are generated by first imposing a total order on nodes and then sampling edges from Erdős-Rényi or scale-free models while respecting the order.
This is the standard construction used in the causal discovery simulation literature referenced in the abstract.

pith-pipeline@v0.9.0 · 5437 in / 1482 out tokens · 44816 ms · 2026-05-08T07:23:14.269215+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

76 extracted references

[1]

Better Simulations for Validating Causal Discovery with the DAG-Adaptation of the Onion Method

Bryan Andrews and Erich Kummerfeld. Better Simulations for Validating Causal Discovery with the DAG-Adaptation of the Onion Method. 2024 (cit. on pp. 2, 6)

2024
[2]

Causal Discovery between time series

Charles Assaad. Causal Discovery between time series. Theses. Universit´e Grenoble Alpes, July 2021 (cit. on p. 8)

2021
[3]

Do we become wiser with time? On causal equivalence with tiered background knowledge

Christine W Bang and Vanessa Didelez. Do we become wiser with time? On causal equivalence with tiered background knowledge. In:Uncertainty in Artificial Intelligence. PMLR. 2023, pp. 119–129 (cit. on p. 8)

2023
[4]

Emergence of Scaling in Random Networks

Albert-L´aszl´o Barab´asi and R ´eka Albert. Emergence of Scaling in Random Networks. In: Science286.5439 (1999), pp. 509–512 (cit. on p. 1)

1999
[5]

DAGMA: Learning DAGs via M- matrices and a Log-Determinant Acyclicity Characterization

Kevin Bello, Bryon Aragam, and Pradeep Ravikumar. DAGMA: Learning DAGs via M- matrices and a Log-Determinant Acyclicity Characterization. In:Advances in Neural Informa- tion Processing Systems35 (2022), pp. 8226–8239 (cit. on p. 6)

2022
[6]

Gilles Blanchard, Nicolas Curien, Klara Krause, and Alexander G. Reisach. A phase transition in Erd˝os-Barak random graphs. arXiv preprint. 2025 (cit. on p. 2)

2025
[7]

Kording, Karen Sachs, Alexandre Drouin, and Dhanya Sridhar

Philippe Brouillard, Chandler Squires, Jonas Wahl, Konrad P. Kording, Karen Sachs, Alexandre Drouin, and Dhanya Sridhar. The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications. In:Causal Learning and Reasoning (CLeaR). 2025 (cit. on pp. 1, 8)

2025
[8]

Increasing paths in random temporal graphs

Nicolas Broutin, Nina Kamcev, and G´abor Lugosi. Increasing paths in random temporal graphs. In:The Annals of Applied Probability34.6 (2024), pp. 5498–5521 (cit. on p. 2)

2024
[9]

CausalTime: Realistically Generated Time-series for Benchmarking of Causal Discovery

Yuxiao Cheng, Ziqian Wang, Tingxiong Xiao, Qin Zhong, Jinli Suo, and Kunlun He. CausalTime: Realistically Generated Time-series for Benchmarking of Causal Discovery. In:The Twelfth International Conference on Learning Representations. 2023 (cit. on p. 8)

2023
[10]

Philip A. Dawid. Beware of the DAG! In:Causality: Abjectives and Assessment. PMLR. 2010, pp. 59–86 (cit. on p. 2)

2010
[11]

On the Evolution of Random Graphs

Paul Erd˝os, Alfr´ed R´enyi, et al. On the Evolution of Random Graphs. In:Publ. Math. Inst. Hung. Acad. Sci5.1 (1960), pp. 17–60 (cit. on p. 1)

1960
[12]

Causal chambers as a real-world physical testbed for AI methodology

Juan L Gamella, Jonas Peters, and Peter B¨uhlmann. Causal chambers as a real-world physical testbed for AI methodology. In:Nature Machine Intelligence7.1 (2025), pp. 107–118 (cit. on p. 8)

2025
[13]

Identifying independence in Bayesian networks

Dan Geiger, Thomas Verma, and Judea Pearl. Identifying independence in Bayesian networks. In:Networks20.5 (1990), pp. 507–534 (cit. on p. 2)

1990
[14]

The Case for Evaluating Causal Models Using Interventional Measures and Empirical Data

Amanda Gentzel, Dan Garant, and David Jensen. The Case for Evaluating Causal Models Using Interventional Measures and Empirical Data. In:Advances in Neural Information Processing Systems (NeurIPS). V ol. 32. 2019 (cit. on p. 1). 10

2019
[15]

causalAssembly: Generating Realistic Production Data for Benchmarking Causal Discovery

Konstantin G¨obler, Tobias Windisch, Mathias Drton, Tim Pychynski, Martin Roth, and Steffen Sonntag. causalAssembly: Generating Realistic Production Data for Benchmarking Causal Discovery. In:Causal Learning and Reasoning. PMLR. 2024, pp. 609–642 (cit. on p. 8)

2024
[16]

Maathuis, and Nicolai Meinshausen

Christina Heinze-Deml, Marloes H. Maathuis, and Nicolai Meinshausen. Causal Structure Learning. In:Annual Review of Statistics and Its Application5 (2018), pp. 371–391 (cit. on p. 1)

2018
[17]

Unitless Unrestricted Markov- Consistent SCM Generation: Better Benchmark Datasets for Causal Discovery

Rebecca J Herman, Jonas Wahl, Urmi Ninad, and Jakob Runge. Unitless Unrestricted Markov- Consistent SCM Generation: Better Benchmark Datasets for Causal Discovery. In:Causal Learning and Reasoning. PMLR. 2025, pp. 1506–1531 (cit. on pp. 2, 6, 8)

2025
[18]

Hoyer, Dominik Janzing, Joris M

Patrik O. Hoyer, Dominik Janzing, Joris M. Mooij, Jonas Peters, Bernhard Sch¨olkopf, et al. Nonlinear causal discovery with additive noise models. In:Advances in Neural Information Processing Systems. V ol. 21. Citeseer. 2008, pp. 689–696 (cit. on p. 5)

2008
[19]

Antti Hyttinen, Frederick Eberhardt, and Patrik O. Hoyer. Learning Linear Cyclic Causal Models with Latent Variables. In:Journal of Machine Learning Research13.109 (2012), pp. 3387–3439 (cit. on p. 8)

2012
[20]

Lower bounds on the size of Markov equivalence classes

Erik Jahn, Frederick Eberhardt, and Leonard J Schulman. Lower bounds on the size of Markov equivalence classes. In:arXiv preprint(2025) (cit. on p. 7)

2025
[21]

Kitson, Anthony C

Neville K. Kitson, Anthony C. Constantinou, Zhigao Guo, Yang Liu, and Kiattikun Chobtham. A survey of Bayesian Network structure learning. In:Artificial Intelligence Review(2023), pp. 1–94 (cit. on p. 1)

2023
[22]

Causal inference and causal explanation with background knowledge

Christopher Meek. Causal inference and causal explanation with background knowledge. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. UAI’95. Morgan Kaufmann Publishers Inc., 1995, pp. 403–410 (cit. on p. 7)

1995
[23]

Mooij, Dominik Janzing, Tom Heskes, and Bernhard Sch¨olkopf

Joris M. Mooij, Dominik Janzing, Tom Heskes, and Bernhard Sch¨olkopf. On Causal Discovery with Cyclic Additive Noise Models. In:Advances in Neural Information Processing Systems (NeurIPS). V ol. 24. 2011 (cit. on p. 8)

2011
[24]

Benchmarking Constraint-Based Bayesian Structure Learning Algorithms: Role of Network Topology

Radha Nagarajan and Marco Scutari. Benchmarking Constraint-Based Bayesian Structure Learning Algorithms: Role of Network Topology. 2025 (cit. on p. 2)

2025
[25]

Stan- dardizing Structural Causal Models

Weronika Ormaniec, Scott Sussex, Lars Lorch, Bernhard Sch¨olkopf, and Andreas Krause. Stan- dardizing Structural Causal Models. In:International Conference on Learning Representations. 2025 (cit. on pp. 2, 6)

2025
[26]

Causality: Models, Reasoning, and Inference

Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2009 (cit. on pp. 1, 2, 5)

2009
[27]

Structural Intervention Distance for Evaluating Causal Graphs

Jonas Peters and Peter B ¨uhlmann. Structural Intervention Distance for Evaluating Causal Graphs. In:Neural computation27.3 (2015), pp. 771–799 (cit. on p. 7)

2015
[28]

Causal Inference on Time Series using Restricted Structural Equation Models

Jonas Peters, Dominik Janzing, and Bernhard Sch ¨olkopf. Causal Inference on Time Series using Restricted Structural Equation Models. In:Advances in Neural Information Processing Systems (NeurIPS). V ol. 26. 2013 (cit. on p. 8)

2013
[29]

Elements of Causal Inference: Foundations and Learning Algorithms

Jonas Peters, Dominik Janzing, and Bernhard Sch ¨olkopf. Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press, 2017 (cit. on p. 2)

2017
[30]

Reisach, Christof Seiler, and Sebastian Weichwald

Alexander G. Reisach, Christof Seiler, and Sebastian Weichwald. Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy to Game. In:Advances in Neural Informa- tion Processing Systems (NeurIPS). V ol. 34. 2021 (cit. on pp. 1, 3, 5, 7)

2021
[31]

Reisach, Alberto Su ´arez, Sebastian Weichwald, and Antoine Chambaz

Alexander G. Reisach, Alberto Su ´arez, Sebastian Weichwald, and Antoine Chambaz. The Case for Time in Causal DAGs. In:Philosophy of Science(2026) (cit. on p. 8)

2026
[32]

Reisach, Myriam Tami, Christof Seiler, Antoine Chambaz, and Sebastian Weichwald

Alexander G. Reisach, Myriam Tami, Christof Seiler, Antoine Chambaz, and Sebastian Weichwald. A Scale-Invariant Sorting Criterion to Find a Causal Order in Additive Noise Models. In:Advances in Neural Information Processing Systems (NeurIPS). V ol. 36. 2023 (cit. on pp. 2, 3, 5, 6)

2023
[33]

Score Matching Enables Causal Discovery of Nonlinear Additive Noise Models

Paul Rolland, V olkan Cevher, Matth ¨aus Kleindessner, Chris Russell, Dominik Janzing, Bern- hard Sch ¨olkopf, and Francesco Locatello. Score Matching Enables Causal Discovery of Nonlinear Additive Noise Models. In:International Conference on Machine Learning. PMLR, 2022, pp. 18741–18753 (cit. on p. 3)

2022
[34]

On the number and size of Markov equivalence classes of random directed acyclic graphs

Dominik Schmid and Allan Sly. On the number and size of Markov equivalence classes of random directed acyclic graphs. 2022 (cit. on p. 2). 11

2022
[35]

Causation, Prediction, and Search

Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. MIT Press, 2001 (cit. on pp. 1, 3, 5)

2001
[36]

Ordering-Based Search: A Simple and Effective Algo- rithm for Learning Bayesian Networks

Marc Teyssier and Daphne Koller. Ordering-Based Search: A Simple and Effective Algo- rithm for Learning Bayesian Networks. In:Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence. AUAI Press, 2005, pp. 584–590 (cit. on p. 5)

2005
[37]

V owels, Necati Cihan Camgoz, and Richard Bowden

Matthew J. V owels, Necati Cihan Camgoz, and Richard Bowden. D’Ya Like DAGs? A Survey on Structure Learning and Causal Discovery. In:ACM Computing Surveys55.4 (2022) (cit. on p. 1)

2022
[38]

Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning

Marcel Wien¨obst, Leonard Henckel, and Sebastian Weichwald. Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning. In:International Conference on Learning Representations. 2025 (cit. on p. 6). A Proof Details This section provides additional details on the proofs of Theorem 2, Theorem 3, and Theorem 4. A.1 Proof of Theorem 2 (Addit...

2025
[39]

Better Simulations for Validating Causal Discovery with the DAG-Adaptation of the Onion Method

Bryan Andrews and Erich Kummerfeld. Better Simulations for Validating Causal Discovery with the DAG-Adaptation of the Onion Method. 2024 (cit. on p. 16)

2024
[40]

BayesDAG: Gradient-Based Posterior Inference for Causal Discovery

Yashas Annadani, Nick Pawlowski, Joel Jennings, Stefan Bauer, Cheng Zhang, and Wenbo Gong. BayesDAG: Gradient-Based Posterior Inference for Causal Discovery. In: 36 (2023). Ed. by A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, pp. 1738–1763 (cit. on p. 15)

2023
[41]

DAGMA: Learning DAGs via M- matrices and a Log-Determinant Acyclicity Characterization

Kevin Bello, Bryon Aragam, and Pradeep Ravikumar. DAGMA: Learning DAGs via M- matrices and a Log-Determinant Acyclicity Characterization. In:Advances in Neural Informa- tion Processing Systems35 (2022), pp. 8226–8239 (cit. on pp. 15, 16)

2022
[42]

Differentiable Causal Discovery from Interventional Data

Philippe Brouillard, S ´ebastien Lachapelle, Alexandre Lacoste, Simon Lacoste-Julien, and Alexandre Drouin. Differentiable Causal Discovery from Interventional Data. In:Advances in Neural Information Processing Systems. Ed. by H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin. V ol. 33. Curran Associates, Inc., 2020, pp. 21865–21877 (cit. on p. 15)

2020
[43]

CAM: Causal Additive Models, High- Dimensional Order Search and Penalized Regression

Peter B ¨uhlmann, Jonas Peters, and Jan Ernest. CAM: Causal Additive Models, High- Dimensional Order Search and Penalized Regression. In:The Annals of Statistics42.6 (2014), pp. 2526–2556 (cit. on p. 15)

2014
[44]

The igraph software package for complex network research

G´abor Cs´ardi and Tam´as Nepusz. The igraph software package for complex network research. In:InterJournal, Complex Systems(2006), p. 1695 (cit. on p. 16)

2006
[45]

Bayesian structure learning with generative flow networks

Tristan Deleu, Ant´onio G´ois, Chris Emezue, Mansi Rankawat, Simon Lacoste-Julien, Stefan Bauer, and Yoshua Bengio. Bayesian structure learning with generative flow networks. In: Uncertainty in Artificial Intelligence. PMLR. 2022, pp. 518–528 (cit. on p. 15). 17

2022
[46]

A Meta-Learning Approach to Bayesian Causal Discovery

Anish Dhir, Matthew Ashman, James Requeima, and Mark van der Wilk. A Meta-Learning Approach to Bayesian Causal Discovery. In:The Thirteenth International Conference on Learning Representations. 2025 (cit. on p. 15)

2025
[47]

Hagberg, Daniel A

Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. Exploring Network Structure, Dynam- ics, and Function using NetworkX. In:Proceedings of the 7th Python in Science Conference. Ed. by Ga¨el Varoquaux, Travis Vaught, and Jarrod Millman. Pasadena, CA USA, Aug. 2008, pp. 11–15 (cit. on p. 16)

2008
[48]

Harris, K

Charles R. Harris, K. Jarrod Millman, St´efan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fern´andez del R´ıo, Mark Wiebe, Pearu Peterson, Pierre G´erard-Marchant, Kevin S...

2020
[49]

Adjustment Identification Dis- tance: A gadjid for Causal Structure Learning

Leonard Henckel, Theo W¨urtzen, and Sebastian Weichwald. Adjustment Identification Dis- tance: A gadjid for Causal Structure Learning. In:The 40th Conference on Uncertainty in Artificial Intelligence. 2024 (cit. on p. 16)

2024
[50]

Unitless Unrestricted Markov- Consistent SCM Generation: Better Benchmark Datasets for Causal Discovery

Rebecca J Herman, Jonas Wahl, Urmi Ninad, and Jakob Runge. Unitless Unrestricted Markov- Consistent SCM Generation: Better Benchmark Datasets for Causal Discovery. In:Causal Learning and Reasoning. PMLR. 2025, pp. 1506–1531 (cit. on p. 16)

2025
[51]

Generalized score functions for causal discovery

Biwei Huang, Kun Zhang, Yizhu Lin, Bernhard Sch¨olkopf, and Clark Glymour. Generalized score functions for causal discovery. In:Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 2018, pp. 1551–1560 (cit. on p. 15)

2018
[52]

J. D. Hunter. Matplotlib: A 2D graphics environment. In:Computing in Science & Engineering 9.3 (2007), pp. 90–95 (cit. on p. 16)

2007
[53]

Constraint-based Causal Discovery: Conflict Resolution with Answer Set Programming

Antti Hyttinen, Frederick Eberhardt, and Matti J¨arvisalo. Constraint-based Causal Discovery: Conflict Resolution with Answer Set Programming. In:UAI. 2014, pp. 340–349 (cit. on p. 15)

2014
[54]

Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm

Markus Kalisch and Peter B¨uhlman. Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm. In:Journal of Machine Learning Research8.3 (2007) (cit. on p. 15)

2007
[55]

Large-Scale Targeted Cause Discovery via Learning from Simulated Data

Jang-Hyun Kim, Claudia Skok Gibbs, Sangdoo Yun, Hyun Oh Song, and Kyunghyun Cho. Large-Scale Targeted Cause Discovery via Learning from Simulated Data. In:Transactions on Machine Learning Research(2025) (cit. on p. 15)

2025
[56]

Gradient- based neural dag learning

S´ebastien Lachapelle, Philippe Brouillard, Tristan Deleu, and Simon Lacoste-Julien. Gradient- based neural dag learning. In:International Conference on Learning Representations. 2020 (cit. on p. 15)

2020
[57]

DiBS: Differentiable Bayesian Structure Learning

Lars Lorch, Jonas Rothfuss, Bernhard Sch¨olkopf, and Andreas Krause. DiBS: Differentiable Bayesian Structure Learning. In:Advances in Neural Information Processing Systems34 (2021), pp. 24111–24123 (cit. on p. 15)

2021
[58]

Sara Magliacane, Tom Claassen, and Joris M. Mooij. Ancestral causal inference. In:Advances in Neural Information Processing Systems29 (2016) (cit. on p. 15)

2016
[59]

Causal inference and causal explanation with background knowledge

Christopher Meek. Causal inference and causal explanation with background knowledge. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. UAI’95. Morgan Kaufmann Publishers Inc., 1995, pp. 403–410 (cit. on p. 14)

1995
[60]

Scalable Causal Discovery with Score Matching

Francesco Montagna, Nicoletta Noceti, Lorenzo Rosasco, Kun Zhang, and Francesco Locatello. Scalable Causal Discovery with Score Matching. In:Conference on Causal Learning and Reasoning. PMLR. 2023, pp. 752–771 (cit. on p. 15)

2023
[61]

On the role of sparsity and dag constraints for learning linear dags

Ignavier Ng, Amir Emad Ghassami, and Kun Zhang. On the role of sparsity and dag constraints for learning linear dags. In:Advances in Neural Information Processing Systems. V ol. 33. 2020 (cit. on p. 15)

2020
[62]

Structure learning with continuous optimization: A sober look and beyond

Ignavier Ng, Biwei Huang, and Kun Zhang. Structure learning with continuous optimization: A sober look and beyond. In:Causal Learning and Reasoning. PMLR. 2024, pp. 71–105 (cit. on p. 15)

2024
[63]

Stan- dardizing Structural Causal Models

Weronika Ormaniec, Scott Sussex, Lars Lorch, Bernhard Sch¨olkopf, and Andreas Krause. Stan- dardizing Structural Causal Models. In:International Conference on Learning Representations. 2025 (cit. on p. 15). 18

2025
[64]

Version latest

pandas Development Team.pandas-dev/pandas: Pandas. Version latest. Feb. 2020 (cit. on p. 16)

2020
[65]

Structural Intervention Distance for Evaluating Causal Graphs

Jonas Peters and Peter B ¨uhlmann. Structural Intervention Distance for Evaluating Causal Graphs. In:Neural computation27.3 (2015), pp. 771–799 (cit. on p. 15)

2015
[66]

Mooij, Dominik Janzing, and Bernhard Sch¨olkopf

Jonas Peters, Joris M. Mooij, Dominik Janzing, and Bernhard Sch¨olkopf. Causal discovery with continuous additive noise models. In: (2014) (cit. on p. 15)

2014
[67]

Reisach, Christof Seiler, and Sebastian Weichwald

Alexander G. Reisach, Christof Seiler, and Sebastian Weichwald. Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy to Game. In:Advances in Neural Informa- tion Processing Systems (NeurIPS). V ol. 34. 2021 (cit. on p. 17)

2021
[68]

Reisach, Myriam Tami, Christof Seiler, Antoine Chambaz, and Sebastian Weichwald

Alexander G. Reisach, Myriam Tami, Christof Seiler, Antoine Chambaz, and Sebastian Weichwald. A Scale-Invariant Sorting Criterion to Find a Causal Order in Additive Noise Models. In:Advances in Neural Information Processing Systems (NeurIPS). V ol. 36. 2023 (cit. on p. 16)

2023
[69]

Score Matching Enables Causal Discovery of Nonlinear Additive Noise Models

Paul Rolland, V olkan Cevher, Matth ¨aus Kleindessner, Chris Russell, Dominik Janzing, Bern- hard Sch ¨olkopf, and Francesco Locatello. Score Matching Enables Causal Discovery of Nonlinear Additive Noise Models. In:International Conference on Machine Learning. PMLR, 2022, pp. 18741–18753 (cit. on p. 15)

2022
[70]

Hoyer, and Kenneth Bollen

Shohei Shimizu, Takanori Inazumi, Yasuhiro Sogawa, Aapo Hyv¨arinen, Yoshinobu Kawahara, Takashi Washio, Patrik O. Hoyer, and Kenneth Bollen. DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. In:Journal of Machine Learning Research12 (2011), pp. 1225–1248 (cit. on p. 15)

2011
[71]

Learning Bayesian networks with discrete variables from data

Peter Spirtes and Christopher Meek. Learning Bayesian networks with discrete variables from data. In:KDD. V ol. 1. 1995, pp. 294–299 (cit. on p. 15)

1995
[72]

Towards Scalable Bayesian Learning of Causal DAGs

Jussi Viinikka, Antti Hyttinen, Johan Pensar, and Mikko Koivisto. Towards Scalable Bayesian Learning of Causal DAGs. In:Advances in Neural Information Processing Systems33 (2020), pp. 6584–6594 (cit. on p. 15)

2020
[73]

Michael L. Waskom. seaborn: statistical data visualization. In:Journal of Open Source Software 6.60 (2021), p. 3021 (cit. on p. 16)

2021
[74]

Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning

Marcel Wien¨obst, Leonard Henckel, and Sebastian Weichwald. Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning. In:International Conference on Learning Representations. 2025 (cit. on pp. 15, 16)

2025
[75]

DAG-GNN: DAG Structure Learning with Graph Neural Networks

Yue Yu, Jie Chen, Tian Gao, and Mo Yu. DAG-GNN: DAG Structure Learning with Graph Neural Networks. In:International Conference on Machine Learning. PMLR, 2019, pp. 7154– 7163 (cit. on p. 15)

2019
[76]

DAGs with NO TEARS: Continuous Optimization for Structure Learning

Xun Zheng, Bryon Aragam, Pradeep K Ravikumar, and Eric P Xing. DAGs with NO TEARS: Continuous Optimization for Structure Learning. In:Advances in Neural Information Process- ing Systems. V ol. 32. 2018, pp. 9472–9483 (cit. on pp. 15, 16). 19

2018