arxiv: 2602.07915 · v2 · submitted 2026-02-08 · 💻 cs.LG · cs.AI· stat.ME· stat.ML

Recognition: no theorem link

CausalCompass: Evaluating the Robustness of Time-Series Causal Discovery in Misspecified Scenarios

Huiyang Yi , Xiaojian Shen , Yonggang Wu , Duxin Chen , He Wang , Wenwu Yu

Authors on Pith no claims yet

Pith reviewed 2026-05-16 06:25 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.MEstat.ML

keywords causal discoverytime seriesrobustnessbenchmarkassumption violationdeep learningmisspecificationcausal inference

0 comments

The pith

CausalCompass benchmark shows deep learning methods outperform others in time-series causal discovery when assumptions are violated.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates CausalCompass to test time-series causal discovery algorithms in settings where standard modeling assumptions fail. It runs representative methods through eight distinct violation scenarios and finds that performance varies widely with no single winner in every case. Deep learning approaches achieve the strongest results across the full set of tests. The work adds hyperparameter checks and ablation studies to explain why those methods hold up better. This setup gives practitioners a clearer way to pick algorithms for real data where assumptions rarely hold perfectly.

Core claim

CausalCompass is a flexible benchmark framework for assessing the robustness of time-series causal discovery methods under violations of modeling assumptions. Experiments across eight scenarios show that no method attains optimal performance in all settings, yet deep learning-based approaches exhibit superior overall performance. The framework also reveals that NTS-NOTEARS depends heavily on standardized preprocessing and that ablation studies clarify the sources of deep learning strength under misspecification.

What carries the argument

CausalCompass, a benchmark that applies eight specific assumption-violation scenarios to representative time-series causal discovery algorithms and measures their performance.

If this is right

Practitioners should favor deep learning methods when applying time-series causal discovery to data likely to violate standard assumptions.
Hyperparameter sensitivity must be checked because performance rankings shift with different settings.
Standardization preprocessing should be applied by default for methods like NTS-NOTEARS to avoid poor results in the unprocessed case.
Future algorithm design should prioritize robustness mechanisms that deep learning appears to exploit under misspecification.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Combined violations occurring together in one dataset would be a natural next test to see whether the deep learning advantage persists.
Adding real-world datasets with unknown violation patterns could validate whether the simulated scenarios predict actual performance.
The flexibility of deep learning in capturing nonlinear dependencies may explain its edge and suggest targeted improvements for classical methods.

Load-bearing premise

The eight chosen assumption-violation scenarios adequately represent the range and severity of misspecifications that occur in real-world time-series data.

What would settle it

Observing that a non-deep-learning method achieves the highest average score across the eight scenarios on new or real-world datasets would challenge the claim of deep learning superiority.

Figures

Figures reproduced from arXiv: 2602.07915 by Duxin Chen, He Wang, Huiyang Yi, Wenwu Yu, Xiaojian Shen, Yonggang Wu.

**Figure 2.** Figure 2: Experimental results under the linear and nonlinear settings across the vanilla scenario and [PITH_FULL_IMAGE:figures/full_fig_p049_2.png] view at source ↗

**Figure 3.** Figure 3: Experimental results under the nonlinear settings across the vanilla scenario and eight [PITH_FULL_IMAGE:figures/full_fig_p050_3.png] view at source ↗

**Figure 4.** Figure 4: Experimental results under the linear and nonlinear settings across the vanilla scenario and [PITH_FULL_IMAGE:figures/full_fig_p051_4.png] view at source ↗

**Figure 5.** Figure 5: Experimental results under the linear and nonlinear settings across the vanilla scenario and [PITH_FULL_IMAGE:figures/full_fig_p052_5.png] view at source ↗

**Figure 6.** Figure 6: Experimental results under the nonlinear settings across the vanilla scenario and eight [PITH_FULL_IMAGE:figures/full_fig_p053_6.png] view at source ↗

**Figure 7.** Figure 7: Experimental results under the linear and nonlinear settings across the vanilla scenario and [PITH_FULL_IMAGE:figures/full_fig_p054_7.png] view at source ↗

**Figure 8.** Figure 8: Experimental results under the linear and nonlinear settings across the vanilla scenario [PITH_FULL_IMAGE:figures/full_fig_p055_8.png] view at source ↗

**Figure 9.** Figure 9: Experimental results under the nonlinear settings across the vanilla scenario and eight [PITH_FULL_IMAGE:figures/full_fig_p056_9.png] view at source ↗

**Figure 10.** Figure 10: Experimental results under the linear and nonlinear settings across the vanilla scenario and [PITH_FULL_IMAGE:figures/full_fig_p057_10.png] view at source ↗

**Figure 11.** Figure 11: Experimental results under the linear and nonlinear settings across the vanilla scenario [PITH_FULL_IMAGE:figures/full_fig_p058_11.png] view at source ↗

**Figure 12.** Figure 12: Experimental results under the nonlinear settings across the vanilla scenario and eight [PITH_FULL_IMAGE:figures/full_fig_p059_12.png] view at source ↗

**Figure 13.** Figure 13: Experimental results under the linear and nonlinear settings across the vanilla scenario [PITH_FULL_IMAGE:figures/full_fig_p060_13.png] view at source ↗

**Figure 14.** Figure 14: Experimental results under the linear and nonlinear settings across the vanilla scenario [PITH_FULL_IMAGE:figures/full_fig_p061_14.png] view at source ↗

**Figure 15.** Figure 15: Experimental results under the nonlinear settings across the vanilla scenario and eight [PITH_FULL_IMAGE:figures/full_fig_p062_15.png] view at source ↗

**Figure 16.** Figure 16: Experimental results under the linear and nonlinear settings across the vanilla scenario [PITH_FULL_IMAGE:figures/full_fig_p063_16.png] view at source ↗

**Figure 17.** Figure 17: Experimental results under the linear and nonlinear settings across the vanilla scenario [PITH_FULL_IMAGE:figures/full_fig_p064_17.png] view at source ↗

**Figure 18.** Figure 18: Experimental results under the nonlinear settings across the vanilla scenario and eight [PITH_FULL_IMAGE:figures/full_fig_p065_18.png] view at source ↗

read the original abstract

Causal discovery from time series is a fundamental task in machine learning. However, its widespread adoption is hindered by a reliance on untestable causal assumptions and by the lack of robustness-oriented evaluation in existing benchmarks. To address these challenges, we propose CausalCompass, a flexible and extensible benchmark framework designed to assess the robustness of time-series causal discovery (TSCD) methods under violations of modeling assumptions. To demonstrate the practical utility of CausalCompass, we conduct extensive benchmarking of representative TSCD algorithms across eight assumption-violation scenarios. Our experimental results indicate that no single method consistently attains optimal performance across all settings. Nevertheless, the methods exhibiting superior overall performance across diverse scenarios are almost invariably deep learning-based approaches. We further provide hyperparameter sensitivity analyses to deepen the understanding of these findings. We additionally conduct ablation experiments to explain the strong performance of deep learning-based methods under assumption violations. We also find, somewhat surprisingly, that NTS-NOTEARS relies heavily on standardized preprocessing in practice, performing poorly in the vanilla setting but exhibiting strong performance after standardization. Finally, our work aims to provide a comprehensive and systematic evaluation of TSCD methods under assumption violations, thereby facilitating their broader adoption in real-world applications. The user-friendly implementation, documentation and datasets are available at https://anonymous.4open.science/r/CausalCompass-anonymous-5B4F/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CausalCompass adds a useful benchmark for TSCD robustness under violations, with DL methods showing stronger average performance, but the eight synthetic scenarios lack validation against real data distributions.

read the letter

The paper's main contribution is CausalCompass, a new extensible benchmark that runs standard TSCD algorithms across eight assumption-violation scenarios and reports that deep learning methods tend to hold up better overall while no single method dominates. It also flags the heavy dependence of NTS-NOTEARS on standardized preprocessing and includes sensitivity and ablation checks to probe why DL approaches are more stable. The released code and datasets make the experiments straightforward to inspect or extend. These elements give practitioners concrete data on method behavior when linearity, stationarity, or no-confounding assumptions break. The soft spot is that the scenarios are purely synthetic with no quantitative comparison to statistical signatures in actual time series from finance, neuroscience, or climate. Without that check, it remains unclear whether the chosen violations are representative in severity or type, which could affect the DL superiority claim if more realistic regimes were used. The experimental design itself is transparent and avoids circularity, relying on generated data and off-the-shelf algorithms. This work is aimed at researchers who pick or improve TSCD methods for applied settings and want robustness evidence beyond standard benchmarks. It shows honest engagement with the assumption literature and reproducible results, so it deserves a serious referee even if the scenario justification needs tightening in revision.

Referee Report

2 major / 1 minor

Summary. The paper introduces CausalCompass, a flexible benchmark framework for evaluating the robustness of time-series causal discovery (TSCD) methods under violations of standard modeling assumptions. Through benchmarking of representative algorithms across eight synthetic assumption-violation scenarios, the authors report that no single method is optimal in all cases, but deep learning-based approaches consistently show superior overall performance. The work includes hyperparameter sensitivity analyses, ablation studies to explain DL advantages, and a specific observation that NTS-NOTEARS performs poorly without standardization but strongly with it.

Significance. If the empirical rankings hold, CausalCompass provides a much-needed tool for systematic robustness assessment in TSCD, where untestable assumptions often limit practical adoption. The emphasis on open implementation, documentation, and datasets supports reproducibility and community use. The finding that DL methods are more resilient under misspecification could inform method selection in domains like finance and neuroscience, though this depends on the scenarios' fidelity to real data.

major comments (2)

[Experimental Setup / Scenario Design] The headline claim that deep learning-based TSCD methods exhibit superior overall performance across diverse misspecified settings is load-bearing on the eight chosen violation scenarios being representative. The manuscript provides no quantitative validation (e.g., matching of statistical signatures such as nonlinearity strength or latent confounding levels) that these synthetic regimes align with misspecifications observed in real-world time-series from target domains.
[Results and Discussion] The finding that NTS-NOTEARS relies heavily on standardized preprocessing (poor in vanilla setting, strong after standardization) is presented as surprising; this raises the question of whether preprocessing choices were uniformly applied across all methods, which could affect the fairness of the DL vs. non-DL ranking in § on results.

minor comments (1)

[Abstract] The abstract states the repository link as anonymous; replace with a permanent DOI or GitHub link in the final version.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback on our manuscript. We address each major comment below and outline the revisions we will make to strengthen the paper.

read point-by-point responses

Referee: [Experimental Setup / Scenario Design] The headline claim that deep learning-based TSCD methods exhibit superior overall performance across diverse misspecified settings is load-bearing on the eight chosen violation scenarios being representative. The manuscript provides no quantitative validation (e.g., matching of statistical signatures such as nonlinearity strength or latent confounding levels) that these synthetic regimes align with misspecifications observed in real-world time-series from target domains.

Authors: We thank the referee for this important observation. The eight scenarios were selected to represent common assumption violations from the TSCD literature (nonlinearity, latent confounding, non-stationarity, etc.) in a controlled manner. We did not perform quantitative statistical signature matching to specific real-world datasets, as the benchmark prioritizes synthetic control for isolating misspecification effects. In the revision we will expand the scenario design section with additional motivation and references to target domains (finance, neuroscience), explicitly acknowledge the synthetic nature of the regimes, and discuss limitations regarding real-world alignment. revision: partial
Referee: [Results and Discussion] The finding that NTS-NOTEARS relies heavily on standardized preprocessing (poor in vanilla setting, strong after standardization) is presented as surprising; this raises the question of whether preprocessing choices were uniformly applied across all methods, which could affect the fairness of the DL vs. non-DL ranking in § on results.

Authors: We confirm that preprocessing was applied uniformly across all methods. The vanilla setting uses raw data with no standardization, while the standardized setting applies z-score normalization consistently to every algorithm before model fitting. The NTS-NOTEARS result was obtained under these identical conditions. We will revise the results section to explicitly document the preprocessing pipeline, state that it was identical for all methods, and clarify the experimental settings to remove any ambiguity about fairness. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical benchmarking on synthetic data

full rationale

The paper contains no mathematical derivation chain, first-principles predictions, or fitted parameters that are later renamed as outputs. It defines eight synthetic assumption-violation scenarios, generates data from them, runs existing TSCD algorithms (including DL-based ones), and reports performance metrics. All results are direct experimental measurements on the generated data; no step reduces by construction to a self-defined quantity or to a self-citation whose content is unverified. The central claim (DL methods show superior aggregate performance) is therefore an empirical observation, not a tautology. The representativeness of the eight scenarios is a separate validity question outside the scope of circularity analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work relies on standard domain assumptions from causal discovery literature (e.g., that controlled violations can be generated to test robustness) but introduces no new free parameters, axioms, or invented entities beyond the benchmark itself.

axioms (1)

domain assumption Standard causal assumptions such as no hidden confounders and correct model specification are frequently violated in practice.
This premise motivates the need for the robustness benchmark.

pith-pipeline@v0.9.0 · 5565 in / 1134 out tokens · 32238 ms · 2026-05-16T06:25:21.437111+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

84 extracted references · 84 canonical work pages · 2 internal anchors

[1]

Regime identification for improving causal analysis in non-stationary timeseries.arXiv preprint arXiv:2405.02315, 2024

Wasim Ahmad, Maha Shadaydeh, and Joachim Denzler. Regime identification for improving causal analysis in non-stationary timeseries.arXiv preprint arXiv:2405.02315, 2024

work page arXiv 2024
[2]

Temporal causal modeling with graphical granger methods

Andrew Arnold, Yan Liu, and Naoki Abe. Temporal causal modeling with graphical granger methods. InProceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 66–75, 2007

work page 2007
[3]

Survey and evaluation of causal discovery methods for time series.Journal of Artificial Intelligence Research, 73:767–819, 2022

Charles K Assaad, Emilie Devijver, and Eric Gaussier. Survey and evaluation of causal discovery methods for time series.Journal of Artificial Intelligence Research, 73:767–819, 2022

work page 2022
[4]

The use of the area under the roc curve in the evaluation of machine learning algorithms.Pattern recognition, 30(7):1145–1159, 1997

Andrew P Bradley. The use of the area under the roc curve in the evaluation of machine learning algorithms.Pattern recognition, 30(7):1145–1159, 1997

work page 1997
[5]

Tangent space causal inference: Leveraging vector fields for causal discovery in dynamical systems.Advances in Neural Information Processing Systems, 37:120078–120102, 2024

Kurt Butler, Daniel Waxman, and Petar Djuric. Tangent space causal inference: Leveraging vector fields for causal discovery in dynamical systems.Advances in Neural Information Processing Systems, 37:120078–120102, 2024

work page 2024
[6]

Triad constraints for learning causal structure of latent variables

Ruichu Cai, Feng Xie, Clark Glymour, Zhifeng Hao, and Kun Zhang. Triad constraints for learning causal structure of latent variables. InAdvances in Neural Information Processing Systems, volume 32, 2019

work page 2019
[7]

Causal discoveries for high dimensional mixed data.Statistics in Medicine, 41(24):4924–4940, 2022

Zhanrui Cai, Dong Xi, Xuan Zhu, and Runze Li. Causal discoveries for high dimensional mixed data.Statistics in Medicine, 41(24):4924–4940, 2022

work page 2022
[8]

Chapman and hall/CRC, 2019

Chris Chatfield and Haipeng Xing.The analysis of time series: an introduction with R. Chapman and hall/CRC, 2019

work page 2019
[9]

Addressing information asymmetry: Deep temporal causal- ity discovery for mixed time series.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

Jiawei Chen and Chunhui Zhao. Addressing information asymmetry: Deep temporal causal- ity discovery for mixed time series.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

work page 2025
[10]

Neural ordinary differential equations.Advances in neural information processing systems, 31, 2018

Ricky TQ Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud. Neural ordinary differential equations.Advances in neural information processing systems, 31, 2018

work page 2018
[11]

Cuts: Neural causal discovery from irregular time-series data.arXiv preprint arXiv:2302.07458, 2023

Yuxiao Cheng, Runzhao Yang, Tingxiong Xiao, Zongren Li, Jinli Suo, Kunlun He, and Qionghai Dai. Cuts: Neural causal discovery from irregular time-series data.arXiv preprint arXiv:2302.07458, 2023

work page arXiv 2023
[12]

Cuts+: High-dimensional causal discovery from irregular time-series

Yuxiao Cheng, Lianglong Li, Tingxiong Xiao, Zongren Li, Jinli Suo, Kunlun He, and Qionghai Dai. Cuts+: High-dimensional causal discovery from irregular time-series. InProceedings of the AAAI Conference on Artificial Intelligence, pages 11525–11533, 2024

work page 2024
[13]

Search for additive nonlinear time series causal models.Journal of Machine Learning Research, 9(5), 2008

Tianjiao Chu, Clark Glymour, and Greg Ridgeway. Search for additive nonlinear time series causal models.Journal of Machine Learning Research, 9(5), 2008

work page 2008
[14]

A seasonal-trend decomposition procedure based on loess (with discussion).J

STL Cleveland. A seasonal-trend decomposition procedure based on loess (with discussion).J. Off. Stat, 6(3), 1990

work page 1990
[15]

Copula pc algorithm for causal discovery from mixed data

Ruifei Cui, Perry Groot, and Tom Heskes. Copula pc algorithm for causal discovery from mixed data. InJoint European conference on machine learning and knowledge discovery in databases, pages 377–392. Springer, 2016

work page 2016
[16]

Haoyue Dai, Peter Spirtes, and Kun Zhang. Independence testing-based approach to causal discovery under measurement error and linear non-Gaussian models.Advances in Neural Information Processing Systems, 35:27524–27536, 2022. 10

work page 2022
[17]

The relationship between precision-recall and roc curves

Jesse Davis and Mark Goadrich. The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine learning, pages 233–240, 2006

work page 2006
[18]

On causal discovery from time series data using fci.Proba- bilistic graphical models, 16, 2010

Doris Entner and Patrik O Hoyer. On causal discovery from time series data using fci.Proba- bilistic graphical models, 16, 2010

work page 2010
[19]

Timegraph: Synthetic benchmark datasets for robust time-series causal discovery

Muhammad Hasan Ferdous, Emam Hossain, and Md Osman Gani. Timegraph: Synthetic benchmark datasets for robust time-series causal discovery. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V . 2, pages 5425–5435, 2025

work page 2025
[20]

Causal discovery of gene regulation with incomplete data.Journal of the Royal Statistical Society Series A: Statistics in Society, 183(4): 1747–1775, 2020

Ronja Foraita, Juliane Friemel, Kathrin Günther, Thomas Behrens, Jörn Bullerdiek, Rolf Nimzyk, Wolfgang Ahrens, and Vanessa Didelez. Causal discovery of gene regulation with incomplete data.Journal of the Royal Statistical Society Series A: Statistics in Society, 183(4): 1747–1775, 2020

work page 2020
[21]

Causal discovery for non-stationary non-linear time series data using just-in-time modeling

Daigo Fujiwara, Kazuki Koyama, Keisuke Kiritoshi, Tomomi Okawachi, Tomonori Izumitani, and Shohei Shimizu. Causal discovery for non-stationary non-linear time series data using just-in-time modeling. InConference on Causal Learning and Reasoning, pages 880–894. PMLR, 2023

work page 2023
[22]

MissDAG: Causal discovery in the presence of missing data with continuous additive noise models

Erdun Gao, Ignavier Ng, Mingming Gong, Li Shen, Wei Huang, Tongliang Liu, Kun Zhang, and Howard Bondell. MissDAG: Causal discovery in the presence of missing data with continuous additive noise models. InAdvances in Neural Information Processing Systems, volume 35, pages 5024–5038, 2022

work page 2022
[23]

Meta-d2ag: Causal graph learning with interventional dynamic data

Tian Gao, Songtao Lu, Junkyu Lee, Elliot Nelson, Debarun Bhattacharjya, Yue Yu, and Miao Liu. Meta-d2ag: Causal graph learning with interventional dynamic data. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025
[24]

High-recall causal discovery for autocorrelated time series with latent confounders.Advances in neural information processing systems, 33:12615–12625, 2020

Andreas Gerhardus and Jakob Runge. High-recall causal discovery for autocorrelated time series with latent confounders.Advances in neural information processing systems, 33:12615–12625, 2020

work page 2020
[25]

Review of causal discovery methods based on graphical models.Frontiers in Genetics, 10:524, 2019

Clark Glymour, Kun Zhang, and Peter Spirtes. Review of causal discovery methods based on graphical models.Frontiers in Genetics, 10:524, 2019

work page 2019
[26]

Causal discovery from temporal data: An overview and new perspectives.ACM Computing Surveys, 57 (4):1–38, 2024

Chang Gong, Chuzhe Zhang, Di Yao, Jingping Bi, Wenbin Li, and Yongjun Xu. Causal discovery from temporal data: An overview and new perspectives.ACM Computing Surveys, 57 (4):1–38, 2024

work page 2024
[27]

Investigating causal relations by econometric models and cross-spectral methods.Econometrica: journal of the Econometric Society, pages 424–438, 1969

Clive WJ Granger. Investigating causal relations by econometric models and cross-spectral methods.Econometrica: journal of the Econometric Society, pages 424–438, 1969

work page 1969
[28]

Causaldynamics: A large-scale benchmark for structural discovery of dynamical causal models.arXiv preprint arXiv:2505.16620, 2025

Benjamin Herdeanu, Juan Nathaniel, Carla Roesch, Jatan Buch, Gregor Ramien, Johannes Haux, and Pierre Gentine. Causaldynamics: A large-scale benchmark for structural discovery of dynamical causal models.arXiv preprint arXiv:2505.16620, 2025

work page arXiv 2025
[29]

Identification of time-dependent causal model: A gaussian process treatment

Biwei Huang, Kun Zhang, and Bernhard Schölkopf. Identification of time-dependent causal model: A gaussian process treatment. InIJCAI, pages 3561–3568, 2015

work page 2015
[30]

Causal discovery and forecasting in nonstationary environments with state-space models

Biwei Huang, Kun Zhang, Mingming Gong, and Clark Glymour. Causal discovery and forecasting in nonstationary environments with state-space models. InInternational conference on machine learning, pages 2901–2910. Pmlr, 2019

work page 2019
[31]

Causal discovery from heterogeneous/nonstationary data.Journal of Machine Learning Research, 21(89):1–53, 2020

Biwei Huang, Kun Zhang, Jiji Zhang, Joseph Ramsey, Ruben Sanchez-Romero, Clark Glymour, and Bernhard Schölkopf. Causal discovery from heterogeneous/nonstationary data.Journal of Machine Learning Research, 21(89):1–53, 2020

work page 2020
[32]

Causal discovery from subsampled time series data by constraint optimization

Antti Hyttinen, Sergey Plis, Matti Järvisalo, Frederick Eberhardt, and David Danks. Causal discovery from subsampled time series data by constraint optimization. InConference on Probabilistic Graphical Models, pages 216–227. PMLR, 2016

work page 2016
[33]

Estimation of a structural vector autoregression model using non-gaussianity.Journal of Machine Learning Research, 11 (5), 2010

Aapo Hyvärinen, Kun Zhang, Shohei Shimizu, and Patrik O Hoyer. Estimation of a structural vector autoregression model using non-gaussianity.Journal of Machine Learning Research, 11 (5), 2010. 11

work page 2010
[34]

Efficient Causal Graph Discovery Using Large Language Models

Thomas Jiralerspong, Xiaoyin Chen, Yash More, Vedant Shah, and Yoshua Bengio. Efficient causal graph discovery using large language models.arXiv preprint arXiv:2402.01207, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[35]

Extensive chaos in the lorenz-96 model.Chaos: An interdisciplinary journal of nonlinear science, 20(4), 2010

Alireza Karimi and Mark R Paul. Extensive chaos in the lorenz-96 model.Chaos: An interdisciplinary journal of nonlinear science, 20(4), 2010

work page 2010
[36]

A logic for causal inference in time series with discrete and continuous variables

Samantha Kleinberg. A logic for causal inference in time series with discrete and continuous variables. InIJCAI Proceedings-International Joint Conference on Artificial Intelligence, volume 22, page 943, 2011

work page 2011
[37]

Improving bayesian network structure learning in the presence of measurement error.Journal of Machine Learning Research, 23(324): 1–28, 2022

Yang Liu, Anthony C Constantinou, and Zhigao Guo. Improving bayesian network structure learning in the presence of measurement error.Journal of Machine Learning Research, 23(324): 1–28, 2022

work page 2022
[38]

Position: The causal revolution needs scientific pragmatism.arXiv preprint arXiv:2406.02275, 2024

Joshua Loftus. Position: The causal revolution needs scientific pragmatism.arXiv preprint arXiv:2406.02275, 2024

work page arXiv 2024
[39]

Robustness of algorithms for causal structure learning to hyperparameter choice

Damian Machlanski, Spyridon Samothrakis, and Paul S Clarke. Robustness of algorithms for causal structure learning to hyperparameter choice. InCausal Learning and Reasoning, pages 703–739. PMLR, 2024

work page 2024
[40]

Causal structure learning from multivariate time series in settings with unmeasured confounding

Daniel Malinsky and Peter Spirtes. Causal structure learning from multivariate time series in settings with unmeasured confounding. InProceedings of 2018 ACM SIGKDD workshop on causal discovery, pages 23–47. PMLR, 2018

work page 2018
[41]

Spacetime: Causal discovery from non-stationary time series

Sarah Mameche, Lénaïg Cornanguer, Urmi Ninad, and Jilles Vreeken. Spacetime: Causal discovery from non-stationary time series. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 19405–19413, 2025

work page 2025
[42]

Interpretable models for granger causality using self-explaining neural networks.arXiv preprint arXiv:2101.07600, 2021

Riˇcards Marcinkevi ˇcs and Julia E V ogt. Interpretable models for granger causality using self-explaining neural networks.arXiv preprint arXiv:2101.07600, 2021

work page arXiv 2021
[43]

Assumption violations in causal discovery and the robustness of score matching

Francesco Montagna, Atalanti Mastakouri, Elias Eulig, Nicoletta Noceti, Lorenzo Rosasco, Dominik Janzing, Bryon Aragam, and Francesco Locatello. Assumption violations in causal discovery and the robustness of score matching. InAdvances in Neural Information Processing Systems, volume 36, 2023

work page 2023
[44]

Causal discovery with attention-based convo- lutional neural networks.Machine Learning and Knowledge Extraction, 1(1):19, 2019

Meike Nauta, Doina Bucur, and Christin Seifert. Causal discovery with attention-based convo- lutional neural networks.Machine Learning and Knowledge Extraction, 1(1):19, 2019

work page 2019
[45]

On the role of sparsity and DAG constraints for learning linear DAGs

Ignavier Ng, AmirEmad Ghassami, and Kun Zhang. On the role of sparsity and DAG constraints for learning linear DAGs. InAdvances in Neural Information Processing Systems, volume 33, pages 17943–17954, 2020

work page 2020
[46]

Structure learning with continuous optimization: A sober look and beyond

Ignavier Ng, Biwei Huang, and Kun Zhang. Structure learning with continuous optimization: A sober look and beyond. InCausal Learning and Reasoning, pages 71–105. PMLR, 2024

work page 2024
[47]

DYNOTEARS: Structure learning from time-series data

Roxana Pamfil, Nisara Sriwattanaworachai, Shaan Desai, Philip Pilgerstorfer, Konstantinos Georgatzis, Paul Beaumont, and Bryon Aragam. DYNOTEARS: Structure learning from time-series data. InInternational Conference on Artificial Intelligence and Statistics, pages 1595–1605. PMLR, 2020

work page 2020
[48]

Cambridge university press, 2009

Judea Pearl.Causality. Cambridge university press, 2009

work page 2009
[49]

Basic books, 2018

Judea Pearl and Dana Mackenzie.The Book of Why: The New Science of Cause and Effect. Basic books, 2018

work page 2018
[50]

The MIT Press, 2017

Jonas Peters, Dominik Janzing, and Bernhard Schölkopf.Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, 2017

work page 2017
[51]

Position: Causal machine learning requires rigorous synthetic experiments for broader adoption.arXiv preprint arXiv:2508.08883, 2025

Audrey Poinsot, Panayiotis Panayiotou, Alessandro Leite, Nicolas Chesneau, Özgür ¸ Sim¸ sek, and Marc Schoenauer. Position: Causal machine learning requires rigorous synthetic experiments for broader adoption.arXiv preprint arXiv:2508.08883, 2025. 12

work page arXiv 2025
[52]

Comparison of strategies for scalable causal discovery of latent variable models from mixed data.International journal of data science and analytics, 6(1):33–45, 2018

Vineet K Raghu, Joseph D Ramsey, Alison Morris, Dimitrios V Manatakis, Peter Sprites, Panos K Chrysanthis, Clark Glymour, and Panayiotis V Benos. Comparison of strategies for scalable causal discovery of latent variable models from mixed data.International journal of data science and analytics, 6(1):33–45, 2018

work page 2018
[53]

Beware of the simulated DAG! causal discovery benchmarks may be easy to game

Alexander Reisach, Christof Seiler, and Sebastian Weichwald. Beware of the simulated DAG! causal discovery benchmarks may be easy to game. InAdvances in Neural Information Processing Systems, volume 34, pages 27772–27784, 2021

work page 2021
[54]

Causal network reconstruction from time series: From theoretical assumptions to practical estimation.Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(7), 2018

Jakob Runge. Causal network reconstruction from time series: From theoretical assumptions to practical estimation.Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(7), 2018

work page 2018
[55]

Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets

Jakob Runge. Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets. InConference on uncertainty in artificial intelligence, pages 1388–1397. Pmlr, 2020

work page 2020
[56]

Detecting and quantifying causal associations in large nonlinear time series datasets.Science advances, 5 (11):eaau4996, 2019

Jakob Runge, Peer Nowack, Marlene Kretschmer, Seth Flaxman, and Dino Sejdinovic. Detecting and quantifying causal associations in large nonlinear time series datasets.Science advances, 5 (11):eaau4996, 2019

work page 2019
[57]

Causal inference for time series.Nature Reviews Earth & Environment, 4(7):487–505, 2023

Jakob Runge, Andreas Gerhardus, Gherardo Varando, Veronika Eyring, and Gustau Camps- Valls. Causal inference for time series.Nature Reviews Earth & Environment, 4(7):487–505, 2023

work page 2023
[58]

Causal discovery from non- stationary time series.International Journal of Data Science and Analytics, 19(1):33–59, 2025

Agathe Sadeghi, Achintya Gopal, and Mohammad Fesanghary. Causal discovery from non- stationary time series.International Journal of Data Science and Analytics, 19(1):33–59, 2025

work page 2025
[59]

Measurement error and causal discovery

Richard Scheines and Joseph Ramsey. Measurement error and causal discovery. InCEUR workshop proceedings, volume 1792, page 1, 2017

work page 2017
[60]

A linear non-Gaussian acyclic model for causal discovery.Journal of Machine Learning Research, 7(10), 2006

Shohei Shimizu, Patrik O Hoyer, Aapo Hyvärinen, Antti Kerminen, and Michael Jordan. A linear non-Gaussian acyclic model for causal discovery.Journal of Machine Learning Research, 7(10), 2006

work page 2006
[61]

Granger causality: A review and recent advances.Annual Review of Statistics and Its Application, 9(1):289–319, 2022

Ali Shojaie and Emily B Fox. Granger causality: A review and recent advances.Annual Review of Statistics and Its Application, 9(1):289–319, 2022

work page 2022
[62]

An algorithm for fast recovery of sparse causal graphs.Social science computer review, 9(1):62–72, 1991

Peter Spirtes and Clark Glymour. An algorithm for fast recovery of sparse causal graphs.Social science computer review, 9(1):62–72, 1991

work page 1991
[63]

MIT press, 2001

Peter Spirtes, Clark Glymour, and Richard Scheines.Causation, Prediction, and Search. MIT press, 2001

work page 2001
[64]

Causalrivers– scaling up benchmarking of causal discovery for real-world time-series.arXiv preprint arXiv:2503.17452, 2025

Gideon Stein, Maha Shadaydeh, Jan Blunk, Niklas Penzel, and Joachim Denzler. Causalrivers– scaling up benchmarking of causal discovery for real-world time-series.arXiv preprint arXiv:2503.17452, 2025

work page arXiv 2025
[65]

TCD-arena: Assessing robustness of time series causal discovery methods against assumption violations

Gideon Stein, Niklas Penzel, Tristan Piater, and Joachim Denzler. TCD-arena: Assessing robustness of time series causal discovery methods against assumption violations. InThe Fourteenth International Conference on Learning Representations, 2026. URL https:// openreview.net/forum?id=MtdrOCLAGY

work page 2026
[66]

Detecting causality in complex ecosystems.science, 338(6106):496–500, 2012

George Sugihara, Robert May, Hao Ye, Chih-hao Hsieh, Ethan Deyle, Michael Fogarty, and Stephan Munch. Detecting causality in complex ecosystems.science, 338(6106):496–500, 2012

work page 2012
[67]

Nts-notears: Learning nonpara- metric dbns with prior knowledge.arXiv preprint arXiv:2109.04286, 2021

Xiangyu Sun, Oliver Schulte, Guiliang Liu, and Pascal Poupart. Nts-notears: Learning nonpara- metric dbns with prior knowledge.arXiv preprint arXiv:2109.04286, 2021

work page arXiv 2021
[68]

Detecting strange attractors in turbulence

Floris Takens. Detecting strange attractors in turbulence. InDynamical Systems and Turbulence, Warwick 1980: proceedings of a symposium held at the University of Warwick 1979/80, pages 366–381. Springer, 2006. 13

work page 1980
[69]

Neural granger causality

Alex Tank, Ian Covert, Nicholas Foti, Ali Shojaie, and Emily B Fox. Neural granger causality. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(8):4267–4279, 2021

work page 2021
[70]

Constraint- based causal discovery with mixed data.International journal of data science and analytics, 6 (1):19–30, 2018

Michail Tsagris, Giorgos Borboudakis, Vincenzo Lagani, and Ioannis Tsamardinos. Constraint- based causal discovery with mixed data.International journal of data science and analytics, 6 (1):19–30, 2018

work page 2018
[71]

Causal discovery in the presence of missing data

Ruibo Tu, Cheng Zhang, Paul Ackermann, Karthika Mohan, Hedvig Kjellström, and Kun Zhang. Causal discovery in the presence of missing data. InThe 22nd International Conference on Artificial Intelligence and Statistics, pages 1762–1770. PMLR, 2019

work page 2019
[72]

Causal discovery from incomplete data: a deep learning approach.arXiv preprint arXiv:2001.05343, 2020

Yuhao Wang, Vlado Menkovski, Hao Wang, Xin Du, and Mykola Pechenizkiy. Causal discovery from incomplete data: a deep learning approach.arXiv preprint arXiv:2001.05343, 2020

work page arXiv 2001
[73]

Mixed causal structure discovery with application to prescriptive pricing

Wei Wenjuan, Feng Lu, and Liu Chunchen. Mixed causal structure discovery with application to prescriptive pricing. InProceedings of the 27th International Joint Conference on Artificial Intelligence, pages 5126–5134, 2018

work page 2018
[74]

Generalized independent noise condition for estimating latent variable causal graphs

Feng Xie, Ruichu Cai, Biwei Huang, Clark Glymour, Zhifeng Hao, and Kun Zhang. Generalized independent noise condition for estimating latent variable causal graphs. InAdvances in Neural Information Processing Systems, volume 33, pages 14891–14902, 2020

work page 2020
[75]

Causal discovery in linear latent variable models subject to measurement error

Yuqin Yang, AmirEmad Ghassami, Mohamed Nafea, Negar Kiyavash, Kun Zhang, and Ilya Shpitser. Causal discovery in linear latent variable models subject to measurement error. Advances in Neural Information Processing Systems, 35:874–886, 2022

work page 2022
[76]

Causal Discovery in Linear Models with Unobserved Variables and Measurement Error

Yuqin Yang, Mohamed Nafea, Negar Kiyavash, Kun Zhang, and AmirEmad Ghassami. Causal discovery in linear models with unobserved variables and measurement error.arXiv preprint arXiv:2407.19426, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[77]

The robustness of differentiable causal discovery in misspecified scenarios

Huiyang Yi, Yanyan He, Duxin Chen, Mingyu Kang, He Wang, and Wenwu Yu. The robustness of differentiable causal discovery in misspecified scenarios. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[78]

Causal discovery with missing data in a multicentric clinical study

Alessio Zanga, Alice Bernasconi, Peter JF Lucas, Hanny Pijnenborg, Casper Reijnen, Marco Scutari, and Fabio Stella. Causal discovery with missing data in a multicentric clinical study. In International Conference on Artificial Intelligence in Medicine, pages 40–44. Springer, 2023

work page 2023
[79]

Federated causal discovery with missing data in a multicentric study on endometrial cancer.Journal of biomedical informatics, page 104877, 2025

Alessio Zanga, Alice Bernasconi, Peter JF Lucas, Hanny Pijnenborg, Casper Reijnen, Marco Scutari, and Anthony C Constantinou. Federated causal discovery with missing data in a multicentric study on endometrial cancer.Journal of biomedical informatics, page 104877, 2025

work page 2025
[80]

Causal discovery for linear mixed data

Yan Zeng, Shohei Shimizu, Hidetoshi Matsui, and Fuchun Sun. Causal discovery for linear mixed data. InConference on Causal Learning and Reasoning, pages 994–1009. PMLR, 2022

work page 2022

Showing first 80 references.