Recognition: 2 theorem links
· Lean TheoremCausal Software Engineering: A Vision and Roadmap
Pith reviewed 2026-05-08 17:47 UTC · model grok-4.3
The pith
Causal models should systematically inform decisions across the full software engineering lifecycle instead of relying on correlations alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose Causal Software Engineering (CSE) as a future paradigm in which causal models and causal reasoning systematically inform activities across the software lifecycle, augmenting existing practices with explicit assumptions, uncertainty-aware effect estimates, and counterfactual diagnosis. We outline a causal-first workflow view spanning development and operations, a staged roadmap for tools and organizational adoption, and an evaluation and benchmark agenda for measuring progress.
What carries the argument
Causal models that support interventional and counterfactual queries applied to software engineering data throughout the development and operations lifecycle.
Load-bearing premise
Causal models can be practically constructed and validated from the noisy, incomplete, and socio-technical data typical in software engineering contexts.
What would settle it
A real-world software project where causal models are built from available data and used to guide decisions yet produce no measurable improvement in outcomes, reliability, or understanding of change effects compared with standard correlational methods.
Figures
read the original abstract
Software engineering increasingly involves making high-stakes decisions under uncertainty, using signals from code, field data, and socio-technical processes. Recent AI-driven support (e.g., anomaly detection, predictive analytics, AIOps, as well as LLM-based agents) has amplified engineers' ability to detect patterns and synthesize content and recommendations, but many critical questions are interventional or counterfactual: What is the expected impact of changing a load-balancing strategy? Would an outage have been avoided under a different release plan? Correlational models answer "what tends to co-occur"; they struggle to answer "what would happen if we act." We propose Causal Software Engineering (CSE) as a future paradigm in which causal models and causal reasoning systematically inform activities across the software lifecycle, augmenting existing practices with explicit assumptions, uncertainty-aware effect estimates, and counterfactual diagnosis. We outline (i) a causal-first workflow view spanning development and operations, (ii) a staged roadmap for tools and organizational adoption, and (iii) an evaluation and benchmark agenda for measuring progress.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Causal Software Engineering (CSE) as a future paradigm in which causal models and causal reasoning systematically inform activities across the software lifecycle. It augments existing correlational AI practices (anomaly detection, predictive analytics, AIOps, LLM agents) by emphasizing explicit assumptions, uncertainty-aware effect estimates, and counterfactual diagnosis for interventional questions such as the impact of changing load-balancing strategies or release plans. The paper outlines (i) a causal-first workflow spanning development and operations, (ii) a staged roadmap for tools and organizational adoption, and (iii) an evaluation and benchmark agenda.
Significance. If the vision is realized, CSE could shift software engineering from pattern detection to reliable interventional and counterfactual reasoning, improving high-stakes decisions under uncertainty in socio-technical systems. As a vision and roadmap paper without empirical results or formal derivations, its primary contribution is framing an open research direction and evaluation agenda rather than delivering validated methods.
major comments (1)
- The central feasibility claim—that causal models can be practically constructed and validated from the noisy, incomplete, and socio-technical data typical in software engineering—is presented as an open challenge in the roadmap but is load-bearing for the entire proposal. No concrete identifiability conditions, data requirements, or integration mechanisms with existing SE artifacts (e.g., logs, issue trackers, code repositories) are specified, leaving the transition from correlational to causal inference underspecified.
minor comments (2)
- The abstract and workflow description reference LLM-based agents but do not clarify how causal reasoning would be integrated with or augment them; a brief illustrative example in the workflow section would improve clarity.
- The evaluation agenda section would benefit from explicit metrics or benchmark tasks that distinguish causal from correlational performance (e.g., counterfactual accuracy on synthetic SE scenarios).
Simulated Author's Rebuttal
We thank the referee for their constructive review and for recognizing the potential of Causal Software Engineering as a framing for future research. We address the single major comment below and propose targeted revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: The central feasibility claim—that causal models can be practically constructed and validated from the noisy, incomplete, and socio-technical data typical in software engineering—is presented as an open challenge in the roadmap but is load-bearing for the entire proposal. No concrete identifiability conditions, data requirements, or integration mechanisms with existing SE artifacts (e.g., logs, issue trackers, code repositories) are specified, leaving the transition from correlational to causal inference underspecified.
Authors: We agree that the practical construction and validation of causal models from typical SE data constitutes a load-bearing challenge for the proposal. As a vision and roadmap paper, the manuscript deliberately positions this as an open research direction rather than a resolved capability, consistent with its stated contribution of outlining a paradigm and evaluation agenda. To reduce underspecification while remaining within the vision-paper scope, we will add a concise subsection in the roadmap that (i) sketches identifiability conditions adapted from causal inference literature (e.g., partial identification under bounded unobserved confounding in controlled deployment settings and use of instrumental variables from release policies), (ii) outlines minimal data requirements (e.g., timestamped logs with intervention markers and version-control provenance), and (iii) provides illustrative integration mechanisms with existing artifacts such as mapping issue-tracker metadata to causal graph nodes or using A/B-test infrastructure for effect estimation. These additions will clarify the transition without asserting solved feasibility. revision: partial
Circularity Check
No significant circularity identified
full rationale
The manuscript is a forward-looking vision and roadmap paper that proposes Causal Software Engineering as a future paradigm without presenting any derivations, equations, fitted parameters, predictions, or formal results. Its content consists of a high-level workflow outline, adoption stages, and an evaluation agenda; no load-bearing claim reduces to an input by construction, self-definition, or self-citation chain. The central proposal is explicitly aspirational and defers feasibility questions, so no circularity patterns apply.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Correlational models are insufficient for answering interventional and counterfactual questions in software engineering.
invented entities (1)
-
Causal Software Engineering (CSE)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Baah, Andy Podgurski, and Mary Jean Harrold
George K. Baah, Andy Podgurski, and Mary Jean Harrold. 2010. Causal Inference for Statistical Fault Localization. In19th International Symposium on Software Testing and Analysis(Trento, Italy). ACM, 73–84. doi:10.1145/1831708.1831717
-
[2]
Antonia Bertolino. 2007. Software Testing Research: Achievements, Challenges, Dreams. InFuture of Software Engineering. 85–103. doi:10.1109/FOSE.2007.25
-
[3]
Pengfei Chen, Yong Qi, and Di Hou. 2019. CauseInfer: Automated End-to-End Performance Diagnosis with Hierarchical Causality Graph in Cloud Environment. IEEE Transactions on Services Computing12, 2 (2019), 214–230. doi:10.1109/TSC. 2016.2607739
work page doi:10.1109/tsc 2019
-
[4]
Clark, Michael Foster, Benedikt Prifling, N
Andrew G. Clark, Michael Foster, Benedikt Prifling, N. Walkinshaw, R. M. Hierons, V. Schmidt, and R. D. Turner. 2023. Testing Causality in Scientific Modelling Software.ACM Transactions on Software Engineering and Methodology33, 1 (2023). doi:10.1145/3607184
-
[5]
Clemens Dubslaff, Kallistos Weis, Christel Baier, and Sven Apel. 2022. Causality in Configurable Software Systems. In44th International Conference on Software Engineering(Pittsburgh, Pennsylvania). ACM, 325–337. doi:10.1145/3510003. 3510200
-
[6]
Angela Fan, Beliz Gokkaya, Mark Harman, Mitya Lyubarskiy, Shubho Sen- gupta, Shin Yoo, and Jie M. Zhang. 2023. Large Language Models for Software Engineering: Survey and Open Problems. InInternational Conference on Soft- ware Engineering: Future of Software Engineering (ICSE-FoSE). IEEE/ACM, 31–53. doi:10.1109/ICSE-FoSE59343.2023.00008
-
[8]
Causality-driven Testing of Autonomous Driving Systems.ACM Transac- tions on Software Engineering and Methodology33, 3, Article 74 (2024), 35 pages. doi:10.1145/3635709
-
[9]
Luca Giamattei, Antonio Guerriero, Roberto Pietrantuono, and Stefano Russo
-
[10]
Information and Software Technology178 (2025)
Causal reasoning in Software Quality Assurance: A systematic review. Information and Software Technology178 (2025)
2025
-
[11]
Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. 2024. Large Language Models for Software Engineering: A Systematic Literature Review.ACM Trans. Softw. Eng. Methodol.33, 8, Article 220 (Dec. 2024), 79 pages. doi:10.1145/3695988
-
[12]
Jeremy Hulse, Nasir U. Eisty, and Tim Menzies. 2025. Shaky structures: The wob- bly world of causal graphs in software analytics.Empirical Software Engineering 30, 5 (21 Jul 2025), 142. doi:10.1007/s10664-025-10690-6
-
[13]
Azam Ikram, Sarthak Chakraborty, Subrata Mitra, Shiv Saini, Saurabh Bagchi, and Murat Kocaoglu. 2022. Root Cause Analysis of Failures in Microservices through Causal Discovery. InAdvances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc, 31158–31170
2022
-
[14]
Md Shahriar Iqbal, Rahul Krishna, Mohammad Ali Javidian, Baishakhi Ray, and Pooyan Jamshidi. 2022. Unicorn: Reasoning about Configurable System Perfor- mance through the Lens of Causality. In17th European Conference on Computer Systems(Rennes, France)(EuroSys ’22). ACM, 199–217. doi:10.1145/3492321. 3519575
-
[15]
Brittany Johnson, Yuriy Brun, and Alexandra Meliou. 2020. Causal testing: understanding defects’ root causes. In42nd International Conference on Software Engineering(Seoul, South Korea). ACM, 87–99. doi:10.1145/3377811.3380377
-
[16]
Yiğit Küçük, Tim A. D. Henderson, and Andy Podgurski. 2021. Improving Fault Localization by Integrating Value and Predicate Based Causal Inference Tech- niques. In43rd International Conference on Software Engineering. IEEE/ACM, 649–660. doi:10.1109/ICSE43902.2021.00066
-
[17]
Chenxi Liu, Yongqiang Chen, Tongliang Liu, Mingming Gong, James Cheng, Bo Han, and Kun Zhang. 2024. Discovery of the Hidden World with Large Language Models. InAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Eds.), Vol. 37. Curran Associates, Inc., 102307–102365. doi:10....
-
[18]
2009.Causality: Models, Reasoning and Inference(2 ed.)
Judea Pearl. 2009.Causality: Models, Reasoning and Inference(2 ed.). Cambridge University Press
2009
-
[19]
PyWhy Community. 2025. PyWhy: Causal Machine Learning. https://www. pywhy.org/
2025
-
[20]
J. Siebert. 2023. Applications of Statistical Causal Inference in Software Engineering.Information and Software Technology159, C (2023), 16 pages. doi:10.1016/j.infsof.2023.107198
-
[21]
Lei Wang, Shanshan Huang, Shu Wang, Jun Liao, Tingpeng Li, and Li Liu. 2024. A survey of causal discovery based on functional causal model.Engineering Applications of Artificial Intelligence133 (2024), 108258. doi:10.1016/j.engappai. 2024.108258
-
[22]
Simin Wang, Liguo Huang, Amiao Gao, Jidong Ge, Tengfei Zhang, Haitao Feng, Ishna Satyarth, Ming Li, He Zhang, and Vincent Ng. 2023. Machine/Deep Learning for Software Engineering: A Systematic Literature Review.IEEE Transactions on Software Engineering49, 3 (2023), 1188–1231. doi:10.1109/TSE.2022.3173346
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.