Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models
read the original abstract
Ensuring trustworthiness in machine learning (ML) systems is crucial as they become increasingly embedded in high-stakes domains. This paper advocates for integrating causal methods into machine learning to navigate the trade-offs among key principles of trustworthy ML, including fairness, privacy, robustness, accuracy, and explainability. While these objectives should ideally be satisfied simultaneously, they are often addressed in isolation, leading to conflicts and suboptimal solutions. Drawing on existing applications of causality in ML that successfully align goals such as fairness and accuracy or privacy and robustness, this paper argues that a causal approach is essential for balancing multiple competing objectives in both trustworthy ML and foundation models. Beyond highlighting these trade-offs, we examine how causality can be practically integrated into ML and foundation models, offering solutions to enhance their reliability and interpretability. Finally, we discuss the challenges, limitations, and opportunities in adopting causal frameworks, paving the way for more accountable and ethically sound AI systems.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Trustworthy AI Suffers from Invariance Conflicts and Causality is The Solution
Causality provides a unifying framework for resolving trade-offs in trustworthy AI by managing invariance conflicts under changes to the data-generating process.
-
Trustworthy AI Suffers from Invariance Conflicts and Causality is The Solution
Causality resolves trade-offs in trustworthy AI by treating them as invariance conflicts under different data-generating process changes.
-
Causality as the Statistical Conscience of Artificial Intelligence: From Pearl's Ladder to Trustworthy Machines
Causality is required for out-of-distribution generalization in AI, with a necessity theorem and unified causal estimators proposed to fix failure modes like hallucination and reward hacking.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.