Transferring Information Across Interventions in Causal Bayesian Optimization

Mohammad Ali Javidian

arxiv: 2606.01457 · v1 · pith:W6IDKTRGnew · submitted 2026-05-31 · 💻 cs.AI · cs.LG· stat.ML

Transferring Information Across Interventions in Causal Bayesian Optimization

Mohammad Ali Javidian This is my paper

Pith reviewed 2026-06-28 16:46 UTC · model grok-4.3

classification 💻 cs.AI cs.LGstat.ML

keywords causal Bayesian optimizationintervention transfercausal kernellinear Gaussian modelsregret boundsinformation gainshared parameterscausal graph

0 comments

The pith

Coupling different interventions through shared causal parameters allows evidence from one to improve estimates of others.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces graph-coupled causal Bayesian optimization to link intervention effects via uncertainty about shared causal parameters in a causal kernel. This transfers information across interventions instead of learning each in isolation. For identifiable linear Gaussian causal models, the kernel has low rank bounded by the number of shared parameters, leading to information gain that grows only logarithmically in the optimization horizon. The resulting regret bound separates errors from optimization, causal estimation, and intervention choice. A reader would care because it reuses sparse interventional data more efficiently when direct interventions are limited.

Core claim

In identifiable linear Gaussian causal models, the proposed causal kernel has low rank, bounded by the number of shared parameters rather than the size of the intervention menu. This yields an information-gain bound that grows only logarithmically in the optimization horizon, and a regret bound that cleanly separates three sources of error: optimization, causal estimation, and the choice of which intervention sets to consider. Nonlinear and adaptive extensions are described.

What carries the argument

The graph-coupled causal kernel, which ties intervention effects together through uncertainty over a small set of shared causal parameters.

If this is right

Information collected from one intervention improves estimates for related interventions.
Information gain grows logarithmically with the optimization horizon rather than with the number of interventions.
Regret bounds separate optimization error, causal estimation error, and errors from choosing intervention sets.
Performance gains are clearest when direct interventions on target parents are unavailable and data must be reused across many candidate interventions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation of error sources could help allocate resources between learning the causal model and optimizing the objective.
This coupling might reduce the need for many direct interventions in systems with large intervention menus.
Adaptive versions could dynamically update the shared parameters as more data arrives.

Load-bearing premise

A known causal graph is available together with observational data, and the system is an identifiable linear Gaussian causal model.

What would settle it

Observing that the information gain grows linearly with the number of interventions instead of logarithmically with the horizon in a linear Gaussian system would falsify the low-rank property of the kernel.

Figures

Figures reproduced from arXiv: 2606.01457 by Mohammad Ali Javidian.

**Figure 1.** Figure 1: Theory-aligned diagnostics on the linear-Gaussian chain. Left to right: finite rank of the causal kernel, analytic versus numerical [PITH_FULL_IMAGE:figures/full_fig_p025_1.png] view at source ↗

**Figure 2.** Figure 2: Cross-set transfer stress tests. Left: linear confounded-funnel setting, where GC-CBO substantially reduces held-out intervention [PITH_FULL_IMAGE:figures/full_fig_p026_2.png] view at source ↗

**Figure 3.** Figure 3: ECOLI70 no-parent intervention experiments for target [PITH_FULL_IMAGE:figures/full_fig_p027_3.png] view at source ↗

**Figure 4.** Figure 4: Convergence across benchmark scenarios: toy chain graph, synthetic DAG from [1], healthcare PSA from [1], ECOLI70 from [27] [PITH_FULL_IMAGE:figures/full_fig_p029_4.png] view at source ↗

read the original abstract

Bayesian optimization is a popular way to optimize expensive systems, where every experiment, simulation, or intervention costs time or money. In its standard form, it treats the variables we control as plain inputs to a black box and cannot tell apart mere correlation from a real cause and effect. Causal Bayesian optimization closes part of this gap by using a known causal graph together with observational data to decide which variables are worth intervening on. Existing methods, however, learn the effect of each possible intervention almost in isolation, even though in a causal system these effects usually share the same underlying mechanisms. We propose graph-coupled causal Bayesian optimization, which ties the different intervention effects together through the uncertainty we have about a small set of shared causal parameters. The result is a causal kernel that lets evidence collected from one intervention improve our estimate of related interventions. For identifiable linear Gaussian causal models, we show that this kernel has low rank, bounded by the number of shared parameters rather than by the size of the intervention menu. This in turn yields an information-gain bound that grows only logarithmically in the optimization horizon, and a regret bound that cleanly separates three sources of error: optimization, causal estimation, and the choice of which intervention sets to consider. We also describe nonlinear and adaptive extensions. Across theory-aligned Gaussian systems, shared-mechanism stress tests, and standard causal optimization benchmarks, the method keeps the benefits of causal Bayesian optimization while transferring information across related interventions, with the clearest gains when direct interventions on the target's parents are unavailable and sparse interventional data must be reused across a large family of candidate interventions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds a graph-coupled causal kernel that shares information across interventions via low-dimensional shared parameters, producing a low-rank structure and log information-gain bounds for linear Gaussian models.

read the letter

The main thing to know is that this paper constructs a causal kernel for Bayesian optimization that ties interventions together through uncertainty over a small set of shared causal parameters. For identifiable linear Gaussian models this kernel has rank bounded by the dimension of those parameters rather than the size of the intervention set.

The new element is the explicit construction that makes each intervention effect an affine function of the same θ. The Gram matrix then takes the form G = A Cov(θ) A^T, so standard finite-rank kernel results give the logarithmic information-gain bound in T. The regret decomposition cleanly isolates optimization error, causal estimation error from the fixed observational data, and menu-selection error. The stress-test note holds up; there is no circular dependence on the adaptive choices.

The paper does well by keeping the causal structure explicit and by showing empirical gains on theory-aligned Gaussian systems and standard benchmarks, especially in regimes where direct interventions on parents are unavailable and data must be reused.

The soft spots are the strong assumptions: a known causal graph plus identifiability in the linear Gaussian class. The nonlinear and adaptive extensions are sketched but do not carry the same rank and bound guarantees. Experimental details on data handling and baselines would need checking for full reproducibility.

This is for researchers already working on causal extensions to Bayesian optimization. A reader interested in structured information transfer in expensive optimization would get concrete value from the kernel and the bound derivation.

It deserves peer review because the central construction and the resulting bounds are specific and checkable under the stated conditions.

Referee Report

1 major / 2 minor

Summary. The paper claims to introduce graph-coupled causal Bayesian optimization, which uses a causal kernel based on shared causal parameters to transfer information across interventions. For identifiable linear Gaussian causal models with known graph and observational data, the kernel has low rank bounded by the number of shared parameters, yielding logarithmic information-gain bounds in the optimization horizon and a regret bound separating optimization, causal estimation, and menu-selection errors. Nonlinear and adaptive extensions are described, with empirical gains on benchmarks especially when direct interventions on parents are unavailable.

Significance. If the results hold, this advances causal Bayesian optimization by enabling information transfer across interventions via shared mechanisms, leading to more efficient optimization in causal systems. The logarithmic bound and error decomposition are notable strengths, as is the focus on identifiable models. The empirical validation on theory-aligned systems and standard benchmarks adds practical value.

major comments (1)

[§4.2] §4.2 (regret bound): the decomposition isolates causal estimation error from fixed observational data, but the paper must explicitly verify that the matrix A in the low-rank Gram matrix G = A Cov(θ) A^T remains independent of the adaptive intervention choices; otherwise the information-gain bound may not separate cleanly from menu-selection error.

minor comments (2)

[Abstract] Abstract: the term 'graph-coupled causal Bayesian optimization' is introduced without a one-sentence definition or pointer to its formal definition in §3.
[§5] §5 (experiments): the description of benchmark setups and any data exclusion rules should be expanded with explicit pseudocode or parameter values to support reproducibility of the reported gains.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment, the recommendation of minor revision, and the helpful comment on the regret bound. We address the point below.

read point-by-point responses

Referee: [§4.2] §4.2 (regret bound): the decomposition isolates causal estimation error from fixed observational data, but the paper must explicitly verify that the matrix A in the low-rank Gram matrix G = A Cov(θ) A^T remains independent of the adaptive intervention choices; otherwise the information-gain bound may not separate cleanly from menu-selection error.

Authors: We agree that explicit verification of this independence is needed for a clean separation. In our construction the causal graph is known and fixed, the observational data are fixed, and the intervention menu (the finite set of candidate interventions) is chosen a priori and fixed. The matrix A is assembled from the known graph structure and the fixed menu; it encodes the linear mapping from the shared parameters θ to each intervention effect and therefore does not depend on the sequence of adaptively chosen interventions. Consequently G = A Cov(θ) A^T is a fixed low-rank kernel over the intervention space. The information-gain bound is taken with respect to this fixed kernel, while the regret decomposition isolates (i) optimization error arising from the BO acquisition procedure, (ii) causal estimation error arising from the fixed observational data, and (iii) menu-selection error arising from the a-priori choice of the intervention menu. We will add a short clarifying paragraph and a direct statement of A’s independence in §4.2. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper constructs a causal kernel by coupling intervention effects through a shared low-dimensional parameter vector θ in identifiable linear Gaussian models. The claimed low-rank property (rank bounded by dim(θ)) follows directly as a linear-algebra consequence of the kernel definition itself (Gram matrix of the form A Cov(θ) A^T). The subsequent information-gain bound (logarithmic in horizon) is obtained by applying standard results on maximum information gain for finite-rank kernels; the regret decomposition isolates optimization, causal-estimation, and menu-selection errors without any fitted parameter being renamed as a prediction or any self-citation serving as the sole justification. No self-definitional loop, fitted-input-as-prediction, or load-bearing self-citation appears in the derivation chain. The result is mathematically self-contained given the stated model class.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Abstract-only review; ledger reflects only elements explicitly named in the abstract. Full parameter counts, exact assumptions, and any additional entities would require the manuscript body.

free parameters (1)

shared causal parameters
Uncertainty over a small set of shared causal parameters is used to couple intervention effects; no specific fitted numerical values are given.

axioms (2)

domain assumption Known causal graph is available
Method starts from a known causal graph together with observational data.
domain assumption System is an identifiable linear Gaussian causal model
Theoretical low-rank kernel and bounds are shown for this class.

pith-pipeline@v0.9.1-grok · 5810 in / 1351 out tokens · 33524 ms · 2026-06-28T16:46:25.769937+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 2 linked inside Pith

[1]

Causal Bayesian optimization

Virginia Aglietti, Xiaoyu Lu, Andrei Paleyes, and Javier Gonz ´alez. Causal Bayesian optimization. InProceedings of the 23rd International Conference on Artificial Intelligence and Statistics, pages 3155–3164. PMLR, 2020

2020
[2]

Constrained causal Bayesian optimization

Virginia Aglietti, Alan Malek, Ira Ktena, and Silvia Chiappa. Constrained causal Bayesian optimization. InInternational Conference on Machine Learning, pages 304–321. PMLR, 2023

2023
[3]

Thinking inside the box: A tutorial on grey-box Bayesian optimization

Raul Astudillo and Peter I Frazier. Thinking inside the box: A tutorial on grey-box Bayesian optimization. In2021 Winter Simulation Conference (WSC), pages 1–15. IEEE, 2021

2021
[4]

Misspecified Gaussian process bandit optimization.Advances in neural information processing systems, 34:3004–3015, 2021

Ilija Bogunovic and Andreas Krause. Misspecified Gaussian process bandit optimization.Advances in neural information processing systems, 34:3004–3015, 2021

2021
[5]

Causal entropy optimiza- tion

Nicola Branchini, Virginia Aglietti, Neil Dhir, and Theodoros Damoulas. Causal entropy optimiza- tion. InInternational Conference on Artificial Intelligence and Statistics, pages 8586–8605. PMLR, 2023

2023
[6]

Differentiable expected hypervolume improvement for parallel multi-objective Bayesian optimization.Advances in neural information processing systems, 33:9851–9864, 2020

Samuel Daulton, Maximilian Balandat, and Eytan Bakshy. Differentiable expected hypervolume improvement for parallel multi-objective Bayesian optimization.Advances in neural information processing systems, 33:9851–9864, 2020

2020
[7]

Parallel Bayesian optimization of multiple noisy objectives with expected hypervolume improvement.Advances in neural information processing systems, 34:2187–2200, 2021

Samuel Daulton, Maximilian Balandat, and Eytan Bakshy. Parallel Bayesian optimization of multiple noisy objectives with expected hypervolume improvement.Advances in neural information processing systems, 34:2187–2200, 2021

2021
[8]

Multi-fidelity Bayesian optimization in engineering design.arXiv preprint arXiv:2311.13050, 2023

Bach Do and Ruda Zhang. Multi-fidelity Bayesian optimization in engineering design.arXiv preprint arXiv:2311.13050, 2023

arXiv 2023
[9]

High-dimensional Bayesian optimization with sparse axis- aligned subspaces

David Eriksson and Martin Jankowiak. High-dimensional Bayesian optimization with sparse axis- aligned subspaces. InUncertainty in artificial intelligence, pages 493–503. PMLR, 2021

2021
[10]

Peter I. Frazier. A tutorial on Bayesian optimization.arXiv preprint arXiv:1807.02811, 2018

Pith/arXiv arXiv 2018
[11]

Bayesian optimization for materials design

Peter I Frazier and Jialei Wang. Bayesian optimization for materials design. InInformation science for materials discovery and design, pages 45–75. Springer, 2015

2015
[12]

Cambridge University Press, 2023

Roman Garnett.Bayesian optimization. Cambridge University Press, 2023

2023
[13]

Functional causal Bayesian optimization

Limor Gultchin, Virginia Aglietti, Alexis Bellot, and Silvia Chiappa. Functional causal Bayesian optimization. InUncertainty in Artificial Intelligence, pages 756–765. PMLR, 2023. 31

2023
[14]

Regret bounds for expected improvement algorithms in Gaussian process bandit optimization

Sunil Gupta, Santu Rana, Svetha Venkatesh, et al. Regret bounds for expected improvement algorithms in Gaussian process bandit optimization. InInternational Conference on Artificial Intelligence and Statistics, pages 8715–8737. PMLR, 2022

2022
[15]

Anubis: Bayesian optimization with unknown feasibility constraints for scientific experimentation.Digital Discovery, 4(8):2104–2122, 2025

Riley J Hickman, Gary Tom, Yunheng Zou, Matteo Aldeghi, and Al ´an Aspuru-Guzik. Anubis: Bayesian optimization with unknown feasibility constraints for scientific experimentation.Digital Discovery, 4(8):2104–2122, 2025

2025
[16]

Horn and Charles R

Roger A. Horn and Charles R. Johnson.Matrix Analysis. Cambridge University Press, 2 edition, 2013

2013
[17]

Multi-objective multi-fidelity Bayesian optimization with causal priors.arXiv preprint arXiv:2602.00788, 2026

Md Abir Hossen, Mohammad Ali Javidian, Vignesh Narayanan, Jason M O’Kane, and Pooyan Jamshidi. Multi-objective multi-fidelity Bayesian optimization with causal priors.arXiv preprint arXiv:2602.00788, 2026

arXiv 2026
[18]

Improved regret bounds for Gaussian process upper confidence bound in Bayesian optimization.Advances in Neural Information Processing Systems, 38:96922–96964, 2026

Shogo Iwazaki. Improved regret bounds for Gaussian process upper confidence bound in Bayesian optimization.Advances in Neural Information Processing Systems, 38:96922–96964, 2026

2026
[19]

Extending multi-source Bayesian optimization with causality principles.arXiv preprint arXiv:2602.14791, 2026

Luuk Jacobs and Mohammad Ali Javidian. Extending multi-source Bayesian optimization with causality principles.arXiv preprint arXiv:2602.14791, 2026

arXiv 2026
[20]

Neural contextual bandits without regret

Parnian Kassraie and Andreas Krause. Neural contextual bandits without regret. InInternational Conference on Artificial Intelligence and Statistics, pages 240–278. PMLR, 2022

2022
[21]

Benchmarking the performance of Bayesian optimization across multiple experimental materials science domains.npj Computational Materials, 7(1):188, 2021

Qiaohao Liang, Aldair E Gongora, Zekun Ren, Armi Tiihonen, Zhe Liu, Shijing Sun, James R Deneault, Daniil Bash, Flore Mekki-Berrada, Saif A Khan, et al. Benchmarking the performance of Bayesian optimization across multiple experimental materials science domains.npj Computational Materials, 7(1):188, 2021

2021
[22]

Cambridge University Press, 2 edition, 2009

Judea Pearl.Causality: Models, Reasoning, and Inference. Cambridge University Press, 2 edition, 2009

2009
[23]

Yahpo gym-an efficient multi-objective multi-fidelity benchmark for hyperparameter optimization

Florian Pfisterer, Lennart Schneider, Julia Moosbauer, Martin Binder, and Bernd Bischl. Yahpo gym-an efficient multi-objective multi-fidelity benchmark for hyperparameter optimization. In International Conference on Automated Machine Learning, pages 3–1. PMLR, 2022

2022
[24]

Carl Edward Rasmussen and Christopher K. I. Williams.Gaussian Processes for Machine Learning. MIT Press, 2006

2006
[25]

Causal Bayesian optimization via exogenous distribution learning

Shaogang Ren and Xiaoning Qian. Causal Bayesian optimization via exogenous distribution learning. arXiv preprint arXiv:2402.02277, 2024

arXiv 2024
[26]

Causalbo: A python package for causal Bayesian optimization

Jeremy Roberts and Mohammad Ali Javidian. Causalbo: A python package for causal Bayesian optimization. InSoutheastCon 2024, pages 1370–1375. IEEE, 2024

2024
[27]

Learning bayesian networks with the bnlearn r package.Journal of statistical software, 35:1–22, 2010

Marco Scutari. Learning bayesian networks with the bnlearn r package.Journal of statistical software, 35:1–22, 2010

2010
[28]

Taking the human out of the loop: A review of Bayesian optimization.Proceedings of the IEEE, 104(1):148–175, 2015

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of Bayesian optimization.Proceedings of the IEEE, 104(1):148–175, 2015

2015
[29]

Identification of joint interventional distributions in recursive semi- markovian causal models

Ilya Shpitser and Judea Pearl. Identification of joint interventional distributions in recursive semi- markovian causal models. InProceedings of the Twenty-First National Conference on Artificial Intelligence, 2006

2006
[30]

Kakade, and Matthias Seeger

Niranjan Srinivas, Andreas Krause, Sham M. Kakade, and Matthias Seeger. Information-theoretic regret bounds for Gaussian process optimization in the bandit setting.IEEE Transactions on Information Theory, 58(5):3250–3265, 2012

2012
[31]

Adversarial causal Bayesian optimization

Scott Sussex, Pier Giuseppe Sessa, Anastasia Makarova, and Andreas Krause. Adversarial causal Bayesian optimization. InInternational Conference on Learning Representations, volume 2024, pages 19332–19353, 2024

2024
[32]

On information gain and regret bounds in Gaussian process bandits

Sattar Vakili, Kia Khezeli, and Victor Picheny. On information gain and regret bounds in Gaussian process bandits. InProceedings of the 24th International Conference on Artificial Intelligence and Statistics. PMLR, 2021

2021
[33]

Cost-aware Bayesian 32 optimization via the pandora’s box gittins index.Advances in Neural Information Processing Systems, 37:115523–115562, 2024

Qian Xie, Raul Astudillo, Peter I Frazier, Ziv Scully, and Alexander Terenin. Cost-aware Bayesian 32 optimization via the pandora’s box gittins index.Advances in Neural Information Processing Systems, 37:115523–115562, 2024

2024
[34]

Paulson, and Calvin Tsay

Yilin Xie, Shiqiang Zhang, Joel A. Paulson, and Calvin Tsay. Global optimization of Gaussian process acquisition functions using a piecewise-linear kernel approximation.arXiv preprint arXiv:2410.16893, 2024

Pith/arXiv arXiv 2024

[1] [1]

Causal Bayesian optimization

Virginia Aglietti, Xiaoyu Lu, Andrei Paleyes, and Javier Gonz ´alez. Causal Bayesian optimization. InProceedings of the 23rd International Conference on Artificial Intelligence and Statistics, pages 3155–3164. PMLR, 2020

2020

[2] [2]

Constrained causal Bayesian optimization

Virginia Aglietti, Alan Malek, Ira Ktena, and Silvia Chiappa. Constrained causal Bayesian optimization. InInternational Conference on Machine Learning, pages 304–321. PMLR, 2023

2023

[3] [3]

Thinking inside the box: A tutorial on grey-box Bayesian optimization

Raul Astudillo and Peter I Frazier. Thinking inside the box: A tutorial on grey-box Bayesian optimization. In2021 Winter Simulation Conference (WSC), pages 1–15. IEEE, 2021

2021

[4] [4]

Misspecified Gaussian process bandit optimization.Advances in neural information processing systems, 34:3004–3015, 2021

Ilija Bogunovic and Andreas Krause. Misspecified Gaussian process bandit optimization.Advances in neural information processing systems, 34:3004–3015, 2021

2021

[5] [5]

Causal entropy optimiza- tion

Nicola Branchini, Virginia Aglietti, Neil Dhir, and Theodoros Damoulas. Causal entropy optimiza- tion. InInternational Conference on Artificial Intelligence and Statistics, pages 8586–8605. PMLR, 2023

2023

[6] [6]

Differentiable expected hypervolume improvement for parallel multi-objective Bayesian optimization.Advances in neural information processing systems, 33:9851–9864, 2020

Samuel Daulton, Maximilian Balandat, and Eytan Bakshy. Differentiable expected hypervolume improvement for parallel multi-objective Bayesian optimization.Advances in neural information processing systems, 33:9851–9864, 2020

2020

[7] [7]

Parallel Bayesian optimization of multiple noisy objectives with expected hypervolume improvement.Advances in neural information processing systems, 34:2187–2200, 2021

Samuel Daulton, Maximilian Balandat, and Eytan Bakshy. Parallel Bayesian optimization of multiple noisy objectives with expected hypervolume improvement.Advances in neural information processing systems, 34:2187–2200, 2021

2021

[8] [8]

Multi-fidelity Bayesian optimization in engineering design.arXiv preprint arXiv:2311.13050, 2023

Bach Do and Ruda Zhang. Multi-fidelity Bayesian optimization in engineering design.arXiv preprint arXiv:2311.13050, 2023

arXiv 2023

[9] [9]

High-dimensional Bayesian optimization with sparse axis- aligned subspaces

David Eriksson and Martin Jankowiak. High-dimensional Bayesian optimization with sparse axis- aligned subspaces. InUncertainty in artificial intelligence, pages 493–503. PMLR, 2021

2021

[10] [10]

Peter I. Frazier. A tutorial on Bayesian optimization.arXiv preprint arXiv:1807.02811, 2018

Pith/arXiv arXiv 2018

[11] [11]

Bayesian optimization for materials design

Peter I Frazier and Jialei Wang. Bayesian optimization for materials design. InInformation science for materials discovery and design, pages 45–75. Springer, 2015

2015

[12] [12]

Cambridge University Press, 2023

Roman Garnett.Bayesian optimization. Cambridge University Press, 2023

2023

[13] [13]

Functional causal Bayesian optimization

Limor Gultchin, Virginia Aglietti, Alexis Bellot, and Silvia Chiappa. Functional causal Bayesian optimization. InUncertainty in Artificial Intelligence, pages 756–765. PMLR, 2023. 31

2023

[14] [14]

Regret bounds for expected improvement algorithms in Gaussian process bandit optimization

Sunil Gupta, Santu Rana, Svetha Venkatesh, et al. Regret bounds for expected improvement algorithms in Gaussian process bandit optimization. InInternational Conference on Artificial Intelligence and Statistics, pages 8715–8737. PMLR, 2022

2022

[15] [15]

Anubis: Bayesian optimization with unknown feasibility constraints for scientific experimentation.Digital Discovery, 4(8):2104–2122, 2025

Riley J Hickman, Gary Tom, Yunheng Zou, Matteo Aldeghi, and Al ´an Aspuru-Guzik. Anubis: Bayesian optimization with unknown feasibility constraints for scientific experimentation.Digital Discovery, 4(8):2104–2122, 2025

2025

[16] [16]

Horn and Charles R

Roger A. Horn and Charles R. Johnson.Matrix Analysis. Cambridge University Press, 2 edition, 2013

2013

[17] [17]

Multi-objective multi-fidelity Bayesian optimization with causal priors.arXiv preprint arXiv:2602.00788, 2026

Md Abir Hossen, Mohammad Ali Javidian, Vignesh Narayanan, Jason M O’Kane, and Pooyan Jamshidi. Multi-objective multi-fidelity Bayesian optimization with causal priors.arXiv preprint arXiv:2602.00788, 2026

arXiv 2026

[18] [18]

Improved regret bounds for Gaussian process upper confidence bound in Bayesian optimization.Advances in Neural Information Processing Systems, 38:96922–96964, 2026

Shogo Iwazaki. Improved regret bounds for Gaussian process upper confidence bound in Bayesian optimization.Advances in Neural Information Processing Systems, 38:96922–96964, 2026

2026

[19] [19]

Extending multi-source Bayesian optimization with causality principles.arXiv preprint arXiv:2602.14791, 2026

Luuk Jacobs and Mohammad Ali Javidian. Extending multi-source Bayesian optimization with causality principles.arXiv preprint arXiv:2602.14791, 2026

arXiv 2026

[20] [20]

Neural contextual bandits without regret

Parnian Kassraie and Andreas Krause. Neural contextual bandits without regret. InInternational Conference on Artificial Intelligence and Statistics, pages 240–278. PMLR, 2022

2022

[21] [21]

Benchmarking the performance of Bayesian optimization across multiple experimental materials science domains.npj Computational Materials, 7(1):188, 2021

Qiaohao Liang, Aldair E Gongora, Zekun Ren, Armi Tiihonen, Zhe Liu, Shijing Sun, James R Deneault, Daniil Bash, Flore Mekki-Berrada, Saif A Khan, et al. Benchmarking the performance of Bayesian optimization across multiple experimental materials science domains.npj Computational Materials, 7(1):188, 2021

2021

[22] [22]

Cambridge University Press, 2 edition, 2009

Judea Pearl.Causality: Models, Reasoning, and Inference. Cambridge University Press, 2 edition, 2009

2009

[23] [23]

Yahpo gym-an efficient multi-objective multi-fidelity benchmark for hyperparameter optimization

Florian Pfisterer, Lennart Schneider, Julia Moosbauer, Martin Binder, and Bernd Bischl. Yahpo gym-an efficient multi-objective multi-fidelity benchmark for hyperparameter optimization. In International Conference on Automated Machine Learning, pages 3–1. PMLR, 2022

2022

[24] [24]

Carl Edward Rasmussen and Christopher K. I. Williams.Gaussian Processes for Machine Learning. MIT Press, 2006

2006

[25] [25]

Causal Bayesian optimization via exogenous distribution learning

Shaogang Ren and Xiaoning Qian. Causal Bayesian optimization via exogenous distribution learning. arXiv preprint arXiv:2402.02277, 2024

arXiv 2024

[26] [26]

Causalbo: A python package for causal Bayesian optimization

Jeremy Roberts and Mohammad Ali Javidian. Causalbo: A python package for causal Bayesian optimization. InSoutheastCon 2024, pages 1370–1375. IEEE, 2024

2024

[27] [27]

Learning bayesian networks with the bnlearn r package.Journal of statistical software, 35:1–22, 2010

Marco Scutari. Learning bayesian networks with the bnlearn r package.Journal of statistical software, 35:1–22, 2010

2010

[28] [28]

Taking the human out of the loop: A review of Bayesian optimization.Proceedings of the IEEE, 104(1):148–175, 2015

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of Bayesian optimization.Proceedings of the IEEE, 104(1):148–175, 2015

2015

[29] [29]

Identification of joint interventional distributions in recursive semi- markovian causal models

Ilya Shpitser and Judea Pearl. Identification of joint interventional distributions in recursive semi- markovian causal models. InProceedings of the Twenty-First National Conference on Artificial Intelligence, 2006

2006

[30] [30]

Kakade, and Matthias Seeger

Niranjan Srinivas, Andreas Krause, Sham M. Kakade, and Matthias Seeger. Information-theoretic regret bounds for Gaussian process optimization in the bandit setting.IEEE Transactions on Information Theory, 58(5):3250–3265, 2012

2012

[31] [31]

Adversarial causal Bayesian optimization

Scott Sussex, Pier Giuseppe Sessa, Anastasia Makarova, and Andreas Krause. Adversarial causal Bayesian optimization. InInternational Conference on Learning Representations, volume 2024, pages 19332–19353, 2024

2024

[32] [32]

On information gain and regret bounds in Gaussian process bandits

Sattar Vakili, Kia Khezeli, and Victor Picheny. On information gain and regret bounds in Gaussian process bandits. InProceedings of the 24th International Conference on Artificial Intelligence and Statistics. PMLR, 2021

2021

[33] [33]

Cost-aware Bayesian 32 optimization via the pandora’s box gittins index.Advances in Neural Information Processing Systems, 37:115523–115562, 2024

Qian Xie, Raul Astudillo, Peter I Frazier, Ziv Scully, and Alexander Terenin. Cost-aware Bayesian 32 optimization via the pandora’s box gittins index.Advances in Neural Information Processing Systems, 37:115523–115562, 2024

2024

[34] [34]

Paulson, and Calvin Tsay

Yilin Xie, Shiqiang Zhang, Joel A. Paulson, and Calvin Tsay. Global optimization of Gaussian process acquisition functions using a piecewise-linear kernel approximation.arXiv preprint arXiv:2410.16893, 2024

Pith/arXiv arXiv 2024