Robust Simulation Based Inference Through Robust Optimal Transport

Anirban Bhattacharya; Debdeep Pati; Lekha Patel; Peter Matthew Jacobs

arxiv: 2605.18741 · v1 · pith:YTLBUDDZnew · submitted 2026-05-18 · 📊 stat.ME · stat.CO

Robust Simulation Based Inference Through Robust Optimal Transport

Peter Matthew Jacobs , Lekha Patel , Anirban Bhattacharya , Debdeep Pati This is my paper

Pith reviewed 2026-05-20 07:57 UTC · model grok-4.3

classification 📊 stat.ME stat.CO

keywords simulation based inferencerobust optimal transportmodel misspecificationstochastic subgradientbootstrap uncertaintygeometric contaminationtotal variation robustness

0 comments

The pith

A Kullback-Leibler informed robust optimal transport divergence allows consistent parameter recovery in simulation-based inference under combined geometric and total variation misspecification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method for simulation-based inference that stays stable when the true data distribution deviates from the model in both geometric ways such as shifts in location and in total variation such as changes in probability mass. It introduces a divergence measure based on optimal transport that incorporates Kullback-Leibler information and draws from empirical likelihood ideas. A stochastic sub-gradient algorithm computes the semi-discrete version of this divergence with a convergence guarantee, and a parallelized procedure adds bootstrap resampling to quantify uncertainty in the inferred parameters. The authors prove mathematically that the divergence is robust specifically under joint geometric plus total variation contamination and demonstrate the approach on a benchmark simulation task. If the claims hold, researchers could obtain useful parameter estimates from simulators even when perfect model match is unrealistic.

Core claim

The central claim is that the Kullback-Leibler informed robust optimal transport divergence is robust under joint geometric plus total variation type contamination between the true distribution P and the closest model P_theta star. This property is shown mathematically, and it supports a stochastic sub-gradient ascent algorithm for estimating the semi-discrete version together with a bootstrap-based parallelized algorithm that delivers parameter estimates and uncertainty quantification for simulation-based inference.

What carries the argument

The Kullback-Leibler informed robust Optimal Transport divergence, which blends optimal transport costs with a robustness adjustment informed by KL divergence to quantify discrepancy between simulated and observed data.

If this is right

The divergence yields parameter estimates that remain consistent under the stated form of misspecification.
The stochastic sub-gradient ascent procedure converges when applied to the semi-discrete robust optimal transport divergence.
Bootstrap resampling on top of the minimum divergence estimator produces reliable uncertainty quantification.
The overall procedure applies directly to complex benchmark simulation-based inference tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same divergence construction could be adapted to other simulation-based tasks such as model selection if similar robustness properties are established.
The parallel bootstrap structure indicates that uncertainty quantification can scale with available simulation budgets on distributed hardware.
Connections to empirical likelihood may allow borrowing finite-sample efficiency techniques from that literature.
Testing on scientific simulators known to exhibit exactly geometric plus total variation misspecification would provide a direct check on practical utility.

Load-bearing premise

The true data-generating distribution differs from the closest model only through a combination of geometric and total variation discrepancies.

What would settle it

A case where the parameter estimator becomes inconsistent or loses coverage when the contamination includes components outside the joint geometric and total variation class would falsify the robustness guarantee.

Figures

Figures reproduced from arXiv: 2605.18741 by Anirban Bhattacharya, Debdeep Pati, Lekha Patel, Peter Matthew Jacobs.

**Figure 1.** Figure 1: In iid statistical inference, there is a statistical model, {Pθ : θ ∈ Θ}, with the measures defined on a ground space Y. Data Y1, . . . , Yn iid∼ P, and the parameters need to be inferred from the data. However, robustness to multiple forms of contamination is not the only challenge faced by the practitioner. Another common challenge is that the likelihood functions associated to distributions in the mod… view at source ↗

**Figure 2.** Figure 2: Illustration of the full B-MRSW inference pipeline and comparison to Minimum [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Panel A: Visualization of the quantile functions of the clean distribution (G-and-K [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: See Section D.1.1 for full details 0.01 0.019 0.037 0.072 0.14 0.27 0.52 1 1.9 3.7 7.2 14 27 52 1e+02 0.2 0.4 0.6 0.8 1.0 1.2 1.4 W 2(F , Q ( )) Selection Diagnostic : Elbow Plot of W2(F , Q ( )) 4 2 0 2 4 6 8 10 x 0.0 0.1 0.2 0.3 0.4 Density = (2.5) B-MRSW vs min Wasserstein-2 (densities of bootstrap sample estimates) B-MRSW bootstrap fitted density min W2 bootstrap fitted density Clean: Normal(0.0, 1.0) … view at source ↗

**Figure 5.** Figure 5: See Section D.1.2 for details bootstrap samples of the MRSW estimator using algorithm 2; in addition we also collect 100 bootstrap samples from the Minimum Wasserstein 2 Estimator. (3): Using λ ∗ , run MRSW on the empirical measure of the data itself (which corresponds to running Algorithm 2 but with the empirical measure rather than constructing a bootstrap sample of the empirical measure), from which we … view at source ↗

**Figure 6.** Figure 6: See Section D.1.3 for details D.2 g-and-k experiment The contaminated data generating distribution for the experiment is P = (1 − ϵ)Q + ϵδz where Y ∼ Q if X ∼ GandK(a = 3, b = 1, g = 2, k = 0.5) and Y = ⌊ X ρ ⌋ρ. We set ρ = ϵ = .05 and z = 50. We set the initial value to (a 0 , b0 , g0 , k0 ) = (5, .15, .05, .05) with bounds for a of (−10, 10), for b of (.1, 10), for g, (.03, 40) and k, (.05, 3.0). For the… view at source ↗

read the original abstract

When a statistical model $\{P_{\theta} : \theta \in \Theta\}$ lacks analytically tractable likelihoods, parametric statistical inference based on data generated from an unknown underlying distribution $P$ can still be performed as long as simulations from the model are possible. This approach is called Simulation Based Inference (SBI). Statistical models are rarely exactly correct (that is, $P \notin \{P_{\theta}: \theta \in \Theta\}$), and Robust SBI focuses on inferring a reasonable parameter even under model mis-specification. We focus on the setting where $P$ possesses potentially both geometric and Total Variation type discrepancies from $P_{\theta^*}$. For this problem, we use a Kullback-Liebler informed robust Optimal Transport divergence, motivated by Empirical Likelihood considerations. We introduce a stochastic sub-gradient ascent algorithm with a convergence guarantee for estimating the semi-discrete version of this robust Optimal Transport divergence, and design a parallelized SBI algorithm which employs the regular bootstrap on top of minimum semi-discrete robust Optimal Transport for parameter uncertainty quantification. We demonstrate mathematically why the divergence is robust under a joint geometric plus Total Variation type contamination and then illustrate the robustness of inferences on a complex benchmark SBI task.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a KL-informed robust OT divergence for SBI under geometric plus TV contamination, with a subgradient solver and bootstrap, but the guarantees are narrow and may not match typical benchmarks.

read the letter

This paper introduces a robust optimal transport divergence for simulation-based inference when the true distribution differs from the model through both geometric and total variation discrepancies. They motivate it from empirical likelihood ideas, prove robustness for that specific joint contamination, give a stochastic subgradient algorithm with a convergence guarantee for the semi-discrete case, and wrap it in a bootstrap for uncertainty quantification on a complex benchmark task.

Referee Report

3 major / 2 minor

Summary. The paper proposes a Kullback-Leibler informed robust Optimal Transport divergence for robust simulation-based inference (SBI) under model misspecification where the data-generating distribution P differs from the closest model P_θ* via a combination of geometric and total variation discrepancies. It introduces a stochastic sub-gradient ascent algorithm with a convergence guarantee for the semi-discrete version of this divergence, a parallelized SBI procedure that uses the regular bootstrap on top of minimum semi-discrete robust OT for parameter uncertainty quantification, a mathematical demonstration that the divergence is robust under joint geometric plus TV contamination, and an illustration of the method on a complex benchmark SBI task.

Significance. If the mathematical robustness result holds under the stated contamination model and the convergence guarantee is rigorous, the work provides a theoretically motivated approach to robust SBI that could improve reliability of inferences when simulations are available but the model is misspecified in geometrically structured ways. The combination of an explicit robustness proof for a specific contamination class with a practical bootstrap-based uncertainty procedure is a strength, though its impact depends on how commonly the assumed discrepancy form appears in real SBI applications.

major comments (3)

[Abstract and §4] Abstract and §4 (robustness demonstration): the claim that the divergence is robust under joint geometric plus Total Variation type contamination is load-bearing for the overall conclusion, yet the explicit contamination model, the precise definition of the joint discrepancy, and the full derivation of the robustness bound are not visible in the abstract. Without these, it is impossible to verify whether the guarantee applies to the misspecification present in the complex benchmark task.
[Abstract and algorithm section] Abstract and algorithm section: the stochastic sub-gradient ascent algorithm is stated to have a convergence guarantee, but no explicit error bounds, step-size conditions, or rate of convergence are provided. This is central because the practical SBI procedure relies on reliable estimation of the semi-discrete robust OT divergence.
[Benchmark illustration] Benchmark illustration: the empirical results on the complex SBI task are presented as demonstrating robustness, but the paper does not show that the misspecification in that benchmark matches the geometric-plus-TV form for which the mathematical guarantee is derived. If the benchmark contains other discrepancies (e.g., support mismatch or likelihood shape differences), the theoretical justification does not directly support the observed performance.

minor comments (2)

[Notation] Notation for the semi-discrete robust OT divergence should be introduced with a clear equation number early in the manuscript to improve readability.
[Uncertainty quantification] The description of the parallelized bootstrap procedure would benefit from a small pseudocode block or explicit reference to the number of bootstrap replicates used in the experiments.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We are grateful to the referee for their constructive comments, which have helped us identify areas where the manuscript can be improved for clarity and rigor. Below, we provide point-by-point responses to the major comments and outline the revisions we plan to make.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (robustness demonstration): the claim that the divergence is robust under joint geometric plus Total Variation type contamination is load-bearing for the overall conclusion, yet the explicit contamination model, the precise definition of the joint discrepancy, and the full derivation of the robustness bound are not visible in the abstract. Without these, it is impossible to verify whether the guarantee applies to the misspecification present in the complex benchmark task.

Authors: We thank the referee for highlighting this point. The explicit contamination model and the definition of the joint geometric plus TV discrepancy are introduced in Section 2 and formalized in Section 4, where the full derivation of the robustness bound is provided. To address the concern about visibility, we will revise the abstract to include a concise statement of the contamination model and the robustness guarantee. Additionally, we will add a paragraph in the benchmark section discussing how the misspecification in the complex SBI task corresponds to the assumed joint discrepancy form, thereby making the applicability of the theoretical result to the empirical example explicit. revision: partial
Referee: [Abstract and algorithm section] Abstract and algorithm section: the stochastic sub-gradient ascent algorithm is stated to have a convergence guarantee, but no explicit error bounds, step-size conditions, or rate of convergence are provided. This is central because the practical SBI procedure relies on reliable estimation of the semi-discrete robust OT divergence.

Authors: The manuscript provides a convergence guarantee for the stochastic subgradient ascent algorithm applied to the semi-discrete robust OT divergence, establishing almost-sure convergence under standard stochastic approximation conditions. We agree that more details would be beneficial. In the revision, we will explicitly state the step-size conditions (e.g., the requirements for the learning rate sequence) and clarify that the guarantee is for convergence to the optimal value rather than providing finite-time error bounds or rates, as deriving the latter would necessitate stronger assumptions on the objective function that are not generally satisfied here. This clarification will be added to the algorithm section. revision: yes
Referee: [Benchmark illustration] Benchmark illustration: the empirical results on the complex SBI task are presented as demonstrating robustness, but the paper does not show that the misspecification in that benchmark matches the geometric-plus-TV form for which the mathematical guarantee is derived. If the benchmark contains other discrepancies (e.g., support mismatch or likelihood shape differences), the theoretical justification does not directly support the observed performance.

Authors: We acknowledge the importance of linking the empirical demonstration to the theoretical contamination model. The complex benchmark task is selected to exhibit both geometric distortions in the data distribution and total variation discrepancies due to model misspecification. In the revised version, we will provide a more detailed characterization of the misspecification in the benchmark, explaining its alignment with the joint geometric and TV contamination for which robustness is proven. This will help readers see that the observed robust performance is supported by the theory. revision: yes

Circularity Check

0 steps flagged

Derivation self-contained from OT/KL definitions with independent robustness proof

full rationale

The paper defines the robust OT divergence directly from first-principles combination of Optimal Transport and Kullback-Leibler terms, motivated by Empirical Likelihood considerations. It then derives a stochastic sub-gradient algorithm with stated convergence guarantee and provides a separate mathematical demonstration that this divergence is robust specifically under joint geometric plus Total Variation contamination. No equation reduces the target quantity to a fitted parameter by construction, no self-citation is invoked as load-bearing for the central robustness claim, and the benchmark illustration follows from the derived properties rather than presupposing them. The derivation chain remains independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Information is limited to the abstract; the ledger therefore records only the domain assumptions stated there and treats the new divergence as the primary addition.

axioms (2)

domain assumption Simulations from each P_theta can be generated on demand.
Standard premise of all simulation-based inference stated in the opening sentence of the abstract.
domain assumption The discrepancy between P and P_theta* takes the specific joint geometric plus total variation form.
Invoked when the authors claim mathematical robustness under that contamination model.

pith-pipeline@v0.9.0 · 5747 in / 1341 out tokens · 41459 ms · 2026-05-20T07:57:47.702648+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We use a Kullback-Leibler informed robust Optimal Transport divergence... ℓ_λ(P1,P2) := inf_{Q≪P1} [1/λ KL(Q,P1) + W₂²(P2,Q)]
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Lemma 2.4 (Robustness to G+H Contamination) ... ℓ_λ(P,Pθ*) ≤ inf θ ℓ_λ(P,Pθ) + ϵ + ϵ²/λ + ρ²

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

139 extracted references · 139 canonical work pages · 3 internal anchors

[1]

The 22nd international conference on artificial intelligence and statistics , pages=

Sample complexity of sinkhorn divergences , author=. The 22nd international conference on artificial intelligence and statistics , pages=. 2019 , organization=

work page 2019
[2]

SIAM Review , volume=

Semidual regularized optimal transport , author=. SIAM Review , volume=. 2018 , publisher=

work page 2018
[3]

Advances in neural information processing systems , volume=

Stochastic optimization for large-scale optimal transport , author=. Advances in neural information processing systems , volume=

work page
[4]

Foundations and Trends

Computational optimal transport: With applications to data science , author=. Foundations and Trends. 2019 , publisher=

work page 2019
[5]

The Journal of Machine Learning Research , volume=

Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression , author=. The Journal of Machine Learning Research , volume=. 2014 , publisher=

work page 2014
[6]

Kodai mathematical journal , volume=

Elementary proof for Sion's minimax theorem , author=. Kodai mathematical journal , volume=. 1988 , publisher=

work page 1988
[7]

Journal of the European Mathematical Society , volume=

Convergence of a Newton algorithm for semi-discrete optimal transport , author=. Journal of the European Mathematical Society , volume=

work page
[8]

2024 , publisher=

Statistical inference , author=. 2024 , publisher=

work page 2024
[9]

2019 , publisher=

A probability path , author=. 2019 , publisher=

work page 2019
[10]

Principles of mathematical analysis , author=. 3rd ed. , year=

work page
[11]

Decreasing Entropic Regularization Averaged Gradient for Semi-Discrete Optimal Transport , author=

work page
[12]

Asymptotic distribution and convergence rates of stochastic algorithms for entropic optimal transportation between probability measures , author=

work page
[13]

2023 , publisher=

Bayesian optimization , author=. 2023 , publisher=

work page 2023
[14]

International Conference on Artificial Intelligence and Statistics , pages=

Nearly tight convergence bounds for semi-discrete entropic optimal transport , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2022 , organization=

work page 2022
[15]

arXiv preprint arXiv:2510.25287 , year=

Stochastic Optimization in Semi-Discrete Optimal Transport: Convergence Analysis and Minimax Rate , author=. arXiv preprint arXiv:2510.25287 , year=

work page arXiv
[16]

Advances in Neural Information Processing Systems , volume=

A combinatorial algorithm for the semi-discrete optimal transport problem , author=. Advances in Neural Information Processing Systems , volume=

work page
[17]

Bayesian Analysis , volume=

Robust probabilistic inference via a constrained transport metric , author=. Bayesian Analysis , volume=. 2025 , publisher=

work page 2025
[18]

2025 , publisher=

Measure theory and fine properties of functions , author=. 2025 , publisher=

work page 2025
[19]

Lecture Notes for ECE563 (UIUC) and , volume=

Lecture notes on information theory , author=. Lecture Notes for ECE563 (UIUC) and , volume=. 2014 , publisher=

work page 2014
[20]

2025 , publisher=

Statistical optimal transport , author=. 2025 , publisher=

work page 2025
[21]

2008 , publisher=

Probability theory: a comprehensive course , author=. 2008 , publisher=

work page 2008
[22]

2009 , publisher=

Optimal transport: old and new , author=. 2009 , publisher=

work page 2009
[23]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[24]

Advances in neural information processing systems , volume=

Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=

work page
[25]

Advances in Neural Information Processing Systems , volume=

On robust optimal transport: Computational complexity and barycenter computation , author=. Advances in Neural Information Processing Systems , volume=

work page
[26]

Breakthroughs in statistics: Methodology and distribution , pages=

Robust estimation of a location parameter , author=. Breakthroughs in statistics: Methodology and distribution , pages=. 1992 , publisher=

work page 1992
[27]

SIAM Journal on Computing , volume=

Robust estimators in high-dimensions without the computational intractability , author=. SIAM Journal on Computing , volume=. 2019 , publisher=

work page 2019
[28]

Information and Inference: A Journal of the IMA , volume=

Robust W-GAN-based estimation under Wasserstein contamination , author=. Information and Inference: A Journal of the IMA , volume=. 2023 , publisher=

work page 2023
[29]

Advances in Neural Information Processing Systems , volume=

Outlier-robust distributionally robust optimization via unbalanced optimal transport , author=. Advances in Neural Information Processing Systems , volume=

work page
[30]

Mathematics of computation , volume=

Scaling algorithms for unbalanced optimal transport problems , author=. Mathematics of computation , volume=

work page
[31]

2017 , school=

Unbalanced optimal transport: Models, numerical methods, applications , author=. 2017 , school=

work page 2017
[32]

Operations Research , volume=

Wasserstein distributionally robust optimization and variation regularization , author=. Operations Research , volume=. 2024 , publisher=

work page 2024
[33]

The 22nd international conference on artificial intelligence and statistics , pages=

Sequential neural likelihood: Fast likelihood-free inference with autoregressive flows , author=. The 22nd international conference on artificial intelligence and statistics , pages=. 2019 , organization=

work page 2019
[34]

International conference on machine learning , pages=

Automatic posterior transformation for likelihood-free inference , author=. International conference on machine learning , pages=. 2019 , organization=

work page 2019
[35]

Advances in neural information processing systems , volume=

Fast -free inference of simulation models with bayesian conditional density estimation , author=. Advances in neural information processing systems , volume=

work page
[36]

Symposium on Advances in Approximate Bayesian Inference , pages=

MMD-Bayes: Robust Bayesian estimation via maximum mean discrepancy , author=. Symposium on Advances in Approximate Bayesian Inference , pages=. 2020 , organization=

work page 2020
[37]

arXiv preprint arXiv:2104.03889 , year=

Generalized Bayesian likelihood-free inference , author=. arXiv preprint arXiv:2104.03889 , year=

work page arXiv
[38]

Electronic Journal of Statistics , volume=

Generalized Bayesian likelihood-free inference , author=. Electronic Journal of Statistics , volume=. 2024 , publisher=

work page 2024
[40]

International Conference on Artificial Intelligence and Statistics , pages=

Robust Bayesian inference for simulator-based models via the MMD posterior bootstrap , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2022 , organization=

work page 2022
[41]

Statistical Science , volume=

Distributionally robust optimization and robust statistics , author=. Statistical Science , volume=. 2025 , publisher=

work page 2025
[42]

Journal of the American statistical Association , volume=

Better bootstrap confidence intervals , author=. Journal of the American statistical Association , volume=. 1987 , publisher=

work page 1987
[43]

Information and Inference: A Journal of the IMA , volume=

On parameter estimation with the Wasserstein distance , author=. Information and Inference: A Journal of the IMA , volume=. 2019 , publisher=

work page 2019
[44]

International Conference on Machine Learning , pages=

Outlier-robust optimal transport , author=. International Conference on Machine Learning , pages=. 2021 , organization=

work page 2021
[46]

Advances in Neural Information Processing Systems , volume=

Outlier-robust wasserstein dro , author=. Advances in Neural Information Processing Systems , volume=

work page
[47]

2001 , publisher=

Empirical likelihood , author=. 2001 , publisher=

work page 2001
[48]

Annual Review of Statistics and its Application , volume=

A review of empirical likelihood , author=. Annual Review of Statistics and its Application , volume=. 2021 , publisher=

work page 2021
[49]

Journal of the American Statistical Association , volume=

Bayesian estimation and comparison of moment condition models , author=. Journal of the American Statistical Association , volume=. 2018 , publisher=

work page 2018
[50]

1999 , publisher=

Elements of information theory , author=. 1999 , publisher=

work page 1999
[51]

Biometrika , volume=

Bayesian exponentially tilted empirical likelihood , author=. Biometrika , volume=. 2005 , publisher=

work page 2005
[54]

The book of GENESIS: exploring realistic neural models with the GEneral NEural SImulation System , pages=

The hodgkin—huxley model , author=. The book of GENESIS: exploring realistic neural models with the GEneral NEural SImulation System , pages=. 1998 , publisher=

work page 1998
[55]

Scientific reports , volume=

Predicting the epidemic threshold of the susceptible-infected-recovered model , author=. Scientific reports , volume=. 2016 , publisher=

work page 2016
[57]

Annual Review of Ecology and Systematics , volume=

Lotka-Volterra population models , author=. Annual Review of Ecology and Systematics , volume=. 1978 , publisher=

work page 1978
[58]

The Bernstein-von-Mises theorem under misspecification , author=

work page
[59]

Annual review of ecology, evolution, and systematics , volume=

Approximate Bayesian computation in evolution and ecology , author=. Annual review of ecology, evolution, and systematics , volume=. 2010 , publisher=

work page 2010
[60]

Journal of the American Statistical Association , volume=

Approximate Bayesian computation: a nonparametric perspective , author=. Journal of the American Statistical Association , volume=. 2010 , publisher=

work page 2010
[61]

Genetics , volume=

Approximate Bayesian computation in population genetics , author=. Genetics , volume=. 2002 , publisher=

work page 2002
[62]

Systematic biology , volume=

Fundamentals and recent developments in approximate Bayesian computation , author=. Systematic biology , volume=. 2017 , publisher=

work page 2017
[63]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

Approximate Bayesian computation with the Wasserstein distance , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2019 , publisher=

work page 2019
[64]

Journal of the American Statistical Association , year=

Robust Bayesian inference via coarsening , author=. Journal of the American Statistical Association , year=

work page
[65]

Bayesian fractional posteriors , author=

work page
[66]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

A general framework for updating belief distributions , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2016 , publisher=

work page 2016
[67]

Biometrika , volume=

General Bayesian updating and the loss-likelihood bootstrap , author=. Biometrika , volume=. 2019 , publisher=

work page 2019
[68]

Journal of the royal statistical society: series D (the Statistician) , volume=

Markov chain Monte Carlo method and its application , author=. Journal of the royal statistical society: series D (the Statistician) , volume=. 1998 , publisher=

work page 1998
[69]

International conference on machine learning , pages=

On gradient descent ascent for nonconvex-concave minimax problems , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020
[70]

Evolutionary computation , volume=

Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES) , author=. Evolutionary computation , volume=. 2003 , publisher=

work page 2003
[71]

Towards a new evolutionary computation: Advances in the estimation of distribution algorithms , pages=

The CMA evolution strategy: a comparing review , author=. Towards a new evolutionary computation: Advances in the estimation of distribution algorithms , pages=. 2006 , publisher=

work page 2006
[72]

Advances in Neural Information Processing Systems , volume=

Robust optimal transport with applications in generative modeling and domain adaptation , author=. Advances in Neural Information Processing Systems , volume=

work page
[73]

Artificial Intelligence for High Energy Physics , pages=

Simulation-based inference methods for particle physics , author=. Artificial Intelligence for High Energy Physics , pages=. 2022 , publisher=

work page 2022
[74]

Journal of Econometrics , volume=

Simulation-based inference: A survey with special reference to panel data models , author=. Journal of Econometrics , volume=. 1993 , publisher=

work page 1993
[76]

2011 , publisher=

Statistical inference: the minimum distance approach , author=. 2011 , publisher=

work page 2011
[77]

Scholarpedia , volume=

Nelder-mead algorithm , author=. Scholarpedia , volume=

work page
[78]

International statistical review , volume=

On choosing and bounding probability metrics , author=. International statistical review , volume=. 2002 , publisher=

work page 2002
[79]

Advances in neural information processing systems , volume=

Mmd gan: Towards deeper understanding of moment matching network , author=. Advances in neural information processing systems , volume=

work page
[80]

The annals of statistics , pages=

The bayesian bootstrap , author=. The annals of statistics , pages=. 1981 , publisher=

work page 1981
[81]

International Conference on Artificial Intelligence and Statistics , pages=

Randomized stochastic gradient descent ascent , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2022 , organization=

work page 2022
[82]

Annual Review of Statistics and Its Application , volume=

Neural methods for amortized inference , author=. Annual Review of Statistics and Its Application , volume=. 2025 , publisher=

work page 2025
[83]

ArXiv , pages=

Wasserstein wormhole: Scalable optimal transport distance with transformers , author=. ArXiv , pages=

work page
[84]

Robust optimal transport with applications in generative modeling and domain adaptation

Yogesh Balaji, Rama Chellappa, and Soheil Feizi. Robust optimal transport with applications in generative modeling and domain adaptation. Advances in Neural Information Processing Systems, 33: 0 12934--12944, 2020

work page 2020
[85]

Statistical inference: the minimum distance approach

Ayanendranath Basu, Hiroyuki Shioya, and Chanseok Park. Statistical inference: the minimum distance approach. CRC press, 2011

work page 2011
[86]

Approximate bayesian computation in evolution and ecology

Mark A Beaumont. Approximate bayesian computation in evolution and ecology. Annual review of ecology, evolution, and systematics, 41 0 (1): 0 379--406, 2010

work page 2010

Showing first 80 references.

[1] [1]

The 22nd international conference on artificial intelligence and statistics , pages=

Sample complexity of sinkhorn divergences , author=. The 22nd international conference on artificial intelligence and statistics , pages=. 2019 , organization=

work page 2019

[2] [2]

SIAM Review , volume=

Semidual regularized optimal transport , author=. SIAM Review , volume=. 2018 , publisher=

work page 2018

[3] [3]

Advances in neural information processing systems , volume=

Stochastic optimization for large-scale optimal transport , author=. Advances in neural information processing systems , volume=

work page

[4] [4]

Foundations and Trends

Computational optimal transport: With applications to data science , author=. Foundations and Trends. 2019 , publisher=

work page 2019

[5] [5]

The Journal of Machine Learning Research , volume=

Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression , author=. The Journal of Machine Learning Research , volume=. 2014 , publisher=

work page 2014

[6] [6]

Kodai mathematical journal , volume=

Elementary proof for Sion's minimax theorem , author=. Kodai mathematical journal , volume=. 1988 , publisher=

work page 1988

[7] [7]

Journal of the European Mathematical Society , volume=

Convergence of a Newton algorithm for semi-discrete optimal transport , author=. Journal of the European Mathematical Society , volume=

work page

[8] [8]

2024 , publisher=

Statistical inference , author=. 2024 , publisher=

work page 2024

[9] [9]

2019 , publisher=

A probability path , author=. 2019 , publisher=

work page 2019

[10] [10]

Principles of mathematical analysis , author=. 3rd ed. , year=

work page

[11] [11]

Decreasing Entropic Regularization Averaged Gradient for Semi-Discrete Optimal Transport , author=

work page

[12] [12]

Asymptotic distribution and convergence rates of stochastic algorithms for entropic optimal transportation between probability measures , author=

work page

[13] [13]

2023 , publisher=

Bayesian optimization , author=. 2023 , publisher=

work page 2023

[14] [14]

International Conference on Artificial Intelligence and Statistics , pages=

Nearly tight convergence bounds for semi-discrete entropic optimal transport , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2022 , organization=

work page 2022

[15] [15]

arXiv preprint arXiv:2510.25287 , year=

Stochastic Optimization in Semi-Discrete Optimal Transport: Convergence Analysis and Minimax Rate , author=. arXiv preprint arXiv:2510.25287 , year=

work page arXiv

[16] [16]

Advances in Neural Information Processing Systems , volume=

A combinatorial algorithm for the semi-discrete optimal transport problem , author=. Advances in Neural Information Processing Systems , volume=

work page

[17] [17]

Bayesian Analysis , volume=

Robust probabilistic inference via a constrained transport metric , author=. Bayesian Analysis , volume=. 2025 , publisher=

work page 2025

[18] [18]

2025 , publisher=

Measure theory and fine properties of functions , author=. 2025 , publisher=

work page 2025

[19] [19]

Lecture Notes for ECE563 (UIUC) and , volume=

Lecture notes on information theory , author=. Lecture Notes for ECE563 (UIUC) and , volume=. 2014 , publisher=

work page 2014

[20] [20]

2025 , publisher=

Statistical optimal transport , author=. 2025 , publisher=

work page 2025

[21] [21]

2008 , publisher=

Probability theory: a comprehensive course , author=. 2008 , publisher=

work page 2008

[22] [22]

2009 , publisher=

Optimal transport: old and new , author=. 2009 , publisher=

work page 2009

[23] [23]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[24] [24]

Advances in neural information processing systems , volume=

Gans trained by a two time-scale update rule converge to a local nash equilibrium , author=. Advances in neural information processing systems , volume=

work page

[25] [25]

Advances in Neural Information Processing Systems , volume=

On robust optimal transport: Computational complexity and barycenter computation , author=. Advances in Neural Information Processing Systems , volume=

work page

[26] [26]

Breakthroughs in statistics: Methodology and distribution , pages=

Robust estimation of a location parameter , author=. Breakthroughs in statistics: Methodology and distribution , pages=. 1992 , publisher=

work page 1992

[27] [27]

SIAM Journal on Computing , volume=

Robust estimators in high-dimensions without the computational intractability , author=. SIAM Journal on Computing , volume=. 2019 , publisher=

work page 2019

[28] [28]

Information and Inference: A Journal of the IMA , volume=

Robust W-GAN-based estimation under Wasserstein contamination , author=. Information and Inference: A Journal of the IMA , volume=. 2023 , publisher=

work page 2023

[29] [29]

Advances in Neural Information Processing Systems , volume=

Outlier-robust distributionally robust optimization via unbalanced optimal transport , author=. Advances in Neural Information Processing Systems , volume=

work page

[30] [30]

Mathematics of computation , volume=

Scaling algorithms for unbalanced optimal transport problems , author=. Mathematics of computation , volume=

work page

[31] [31]

2017 , school=

Unbalanced optimal transport: Models, numerical methods, applications , author=. 2017 , school=

work page 2017

[32] [32]

Operations Research , volume=

Wasserstein distributionally robust optimization and variation regularization , author=. Operations Research , volume=. 2024 , publisher=

work page 2024

[33] [33]

The 22nd international conference on artificial intelligence and statistics , pages=

Sequential neural likelihood: Fast likelihood-free inference with autoregressive flows , author=. The 22nd international conference on artificial intelligence and statistics , pages=. 2019 , organization=

work page 2019

[34] [34]

International conference on machine learning , pages=

Automatic posterior transformation for likelihood-free inference , author=. International conference on machine learning , pages=. 2019 , organization=

work page 2019

[35] [35]

Advances in neural information processing systems , volume=

Fast -free inference of simulation models with bayesian conditional density estimation , author=. Advances in neural information processing systems , volume=

work page

[36] [36]

Symposium on Advances in Approximate Bayesian Inference , pages=

MMD-Bayes: Robust Bayesian estimation via maximum mean discrepancy , author=. Symposium on Advances in Approximate Bayesian Inference , pages=. 2020 , organization=

work page 2020

[37] [37]

arXiv preprint arXiv:2104.03889 , year=

Generalized Bayesian likelihood-free inference , author=. arXiv preprint arXiv:2104.03889 , year=

work page arXiv

[38] [38]

Electronic Journal of Statistics , volume=

Generalized Bayesian likelihood-free inference , author=. Electronic Journal of Statistics , volume=. 2024 , publisher=

work page 2024

[39] [40]

International Conference on Artificial Intelligence and Statistics , pages=

Robust Bayesian inference for simulator-based models via the MMD posterior bootstrap , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2022 , organization=

work page 2022

[40] [41]

Statistical Science , volume=

Distributionally robust optimization and robust statistics , author=. Statistical Science , volume=. 2025 , publisher=

work page 2025

[41] [42]

Journal of the American statistical Association , volume=

Better bootstrap confidence intervals , author=. Journal of the American statistical Association , volume=. 1987 , publisher=

work page 1987

[42] [43]

Information and Inference: A Journal of the IMA , volume=

On parameter estimation with the Wasserstein distance , author=. Information and Inference: A Journal of the IMA , volume=. 2019 , publisher=

work page 2019

[43] [44]

International Conference on Machine Learning , pages=

Outlier-robust optimal transport , author=. International Conference on Machine Learning , pages=. 2021 , organization=

work page 2021

[44] [46]

Advances in Neural Information Processing Systems , volume=

Outlier-robust wasserstein dro , author=. Advances in Neural Information Processing Systems , volume=

work page

[45] [47]

2001 , publisher=

Empirical likelihood , author=. 2001 , publisher=

work page 2001

[46] [48]

Annual Review of Statistics and its Application , volume=

A review of empirical likelihood , author=. Annual Review of Statistics and its Application , volume=. 2021 , publisher=

work page 2021

[47] [49]

Journal of the American Statistical Association , volume=

Bayesian estimation and comparison of moment condition models , author=. Journal of the American Statistical Association , volume=. 2018 , publisher=

work page 2018

[48] [50]

1999 , publisher=

Elements of information theory , author=. 1999 , publisher=

work page 1999

[49] [51]

Biometrika , volume=

Bayesian exponentially tilted empirical likelihood , author=. Biometrika , volume=. 2005 , publisher=

work page 2005

[50] [54]

The book of GENESIS: exploring realistic neural models with the GEneral NEural SImulation System , pages=

The hodgkin—huxley model , author=. The book of GENESIS: exploring realistic neural models with the GEneral NEural SImulation System , pages=. 1998 , publisher=

work page 1998

[51] [55]

Scientific reports , volume=

Predicting the epidemic threshold of the susceptible-infected-recovered model , author=. Scientific reports , volume=. 2016 , publisher=

work page 2016

[52] [57]

Annual Review of Ecology and Systematics , volume=

Lotka-Volterra population models , author=. Annual Review of Ecology and Systematics , volume=. 1978 , publisher=

work page 1978

[53] [58]

The Bernstein-von-Mises theorem under misspecification , author=

work page

[54] [59]

Annual review of ecology, evolution, and systematics , volume=

Approximate Bayesian computation in evolution and ecology , author=. Annual review of ecology, evolution, and systematics , volume=. 2010 , publisher=

work page 2010

[55] [60]

Journal of the American Statistical Association , volume=

Approximate Bayesian computation: a nonparametric perspective , author=. Journal of the American Statistical Association , volume=. 2010 , publisher=

work page 2010

[56] [61]

Genetics , volume=

Approximate Bayesian computation in population genetics , author=. Genetics , volume=. 2002 , publisher=

work page 2002

[57] [62]

Systematic biology , volume=

Fundamentals and recent developments in approximate Bayesian computation , author=. Systematic biology , volume=. 2017 , publisher=

work page 2017

[58] [63]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

Approximate Bayesian computation with the Wasserstein distance , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2019 , publisher=

work page 2019

[59] [64]

Journal of the American Statistical Association , year=

Robust Bayesian inference via coarsening , author=. Journal of the American Statistical Association , year=

work page

[60] [65]

Bayesian fractional posteriors , author=

work page

[61] [66]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

A general framework for updating belief distributions , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2016 , publisher=

work page 2016

[62] [67]

Biometrika , volume=

General Bayesian updating and the loss-likelihood bootstrap , author=. Biometrika , volume=. 2019 , publisher=

work page 2019

[63] [68]

Journal of the royal statistical society: series D (the Statistician) , volume=

Markov chain Monte Carlo method and its application , author=. Journal of the royal statistical society: series D (the Statistician) , volume=. 1998 , publisher=

work page 1998

[64] [69]

International conference on machine learning , pages=

On gradient descent ascent for nonconvex-concave minimax problems , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020

[65] [70]

Evolutionary computation , volume=

Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES) , author=. Evolutionary computation , volume=. 2003 , publisher=

work page 2003

[66] [71]

Towards a new evolutionary computation: Advances in the estimation of distribution algorithms , pages=

The CMA evolution strategy: a comparing review , author=. Towards a new evolutionary computation: Advances in the estimation of distribution algorithms , pages=. 2006 , publisher=

work page 2006

[67] [72]

Advances in Neural Information Processing Systems , volume=

Robust optimal transport with applications in generative modeling and domain adaptation , author=. Advances in Neural Information Processing Systems , volume=

work page

[68] [73]

Artificial Intelligence for High Energy Physics , pages=

Simulation-based inference methods for particle physics , author=. Artificial Intelligence for High Energy Physics , pages=. 2022 , publisher=

work page 2022

[69] [74]

Journal of Econometrics , volume=

Simulation-based inference: A survey with special reference to panel data models , author=. Journal of Econometrics , volume=. 1993 , publisher=

work page 1993

[70] [76]

2011 , publisher=

Statistical inference: the minimum distance approach , author=. 2011 , publisher=

work page 2011

[71] [77]

Scholarpedia , volume=

Nelder-mead algorithm , author=. Scholarpedia , volume=

work page

[72] [78]

International statistical review , volume=

On choosing and bounding probability metrics , author=. International statistical review , volume=. 2002 , publisher=

work page 2002

[73] [79]

Advances in neural information processing systems , volume=

Mmd gan: Towards deeper understanding of moment matching network , author=. Advances in neural information processing systems , volume=

work page

[74] [80]

The annals of statistics , pages=

The bayesian bootstrap , author=. The annals of statistics , pages=. 1981 , publisher=

work page 1981

[75] [81]

International Conference on Artificial Intelligence and Statistics , pages=

Randomized stochastic gradient descent ascent , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2022 , organization=

work page 2022

[76] [82]

Annual Review of Statistics and Its Application , volume=

Neural methods for amortized inference , author=. Annual Review of Statistics and Its Application , volume=. 2025 , publisher=

work page 2025

[77] [83]

ArXiv , pages=

Wasserstein wormhole: Scalable optimal transport distance with transformers , author=. ArXiv , pages=

work page

[78] [84]

Robust optimal transport with applications in generative modeling and domain adaptation

Yogesh Balaji, Rama Chellappa, and Soheil Feizi. Robust optimal transport with applications in generative modeling and domain adaptation. Advances in Neural Information Processing Systems, 33: 0 12934--12944, 2020

work page 2020

[79] [85]

Statistical inference: the minimum distance approach

Ayanendranath Basu, Hiroyuki Shioya, and Chanseok Park. Statistical inference: the minimum distance approach. CRC press, 2011

work page 2011

[80] [86]

Approximate bayesian computation in evolution and ecology

Mark A Beaumont. Approximate bayesian computation in evolution and ecology. Annual review of ecology, evolution, and systematics, 41 0 (1): 0 379--406, 2010

work page 2010