Divergence is Uncertainty: A Closed-Form Posterior Covariance for Flow Matching

Jian Wang; Jiarui Xing; Song Wang

arxiv: 2605.00941 · v4 · pith:3MQ3NOMBnew · submitted 2026-05-01 · 💻 cs.LG · cs.CV

Divergence is Uncertainty: A Closed-Form Posterior Covariance for Flow Matching

Jiarui Xing , Song Wang , Jian Wang This is my paper

Pith reviewed 2026-05-22 10:28 UTC · model grok-4.3

classification 💻 cs.LG cs.CV

keywords flow matchingposterior covarianceuncertainty quantificationTweedie's formulavelocity field divergencegenerative modelingone-step generation

0 comments

The pith

Flow matching uncertainty reduces exactly to the divergence of the learned velocity field.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that uncertainty in samples from flow matching models has an exact closed-form expression that requires no new training. By extending Tweedie's formula from denoising models to the flow matching interpolant, the posterior covariance at any point on the generative trajectory depends only on the divergence of the velocity field. This quantity is available from any pre-trained model and can be evaluated without architectural changes. For one-step generators the same expression gives the full end-to-end uncertainty in a single forward pass. Experiments indicate that the resulting uncertainty maps highlight semantically meaningful regions and that the scalar score tracks actual error, all at far lower cost than ensembles or Monte Carlo methods.

Core claim

By extending Tweedie's formula from the denoising setting to the flow matching interpolant, we derive an exact, closed-form expression for the posterior covariance at every point along the generative trajectory. The result depends on a single quantity, namely the divergence of the learned velocity field, which can be computed post-hoc on any pre-trained flow matching model, requiring no retraining and no architectural modification.

What carries the argument

The closed-form posterior covariance obtained by extending Tweedie's formula to the flow matching interpolant, expressed solely in terms of the divergence of the learned velocity field.

If this is right

Uncertainty can be obtained for any pre-trained flow matching model without retraining or auxiliary heads.
One-step generators such as MeanFlow produce end-to-end generation uncertainty in a single forward pass.
Per-pixel uncertainty maps concentrate on high-variation regions such as digit boundaries.
A scalar uncertainty score derived from the same expression tracks actual prediction error.
Uncertainty evaluation requires orders of magnitude less compute than ensembling or Monte Carlo dropout.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same divergence-based relation may supply closed-form uncertainty for other continuous-time generative models.
These uncertainty maps could be used to prioritize regions for refinement or to weight samples in downstream tasks.
Direct comparison of the derived covariance against ground-truth variance on controlled synthetic data would provide a sharper test.

Load-bearing premise

The mathematical extension of Tweedie's formula from the denoising setting to the continuous flow matching interpolant holds.

What would settle it

Train a flow matching model on MNIST, compute the formula's covariance using the velocity divergence at selected points, then compare it directly to the empirical covariance measured across many independent generated samples at those same points.

Figures

Figures reproduced from arXiv: 2605.00941 by Jian Wang, Jiarui Xing, Song Wang.

**Figure 1.** Figure 1: Our closed-form uncertainty for flow matching. For any pre-trained flow matching model, our formula Cov(x1 | xt) = (1−t) 2 t [I + (1−t)Jvθ ] produces per-pixel uncertainty maps directly from the velocity Jacobian, with no retraining, no ensembling, and no extra forward passes. At small t (near noise) the maps are diffuse; as t grows toward the data, uncertainty progressively concentrates on digit boundarie… view at source ↗

**Figure 2.** Figure 2: Empirical scalar uncertainty U(xt, t) vs. flow time. Blue: U computed from the trained flow matching model via Eq. (22) (mean ± std over 16 test samples, 50 Hutchinson probes). Red dashed: prior baseline (1−t) 2 /t d corresponding to div vθ = 0. The 1 to 2 orders-of-magnitude gap is the quantitative footprint of the learned flow’s contractive (negative-divergence) behaviour. makes this precise: negative di… view at source ↗

**Figure 4.** Figure 4: Euler trajectory (odd rows) and corresponding Tweedie UQ maps (even rows) for four MNIST samples. Uncertainty evolves from diffuse (early t) to boundary-localised (late t), aligning with the model’s progressive resolution of digit identity, topology, and stroke boundary. 5.3. Correlation with Prediction Error To answer Q2, we compute the Spearman rank correlation ρ between the scalar score U(xt, t) and th… view at source ↗

**Figure 5.** Figure 5: Total UQ cost (training + inference, log scale) for 16 samples. Tweedie+FM and Tweedie+MF require no retraining and produce uncertainty in a single inference pass; MC Dropout requires retraining a dropout-enabled model plus 50 stochastic passes; deep ensembles require 5 independent training runs. Our method is roughly 104× cheaper end-to-end. simultaneously (i) retraining-free, (ii) exact at the time at wh… view at source ↗

read the original abstract

Flow matching has become a leading framework for generative modeling, but quantifying the uncertainty of its samples remains an open problem. Existing approaches retrain the model with auxiliary variance heads, maintain costly ensembles, or propagate approximate covariance through many integration steps, trading off training cost, inference cost, or accuracy. We show that none of these trade-offs is necessary. By extending Tweedie's formula from the denoising setting to the flow matching interpolant, we derive an exact, closed-form expression for the posterior covariance at every point along the generative trajectory. The result depends on a single quantity, namely the divergence of the learned velocity field, which can be computed post-hoc on any pre-trained flow matching model, requiring no retraining and no architectural modification. For one-step generators such as MeanFlow, the same formula yields the end-to-end generation uncertainty in a single forward pass, eliminating the multi-step variance propagation required by all prior methods. Experiments on MNIST confirm that the resulting per-pixel uncertainty maps are semantically meaningful, concentrating on digit boundaries where inter-sample variation is highest, and that the scalar uncertainty score tracks actual prediction error, all at roughly $10^4 \times$ less total compute than ensembling or Monte Carlo dropout.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a training-free covariance formula for flow matching from velocity divergence, but the exactness claim needs verification on Jacobian handling.

read the letter

Hey, the main point is that they derive a closed-form posterior covariance for flow matching by extending Tweedie's formula to the interpolant and pulling it from the divergence of the pre-trained velocity field. If that step is solid, it removes the need for retraining or ensembles and works in one pass for models like MeanFlow. That's the practical angle worth noting. They handle the extension cleanly enough in the abstract and show it applies along the trajectory without new parameters. The MNIST experiments are a plus: uncertainty maps focus on boundaries where variation should be high, the scalar score lines up with prediction error, and the compute savings are large compared to Monte Carlo or dropout baselines. Credit for keeping it post-hoc and architecture-agnostic. The soft spot sits in the derivation itself. The stress-test concern about missing Jacobian or time-dependent terms looks worth checking; covariance in a general flow depends on the full map, not just its trace, unless isotropy or straight paths are assumed explicitly. The paper calls the result exact, so the proof should address whether those extra terms drop out or if the claim is conditional. Experiments stay narrow to MNIST, which limits how far the semantic meaningfulness claim travels. This is for people already using flow matching in vision or reliability work who want cheap uncertainty on existing models. A reader focused on applied generative tools would find it useful even if the math needs tightening. It deserves peer review to confirm the derivation and test broader cases.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that by extending Tweedie's formula from the denoising setting to the flow-matching interpolant x_t = (1-t)x_0 + t x_1, one obtains an exact closed-form expression for the posterior covariance at any point along the trajectory; this expression depends only on the divergence of the learned velocity field v_t and can be evaluated post-hoc on any pre-trained flow-matching model. Experiments on MNIST are presented to show that the resulting per-pixel uncertainty maps are semantically meaningful and that a scalar uncertainty score correlates with prediction error, all at far lower compute than ensembles.

Significance. If the central derivation holds, the result would be a practical advance for uncertainty quantification in flow matching, eliminating the need for retraining, auxiliary heads, or multi-step covariance propagation. The post-hoc nature and applicability to one-step generators such as MeanFlow are attractive. The MNIST results provide initial evidence that the uncertainty is interpretable, but the overall significance is conditional on the exactness of the mathematical extension.

major comments (2)

[Section 3 (derivation)] The derivation that extends Tweedie's formula to the flow-matching interpolant and concludes that posterior covariance is exactly the divergence of v_t (Section 3, around the statement following Eq. (7) or equivalent): the posterior covariance at an intermediate t generally depends on the full Jacobian of the flow map, not solely on its trace (divergence). Please supply the explicit steps showing how Jacobian determinant or eigenvalue contributions reduce to the scalar divergence, including any isotropy or straight-path assumptions required for the reduction to be exact.
[Section 3] The claim that the formula is 'exact' and 'closed-form' for general learned velocity fields (abstract and Section 3): if the reduction relies on the velocity field satisfying additional structure beyond the standard flow-matching objective, this should be stated explicitly as a modeling assumption, because the ODE marginalization otherwise retains off-diagonal covariance terms.

minor comments (2)

[Experiments] The MNIST experiments are described only at a high level; adding quantitative metrics (e.g., correlation coefficients between uncertainty score and actual error) and a comparison against a simple baseline such as MC dropout would strengthen the empirical support.
[Notation and preliminaries] Notation: ensure consistent use of subscripts for time t and clarify whether the divergence is evaluated on the learned or ground-truth velocity field in the theoretical statements.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. We address each major comment below and have revised the manuscript to improve the presentation and explicitness of the derivation in Section 3.

read point-by-point responses

Referee: [Section 3 (derivation)] The derivation that extends Tweedie's formula to the flow-matching interpolant and concludes that posterior covariance is exactly the divergence of v_t (Section 3, around the statement following Eq. (7) or equivalent): the posterior covariance at an intermediate t generally depends on the full Jacobian of the flow map, not solely on its trace (divergence). Please supply the explicit steps showing how Jacobian determinant or eigenvalue contributions reduce to the scalar divergence, including any isotropy or straight-path assumptions required for the reduction to be exact.

Authors: We thank the referee for this observation. The original manuscript presented the core extension concisely. Under the straight-path interpolant x_t = (1-t)x_0 + t x_1 that defines standard flow matching, the map from x_t to the posterior over x_0 is an affine transformation whose Jacobian is a scalar multiple of the identity. All eigenvalues are therefore identical, and the determinant and eigenvalue contributions to the posterior covariance reduce exactly to the trace of the Jacobian, which is the divergence of v_t. We have added the full step-by-step derivation, including the explicit invocation of the straight-path assumption and the resulting isotropy, to the revised Section 3. revision: yes
Referee: [Section 3] The claim that the formula is 'exact' and 'closed-form' for general learned velocity fields (abstract and Section 3): if the reduction relies on additional structure beyond the standard flow-matching objective, this should be stated explicitly as a modeling assumption, because the ODE marginalization otherwise retains off-diagonal covariance terms.

Authors: We appreciate the referee highlighting the need for clarity. The derivation is exact for any velocity field obtained from the standard flow-matching objective on the linear interpolant; no further structure is imposed. Because the interpolant is linear, the conditional distributions along the trajectory yield a posterior covariance whose off-diagonal contributions are eliminated by the change-of-variables and the definition of the learned velocity as the conditional expectation, leaving only the divergence (trace) as the closed-form scalar expression. Per-pixel uncertainty maps are obtained by evaluating the relevant diagonal contributions component-wise. We have partially revised Section 3 to state these modeling choices explicitly while preserving the claim of exactness under the standard flow-matching setup. revision: partial

Circularity Check

0 steps flagged

Derivation extends external Tweedie's formula to flow-matching interpolant without self-definition or fitted-input reduction

full rationale

The paper's central result is obtained by extending Tweedie's formula (an external identity from the denoising literature) to the flow-matching interpolant x_t = (1-t)x_0 + t x_1, yielding posterior covariance expressed solely in terms of the divergence of the already-trained velocity field v_t. This quantity is computed post-hoc on a pre-trained model and is not a newly fitted parameter or a self-defined quantity. No equations in the abstract or described derivation reduce the claimed covariance back to the input velocity field by construction, nor do they rely on load-bearing self-citations, uniqueness theorems from the same authors, or smuggled ansatzes. The derivation remains self-contained once the validity of the Tweedie extension is granted; any doubt about missing Jacobian terms concerns correctness rather than circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of extending Tweedie's formula to the flow-matching setting and on the assumption that the learned velocity field supports meaningful covariance extraction.

axioms (1)

domain assumption Tweedie's formula extends exactly to the flow matching interpolant
The derivation begins from this extension as stated in the abstract.

pith-pipeline@v0.9.0 · 5746 in / 1139 out tokens · 31312 ms · 2026-05-22T10:28:55.259282+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

By extending Tweedie’s formula ... the result depends on a single quantity, namely the divergence of the learned velocity field

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.