pith. sign in

arxiv: 2411.10959 · v4 · pith:3R6FVH4Gnew · submitted 2024-11-17 · 💰 econ.EM · cs.LG· math.ST· stat.AP· stat.ME· stat.ML· stat.TH

Program Evaluation with Remotely Sensed Outcomes

Pith reviewed 2026-05-23 17:55 UTC · model grok-4.3

classification 💰 econ.EM cs.LGmath.STstat.APstat.MEstat.MLstat.TH
keywords causal inferenceremote sensingprogram evaluationnonparametric identificationexperimental dataobservational datatreatment effects
0
0 comments X

The pith

A nonparametric formula identifies causal effects of programs when the outcome is measured only through remote sensing by combining experimental and observational data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method for causal inference when the economic outcome of interest is observed imperfectly through scalable but indirect remote measures such as satellite imagery. It treats the remote variable as post-outcome, so that changes in the true economic outcome drive the remote readings. Under this structure, a simple identification formula recovers the causal parameter by fusing the treatment variation from an experiment with the predictive relationship observed in non-experimental data. The resulting estimator supports root-n inference that remains valid even if the algorithms used to process the remote data are misspecified. This setup makes it possible to evaluate interventions at low cost without requiring direct measurement of the outcome in every setting.

Core claim

Under the modeling assumption that the remotely sensed variable is caused by the economic outcome, the average treatment effect is nonparametrically identified by an explicit formula that integrates the experimental contrast in treatment assignment with the observational conditional expectation of the outcome given the remote measure; the paper supplies a corresponding estimator and inference procedure that is robust to arbitrary processing of the remote data.

What carries the argument

The nonparametric identification formula that recovers the causal parameter by combining experimental treatment assignment with the observational mapping from the remotely sensed variable to the outcome.

If this is right

  • Program evaluations can use low-cost remote data for outcomes that are expensive to measure directly.
  • The estimator converges at the parametric rate without parametric restrictions on how the remote data are processed.
  • Inference remains valid under arbitrary misspecification of the relationship between the remote measure and the outcome.
  • The approach applies to any post-outcome proxy that is predictive in observational data and observed in both experimental and observational samples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same structure could apply to other imperfect proxies such as administrative records or survey responses that are caused by the underlying outcome.
  • Extensions might incorporate high-dimensional machine learning predictors for the observational mapping without changing the identification argument.
  • The method could be tested in settings where both direct outcome measures and remote proxies are available in the same experiment.

Load-bearing premise

Changes in the economic outcome cause changes in the remotely sensed variable rather than the reverse.

What would settle it

An auxiliary randomized trial that directly manipulates the remotely sensed variable while holding the economic outcome fixed would produce a nonzero estimate under the identification formula if the post-outcome assumption is false.

Figures

Figures reproduced from arXiv: 2411.10959 by Ashesh Rambachan, Davide Viviano, Rahul Singh.

Figure 1
Figure 1. Figure 1: We illustrate the two samples that we will use to evaluate an anti-poverty program [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Our main assumption (Assumption 2(i)) is plausible in real data. We compare fR(R | S = e,D = 0,Y = 0) with fR(R | S = o,D = 0,Y = 0) in Figure 2b, and fR(R | S = e,D = 0,Y = 1) with fR(R|S =o,D = 0,Y = 1) in Figure 2d, using data from the Smartcard experiment conducted by Muralidharan et al. (2016) that we analyze in Section 5. Because the satellite image R∈R 4000 is high-dimensional, we visualize the dens… view at source ↗
Figure 3
Figure 3. Figure 3: Causal graph for remotely sensed variables under Assumptions [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: In the first exercise, our method outperforms common practice in terms of average [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: In the first exercise, our method outperforms common practice in terms of root [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Satellite images are relevant to the poverty outcomes. Each plot is for a poverty [PITH_FULL_IMAGE:figures/full_fig_p027_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Our method recovers the unbiased benchmark estimate and its [PITH_FULL_IMAGE:figures/full_fig_p028_7.png] view at source ↗
read the original abstract

We study causal inference in experiments and quasi-experiments, where the economic outcome is imperfectly measured by a remotely sensed variable. The remotely sensed variable is low-cost, scalable, and predictive of the economic outcome in observational data; examples include satellite imagery and mobile phone activity. We model the remotely sensed variable as post-outcome: variation in the economic outcome causes variation in the remotely sensed variable. For example, changes in environmental quality cause changes in satellite imagery, not vice versa. Under this assumption, we propose a formula to nonparametrically identify the causal parameter by combining experimental and observational data. We develop a method for n^{-1/2} inference that is robust to misspecification and that does not restrict the algorithms used to process remotely sensed variables.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper studies causal inference when the economic outcome of interest is imperfectly measured by a remotely sensed proxy (e.g., satellite imagery). It models the remote variable as post-outcome (outcome causes remote measure), proposes a nonparametric identification formula that fuses experimental data (treatment affects outcome and hence the remote measure) with observational data (both variables observed), and develops an n^{-1/2}-consistent inference procedure that is robust to misspecification of the remote-sensing processing algorithm.

Significance. If the identification result is valid, the approach would allow researchers to leverage low-cost, scalable remote-sensing data for program evaluation in settings where direct outcome measurement is expensive or infeasible. The robustness claim for inference is a potential strength if the conditions are fully stated.

major comments (1)
  1. [Abstract] Abstract: the nonparametric identification formula is stated to recover the causal parameter under the post-outcome assumption alone. However, transport of the conditional distribution P(remote | outcome) from the observational sample to the experimental sample is required for the formula to be valid; the post-outcome modeling establishes directionality but supplies no justification for invariance of this conditional law across contexts. This invariance is load-bearing for the central claim and is not addressed in the abstract.
minor comments (1)
  1. [Abstract] The abstract claims n^{-1/2} inference robust to misspecification without restricting the remote-sensing algorithms; the manuscript should clarify whether this robustness holds under the same invariance condition or requires additional assumptions.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the nonparametric identification formula is stated to recover the causal parameter under the post-outcome assumption alone. However, transport of the conditional distribution P(remote | outcome) from the observational sample to the experimental sample is required for the formula to be valid; the post-outcome modeling establishes directionality but supplies no justification for invariance of this conditional law across contexts. This invariance is load-bearing for the central claim and is not addressed in the abstract.

    Authors: We agree that the abstract is imprecise on this point. The post-outcome assumption establishes the causal direction (outcome causes remote), while identification of the ATE further requires that the conditional distribution P(remote | outcome) can be transported from the observational to the experimental sample. This transportability assumption is maintained throughout the identification argument in the main text but is not mentioned in the abstract. We will revise the abstract to state the full set of assumptions under which the formula recovers the target parameter. revision: yes

Circularity Check

0 steps flagged

No circularity; identification formula is self-contained

full rationale

The paper proposes a nonparametric identification formula that fuses experimental data (treatment to outcome) with separate observational data (outcome to remote sensing) under an explicit post-outcome modeling assumption. This derivation relies on external data sources and standard causal transport arguments rather than any quantity fitted exclusively from the experimental sample, any self-citation chain, or a result defined in terms of itself. No load-bearing step reduces by construction to the inputs; the central claim retains independent content from the data combination and is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on one domain assumption about the causal direction between the economic outcome and the remote proxy; no free parameters or invented entities are described in the abstract.

axioms (1)
  • domain assumption The remotely sensed variable is post-outcome: variation in the economic outcome causes variation in the remotely sensed variable.
    Explicitly stated in the abstract as the modeling assumption required for the identification formula.

pith-pipeline@v0.9.0 · 5669 in / 1265 out tokens · 24633 ms · 2026-05-23T17:55:20.190064+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Making Interpretable Discoveries from Unstructured Data: A High-Dimensional Multiple Hypothesis Testing Approach

    econ.EM 2025-11 unverdicted novelty 6.0

    A new framework combines AI-derived concept embeddings with high-dimensional selective inference to enable statistically principled, interpretable discovery from unstructured data in empirical economics.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · cited by 1 Pith paper

  1. [1]

    Therefore, S becomes a degenerate random variable conditional uponD =1and we conclude thatS |=(Y,R)|D=1trivially

    By the hypothesis that the observational sample is untreated,Pr(S = e|D = 1) = 1. Therefore, S becomes a degenerate random variable conditional uponD =1and we conclude thatS |=(Y,R)|D=1trivially

  2. [2]

    Weshowthat E{h(Y,R)|D =0 ,S = e}= E{h(Y,R)|D =0 ,S = o}, which impliesS |=(Y,R)|D =0as desired

    Fixanybounded, measurablefunction h(Y,R). Weshowthat E{h(Y,R)|D =0 ,S = e}= E{h(Y,R)|D =0 ,S = o}, which impliesS |=(Y,R)|D =0as desired. By SUTVA, randomized treatment assignment, and randomized sample selection, E{h(Y,R)|D=0,S=e}=E[h{Y,Y R(1)+(1−Y)R(0)}|D=0,S=e] =E(h[Y(0),Y(0)R(1)+{1−Y(0)}R(0)]|D=0,S=e) =E(h[Y(0),Y(0)R(1)+{1−Y(0)}R(0)]|S=e) =E(h[Y(0),Y(...

  3. [3]

    To begin, we write the average potential outcome in the experimental sample as µ(1):=Pr{Y(1)=1|S=e} (1) = Pr{Y(1)=1|D=1,S=e} =Pr(Y=1|D=1,S=e) = Z Pr(Y=1|R=r,D=1,S=e)f R(r|D=1,S=e)dr, where (1) follows by Assumption 1

  4. [4]

    Next we rewrite the implicit target eµ(1):= Z Pr(Y=1|R=r,S=o)f R(r|D=1,S=e)dr. Within this expression, Pr(Y=1|R,S=o) (1) = fR(R|Y=1,S=o)Pr(Y=1|S=o) fR(R|S=o) (2) = fR(R|Y=1,D=1,S=e)P(Y=1|S=o) fR(R|S=o) (3) = Pr(Y=1|R,D=1,S=e) fR(R|D=1,S=e)Pr(Y=1|S=o) Pr(Y=1|D=1,S=e)f R(R|S=o) (4) = Pr(Y=1|R,D=1,S=e) Pr(Y=1|S=o) Pr{Y(1)=1|S=e} fR(R|D=1,S=e) fR(R|S=o) where...

  5. [5]

    Within this expression, Pr(Y=1|R,D=1,S=e)f R(R|D=1,S=e)=f Y,R(Y=1,R|D=1,S=e) =f R(R|Y=1,D=1,S=e)Pr(Y=1|D=1,S=e} =f R(R|Y=1,D=1,S=e)Pr{Y(1)=1|S=e} =f R(R|Y=1,D=1,S=e)µ(1)

    Combining results, the bias for the treated potential outcome,eµ(1)−µ(1), equals Z Pr{Y(0)=1|S=e} Pr{Y(1)=1|S=e} fR(r|D=1) fR(r|D=0) −1 Pr(Y=1|R=r,D=1,S=e)f R(r|D=1,S=e)dr. Within this expression, Pr(Y=1|R,D=1,S=e)f R(R|D=1,S=e)=f Y,R(Y=1,R|D=1,S=e) =f R(R|Y=1,D=1,S=e)Pr(Y=1|D=1,S=e} =f R(R|Y=1,D=1,S=e)Pr{Y(1)=1|S=e} =f R(R|Y=1,D=1,S=e)µ(1). by the law of...

  6. [6]

    We can follow a similar argument foreµ(0). As in Step 2, eµ(0)= Z Pr(Y=1|R=r,S=o)f R(r|D=0,S=e)dr = Z Pr(Y=1|R=r,D=0,S=e) Pr(Y=1|S=o) Pr{Y(0)=1|S=e} fR(r|D=0,S=e) fR(r|S=o) fR(r|D=0,S=e)dr by identical arguments as before, replacingD = 1with D = 0. We have previously shown Pr(Y = 1|S = o) =Pr{Y (0) = 1|S = e}. Moreover, fR(R|D = 0,S = e) =fR(R|D = 0)since...

  7. [7]

    Sinceeµ(0)=µ(0), we conclude thateθ−θ=eµ(1)−µ(1)under the stated conditions

  8. [8]

    Suppose thatRis binary with R|Y,D,S=    Ywith probability1/2 1otherwise

    Next we demonstrate the bias can be positive or negative by constructing the claimed DGPs. Suppose thatRis binary with R|Y,D,S=    Ywith probability1/2 1otherwise. 44 This process impliesR |=(D,S)|Y, which satisfiesR |=S|D,Y (Assumption 2) by the weak union axiom. It also directly satisfiesR |=D|Y(Assumption 3(ii)). Under this DGP, Pr(R=1|Y=1,D=1,S=e...

  9. [9]

    By Assumption 1,fY (y|S=e,X=x,D=d)=f Y(d) (y|S=e,X=x)

    As in Lemma 1, for general(X,Y,R)and forδe d(r,x):=f R(r|S=e,X=x,D=d), δe d(r,x)= Z fR(r|S=e,X=x,D=d,Y=y)f Y (y|S=e,X=x,D=d)dy. By Assumption 1,fY (y|S=e,X=x,D=d)=f Y(d) (y|S=e,X=x). Now, however, fR(r|S=e,X=x,D=d,Y=y)=f R(r|S=o,X=x,D=d,Y=y)=:δ o y,d(r,x), where the first equality applies Assumption 2 and the second equality applies Assumption 3(i). Combi...

  10. [10]

    As in Theorem 1, we next apply Bayes’ rule to rewrite δo y,d(r,x)= Pr(Y=y,D=d,S=o|R=r,X=x)f R(r|X=x) Pr(Y=y,D=d,S=o|X=x) (6) δe d(r,x)= Pr(D=d,S=e|R=r,X=x)f R(r|X=x) Pr(D=d,S=e|X=x) . It therefore follows by cancelingfR(r|X=x)that, ford∈ {0,1}andr∈ R, Pr(D=d,S=e|R=r,X=x) Pr(D=d,S=e|X=x) − Pr(Y=0,D=d,S=o|R=r,X=x) Pr(Y=0,D=d,S=o|X=x) = Pr(Y=1,D=d,S=o|R=r,X=...

  11. [12]

    predD(R) count2 D=1,S=e + 1−pred D(R) count2 D=0,S=e # predS(R) +bθ2 init

    Learn the representation ontrain:bH(R). (a) Count marginals:count Y=1,S=o,count Y=0,S=o,count D=1,S=e,count D=0,S=e. (b) Train predictors: predY (R)estimates Pr(Y = 1|S = o, R), predD(R)estimates Pr(D=1|S=e,R), andpred S(R)estimatesPr(S=e|R), using machine learning. (c) Initially estimatebθinit =argminθEtrain[{bE(∆e|R)−bE(∆o|R)θ}2], wherebE(∆e|R)and bE(∆o...

  12. [13]

    (a) Count marginals:count Y=1,S=o,count Y=0,S=o,count D=1,S=e,count D=0,S=e

    Construct a causal estimate ontest:bθ. (a) Count marginals:count Y=1,S=o,count Y=0,S=o,count D=1,S=e,count D=0,S=e. (b) Construct a causal estimate:bθ = Etest{ b∆e bH(R)} Etest{ b∆o bH(R)} whereb∆e andb∆o are constructed from marginal probabilities according to (1): b∆e = 1{D=1,S=e} countD=1,S=e − 1{D=0,S=e} countD=0,S=e , b∆o = 1{Y=1,S=o} countY=1,S=o − ...

  13. [14]

    Divide the sample intotrainandtestfolds

  14. [15]

    Learn the representations ontrain:bH(R,x). For eachx∈ X, (a) Count conditional probabilities: count(D=d,S=e|X=x)= P i∈train1(Di =d,S i =e,X i =x)P i∈train1(Xi =x) for eachd∈ {0,1}, count(Y=y,S=o|X=x)= P i∈train1(Yi =y,S i =o,X i =x)P i∈train1(Xi =x) for eachy∈ Y. (b) Train predictors on the subsample withX=x: predY (R,x):R →[0,1] K using{(R i,Yi):i∈train,...

  15. [16]

    Construct generalized CATE estimates ontest:bθ(x). For eachx∈ X, (a) Count conditional probabilities: count(D=d,S=e|X=x)= P i∈test1(Di =d,S i =e,X i =x)P i∈test1(Xi =x) for eachd∈ {0,1}, count(Y=y,S=o|X=x)= P i∈test1(Yi =y,S i =o,X i =x)P i∈test1(Xi =x) for eachy∈ Y. (b) Compute treatment and outcome variation: b∆e(x)= 1(D=1,S=e) count(D=d,S=e|X=x) − 1(D=...

  16. [17]

    (a) Compute the generalized ATE estimate:bθ=E test{bθ(X)} (b) Compute the ATE estimate:bθ0 =PK−1 j=1 yjbθj +yK 1−PK−1 j=1 bθj

    Aggregate into ATE estimate ontest:bθ0. (a) Compute the generalized ATE estimate:bθ=E test{bθ(X)} (b) Compute the ATE estimate:bθ0 =PK−1 j=1 yjbθj +yK 1−PK−1 j=1 bθj

  17. [18]

    first stage

    Bootstrap its confidence interval:bθ0 ±c αbvn−1/2, where cα is the1 −α/2quantile of the standard Gaussian andbvn−1/2 is the bootstrap standard error ofbθ0 fixing the estimated representations. Our final estimatorbθ0 ∈R, for the ATE in the experimental sample, is asymptotically normal by direct extensions of the arguments in Section 4 and Appendix D. The k...

  18. [19]

    By the definition of˜Yε and Assumption 1, Pr( ˜Yε =y|S=e,X=x,D=d)= Z y′∈By(ε) fY (y′ |S=e,X=x,D=d)dy ′ = Z y′∈By(ε) fY(d) (y′ |S=e,X=x)dy ′ =Pr{Y(d)∈B y(ε)|S=e,X=x}

    By the law of total probability and the definition of˜Yε, fR(r|S=e,X=x,D=d)= Z fR, ˜Yε(r,y|S=e,X=x,D=d)dy = Z fR(r|S=e,X=x,D=d, ˜Yε =y)f ˜Yε(y|S=e,X=x,D=d)dy = X y∈Yε fR(r|S=e,X=x,D=d, ˜Yε =y)Pr( ˜Yε =y|S=e,X=x,D=d). By the definition of˜Yε and Assumption 1, Pr( ˜Yε =y|S=e,X=x,D=d)= Z y′∈By(ε) fY (y′ |S=e,X=x,D=d)dy ′ = Z y′∈By(ε) fY(d) (y′ |S=e,X=x)dy ′ ...

  19. [20]

    Substituting these expressions into the previous step and cancelingfR(R|X )yields the result

    We apply Bayes’ rule to rewrite fR(r|S=o,X=x, ˜Yε =y)= Pr( ˜Yε =y,S=o|R=r,X=x)f R(r|X=x) Pr( ˜Yε =y,S=o|X=x) =γ ε y(x,r)fR(r|X=x), fR(r|S=e,X=x,D=d)= Pr(D=d,S=e|R=r,X=x)f R(r|X=x) Pr(D=d,S=e|X=x) =π d(x,r)fR(r|X=x). Substituting these expressions into the previous step and cancelingfR(R|X )yields the result. Consequently,foranyfixed ε >0,wecanperformestim...

  20. [21]

    By equation (3), fR(r|S=e,X=x,D=d)= Z fR(r|S=o,X=x,Y=y)f Y(d) (y|S=e,X=x)dy

  21. [22]

    maximum accuracy

    By Bayes’ rule, fR(r|Y=y,S=o,X=x)= fY,S(y,o|R=r,X=x)f R(r|X=x) fY,S(y,o|X=x) =γ 0 y(X,R)fR(r|X=x), fR(r|D=d,S=e,X=x)= Pr(D=d,S=e|R=r,X=x)f R(r|X=x) Pr(D=d,S=e|X=x) =π d(X,R)fR(r|X=x). Substituting these expressions into the previous step and cancelingfR(r|X = x)yields the result. Proposition F.3 identifies the conditional counterfactual densityfY(d) (y|S ...

  22. [23]

    By the law of total probability, δe d,i(r):=f Ri(r|S i =e,D i =d) = Z fRi,Yi(r,y|S i =e,D i =d)dy = Z fRi(r|S i =e,D i =d,Y i =y)f Yi(y|S i =e,D i =d)dy. 74 Next, notice that fRi(r|S i =e,D i =d,Y i =y)=f Ri(r|S i =o,D i =d,Y i =y) =f Ri(r|S i =o,Y i =y) =:δ o y,i(r), wherethefirstequalityappliesAssumptionI.2andthesecondequalityappliesAssumptionI.3. Combi...

  23. [24]

    synthetic samples

    We apply Bayes’ rule to rewrite δo y,i(r):=f Ri(r|S i =o,Y i =y)= Pr(Yi =y,S i =o|R i =r)f Ri(r) Pr(Yi =y,S i =o) , δe d,i(r):=f Ri(r|S i =e,D i =d)= Pr(Di =d,S i =e|R i =r)f Ri(r) Pr(Di =d,S i =e) . Substituting these expressions into the previous step and cancelingfRi(r)gives E(∆e i |Ri)=E(∆ o i |Ri)θi, θ i =E{Y dir i (1)−Y dir i (0)|Si =e}. We conclude...

  24. [25]

    Next, notice that fR(r|S=e,D=d,Z=z,Y=y)=f R(r|S=o,D=d,Z=z,Y=y) =f R(r|S=o,Y=y) =:δ o y(r), wherethefirstequalityappliesAssumptionJ.2andthesecondequalityappliesAssumptionJ.3

    By the law of total probability, δe d,z(r):=f R(r|S=e,D=d,Z=z) = Z fR,Y (r,y|S=e,D=d,Z=z)dy = Z fR(r|S=e,D=d,Z=z,Y=y)f Y (y|S=e,D=d,Z=z)dy. Next, notice that fR(r|S=e,D=d,Z=z,Y=y)=f R(r|S=o,D=d,Z=z,Y=y) =f R(r|S=o,Y=y) =:δ o y(r), wherethefirstequalityappliesAssumptionJ.2andthesecondequalityappliesAssumptionJ.3. 83 Combining the previous displays, we arri...

  25. [26]

    Substituting these expressions into the previous step and cancelingfR(r)yields E{∆e(d,z)−∆ oα(d,z)|R}=0

    We apply Bayes’ rule to rewrite δo y(r):=f R(r|S=o,Y=y)= Pr(Y=y,S=o|R=r)f R(r) Pr(Y=y,S=o) , δe d,z(r):=f R(r|S=e,D=d,Z=z)= Pr(D=d,Z=z,S=e|R=r)f R(r) Pr(D=d,Z=z,S=e) . Substituting these expressions into the previous step and cancelingfR(r)yields E{∆e(d,z)−∆ oα(d,z)|R}=0. Corollary J.1(Representation with an instrument).Under Theorem J.1’s conditions, for...

  26. [27]

    Next, notice that fRt(r|S=e,D=d,Y t =y)=f Rt(r|S=o,D=d,Y t =y) =f Rt(r|S=o,Y t =y) =:δ o y,t(r), wherethefirstequalityappliesAssumptionJ.5andthesecondequalityappliesAssumptionJ.6

    By the law of total probability, δe d,t(r):=f Rt(r|S=e,D=d) = Z fRt,Yt(r,y|S=e,D=d)dy = Z fRt(r|S=e,D=d,Y t =y)f Yt(y|S=e,D=d)dy. Next, notice that fRt(r|S=e,D=d,Y t =y)=f Rt(r|S=o,D=d,Y t =y) =f Rt(r|S=o,Y t =y) =:δ o y,t(r), wherethefirstequalityappliesAssumptionJ.5andthesecondequalityappliesAssumptionJ.6. Combining the previous displays, we arrive at t...

  27. [28]

    Substituting these expressions into the previous step and cancelingfRt(r)yields E{∆e t(d)−∆ o t αt(d)|R t}=0

    We apply Bayes’ rule to rewrite δo y(r):=f Rt(r|Y t =y,S=o)= Pr(Yt =y,S=o|R t =r)f Rt(r) Pr(Yt =y,S=o) , δe d(r):=f Rt(r|D=d,S=e)= Pr(D=d,S=e|R t =r)f Rt(r) Pr(D=d,S=e) . Substituting these expressions into the previous step and cancelingfRt(r)yields E{∆e t(d)−∆ o t αt(d)|R t}=0. 86 Corollary J.2.Under Theorem J.2’s conditions, for anyt∈ {1,2}, d∈ {0,1} a...