Quantile and Distribution Treatment Effects on the Treated with Possibly Non-Continuous Outcomes

Emmanuel Selorm Tsyawo; Nelly K. Djuazon

arxiv: 2408.07842 · v2 · pith:5GTWVM2Cnew · submitted 2024-08-14 · 💰 econ.EM

Quantile and Distribution Treatment Effects on the Treated with Possibly Non-Continuous Outcomes

Nelly K. Djuazon , Emmanuel Selorm Tsyawo This is my paper

Pith reviewed 2026-05-23 21:34 UTC · model grok-4.3

classification 💰 econ.EM

keywords difference-in-differencesdistributional treatment effectsquantile treatment effectsnon-continuous outcomesstaggered adoptionuniform inferenceoveridentification test

0 comments

The pith

A distributional DiD method identifies treatment effects on entire outcome distributions and quantiles even when outcomes are discrete, censored, or mixed.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to identify the full distribution of treatment effects on the treated, and the quantiles of that distribution, in difference-in-differences settings where the outcome variable is not continuously distributed. It does so by showing that the distribution of untreated potential outcomes for the treated units can be recovered from the control units under a parallel-trends assumption stated directly on the cumulative distribution functions. A sympathetic reader would care because many policy-relevant outcomes, such as crime counts, employment status, or health events, take discrete or limited values, yet most existing DiD tools target only mean effects. The framework also supplies uniform inference and a specification test for the identifying assumptions, and it covers staggered adoption and the main panel and cross-section sampling designs used in applied work.

Core claim

Under distributional parallel trends and no-anticipation for the untreated potential outcomes, the counterfactual distribution for the treated group is identified from the control group; subtracting this from the observed post-treatment distribution for the treated group yields the distribution of treatment effects on the treated, which in turn delivers quantile treatment effects on the treated, with the whole procedure valid for non-continuous outcomes and extendable to staggered timing.

What carries the argument

Distributional parallel trends assumption on the cumulative distribution functions of untreated potential outcomes, which identifies the counterfactual untreated distribution for the treated group and thereby the distribution of treatment effects on the treated.

If this is right

Event probabilities for each possible outcome value can be recovered and compared across treatment status for count-valued outcomes.
Uniform confidence bands can be constructed around the entire distribution of treatment effects and around all quantiles simultaneously.
The same identification and inference results apply whether the data are repeated cross sections, unbalanced panels, rotating panels, or balanced panels.
Staggered treatment adoption is accommodated without additional assumptions beyond those used for two-period designs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same machinery could be used to recover how a policy changes the probability that an outcome exceeds any policy-relevant threshold, not just its mean.
Applied researchers could first run the over-identification test on a placebo period before trusting distributional estimates from the main sample.
Because the method works with mixed distributions, it could be applied directly to outcomes that combine a mass at zero with a continuous positive part, such as wages or expenditures.

Load-bearing premise

The entire distribution of untreated potential outcomes would have followed the same time trend in the treated and control groups in the absence of treatment.

What would settle it

Rejection of the proposed test of functional over-identifying restrictions in an application where the economic model predicts that the restrictions should hold.

read the original abstract

Applied Difference-in-Differences studies often involve outcomes that are discrete, mixed, censored, or otherwise non-continuously distributed, while policy questions frequently concern distributional effects rather than mean effects alone. This paper develops a distributional DiD framework for identifying and conducting uniform inference on distribution and quantile treatment effects on the treated in such settings under stated identifying and regularity conditions. Identification is based on distributional parallel trends and no-anticipation assumptions, illustrated through an economic model of crime that generates count-valued untreated potential outcomes. The identification and asymptotic theory accommodate staggered treatment adoption and a general sampling scheme encompassing repeated cross-sections, unbalanced panels, rotating panels, and balanced panels. The paper also proposes a test of functional over-identifying restrictions as a diagnostic for the identifying assumptions and working-CDF specification. An empirical application to the effect of police on crime illustrates the practical relevance of the approach and shows how distributional effects can be interpreted as event-probability effects for count outcomes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Extends distributional DiD to non-continuous outcomes with uniform inference and a test, but the asymptotics and test properties need verification from the full derivations.

read the letter

This paper develops a distributional DiD framework that identifies quantile and distribution treatment effects on the treated for discrete, mixed, or censored outcomes. It relies on distributional parallel trends and no-anticipation assumptions, handles staggered adoption, and covers repeated cross-sections through balanced panels. It also supplies uniform inference and a functional over-identification test, with an economic model generating count-valued untreated outcomes to motivate the setup. An application to police and crime is referenced to show how distributional effects translate to event probabilities for counts. What is new is the explicit extension beyond continuous outcomes while keeping the staggered and general sampling features. The paper states the identifying assumptions clearly and ties the results to an interpretable economic example. The test for over-identifying restrictions is presented as a practical diagnostic. The soft spots are in the technical backing. The abstract asserts that the asymptotic theory delivers uniform bands and accommodates the non-continuous cases, but the regularity conditions, handling of jumps in the CDF, and proof outlines are not visible here, so it is not possible to confirm whether the uniform inference holds or whether the test has power against plausible alternatives. The empirical section is mentioned without reported estimates or robustness checks. This is aimed at applied econometricians working with distributional questions on count or discrete data. The claims are specific enough to be checked by a referee. I would send it for peer review rather than desk reject.

Referee Report

2 major / 2 minor

Summary. The paper develops a distributional DiD framework to identify and conduct uniform inference on distribution and quantile treatment effects on the treated (DTET/QTET) for possibly non-continuous outcomes. Identification rests on distributional parallel trends and no-anticipation assumptions for the untreated potential outcome distributions. The framework accommodates staggered adoption and sampling schemes including repeated cross-sections, unbalanced/rotating/balanced panels; it supplies an economic model generating count-valued untreated outcomes, proposes a functional over-identification test, and illustrates with an application to police effects on crime counts.

Significance. If the identification and uniform-inference results hold under the stated conditions, the contribution is significant for applied DiD work with discrete, censored, or mixed outcomes where mean effects alone are insufficient. The explicit economic model for count outcomes, accommodation of staggered adoption across sampling schemes, and the functional over-ID test as a diagnostic are concrete strengths that would support richer event-probability interpretations in policy applications.

major comments (2)

[§3] §3 (Identification): the claim that distributional parallel trends plus no-anticipation identify the full DTET for non-continuous outcomes is load-bearing, yet the derivation does not explicitly verify whether the assumption is stated on the CDF or on the probability mass function when support is discrete; this affects whether the QTET inversion remains valid without additional continuity corrections.
[§4] §4 (Asymptotics): the uniform bands for the QTET process rely on a functional CLT; the regularity conditions listed do not include an explicit entropy integral or covering-number bound that accounts for the jump discontinuities induced by non-continuous outcomes, which is necessary to confirm that the multiplier bootstrap remains valid uniformly over the quantile index.

minor comments (2)

[Table 1] Table 1 (sampling schemes): the description of the rotating-panel case should clarify whether the overlap probability between periods enters the influence function or is absorbed into the weighting scheme.
[Empirical application] The empirical application reports event-probability effects but does not tabulate the pointwise standard errors alongside the uniform bands, making it difficult to compare with conventional pointwise inference.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments on our manuscript. The suggestions regarding the identification for non-continuous outcomes and the asymptotic conditions are helpful. Below we provide point-by-point responses to the major comments.

read point-by-point responses

Referee: [§3] §3 (Identification): the claim that distributional parallel trends plus no-anticipation identify the full DTET for non-continuous outcomes is load-bearing, yet the derivation does not explicitly verify whether the assumption is stated on the CDF or on the probability mass function when support is discrete; this affects whether the QTET inversion remains valid without additional continuity corrections.

Authors: The distributional parallel trends assumption is formulated directly on the conditional CDFs of the untreated potential outcomes, as stated in Assumption 3.1: F_{Y_{it}(0)|D_i=1}(y) = F_{Y_{it}(0)|D_i=0}(y) for all y. For discrete outcomes, the CDF fully characterizes the distribution, and the PMF can be recovered from the jumps in the CDF. The DTET is identified as the difference between the observed CDF for the treated and the counterfactual CDF. The QTET is obtained via the quantile function defined as the left-inverse of the CDF, which is standard and valid for any distribution, including those with discontinuities; no continuity corrections are required because the generalized inverse handles flat parts and jumps appropriately. We agree that an explicit verification for the discrete case would strengthen the presentation and will add a remark or example in Section 3 of the revised manuscript. revision: yes
Referee: [§4] §4 (Asymptotics): the uniform bands for the QTET process rely on a functional CLT; the regularity conditions listed do not include an explicit entropy integral or covering-number bound that accounts for the jump discontinuities induced by non-continuous outcomes, which is necessary to confirm that the multiplier bootstrap remains valid uniformly over the quantile index.

Authors: We appreciate this observation. The current regularity conditions in Assumption 4.1 include that the outcome support is finite or the functions have bounded variation, which implicitly controls the complexity for discrete cases with finitely many jumps. However, to make the entropy condition explicit, we will revise the assumption to include a bound on the covering number or entropy integral that accounts for the discontinuity points. This will ensure the functional CLT and the validity of the multiplier bootstrap hold uniformly over tau in (0,1). The revision will be incorporated in the next version of the paper. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's central identification result for distributional and quantile treatment effects on the treated rests on explicitly maintained assumptions (distributional parallel trends and no-anticipation) that are stated as external domain conditions rather than derived from the paper's own equations or fitted parameters. The illustrative economic model of crime is used only to motivate count-valued outcomes and does not enter the identification or inference derivations tautologically. The proposed functional over-identification test is a diagnostic tool, not a load-bearing step that reduces to self-reference. Asymptotic theory for uniform inference follows from standard regularity conditions under the maintained assumptions. No self-citation chains, ansatzes smuggled via citation, or renamings of known results appear as load-bearing elements in the derivation chain. The result is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Central claim rests on two domain assumptions for identification; no free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption Distributional parallel trends assumption
Invoked to identify the counterfactual distribution of untreated potential outcomes for the treated group.
domain assumption No-anticipation assumption
Rules out behavioral changes prior to treatment that would violate the identifying conditions.

pith-pipeline@v0.9.0 · 5698 in / 1258 out tokens · 20627 ms · 2026-05-23T21:34:56.018751+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 1 (Identification). Suppose Assumptions 1 and 2 hold, then FY01|D=1(y) is identified … Φ1(Φ−1_1(F10(y)) + Φ−1_0(F01(y)) − Φ−1_0(F00(y)))
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Assumption 1 (Functional Index Parallel Trends) … Φd, eΦt known strictly increasing continuously differentiable

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Grid-Rate Condition for Valid Uniform Inference
econ.EM 2026-05 unverdicted novelty 6.0

L_n = ω(r_n^{1/4}) ensures valid uniform inference for twice differentiable Donsker-class functions estimated at the r_n^{1/2} rate by making grid approximation error negligible relative to stochastic variation.