pith. sign in

arxiv: 2605.20012 · v1 · pith:JPKZ57SLnew · submitted 2026-05-19 · 💰 econ.EM

Testing Heteroskedasticity Under Measurement Error

Pith reviewed 2026-05-20 04:04 UTC · model grok-4.3

classification 💰 econ.EM
keywords heteroskedasticity testmeasurement errordeconvolutionempirical processbootstrap testregression modelasymptotic distribution
0
0 comments X

The pith

A test for heteroskedasticity stays valid when regressors are observed with measurement error by deconvolving the residuals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs test statistics for detecting heteroskedasticity in linear regressions where the explanatory variables contain additive measurement error. It does so by building an empirical process from residuals that have been corrected through deconvolution with the error distribution's characteristic function. Asymptotic null distributions are derived for both ordinary smooth and supersmooth measurement error cases when the error law is known, and the approach extends to the case of unknown error distributions estimated from repeated measurements. Two multiplier bootstrap procedures deliver critical values while accounting for parameter estimation. These results matter because measurement error is common in economic data and can invalidate standard heteroskedasticity tests, leading to incorrect inference in applied work.

Core claim

The paper shows that a deconvolved residual-marked empirical process converges weakly to a centered Gaussian process under the null of homoskedasticity, delivering consistent tests against alternatives where the conditional variance depends on the regressors, with the limiting behavior holding in both ordinary smooth and supersmooth measurement error settings.

What carries the argument

The deconvolved residual-marked empirical process, formed by integrating residuals against a kernel-weighted indicator after correcting for measurement error via the characteristic function.

If this is right

  • The test controls size and has nontrivial power against heteroskedasticity even after measurement error contaminates the regressors.
  • Multiplier bootstrap methods yield critical values without direct simulation of the limiting Gaussian process.
  • The procedure remains valid when the measurement error density is estimated from repeated observations rather than assumed known.
  • Asymptotic results cover both ordinary smooth and supersmooth classes of measurement error distributions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same deconvolution step could be applied to other integrated conditional moment tests for specification errors such as omitted variables or functional form misspecification.
  • When repeated measurements are unavailable, sensitivity checks that vary assumed error variances would help gauge robustness of the heteroskedasticity conclusion.
  • The framework suggests a route to testing conditional heteroskedasticity in models with fixed effects or other nonparametric components after suitable deconvolution adjustments.

Load-bearing premise

The measurement error distribution is known or can be consistently estimated from repeated measurements of the contaminated regressors.

What would settle it

In repeated Monte Carlo experiments with a known measurement error distribution and homoskedastic errors, the rejection rate under the null stays close to the nominal level across ordinary smooth and supersmooth cases.

read the original abstract

In this paper, we propose a novel approach to detect heteroskedasticity in regression models with regressors contaminated by measurement error. Specifically, inspired by the integrated conditional moment (ICM) approach, we construct test statistics based on a deconvolved residual-marked empirical process and establish their asymptotic properties in both ordinary smooth and supersmooth cases, assuming the measurement error distribution is known. The issue of an unknown measurement error distribution is addressed by employing estimators of the measurement error characteristic function based on repeated measurements. Furthermore, depending on whether the measurement error distribution is known or not, to obtain critical values from the case-dependent limiting null distributions, we propose two computationally attractive multiplier bootstrap methods where the "parameter estimation effect" is successfully addressed. Finally, simulation results and empirical studies about corn yields and household budget shares confirm the favorable properties of the proposed tests.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript proposes a test for heteroskedasticity in linear regression models with measurement error in the regressors. It constructs test statistics from a deconvolved residual-marked empirical process in the integrated conditional moment (ICM) framework and derives asymptotic null distributions for both ordinary-smooth and supersmooth measurement-error characteristic functions when the error distribution is known. For the unknown-distribution case, estimators of the error characteristic function are obtained from repeated measurements. Two multiplier bootstrap procedures are introduced to obtain critical values while addressing the parameter-estimation effect. Finite-sample behavior is examined in simulations, and the tests are applied to corn-yield and household-budget-share data.

Significance. If the asymptotic derivations and bootstrap validity hold, the paper fills a practical gap by providing a specification test that accommodates classical measurement error, a frequent feature of econometric data. The explicit treatment of ordinary-smooth versus supersmooth error distributions and the construction of estimation-effect-robust bootstraps are technically useful extensions of the ICM literature. The simulation design and two empirical illustrations add credibility and demonstrate applicability.

major comments (2)
  1. [§4.1, Theorem 1] §4.1, Theorem 1 (ordinary-smooth case): The tightness argument for the deconvolved process requires a specific decay condition on the characteristic function of the measurement error; the stated assumption appears sufficient for weak convergence but the paper does not verify that the same rate condition continues to hold after the consistent estimation of the characteristic function from repeated measurements, which is load-bearing for the bootstrap validity claim.
  2. [§5.2] §5.2, the multiplier bootstrap construction: The bootstrap is asserted to remove the parameter-estimation effect, yet the proof sketch does not display the uniform stochastic equicontinuity step that would confirm the bootstrap process converges to the same Gaussian limit as the original statistic; this step is central to justifying the critical-value procedure.
minor comments (3)
  1. [Abstract] Abstract, line 4: 'supersmooth cases' should read 'the supersmooth case' for grammatical parallelism with the preceding clause.
  2. [Section 2] Section 2, Equation (1): The independence assumption between the measurement error and the latent regressor is stated but not repeated when the repeated-measurement estimator is introduced; a brief reminder would improve readability.
  3. [Table 1] Table 1 caption: The phrase 'nominal level 5%' is repeated; a single statement at the top of the table would suffice.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and valuable comments on our manuscript. The points raised concern the details of the tightness argument under estimated measurement-error characteristic functions and the completeness of the bootstrap proof. We address each comment below and will incorporate the necessary clarifications and additions in the revised version.

read point-by-point responses
  1. Referee: [§4.1, Theorem 1] §4.1, Theorem 1 (ordinary-smooth case): The tightness argument for the deconvolved process requires a specific decay condition on the characteristic function of the measurement error; the stated assumption appears sufficient for weak convergence but the paper does not verify that the same rate condition continues to hold after the consistent estimation of the characteristic function from repeated measurements, which is load-bearing for the bootstrap validity claim.

    Authors: We appreciate this observation. The tightness result in Theorem 1 is derived under a known measurement-error distribution satisfying the ordinary-smooth decay condition. For the repeated-measurement case, the estimator of the characteristic function converges uniformly at a rate that is faster than the bandwidth requirements of the deconvolution kernel. We will add a supporting lemma in the appendix establishing that this estimation error does not violate the decay condition or the tightness of the empirical process. The same argument will be used to confirm bootstrap validity under the estimated characteristic function. revision: yes

  2. Referee: [§5.2] §5.2, the multiplier bootstrap construction: The bootstrap is asserted to remove the parameter-estimation effect, yet the proof sketch does not display the uniform stochastic equicontinuity step that would confirm the bootstrap process converges to the same Gaussian limit as the original statistic; this step is central to justifying the critical-value procedure.

    Authors: We agree that the proof sketch in §5.2 would be strengthened by an explicit uniform stochastic equicontinuity argument. In the revision we will expand the proof to verify that the multiplier bootstrap process is asymptotically equicontinuous uniformly over the index set, thereby establishing that it converges weakly to the same centered Gaussian process as the original statistic after the parameter-estimation effect has been removed. This addition will rigorously justify the critical-value procedure. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper constructs test statistics from a deconvolved residual-marked empirical process and derives limiting distributions from first principles under ordinary-smooth and supersmooth measurement-error assumptions. The unknown-error case uses consistent estimation from repeated measurements, with multiplier bootstrap constructed to remove parameter-estimation effects; these steps are presented as standard extensions without reducing the null distribution or test statistic to a fit of the same data by construction. No self-definitional, fitted-input, or load-bearing self-citation reductions are exhibited in the stated derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The procedure rests on standard econometric regularity conditions plus the domain assumption that the measurement-error characteristic function is either known or estimable from repeats; no free parameters or new entities are introduced in the abstract.

axioms (1)
  • domain assumption Measurement error distribution known or consistently estimable from repeated measurements
    Invoked to justify the deconvolution step and the bootstrap validity.

pith-pipeline@v0.9.0 · 5664 in / 1148 out tokens · 29678 ms · 2026-05-20T04:04:56.066438+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages

  1. [1]

    = 1 n nX i=1 Z (Yi −g(x;θ 0))2 −σ 2 0 Kb x−W i b eixξ dx, ˆSn(ξ, θ0, σ2

  2. [2]

    = 1 n nX i=1 Z (Yi −g(x;θ 0))2 −σ 2 0 ˆKb x−W i b eixξ dx, S∗ n(ξ, θ0, σ2

  3. [3]

    = 1 n nX i=1 Vi Z (Yi −g(x;θ 0))2 −σ 2 0 Kb x−W i b eixξ dx, ˆS∗ n(ξ, θ0, σ2

  4. [4]

    C Proofs of Theorems Proof of Theorem 3.1.We start be decomposing Sn(ξ, θn, σ2 n) =S n(ξ, θ0, σ2

    = 1 n nX i=1 Z (Yi −g(x;θ 0))2 −σ 2 0 ˆK∗ b x−W i b eixξ dx, 67 and Gn(ξ) = 1 n nX i=1 Z Kb x−W i b eixξ dx, ˆGn(ξ) = 1 n nX i=1 Z ˆKb x−W i b eixξ dx, G∗ n(ξ) = 1 n nX i=1 Vi Z Kb x−W i b eixξ dx, ˆG∗ n(ξ) = 1 n nX i=1 Z ˆK∗ b x−W i b eixξ dx for notational simplicity. C Proofs of Theorems Proof of Theorem 3.1.We start be decomposing Sn(ξ, θn, σ2 n) =S n...

  5. [5]

    parametric estimation effect

    +S n1(ξ, θn)−S n2(ξ, σ2 n),(C.1) where Sn1(ξ, θn) = 1 n nX i=1 Z (Yi −g(x;θ n))2 −(Y i −g(x;θ 0))2 Kb x−W i b eixξ dx, Sn2(ξ, σ2 n) = σ2 n −σ 2 0 " 1 n nX i=1 Z Kb x−W i b eixξ dx # = σ2 n −σ 2 0 Gn(ξ). For the main term, noting that the null hypothesis(2.3) impliesE[(U 2 −σ 2 0)eiXξ] = 0, along with the proof of Lemma D.3, we obtain sup ξ∈Π √nSn(ξ, θ0, σ...

  6. [6]

    Together with conclusions in Lemma D.5 and D.7, sup ξ∈Π √nSn1(ξ, θn) =o p (1) (C.4) and sup ξ∈Π √nSn2(ξ, σ2 n)− f ft X(ξ)√n nX i=1 {r1,∞(di; 0)−E[r 1,∞(di; 0)]} =o p (1) (C.5) hold

    +S n1(0, θn).(C.3) 68 Consequently, √n σ2 n −σ 2 0 − 1√n nX i=1 {r1,∞(di; 0)−E[r 1,∞(di; 0)]} =o p (1) is implied by null hypothesis(2.3), (C.2) and Lemma D.5. Together with conclusions in Lemma D.5 and D.7, sup ξ∈Π √nSn1(ξ, θn) =o p (1) (C.4) and sup ξ∈Π √nSn2(ξ, σ2 n)− f ft X(ξ)√n nX i=1 {r1,∞(di; 0)−E[r 1,∞(di; 0)]} =o p (1) (C.5) hold. Combining (C.1)...

  7. [7]

    which depends onr 2,∞(di;ξ), rather thanS os ∞(ξ, θ0, σ2

  8. [8]

    which depends onr 1,∞(di;ξ). We first decomposeS n(ξ, θn, σ2 n) as (C.1), then invoke the null hypothesis to establish the negligibility of the bias term, and finally results from Lemma D.3, D.5 and D.7 are applied, as their conclusions are stated to be valid in the supersmooth case as well. Proof of Theorem 3.3.It is important to note that, in the proof ...

  9. [9]

    The bias of main term ˆSn(ξ, θ0, σ2

    + ˆSn1(ξ, ˆθn)− ˆSn2(ξ,ˆσ2 n),(C.11) 70 where ˆSn1(ξ, ˆθn) = 1 n nX i=1 Z Yi −g(x; ˆθn) 2 −(Y i −g(x;θ 0))2 ˆKb x−W i b eixξ dx, ˆSn2(ξ,ˆσ2 n) = ˆσ2 n −σ 2 0 " 1 n nX i=1 Z ˆKb x−W i b eixξ dx # = ˆσ2 n −σ 2 0 ˆGn(ξ). The bias of main term ˆSn(ξ, θ0, σ2

  10. [10]

    vanishes relying on the implication of the null hy- pothesis, and the convergence result is then derived by following arguments in the proof of Lemma D.4, leading to the following conclusion, sup ξ∈Π √n ˆSn(ξ, θ0, σ2 0)− 1√n nX i=1 {ˆr1,∞(Di;ξ)−E[ˆr 1,∞(Di;ξ)]} =o p (1),(C.12) where ˆr1,∞(Di;ξ) representsr ϵ,os ∞ (Y, W, W r;ξ, θ 0, σ2

  11. [11]

    parametric estimation effect

    mentioned in Theorem 4.1. For the term representing the “parametric estimation effect”, ˆσ2 n −σ 2 0 = ˆSn(0, θ0, σ2

  12. [12]

    + ˆSn1(0, ˆθn) still holds. Given that sup ξ∈Π √n ˆSn1(ξ, ˆθn) =o p (1),(C.13) follows from Lemma D.6, sup ξ∈Π √n ˆσ2 n −σ 2 0 − 1√n nX i=1 {ˆr1,∞(Di; 0)−E[ˆr1,∞(Di; 0)]} =o p (1) (C.14) can be derived by (C.12) and (C.13), thereby establishing sup ξ∈Π √n ˆSn2(ξ, σ2 n)− f ft X(ξ)√n nX i=1 {ˆr1,∞(Di; 0)−E[ˆr1,∞(Di; 0)]} =o p (1) (C.15) by (C.14) and Lemma ...

  13. [13]

    which depends on ˆr1,∞(Di;ξ) is replaced by ˆSss ∞(ξ, θ0, σ2

  14. [14]

    Therefore, we omit the detailed proof

    which depends on ˆr2,∞(Di;ξ). Therefore, we omit the detailed proof. Proof of Theorem 4.3.Lemma D.4, D.6 and D.8 show that the estimation of deconvolution kernel does not influence the bias of the proposed statistics. In addition, Theorem 4.1 and 4.2 demonstrate that the convergence rate of the main term is unaffected 71 by the presence of an estimated me...

  15. [15]

    +S ∗ n1(ξ, θn)−S ∗ n2(ξ, σ2 n) =S∗ n(ξ, θ0, σ2

  16. [16]

    +S ∗ n1(ξ, θn)− Sn(0, θ0, σ2

  17. [17]

    +S n1(0, θn) G∗ n(ξ).(C.17) where the last equation follows by (C.3) and S∗ n1(ξ, θn) = 1 n nX i=1 Vi Z (Yi −g(x;θ n))2 −(Y i −g(x;θ 0))2 Kb x−W i b eixξ dx. (C.16) and (C.17) provide the decomposition of the bootstrap version of the test statistic after introducing the projection structure, Spro,∗ n (ξ, θn, σ2 n) =S∗ n(ξ, θ0, σ2 0)−S ∗ n(0, θ0, σ2 0)Gn(ξ...

  18. [18]

    +S n1(0, θn) G∗ n(ξ) + Sn(0, θ0, σ2

  19. [19]

    +S n1(0, θn) Gn(ξ)G∗ n(0).(C.18) For the main term, Lemma D.3 shows that the convergence order ofS ∗ n(ξ, θ0, σ2

  20. [20]

    is the same as that ofS n(ξ, θ0, σ2 0). Moreover, as stated in Lemma D.3, the unit variance of the multipliers ensures that the form of the main term is preserved, that is, sup ξ∈Π √nS∗ n(ξ, θ0, σ2 0)− 1√n nX i=1 Vi {r1,∞(di;ξ)−E[r 1,∞(di;ξ)]} =o p (1) (C.19) 72 for the ordinary smooth case and sup ξ∈Π √nS∗ n(ξ, θ0, σ2 0)− 1√n nX i=1 Vi {r2,∞(di;ξ)−E[r 2,...

  21. [21]

    Subsequently, conclusions aboutS n1(ξ, θn),S ∗ n1(ξ, θn),G n(ξ),G ∗ n(ξ) and (C.18) still hold under the local alternative hypothesis

    +S ∗ n(0, θ0, σ2 0)Gn(ξ) =o p n− 1 2 .(C.21) Combining (C.19), (C.20), (C.21) and the unit variance property of multipliers, (5.3) and (5.4) hold under the null hypothesis. Subsequently, conclusions aboutS n1(ξ, θn),S ∗ n1(ξ, θn),G n(ξ),G ∗ n(ξ) and (C.18) still hold under the local alternative hypothesis. The only difference from the proof of Theorem 3.3...

  22. [22]

    + ˆS∗ n1(ξ, ˆθn)− ˆS∗ n2(ξ,ˆσ2 n) and subsequently ˆS∗ n(ξ, θ0, σ2

  23. [23]

    +S ∗ n,2(ξ, θ0, σ2 0), where ˆS∗ n1(ξ, ˆθn), ˆS∗ n2(ξ,ˆσ2 n),S ∗ n,1(ξ, θ0, σ2

  24. [24]

    andS ∗ n,2(ξ, θ0, σ2

  25. [25]

    denote the counterparts of ˆSn1(ξ, ˆθn) and ˆSn2(ξ,ˆσ2 n) as mentioned in proof of Theorem 4.1,S n,1(ξ, θ0, σ2

  26. [26]

    in Lemma D.4, respectively, with ˆf ft ϵ (·) replaced by the multiplier-perturbed estimator ˆf ft,∗ ϵ (·). The proof follows the same line as that of Theorem 4.1, with the only difference being that ˆr1,∞(Di;ξ) is replaced by ˆr∗ 1,∞(D;ξ) =e iW ξ αX l=0 cos l (ξ) (Y−g(W;θ 0))2 −σ 2 0 (l) 73 + 2V ∗ g2fX ft (ξ)Πϵ(ξ) + V ∗ π Z f ft X(t)(g2)ft(ξ−t)Π ϵ(t)dt − ...

  27. [27]

    1 n nX i=1 V ∗ i Πϵ,i(ξ) # − g2fX ft (ξ)

    is obtained by an argument similar to that leading to (D.31), sup ξ∈Π S∗ n,1(ξ, θ0, σ2 0)−E U 2 −σ 2 0 eiXξ Kft(bξ) " 1 n nX i=1 V ∗ i Πϵ,i(ξ) # − g2fX ft (ξ) " 1 n nX i=1 V ∗ i Πϵ,i(ξ) # − 1 2πn nX i=1 Z f ft X(t)(g2)ft(ξ−t)V ∗ i Πϵ,i(t)dt + 1 πn nX i=1 Z (gfX)ft(t)gft(ξ−t)V ∗ i Πϵ,i(t)dt =o p(n− 1 2 ). Note that the local alternative impliesE[(U 2 −σ 2 ...

  28. [28]

    Combining this with the analysis ofS n(ξ, θ0, σ2

    does not affect the deterministic shift function described in Section 3. Combining this with the analysis ofS n(ξ, θ0, σ2

  29. [29]

    under local alternatives in Theorem 3.4, we conclude thatµ(·) andE[ √n ˆS∗ n(ξ, ˆθn,ˆσ2 n)] are asymptotically equivalent, thereby establishing the stated result of the theorem. 74 D Lemmas and Proofs Lemma D.1Under Assumptions 3.1 and 4.1, with addition of Assumption 3.2 or As- sumption 3.3, we have, Z blxlKϵ,2(x)dx= 0, l < p.(D.1) Meanwhile, for the ord...

  30. [30]

    For the ordinary smooth, Assumption 3.2 implies that functiongisp-times continuously differentiable, which allows us to expandH(Y i, b˜x+Wi) aroundW i to thep-th order, wherex=b˜x+W i, Z (Yi −g(x;θ 0))2 −σ 2 0 Kb x−W i b eixξ dx=r 1,n(di;ξ) +t 1,n(di;ξ),(D.10) where r1,n(di;ξ) = e iWiξ p−1X l=0 H(l)(Yi, Wi) bl l! Z ˜xlKϵ(˜x)eib˜xξd˜x, t1,n(di;ξ) = e iWiξ ...

  31. [31]

    + 80 Sn,2(ξ, θ0, σ2 0), where Sn,1(ξ, θ0, σ2

  32. [32]

    = 1 n nX i=1 Z (Yi −g(x;θ 0))2 −σ 2 0 Kb,1 x−W i b eixξ dx, Sn,2(ξ, θ0, σ2

  33. [33]

    = 1 n nX i=1 Z (Yi −g(x;θ 0))2 −σ 2 0 Kb,2 x−W i b eixξ dx. For the first termS n,1, we definep(D i, Dj;ξ) as p(Di, Dj;ξ) = 1 2πb ZZ (Yi −g(x;θ 0))2 −σ 2 0 e−i x−Wi b t Kft(t) f ft ϵ ( t b) Πϵ,j( t b)eixξ dx dt = 1 2π ZZ (Yi −g(x;θ 0))2 −σ 2 0 e−i(x−Wi)t Kft(bt) f ft ϵ (t) Πϵ,j(t)eixξ dx dt, where Πϵ,j(·) is denoted by Πϵ,j (t) = f ft ϵ (t) 2 −ζ j(t) 2 [f...

  34. [34]

    +nS n,12(ξ, θ0, σ2 0)/(n−1), Sn,11(ξ, θ0, σ2

  35. [35]

    = 1 n2 nX i=1 p(Di, Di;ξ), S n,12(ξ, θ0, σ2

  36. [36]

    = 1 n(n−1) nX i̸=j p(Di, Dj;ξ). By similar arguments to the proof of Lemma D.1, we obtain E 1 2π Z Z xle−ixt Kft(bt) f ft ϵ (t) Πϵ,j(t)dt dx =o p n 1 2 forl≤p+ 1 whenb −3α−2 =o(n 1/2) which holds under Assumption 4.2. Still noteH(Y, x) = (Y− g(x;θ 0))2 −σ 2 0, along with Assumption 3.2 and 4.2, we obtain E sup ξ∈Π 1 2π ZZ H(Y i, x+W i)e−ixt Kft(bt) f ft ϵ...

  37. [37]

    =E[p 1(Di, ξ)] + 2 n nX i=1 {p1(Di, ξ)−E[p 1(Di, ξ)]}.(D.26) ProvidedE(U|X) = 0, we note that 2p1(Di, ξ) =E[2q(D i, Dj, ξ)|Di] = 1 2π ZZ E (Y−g(X;θ 0))2 −σ 2 0 eiW t Kft(bt) f ft ϵ (t) Πϵ,i(t)eix(ξ−t) dx dt + 1 2π ZZ E (g(X;θ 0)−g(x;θ 0))2 eiXt Kft(bt)Πϵ,i(t)eix(ξ−t) dx dt, =E (Y−g(X;θ 0))2 −σ 2 0 eiW ξ Kft(bξ)Πϵ,i(ξ) + g2fX ft (ξ)K ft(bξ)Πϵ,i(ξ) + 1 2π Z...

  38. [38]

    Along withS n,11(ξ, θ0, σ2

    =O p(n−1/2) by combining (D.26), (D.28), (D.30) 82 andE[p 1(D, ξ)] = 0. Along withS n,11(ξ, θ0, σ2

  39. [39]

    1 n nX i=1 Πϵ,i(ξ) # − g2fX ft (ξ)

    =o p(n−1/2), we obtain sup ξ∈Π Sn,1(ξ, θ0, σ2 0)−E U 2 −σ 2 0 eiXξ Kft(bξ) " 1 n nX i=1 Πϵ,i(ξ) # − g2fX ft (ξ) " 1 n nX i=1 Πϵ,i(ξ) # − 1 2πn nX i=1 Z f ft X(t)(g2)ft(ξ−t)Π ϵ,i(t)dt + 1 πn nX i=1 Z (gfX)ft(t)gft(ξ−t)Π ϵ,i(t)dt =o p(n− 1 2 ).(D.31) For the second term, Sn,2(ξ, θ0, σ2

  40. [40]

    = 1 n nX i=1 eiWiξ Z H(Y i, bx+W i)Kϵ,2(x)eibxξ dx =Ros n,1(ξ, θ0, σ2

  41. [41]

    +R os n,3(ξ, θ0, σ2 0),(D.32) by Taylor expansion, where Ros n,1(ξ, θ0, σ2

  42. [42]

    = p−1X l=0 bl l!    1 n nX i=1 eiWiξH(l)(Yi, Wi) −E eiW ξH(l)(Y, W)    Z xlKϵ,2(x)eibxξ dx, Ros n,2(ξ, θ0, σ2

  43. [43]

    = bp p! 1 n nX i=1 Z    eiWiξH(p)(Yi, ¯Wi) −E H(p)(Y, ¯W)e iW ξ    xpKϵ,2(x)eibxξ dx, Ros n,3(ξ, θ0, σ2

  44. [44]

    = Z E H(Y, x)K b,2 x−W b eixξ dx. Notice that Assumption 3.2 impliesE|H (l)(Y, W)| 2 <∞, sup ξ∈Π 1 n nX i=1 eiWiξH(l)(Yi, Wi)−E eiW ξH(l)(Y, W) =O p n− 1 2 forl < p(D.33) follows by standard entropy-based criteria for Donsker classes (see, e.g., Example 2.10.10 and Theorem 2.5.2 in van der Vaart and Wellner (1996) which implies that the exponential weight...

  45. [45]

    1 n nX i=1 H(p)(Yi, Wi) +E H(p)(Y, W) #Z |(bx)pKϵ,2(x)|dx + 1 p!

    in decomposition (D.32), using the Lipschitz continuity mentioned in Assumption 3.2, sup ξ∈Π |Ros n,1(ξ, θ0, σ2 0)| ≤ 1 p! " 1 n nX i=1 H(p)(Yi, Wi) +E H(p)(Y, W) #Z |(bx)pKϵ,2(x)|dx + 1 p! " 1 n nX i=1 |m(Yi, Wi)|+E|m(Y, W)| #Z (bx)p+1Kϵ,2(x) dx, wherem(Y, W) =|Y L g(p)(W)|+|L [g2](p)(W)|. Notice that Assumption 3.2 impliesE|H (p)(Y, W)|< ∞andE|m(Y, W)|<...

  46. [46]

    as Ros n,3(ξ, θ0, σ2

  47. [47]

    Let Ros n,31(ξ, θ0, σ2

    = Z E H(Y, X+bx)e i(X+bx)ξ E[K ϵ,2(x)]dx = Z E (Y−g(X;θ 0))2 −σ 2 0 ei(X+bx)ξ E[K ϵ,2(x)]dx + Z E (g(X;θ 0)−g(X+bx;θ 0))2 ei(X+bx)ξ E[K ϵ,2(x)]dx = Z E (Y−g(X;θ 0))2 −σ 2 0 Kb,2 x−X b eixξ dx + ZZ (g(y;θ 0)−g(x;θ 0))2 eixξE Kb,2 x−y b fX(y)dx dy. Let Ros n,31(ξ, θ0, σ2

  48. [48]

    = 1 2πb Z eixξ Z g2(y;θ 0)Kϵ,2 x−y b fX(y)dy dx, Ros n,32(ξ, θ0, σ2

  49. [49]

    = 1 2πb Z g2(x;θ 0)eixξ Z Kϵ,2 x−y b fX(y)dy dx, Ros n,33(ξ, θ0, σ2

  50. [50]

    = 1 2πb Z g(x;θ 0)eixξ Z g(y;θ 0)Kϵ,2 x−y b fX(y)dy dx. 84 By similar arguments to the proof of Lemma D.3, Z g2(y;θ 0)Kϵ,2 x−y b fX(y)dy = p−1X l=0 bl l! (g2fX)(l)(x;θ 0) Z ylKϵ,2(y)dy+ bp p! Z (g2fX)(p)(˜x;θ0)ypKϵ,2(y)dy = bp p! Z (g2fX)(p)(˜x;θ0)ypKϵ,2(y)dy holds using Taylor expansion, where ˜xis betweenx−byandx. We note that the second equation follow...

  51. [51]

    By a similar arguments,R os n,32(ξ, θ0, σ2

    =o p(n−1/2) which follows by conclusions in Lemma D.1. By a similar arguments,R os n,32(ξ, θ0, σ2

  52. [52]

    =o p(n−1/2) andR os n,33(ξ, θ0, σ2

  53. [53]

    =o p(n−1/2). Consequently, sup ξ∈Π Ros n,3(ξ, θ0, σ2 0)− Z E U 2 −σ 2 0 Kb,2 x−X b eixξ dx =o p n− 1 2 .(D.36) Thus, (D.23) holds for the ordinary smooth case by combining the definition ofψ 2, (D.8), (D.31), (D.32), (D.34), (D.35) and (D.36). For the supersmooth case, (D.8) and decomposition of ˆSn(ξ, θ0, σ2 0)−S n(ξ, θ0, σ2

  54. [54]

    still hold. For the first term of decomposition, under Assumption 4.3, following the proof of Lemma D.1, E sup ξ∈Π 1 2π ZZ H(Y i, x+W i)e−ixt Kft(bt) f ft ϵ (t) Πϵ,j(t)eixξ dx dt ≤ 1 2π ∞X l=0 1 l! E H(l)(Yi, Wi) Esup ξ∈Π ZZ xle−ixt Kft(bt) f ft ϵ (t) Πϵ,j(t)eixξ dx dt =Op e−3µ(1+b−1)2 =o p n 1 2 , where the last equation follows by Assumption 4.3. Thus, ...

  55. [55]

    85 Subsequently, (D.31) follows by (D.26)–(D.30)

    =o p(n−1/2) andE[sup ξ∈Π p2(Di, Di;ξ)] =o p(n) hold. 85 Subsequently, (D.31) follows by (D.26)–(D.30). For the second term of decomposition, we decompose Sn,2(ξ, θ0, σ2

  56. [56]

    +R ss n,2(ξ, θ0, σ2 0),(D.37) where Rss n,1(ξ, θ0, σ2

  57. [57]

    = ∞X l=0 bl l!    1 n nX i=1 eiWiξH(l)(Yi, Wi) −E eiW ξH(l)(Y, W)    Z xlKϵ,2(x)eibxξ dx, Rss n,2(ξ, θ0, σ2

  58. [58]

    Under Assumption 4.3, (D.33) still holds and therefore, sup ξ∈Π Rss n,1(ξ, θ0, σ2 0) =o p n− 1 2 (D.38) follows by conclusion (D.7) mentioned in Lemma D.2

    = Z E H(Y, x)K b,2 x−W b eixξ dx. Under Assumption 4.3, (D.33) still holds and therefore, sup ξ∈Π Rss n,1(ξ, θ0, σ2 0) =o p n− 1 2 (D.38) follows by conclusion (D.7) mentioned in Lemma D.2. We notice thatR ss n,2(ξ, θ0, σ2

  59. [59]

    = Ros n,31(ξ, θ0, σ2

  60. [60]

    +R os n,32(ξ, θ0, σ2

  61. [61]

    parameter estimation uncertainty

    still holds, together with Z g2(y;θ 0)Kϵ,2 x−y b fX(y)dy= ∞X l=0 bl l! (g2fX)(l)(x;θ 0) Z ylKϵ,2(y)dy= 0 and by similar arguments, Z Kϵ,2 x−y b fX(y)dy= 0, Z g(y;θ 0)Kϵ,2 x−y b fX(y)dy= 0, we claim sup ξ∈Π Sn,2(ξ, θ0, σ2 0)− Z E U 2 −σ 2 0 Kb,2 x−X b eixξ dx =o p n− 1 2 .(D.39) Consequently, (D.23) follows by (D.31) and (D.39). Owing to the fact that the ...