Kernel Density Estimation under $C^{1,1}$ Regularity: AMISE, Weak Curvature, and Plug-in Bandwidths

Alireza Kabgani; Elaheh Lotfian

arxiv: 2605.20550 · v1 · pith:D7LPYG6Qnew · submitted 2026-05-19 · 🧮 math.ST · stat.TH

Kernel Density Estimation under C^(1,1) Regularity: AMISE, Weak Curvature, and Plug-in Bandwidths

Alireza Kabgani , Elaheh Lotfian This is my paper

Pith reviewed 2026-05-21 06:07 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords kernel density estimationAMISEweak derivativeC^{1,1} regularityplug-in bandwidthEpanechnikov kernelnonparametric statisticsweak curvature

0 comments

The pith

The classical AMISE formula and optimal bandwidth for kernel density estimation hold under the weaker condition that the density has a Lipschitz first derivative.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that the usual asymptotic mean integrated squared error expression for kernel density estimators continues to apply when the density belongs only to C^{1,1} rather than requiring a continuous second derivative. The authors replace the pointwise Taylor expansion with an integral representation that uses the weak second derivative, allowing the roughness term to be defined even when curvature is kinked or discontinuous. Under the additional condition that this weak second derivative lies in L^2, the familiar n^{-1/5} optimal bandwidth and Epanechnikov kernel optimality are recovered exactly as in the classical setting. The work further develops a plug-in bandwidth selector based on a generalized curvature estimate and proves its first-order equivalence to the oracle bandwidth, along with consistency of a related U-statistic estimator. A multivariate version recovers the expected scalar-bandwidth convergence rate.

Core claim

Under f in C^{1,1}(R) with weak second derivative in L^2(R), the asymptotic mean integrated squared error of a kernel density estimator equals the classical expression involving the kernel's second moment and the roughness R(f''), where the latter is interpreted through the weak-curvature functional. This yields the standard optimal bandwidth of order n^{-1/5} and confirms that the Epanechnikov kernel minimizes the leading AMISE term without any need for pointwise twice differentiability.

What carries the argument

Integral Taylor representation based on the weak second derivative, which defines the weak-curvature functional R(f'').

If this is right

The AMISE formula remains identical to the classical case without assuming a continuous second derivative.
The optimal bandwidth retains the rate n^{-1/5}.
The Epanechnikov kernel remains asymptotically optimal among kernels of fixed order.
A plug-in selector based on estimated weak curvature is first-order AMISE equivalent under ratio-consistent estimation.
A leave-one-out U-statistic estimator of the weak curvature is consistent.
In the multivariate setting the scalar bandwidth achieves the rate n^{-4/(d+4)} via weak Hessian regularity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Densities arising from threshold or regime-change models can now use standard KDE bandwidth rules with the same theoretical guarantees.
The weak-curvature approach may extend naturally to other bias calculations in nonparametric smoothing under minimal smoothness.
Numerical checks on piecewise-quadratic densities would directly test whether the predicted AMISE rates materialize in finite samples.

Load-bearing premise

The integral Taylor representation using the weak second derivative accurately captures the leading bias term in the mean integrated squared error expansion.

What would settle it

Compute the finite-sample integrated squared error of a kernel estimator on a density whose first derivative is Lipschitz but whose second derivative has a jump discontinuity, using the n^{-1/5} bandwidth, and check whether the error scales exactly as predicted by the classical AMISE expression.

Figures

Figures reproduced from arXiv: 2605.20550 by Alireza Kabgani, Elaheh Lotfian.

**Figure 2.** Figure 2: Log–log plot of Monte Carlo mean integrated squared error against sample size for the [PITH_FULL_IMAGE:figures/full_fig_p026_2.png] view at source ↗

**Figure 3.** Figure 3: Epanechnikov kernel density estimates for the Old Faithful eruption-duration data [PITH_FULL_IMAGE:figures/full_fig_p028_3.png] view at source ↗

read the original abstract

Classical kernel density estimation usually derives the AMISE and optimal bandwidth from a pointwise Taylor expansion, which requires twice continuous differentiability. This assumption is stronger than necessary and excludes natural densities arising from threshold models, regime changes, and robust mixture models, where the first derivative may be Lipschitz while the curvature is kinked, discontinuous, or only weakly defined. We show that the classical AMISE theory remains valid under the weaker condition $f\in C^{1,1}(\mathbb{R})$. The pointwise $C^2$ Taylor expansion is replaced by an integral Taylor representation based on the weak second derivative, so that $R(f'')$ is interpreted as a weak-curvature functional. Under $f\in C^{1,1}(\mathbb{R})$ and $f''\in L^2(\mathbb{R})$, we recover the classical AMISE formula, the $n^{-1/5}$ optimal bandwidth, and Epanechnikov kernel optimality without assuming a continuous classical second derivative. We also propose a generalized-curvature plug-in bandwidth selector, prove its first-order AMISE equivalence under ratio-consistent curvature estimation, and establish consistency of a leave-one-out U-statistic curvature estimator. A multivariate extension using weak Hessians recovers the scalar-bandwidth rate $n^{-4/(d+4)}$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper recovers classical AMISE under C^{1,1} via weak second derivatives, which is a clean technical step if the remainder controls, though the L^2 assumption on curvature leaves the bias expansion open to question.

read the letter

The main new piece is replacing the usual pointwise C^2 Taylor step with an integral representation that uses the weak second derivative. Under f in C^{1,1} and f'' in L^2 they recover the standard AMISE formula, the n^{-1/5} bandwidth rate, and Epanechnikov optimality. That move directly addresses densities from threshold or mixture models where curvature can kink or jump. They also build a generalized-curvature plug-in selector, show first-order AMISE equivalence when the curvature estimate is ratio-consistent, and prove consistency for a leave-one-out U-statistic estimator of the curvature functional. The multivariate extension with weak Hessians follows the same pattern and keeps the scalar-bandwidth rate n^{-4/(d+4)}.

Referee Report

2 major / 2 minor

Summary. The paper claims that under f ∈ C^{1,1}(ℝ) with f'' ∈ L²(ℝ), kernel density estimation recovers the classical AMISE formula (including the leading bias term involving ∫ [f''(x)]² dx), the n^{-1/5} optimal bandwidth rate, and Epanechnikov kernel optimality. This is achieved by replacing the pointwise C² Taylor expansion with an integral Taylor representation based on the weak second derivative. The work also introduces a generalized-curvature plug-in bandwidth selector shown to be first-order AMISE equivalent under ratio-consistent curvature estimation, proves consistency of a leave-one-out U-statistic estimator for the curvature functional, and provides a multivariate extension recovering the scalar-bandwidth rate n^{-4/(d+4)}.

Significance. If the central claims hold, the result would extend classical KDE asymptotics to a wider class of densities with kinked or only weakly defined curvature (e.g., from threshold or regime-switching models), which is a meaningful theoretical advance. The explicit use of weak derivatives to interpret R(f'') and the construction of a consistent U-statistic curvature estimator are notable strengths that could support practical plug-in methods under reduced smoothness.

major comments (2)

[§3.1] §3.1 (Bias expansion via integral Taylor representation): The argument that the remainder term arising from the representation f(x + hu) = f(x) + hu f'(x) + ∫_0^{hu} (hu - t) f''(x + t) dt, after convolution with K and integration of the squared bias, is o(h²) in the L² sense under only f'' ∈ L²(ℝ) is load-bearing for recovering the exact classical AMISE. With f'' merely square-integrable, the resulting double-integral term against K(u) need not vanish faster than the leading (h²/2) μ₂(f'') term uniformly in x or after squaring and integrating; a explicit bound or additional local integrability condition is required to confirm the o(h²) rate.
[Theorem 4] Theorem 4 (AMISE equivalence of the generalized-curvature plug-in selector): The first-order equivalence to the AMISE-optimal bandwidth is stated to hold under ratio-consistent estimation of the weak-curvature functional, but the proof sketch does not explicitly address whether the leave-one-out U-statistic estimator remains ratio-consistent when the pilot bandwidth used in curvature estimation is itself of order n^{-1/5}; any implicit dependence could affect the claimed equivalence.

minor comments (2)

[§2] The notation for the weak-curvature functional R(f'') is introduced in the abstract but would benefit from an explicit integral definition in §2 before the main results.
Figure 1 (if present) comparing classical and weak-curvature AMISE curves should include the precise regularity conditions used for each curve in the caption.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading of the manuscript and for the positive assessment of its potential significance in extending classical KDE asymptotics to the C^{1,1} class. The two major comments raise valid points about technical details in the bias expansion and the proof of the plug-in selector. We address each below and will incorporate clarifications and expanded arguments in the revised version.

read point-by-point responses

Referee: [§3.1] §3.1 (Bias expansion via integral Taylor representation): The argument that the remainder term arising from the representation f(x + hu) = f(x) + hu f'(x) + ∫_0^{hu} (hu - t) f''(x + t) dt, after convolution with K and integration of the squared bias, is o(h²) in the L² sense under only f'' ∈ L²(ℝ) is load-bearing for recovering the exact classical AMISE. With f'' merely square-integrable, the resulting double-integral term against K(u) need not vanish faster than the leading (h²/2) μ₂(f'') term uniformly in x or after squaring and integrating; a explicit bound or additional local integrability condition is required to confirm the o(h²) rate.

Authors: We agree that an explicit bound on the remainder would strengthen the argument. In the current proof we apply the integral Taylor formula, convolve with K, square the bias, and integrate. Because f'' ∈ L²(ℝ) and K has finite second moment and compact support, Fubini’s theorem together with Cauchy–Schwarz yields that the L²-norm of the remainder convolution is bounded by C h² ‖f''‖₂ times a factor that tends to zero with h (uniformly in the location variable after integration against the density). This is sufficient to make the integrated squared remainder o(h⁴) and therefore o_p of the leading AMISE terms. We will add a short lemma in §3.1 that records this bound explicitly, confirming the classical AMISE expansion under the stated assumptions without requiring extra local integrability. revision: yes
Referee: [Theorem 4] Theorem 4 (AMISE equivalence of the generalized-curvature plug-in selector): The first-order equivalence to the AMISE-optimal bandwidth is stated to hold under ratio-consistent estimation of the weak-curvature functional, but the proof sketch does not explicitly address whether the leave-one-out U-statistic estimator remains ratio-consistent when the pilot bandwidth used in curvature estimation is itself of order n^{-1/5}; any implicit dependence could affect the claimed equivalence.

Authors: The concern is well taken. The manuscript establishes consistency of the leave-one-out U-statistic for the curvature functional R(f'') whenever the pilot bandwidth h_p satisfies h_p → 0 and n h_p → ∞, a regime that includes the order n^{-1/5} pilot. To obtain ratio-consistency (estimated curvature / true curvature → 1 in probability) we further need the estimation error to be o_p(1) uniformly over pilots of that order. We will expand the proof of Theorem 4 by conditioning on the pilot bandwidth and invoking the uniform convergence rate of the U-statistic (which is faster than the variation induced by an n^{-1/5} pilot). This explicit argument removes any ambiguity about dependence and preserves the first-order AMISE equivalence of the resulting bandwidth selector. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation uses independent integral representation and consistency proof

full rationale

The paper replaces the classical pointwise C^2 Taylor expansion with an integral Taylor representation based on the weak second derivative under the stated C^{1,1} and L^2 assumptions. It then derives the AMISE formula, n^{-1/5} rate, and Epanechnikov optimality directly from this representation. The generalized-curvature plug-in selector is shown first-order equivalent under a ratio-consistent estimator whose consistency is separately established via a leave-one-out U-statistic with its own proof; neither step reduces to a fitted input renamed as prediction nor to a self-citation chain. The multivariate extension follows the same pattern. All load-bearing steps are self-contained against external analytic benchmarks and do not rely on the target result by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests primarily on the domain assumption of C^{1,1} regularity together with square-integrability of the weak second derivative; no free parameters or new postulated entities are introduced in the abstract.

axioms (1)

domain assumption f belongs to C^{1,1}(R) with weak second derivative belonging to L^2(R)
This assumption replaces the classical twice continuous differentiability and enables the integral Taylor representation for the AMISE derivation.

pith-pipeline@v0.9.0 · 5776 in / 1468 out tokens · 64401 ms · 2026-05-21T06:07:15.200228+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

Bickel, P. J. and Ritov, Y. (1988). Estimating integrated squared density derivatives: Sharp best order of convergence estimates. Sankhy\= a : The Indian Journal of Statistics, Series A 50 381--393

work page 1988
[2]

Chac \'o n, J. E. and Duong, T. (2010). Multivariate plug-in bandwidth selection with unconstrained pilot bandwidth matrices. TEST 19 375--398

work page 2010
[3]

Chac \'o n, J. E. and Duong, T. (2013). Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting. Electron. J. Stat. 7 499--532

work page 2013
[4]

Chac \'o n, J. E. and Duong, T. (2018). Multivariate Kernel Smoothing and Its Applications. Chapman and Hall/CRC, Boca Raton

work page 2018
[5]

Chac \'o n, J. E. , Duong, T. and Wand, M. P. (2011). Asymptotics for general multivariate kernel density derivative estimators. Statist. Sinica 21 807--840

work page 2011
[6]

Chiu, S.-T. (1996). A comparative review of bandwidth selection for kernel density estimation. Statist. Sinica 6 129--145

work page 1996
[7]

Clarke, F. H. (1990). Optimization and Nonsmooth Analysis. SIAM, Philadelphia

work page 1990
[8]

Cline, D. B. H. and Hart, J. D. (1991). Kernel estimation of densities with discontinuities or discontinuous derivatives. Statistics 22 69--84

work page 1991
[9]

Donoho, D. L. , Johnstone, I. M. , Kerkyacharian, G. and Picard, D. (1996). Density estimation by wavelet thresholding. Ann. Statist. 24 508--539

work page 1996
[10]

Epanechnikov, V. A. (1969). Non-parametric estimation of a multivariate probability density. Theory Probab. Appl. 14 153--158

work page 1969
[11]

Evans, L. C. and Gariepy, R. F. (2015). Measure Theory and Fine Properties of Functions, revised ed. CRC Press, Boca Raton

work page 2015
[12]

and Marron, J

Fan, J. and Marron, J. S. (1992). Best possible constant for bandwidth selection. Ann. Statist. 20 2057--2070

work page 1992
[13]

and Nickl, R

Gin \'e , E. and Nickl, R. (2016). Mathematical Foundations of Infinite-Dimensional Statistical Models. Cambridge Univ. Press, Cambridge

work page 2016
[14]

and Lepski, O

Goldenshluger, A. and Lepski, O. (2011). Bandwidth selection in kernel density estimation: Oracle inequalities and adaptive minimax optimality. Ann. Statist. 39 1608--1632

work page 2011
[15]

and Lepski, O

Goldenshluger, A. and Lepski, O. (2014). On adaptive minimax density estimation on R ^d . Probab. Theory Related Fields 159 479--543

work page 2014
[16]

Guidoum, A. C. (2020). Kernel estimator and bandwidth selection for density and its derivatives: The kedd package. arXiv preprint arXiv:2012.06102

work page arXiv 2020
[17]

and Marron, J

Hall, P. and Marron, J. S. (1987). Estimation of integrated squared density derivatives. Statist. Probab. Lett. 6 109--115

work page 1987
[18]

, Sheather, S

Hall, P. , Sheather, S. J. , Jones, M. C. and Marron, J. S. (1991). On optimal data-based bandwidth selection in kernel density estimation. Biometrika 78 263--269

work page 1991
[19]

Hansen, B. E. (2017). Regression kink with an unknown threshold. J. Bus. Econom. Statist. 35 228--240

work page 2017
[20]

, Schindler, A

Heidenreich, N.-B. , Schindler, A. and Sperlich, S. (2013). Bandwidth selection for kernel density estimation: A review of fully automatic selectors. AStA Adv. Stat. Anal. 97 403--433

work page 2013
[21]

, Strodiot, J.-J

Hiriart-Urruty, J.-B. , Strodiot, J.-J. and Nguyen, V. H. (1984). Generalized Hessian matrix and second-order optimality conditions for problems with C^ 1,1 data. Appl. Math. Optim. 11 43--56

work page 1984
[22]

Jones, M. C. (1993). Simple boundary correction for kernel density estimation. Statist. Comput. 3 135--146

work page 1993
[23]

Jones, M. C. , Marron, J. S. and Sheather, S. J. (1996). A brief survey of bandwidth selection for density estimation. J. Amer. Statist. Assoc. 91 401--407

work page 1996
[24]

Jones, M. C. and Sheather, S. J. (1991). Using non-stochastic terms to advantage in kernel-based estimation of integrated squared density derivatives. Statist. Probab. Lett. 11 511--514

work page 1991
[25]

and Picard, D

Kerkyacharian, G. and Picard, D. (1993). Density estimation by kernel and wavelets methods: Optimality of Besov spaces. Statist. Probab. Lett. 18 327--336

work page 1993
[26]

, Picard, D

Kerkyacharian, G. , Picard, D. and Tribouley, K. (1996). L_p adaptive density estimation. Bernoulli 2 229--247

work page 1996
[27]

McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models. Wiley Series in Probability and Statistics. Wiley, New York

work page 2000
[28]

Park, B. U. and Marron, J. S. (1990). Comparison of data-driven bandwidth selectors. J. Amer. Statist. Assoc. 85 66--72

work page 1990
[29]

Parzen, E. (1962). On estimation of a probability density function and mode. Ann. Math. Statist. 33 1065--1076

work page 1962
[30]

Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. Ann. Math. Statist. 27 832--837

work page 1956
[31]

Scott, D. W. (2015). Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, Hoboken, NJ

work page 2015
[32]

Sheather, S. J. (2004). Density estimation. Statist. Sci. 19 588--597

work page 2004
[33]

Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. J. Roy. Statist. Soc. Ser. B 53 683--690

work page 1991
[34]

Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall, London

work page 1986
[35]

Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer, New York

work page 2009
[36]

van Eeden, C. (1985). Mean integrated squared error of kernel estimators when the density and its derivatives are not necessarily continuous. Ann. Inst. Statist. Math. 37 461--472

work page 1985
[37]

van Es, A. J. and Hoogstrate, A. J. (1994). Kernel estimators of integrated squared density derivatives in non-smooth cases. In Asymptotic Statistics: Proceedings of the Fifth Prague Symposium, 163--178. Physica-Verlag

work page 1994
[38]

van Es, B. (1997). A note on the integrated squared error of a kernel density estimator in non-smooth cases. Statist. Probab. Lett. 35 241--250

work page 1997
[39]

Wand, M. P. and Jones, M. C. (1994). Multivariate plug-in bandwidth selection. Comput. Statist. 9 97--116

work page 1994
[40]

Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London

work page 1995
[41]

Wu, T.-J. (1995). Adaptive root- n estimates of integrated squared density derivatives. Ann. Statist. 23 1474--1495

work page 1995

[1] [1]

Bickel, P. J. and Ritov, Y. (1988). Estimating integrated squared density derivatives: Sharp best order of convergence estimates. Sankhy\= a : The Indian Journal of Statistics, Series A 50 381--393

work page 1988

[2] [2]

Chac \'o n, J. E. and Duong, T. (2010). Multivariate plug-in bandwidth selection with unconstrained pilot bandwidth matrices. TEST 19 375--398

work page 2010

[3] [3]

Chac \'o n, J. E. and Duong, T. (2013). Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting. Electron. J. Stat. 7 499--532

work page 2013

[4] [4]

Chac \'o n, J. E. and Duong, T. (2018). Multivariate Kernel Smoothing and Its Applications. Chapman and Hall/CRC, Boca Raton

work page 2018

[5] [5]

Chac \'o n, J. E. , Duong, T. and Wand, M. P. (2011). Asymptotics for general multivariate kernel density derivative estimators. Statist. Sinica 21 807--840

work page 2011

[6] [6]

Chiu, S.-T. (1996). A comparative review of bandwidth selection for kernel density estimation. Statist. Sinica 6 129--145

work page 1996

[7] [7]

Clarke, F. H. (1990). Optimization and Nonsmooth Analysis. SIAM, Philadelphia

work page 1990

[8] [8]

Cline, D. B. H. and Hart, J. D. (1991). Kernel estimation of densities with discontinuities or discontinuous derivatives. Statistics 22 69--84

work page 1991

[9] [9]

Donoho, D. L. , Johnstone, I. M. , Kerkyacharian, G. and Picard, D. (1996). Density estimation by wavelet thresholding. Ann. Statist. 24 508--539

work page 1996

[10] [10]

Epanechnikov, V. A. (1969). Non-parametric estimation of a multivariate probability density. Theory Probab. Appl. 14 153--158

work page 1969

[11] [11]

Evans, L. C. and Gariepy, R. F. (2015). Measure Theory and Fine Properties of Functions, revised ed. CRC Press, Boca Raton

work page 2015

[12] [12]

and Marron, J

Fan, J. and Marron, J. S. (1992). Best possible constant for bandwidth selection. Ann. Statist. 20 2057--2070

work page 1992

[13] [13]

and Nickl, R

Gin \'e , E. and Nickl, R. (2016). Mathematical Foundations of Infinite-Dimensional Statistical Models. Cambridge Univ. Press, Cambridge

work page 2016

[14] [14]

and Lepski, O

Goldenshluger, A. and Lepski, O. (2011). Bandwidth selection in kernel density estimation: Oracle inequalities and adaptive minimax optimality. Ann. Statist. 39 1608--1632

work page 2011

[15] [15]

and Lepski, O

Goldenshluger, A. and Lepski, O. (2014). On adaptive minimax density estimation on R ^d . Probab. Theory Related Fields 159 479--543

work page 2014

[16] [16]

Guidoum, A. C. (2020). Kernel estimator and bandwidth selection for density and its derivatives: The kedd package. arXiv preprint arXiv:2012.06102

work page arXiv 2020

[17] [17]

and Marron, J

Hall, P. and Marron, J. S. (1987). Estimation of integrated squared density derivatives. Statist. Probab. Lett. 6 109--115

work page 1987

[18] [18]

, Sheather, S

Hall, P. , Sheather, S. J. , Jones, M. C. and Marron, J. S. (1991). On optimal data-based bandwidth selection in kernel density estimation. Biometrika 78 263--269

work page 1991

[19] [19]

Hansen, B. E. (2017). Regression kink with an unknown threshold. J. Bus. Econom. Statist. 35 228--240

work page 2017

[20] [20]

, Schindler, A

Heidenreich, N.-B. , Schindler, A. and Sperlich, S. (2013). Bandwidth selection for kernel density estimation: A review of fully automatic selectors. AStA Adv. Stat. Anal. 97 403--433

work page 2013

[21] [21]

, Strodiot, J.-J

Hiriart-Urruty, J.-B. , Strodiot, J.-J. and Nguyen, V. H. (1984). Generalized Hessian matrix and second-order optimality conditions for problems with C^ 1,1 data. Appl. Math. Optim. 11 43--56

work page 1984

[22] [22]

Jones, M. C. (1993). Simple boundary correction for kernel density estimation. Statist. Comput. 3 135--146

work page 1993

[23] [23]

Jones, M. C. , Marron, J. S. and Sheather, S. J. (1996). A brief survey of bandwidth selection for density estimation. J. Amer. Statist. Assoc. 91 401--407

work page 1996

[24] [24]

Jones, M. C. and Sheather, S. J. (1991). Using non-stochastic terms to advantage in kernel-based estimation of integrated squared density derivatives. Statist. Probab. Lett. 11 511--514

work page 1991

[25] [25]

and Picard, D

Kerkyacharian, G. and Picard, D. (1993). Density estimation by kernel and wavelets methods: Optimality of Besov spaces. Statist. Probab. Lett. 18 327--336

work page 1993

[26] [26]

, Picard, D

Kerkyacharian, G. , Picard, D. and Tribouley, K. (1996). L_p adaptive density estimation. Bernoulli 2 229--247

work page 1996

[27] [27]

McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models. Wiley Series in Probability and Statistics. Wiley, New York

work page 2000

[28] [28]

Park, B. U. and Marron, J. S. (1990). Comparison of data-driven bandwidth selectors. J. Amer. Statist. Assoc. 85 66--72

work page 1990

[29] [29]

Parzen, E. (1962). On estimation of a probability density function and mode. Ann. Math. Statist. 33 1065--1076

work page 1962

[30] [30]

Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. Ann. Math. Statist. 27 832--837

work page 1956

[31] [31]

Scott, D. W. (2015). Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, Hoboken, NJ

work page 2015

[32] [32]

Sheather, S. J. (2004). Density estimation. Statist. Sci. 19 588--597

work page 2004

[33] [33]

Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. J. Roy. Statist. Soc. Ser. B 53 683--690

work page 1991

[34] [34]

Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall, London

work page 1986

[35] [35]

Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer, New York

work page 2009

[36] [36]

van Eeden, C. (1985). Mean integrated squared error of kernel estimators when the density and its derivatives are not necessarily continuous. Ann. Inst. Statist. Math. 37 461--472

work page 1985

[37] [37]

van Es, A. J. and Hoogstrate, A. J. (1994). Kernel estimators of integrated squared density derivatives in non-smooth cases. In Asymptotic Statistics: Proceedings of the Fifth Prague Symposium, 163--178. Physica-Verlag

work page 1994

[38] [38]

van Es, B. (1997). A note on the integrated squared error of a kernel density estimator in non-smooth cases. Statist. Probab. Lett. 35 241--250

work page 1997

[39] [39]

Wand, M. P. and Jones, M. C. (1994). Multivariate plug-in bandwidth selection. Comput. Statist. 9 97--116

work page 1994

[40] [40]

Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London

work page 1995

[41] [41]

Wu, T.-J. (1995). Adaptive root- n estimates of integrated squared density derivatives. Ann. Statist. 23 1474--1495

work page 1995