pith. sign in

arxiv: 2606.24117 · v1 · pith:O3SFTZ3Gnew · submitted 2026-06-23 · 🌌 astro-ph.IM

Improving Radio Source Count Estimation Using Kernel Density Estimation

Pith reviewed 2026-06-25 23:00 UTC · model grok-4.3

classification 🌌 astro-ph.IM
keywords radio source countskernel density estimationLOFAR surveysource count estimationnonparametric density estimationsurvey incompletenessflux-limited samples
0
0 comments X

The pith

Kernel density estimation produces more accurate radio source counts than traditional binning, especially at high fluxes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that kernel density estimation serves as a nonparametric alternative to histogram binning for estimating differential radio source counts. Conventional binning suffers from arbitrary bin choices, boundary effects, and poor handling of sparse data, while KDE smooths the distribution directly from the observations. Simulations show KDE, including adaptive variants, yields estimates closer to the true underlying distribution and more stable in regions with few sources. The method extends naturally to weighted estimation that accounts for survey incompleteness at the level of individual sources rather than averaged bins. Application to LOFAR Deep Fields data confirms a known drop-and-bump feature at sub-mJy levels but indicates that a secondary bump near 10 mJy is likely produced by binning artifacts.

Core claim

KDE-based approaches yield more accurate and stable estimates of differential radio source counts than binned methods, particularly in the high-flux regime, and can incorporate continuous weights to address observational incompleteness.

What carries the argument

Kernel density estimation (KDE), which places a kernel function at each observed flux and sums the contributions to produce a continuous density estimate.

If this is right

  • KDE removes the need to choose bin widths and reduces boundary effects at the bright end.
  • Adaptive KDE automatically adjusts smoothing where source density changes rapidly.
  • Weighted KDE applies incompleteness corrections continuously rather than in discrete bins.
  • Some secondary features seen only in binned analyses of real data disappear under KDE, indicating they are artifacts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same KDE machinery could be tested on source-count problems at other wavelengths where flux-limited samples are also incomplete.
  • If adopted, the approach would allow consistent re-analysis of legacy survey data without re-binning decisions.
  • The stability at high fluxes suggests KDE estimates could tighten constraints on the bright-end slope of luminosity functions.

Load-bearing premise

The simulated flux-limited samples derived from an input luminosity function model faithfully represent the statistical properties and incompleteness of real observational data.

What would settle it

Comparison of KDE-derived counts against source counts measured in a deeper, more complete survey whose completeness function is independently known.

Figures

Figures reproduced from arXiv: 2606.24117 by Chuanqi Li, Luozhenhan Liu, Wenjie Wang, Zunli Yuan.

Figure 1
Figure 1. Figure 1: Mean differential source counts derived from 200 simulated realizations, estimated using three methods: the traditional binned approach (orange), standard KDE (blue), and adaptive KDE (red). In each panel, the green dashed line shows the true source counts computed from Equation (14), and the shaded regions indicate the 1σ dispersion across the 200 samples. The flux limit adopted in each case is marked by … view at source ↗
Figure 2
Figure 2. Figure 2: Similar to [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distributions of the discrepancy metric dn for different source count estimators, based on 200 simulated samples at each flux limit. Each panel corresponds to a different flux threshold. In this section, we apply the KDE-based estimators to real observational data from the LOFAR Two-Metre Sky Survey(LoTSS) Deep Fields presented by Mandal et al. (2021). The sample consists of radio sources detected at 150 M… view at source ↗
Figure 4
Figure 4. Figure 4: Differential source counts for the three LoTSS Deep Fields (Lockman Hole, Bo¨otes, and ELAIS-N1) obtained with the traditional binned estimator. Black filled squares show the results of Mandal et al. (2021), and red filled circles show our own estimates. The binning scheme follows Mandal et al. (2021) with slight shifts in several bin edges, illustrating the sensitivity of histogram-based estimators to bin… view at source ↗
Figure 5
Figure 5. Figure 5: Similar to [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Differential source counts for the three LoTSS Deep Fields, estimated using the adaptive KDE method (solid red lines). The pale-red shaded region indicates the 3σ uncertainty band for the KDE estimate. For comparison, the binned results from Mandal et al. (2021) are shown as black squares [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Posterior distributions of the adaptive KDE parameters h0 and β, derived with the routine AstroKDE. Panels from left to right show the three LoTSS Deep Fields: Lockman Hole, Bo¨otes, and ELAIS-N1. 4. DISCUSSION The results of our analysis demonstrate that KDE offers a flexible and statistically robust alternative to the traditional binned approach for estimating radio source counts. This advantage becomes … view at source ↗
read the original abstract

Radio source counts provide a fundamental census of cosmic radio emission, yet their estimation is usually based on coarse histograms that suffer from bin-choice bias, boundary effects, and survey incompleteness. We apply and rigorously evaluate kernel density estimation (KDE) as a anonparametric alternative to the conventional binned method for estimating differential radio source counts. Using simulated flux-limited samples derived from an input luminosity function model, we compare the performance of standard KDE, adaptive KDE, and traditional binning methods. Our results show that KDE-based approaches yield more accurate and stable estimates, particularly in the high-flux regime where data are sparse and conventional methods struggle. We also apply the adaptive KDE method to real observational data from the LOFAR Two-Metre Sky Survey Deep Fields. Our analysis robustly confirms the pronounced ``drop and bump" feature at sub-mJy flux densities, but also reveals that a secondary, modest bump seen in the binned data at ~ $\sim 10$ mJy is likely a binning artifact. We also demonstrate the flexibility of KDE in addressing observational incompleteness through weighted estimation, which applies weights continuously at the level of individual sources rather than averaging them in discrete bins. These strengths make KDE a powerful tool for source-count analyses in current and future radio surveys and, more broadly, in analogous studies at other wavelengths. All computations in this study are implemented with \texttt{AstroKDE}, a Python package we have developed for astronomical applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that kernel density estimation (KDE), including adaptive and weighted variants, yields more accurate and stable estimates of differential radio source counts than conventional binned histograms, especially in the sparse high-flux regime. This is demonstrated by recovering known counts from flux-limited samples simulated from an input luminosity function model, with application to LOFAR Two-Metre Sky Survey Deep Fields data confirming the sub-mJy drop-and-bump feature while attributing a ~10 mJy secondary bump to binning artifacts. The work also shows KDE's flexibility for handling incompleteness via per-source weights and provides the AstroKDE Python package for implementation.

Significance. If the performance gains hold under broader validation, the nonparametric KDE approach could meaningfully reduce bin-choice bias and boundary effects in radio source count analyses, benefiting studies of cosmic radio emission and analogous problems at other wavelengths. The provision of the AstroKDE package and emphasis on reproducible computations are strengths that support wider adoption.

major comments (3)
  1. [Simulation setup and results] Simulation and validation section: The accuracy claim rests on recovering the input luminosity function from samples generated under that same model (including its selection and incompleteness prescription). This tests internal consistency but does not directly establish superiority for real data whose true distribution may deviate; the manuscript should either test multiple independent input models or provide a quantitative sensitivity analysis to model choice.
  2. [Application to real data] LOFAR application and results: The assertion that the ~10 mJy bump is a binning artifact requires explicit uncertainty quantification (e.g., bootstrap or analytic KDE variance) on the adaptive KDE estimate to demonstrate that the feature is statistically insignificant, rather than relying on visual comparison alone.
  3. [Comparison methodology] Methods: The paper states KDE approaches are 'more accurate' but does not specify the exact error metric(s) (e.g., integrated squared error, Kolmogorov-Smirnov statistic, or binned χ2) or report numerical values/tables comparing KDE versus binning across flux regimes; without these, the quantitative support for the central performance claim cannot be evaluated.
minor comments (2)
  1. [Abstract] Abstract: 'anonparametric' is a typographical error and should read 'a nonparametric'.
  2. [Methods] The manuscript should include a brief description of the kernel function and bandwidth selection method (e.g., cross-validation or rule-of-thumb) used in the AstroKDE implementation for reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed review, which has identified several areas where the manuscript can be strengthened. We address each major comment below and will incorporate the suggested revisions to improve the clarity and robustness of our claims.

read point-by-point responses
  1. Referee: Simulation and validation section: The accuracy claim rests on recovering the input luminosity function from samples generated under that same model (including its selection and incompleteness prescription). This tests internal consistency but does not directly establish superiority for real data whose true distribution may deviate; the manuscript should either test multiple independent input models or provide a quantitative sensitivity analysis to model choice.

    Authors: We agree that the current simulation setup demonstrates internal consistency for a single realistic input model but does not fully address sensitivity to model assumptions. In the revised manuscript, we will add a quantitative sensitivity analysis by perturbing key parameters of the luminosity function (e.g., faint-end slope and normalization) across a range of plausible values, recomputing the recovered source counts with both KDE and binned methods, and reporting the resulting variations in accuracy metrics. This will be presented in an expanded simulation section with additional figures and tables. revision: yes

  2. Referee: LOFAR application and results: The assertion that the ~10 mJy bump is a binning artifact requires explicit uncertainty quantification (e.g., bootstrap or analytic KDE variance) on the adaptive KDE estimate to demonstrate that the feature is statistically insignificant, rather than relying on visual comparison alone.

    Authors: We concur that visual comparison is insufficient to establish statistical insignificance. We will revise the LOFAR results section to include bootstrap resampling (with 1000 resamples) of the adaptive KDE estimates, providing uncertainty bands on the differential source counts. This will demonstrate quantitatively that the ~10 mJy feature lies within the 1-sigma uncertainty envelope while the sub-mJy drop-and-bump remains significant, with the new analysis added to the text and figures. revision: yes

  3. Referee: Methods: The paper states KDE approaches are 'more accurate' but does not specify the exact error metric(s) (e.g., integrated squared error, Kolmogorov-Smirnov statistic, or binned χ2) or report numerical values/tables comparing KDE versus binning across flux regimes; without these, the quantitative support for the central performance claim cannot be evaluated.

    Authors: The referee correctly notes the lack of explicit quantitative metrics. We will update the methods and results sections to define the error metrics used (integrated squared error for differential counts and Kolmogorov-Smirnov statistic for cumulative distributions), and add a new table summarizing numerical values of these metrics for KDE variants versus binning across low-, mid-, and high-flux regimes. This will provide the requested quantitative support for the performance claims. revision: yes

Circularity Check

0 steps flagged

No circularity: validation against known input model is independent benchmark

full rationale

The paper generates flux-limited samples from an explicit input luminosity function model, then measures how well KDE recovers the known differential counts versus binning. This is a standard external-truth test (the model supplies the ground truth), not a fit-then-predict or self-definition loop. Real-data application to LOFAR is presented separately without claiming quantitative accuracy from the simulations. No self-citations, uniqueness theorems, or ansatzes are invoked to force the result. The assumption that the simulation faithfully captures real incompleteness is a modeling limitation, not a circular reduction of the claimed derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper's claims depend on the standard properties of KDE and the assumption that the simulation model captures real-world survey characteristics.

axioms (1)
  • standard math Kernel density estimation accurately recovers the underlying density from samples under appropriate kernel and bandwidth choices.
    This is the foundational assumption of the KDE method used throughout the paper.

pith-pipeline@v0.9.1-grok · 5796 in / 1295 out tokens · 32687 ms · 2026-06-25T23:00:28.750013+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 42 canonical work pages

  1. [1]

    , keywords =

    Abramson, I. 1982, Annals of Statistics, 10, 1217 Biggs, A. D., & Ivison, R. J. 2006, MNRAS, 371, 963, doi: 10.1111/j.1365-2966.2006.10730.x 13

  2. [2]

    2024, Kernel Density Estimators in Large Dimensions

    Biroli, G., & M´ ezard, M. 2024, Kernel Density Estimators in Large Dimensions. https://arxiv.org/abs/2408.05807

  3. [3]

    2012, ApJS, 203, 15, doi: 10.1088/0067-0049/203/1/15

    Bonzini, M., Mainieri, V., Padovani, P., et al. 2012, ApJS, 203, 15, doi: 10.1088/0067-0049/203/1/15

  4. [4]

    and David, Laurence P

    Borys, C., Scott, D., Chapman, S., et al. 2004, MNRAS, 355, 485, doi: 10.1111/j.1365-2966.2004.08335.x

  5. [5]

    I., Grotowski, J

    Botev, Z. I., Grotowski, J. F., & Kroese, D. P. 2010, The Annals of Statistics, 38, doi: 10.1214/10-aos799

  6. [6]

    S., & Purcell, E

    Breiman, L., Meisel, W. S., & Purcell, E. A. 1977, Technometrics, 19, 135, doi: 10.2307/1267744

  7. [7]

    K., Kondapally, R., Best, P

    Cochrane, R. K., Kondapally, R., Best, P. N., et al. 2023, MNRAS, 523, 6082, doi: 10.1093/mnras/stad1602

  8. [8]

    Condon, J. J. 1974, ApJ, 188, 279, doi: 10.1086/152714

  9. [9]

    , archivePrefix = "arXiv", eprint =

    Condon, J. J., Cotton, W. D., Fomalont, E. B., et al. 2012, ApJ, 758, 23, doi: 10.1088/0004-637X/758/1/23

  10. [10]

    1967, Nature, 216, 1076, doi: 10.1038/2161076a0

    Davidson, W. 1967, Nature, 216, 1076, doi: 10.1038/2161076a0

  11. [11]

    M., Marshall, J

    Davies, T. M., Marshall, J. C., & Hazelton, M. L. 2018, Statistics in Medicine, 37, 1191

  12. [12]

    Eddington, Sir, A. S. 1940, MNRAS, 100, 354, doi: 10.1093/mnras/100.5.354

  13. [13]

    1998, A&AS, 127, 335, doi: 10.1051/aas:1998355

    Fadda, D., Slezak, E., & Bijaoui, A. 1998, A&AS, 127, 335, doi: 10.1051/aas:1998355

  14. [14]

    J., Buddelmeijer, H., Trager, S

    Ferdosi, B. J., Buddelmeijer, H., Trager, S. C., Wilkinson, M. H. F., & Roerdink, J. B. T. M. 2011, A&A, 531, A114, doi: 10.1051/0004-6361/201116878

  15. [15]

    Franzen, T. M. O., Jackson, C. A., Offringa, A. R., et al. 2016, Monthly Notices of the Royal Astronomical Society, 459, 3314–3325, doi: 10.1093/mnras/stw823

  16. [16]

    , keywords =

    Garn, T., Green, D. A., Riley, J. M., & Alexander, P. 2008, MNRAS, 383, 75, doi: 10.1111/j.1365-2966.2007.12562.x

  17. [17]

    Gasser, T., & M¨ uller, H. G. 1979, in Lectures Notes in

  18. [18]

    2018, Studies in Big Data, Vol

    Gramacki, A. 2018, Studies in Big Data, Vol. 37, Nonparametric Kernel Density Estimation and Its Computational Aspects (Springer), doi: 10.1007/978-3-319-71688-6

  19. [19]

    L., et al

    Gully, H., Hatch, N., Ahad, S. L., et al. 2025, MNRAS, 539, 3058, doi: 10.1093/mnras/staf635

  20. [20]

    Hall, P., & Park, B. U. 2002, Annals of Statistics, 30, 1460

  21. [21]

    W., Lindsay, S

    Hatfield, P. W., Lindsay, S. N., Jarvis, M. J., et al. 2016, MNRAS, 459, 2618, doi: 10.1093/mnras/stw769

  22. [22]

    T., Jackson, C

    Huynh, M. T., Jackson, C. A., Norris, R. P., & Prandoni, I. 2005, AJ, 130, 1373, doi: 10.1086/432873

  23. [23]

    2016, in MeerKAT Science: On the Pathway to the SKA, 6, doi: 10.22323/1.277.0006

    Jarvis, M., Taylor, R., Agudo, I., et al. 2016, in MeerKAT Science: On the Pathway to the SKA, 6, doi: 10.22323/1.277.0006

  24. [24]

    Jones, M. C. 1993, Statistics and Computing, 3, 135, doi: 10.1007/BF00147776

  25. [25]

    Kapinska, A. D. 2020, in American Astronomical Society Meeting Abstracts, Vol. 236, American Astronomical Society Meeting Abstracts #236, 322.06

  26. [26]

    I., Condon, J

    Kellermann, K. I., Condon, J. J., Kimball, A. E., Perley, R. A., & Ivezi´ c,ˇZ. 2016, ApJ, 831, 168, doi: 10.3847/0004-637X/831/2/168

  27. [27]

    Longair, M. S. 1966, MNRAS, 133, 421, doi: 10.1093/mnras/133.4.421 —. 2011, High Energy Astrophysics

  28. [28]

    2014, Computational Statistics & Data Analysis, 72, 57

    Maleca, P., & Schienle, M. 2014, Computational Statistics & Data Analysis, 72, 57

  29. [29]

    Mandal et al

    Mandal, S., Prandoni, I., Hardcastle, M. J., et al. 2021, A&A, 648, A5, doi: 10.1051/0004-6361/202039998

  30. [30]

    S., & Ruppert, D

    Marron, J. S., & Ruppert, D. 1994, Journal of the Royal Statistical Society Series B, 56, 653

  31. [31]

    C., & Hazelton, M

    Marshall, J. C., & Hazelton, M. L. 2010, Journal of Multivariate Analysis, 101, 949 Mart´ ın-Navarro, I., & Mezcua, M. 2018, The Astrophysical Journal Letters, 855, L20, doi: 10.3847/2041-8213/aab103

  32. [32]

    , keywords =

    Massardi, M., Bonaldi, A., Negrello, M., et al. 2010, MNRAS, 404, 532, doi: 10.1111/j.1365-2966.2010.16305.x

  33. [33]

    Mauch, T., & Sadler, E. M. 2007, MNRAS, 375, 931, doi: 10.1111/j.1365-2966.2006.11353.x

  34. [34]

    2006, Large and moderate deviations principles for recursive kernel estimators of a multivariate density and its partial derivatives

    Mokkadem, A., Pelletier, M., & Thiam, B. 2006, Large and moderate deviations principles for recursive kernel estimators of a multivariate density and its partial derivatives. https://arxiv.org/abs/math/0601429

  35. [35]

    2016, A&A Rv, 24, 13, doi: 10.1007/s00159-016-0098-6

    Padovani, P. 2016, A&A Rv, 24, 13, doi: 10.1007/s00159-016-0098-6

  36. [36]

    I., et al

    Padovani, P., Bonzini, M., Kellermann, K. I., et al. 2015, MNRAS, 452, 1263, doi: 10.1093/mnras/stv1375

  37. [37]

    A.Rahmati,J.Schaye,A.H.Pawlik,andM.Raičević.Mon

    Prandoni, I., & Seymour, N. 2015, in Advancing Astrophysics with the Square Kilometre Array (AASKA14), 67, doi: 10.22323/1.215.0067

  38. [38]

    Sain, S. R. 2002, Computational Statistics & Data Analysis, 39, 165

  39. [39]

    D., Norris, J

    Scargle, J. D., Norris, J. J., Jackson, B., & Chiang, J. 2013, The Astrophysical Journal, 764, 167, doi: 10.1088/0004-637X/764/2/167

  40. [40]

    Scheuer, P. A. G. 1957, Proceedings of the Cambridge Philosophical Society, 53, 764, doi: 10.1017/S0305004100032825

  41. [41]

    Scott, D. W. 1979, Biometrika, 66, 605

  42. [42]

    Scott, D. W. 2015, Multivariate Density Estimation:

  43. [43]

    W., Tasse, C., Hardcastle, M

    Shimwell, T. W., Tasse, C., Hardcastle, M. J., et al. 2019, A&A, 622, A1, doi: 10.1051/0004-6361/201833559

  44. [44]

    W., Hardcastle, M

    Shimwell, T. W., Hardcastle, M. J., Tasse, C., et al. 2022, A&A, 659, A1, doi: 10.1051/0004-6361/202142484 14Liu, W ang, Yuan, et al

  45. [45]

    Silverman, B. W. 1986, Density estimation for statistics and data analysis Smolˇ ci´ c, V., Zamorani, G., Schinnerer, E., Vla-Cosmos, & Cosmos Collaborations. 2009a, in Astronomical Society of the Pacific Conference Series, Vol. 408, The Starburst-AGN Connection, ed. W. Wang, Z. Yang, Z. Luo, & Z. Chen, 116 Smolˇ ci´ c, V., Schinnerer, E., Zamorani, G., e...

  46. [46]

    V., et al

    Vernstrom, T., Scott, D., Wall, J. V., et al. 2014, MNRAS, 440, 2791, doi: 10.1093/mnras/stu470

  47. [47]

    2024, A&A, 683, A174, doi: 10.1051/0004-6361/202347746

    Wang, W., Yuan, Z., Yu, H., & Mao, J. 2024, A&A, 683, A174, doi: 10.1051/0004-6361/202347746

  48. [48]

    Eales, S. A. 2001, MNRAS, 322, 536, doi: 10.1046/j.1365-8711.2001.04101.x

  49. [49]

    Y., Delaigle, A., & Gustafson, P., eds

    Yi, G. Y., Delaigle, A., & Gustafson, P., eds. 2021, Handbook of Measurement Error Models, 1st edn. (Chapman and Hall/CRC), doi: 10.1201/9781315101279

  50. [50]

    J., & Wang, J

    Yuan, Z., Jarvis, M. J., & Wang, J. 2020, ApJS, 248, 1, doi: 10.3847/1538-4365/ab855b

  51. [51]

    2013, Ap&SS, 345, 305, doi: 10.1007/s10509-013-1402-9

    Yuan, Z., & Wang, J. 2013, Ap&SS, 345, 305, doi: 10.1007/s10509-013-1402-9

  52. [52]

    2017, ApJ, 846, 78, doi: 10.3847/1538-4357/aa8463

    Yuan, Z., Wang, J., Zhou, M., Qin, L., & Mao, J. 2017, ApJ, 846, 78, doi: 10.3847/1538-4357/aa8463

  53. [53]

    2022, ApJS, 260, 10, doi: 10.3847/1538-4365/ac596a

    Yuan, Z., Zhang, X., Wang, J., Cheng, X., & Wang, W. 2022, ApJS, 260, 10, doi: 10.3847/1538-4365/ac596a