pith. machine review for the scientific record. sign in

arxiv: 2604.27721 · v1 · submitted 2026-04-30 · ⚛️ physics.ao-ph · cs.CV· physics.data-an· physics.space-ph

Recognition: unknown

Physically-Informed Fuzzy Clustering of Vertical Sounding Ionograms

Authors on Pith no claims yet

Pith reviewed 2026-05-07 04:52 UTC · model grok-4.3

classification ⚛️ physics.ao-ph cs.CVphysics.data-anphysics.space-ph
keywords ionogram clusteringfuzzy clusteringexpectation-maximizationparabolic ionospheric layerBayesian information criterionvertical soundingautomatic track separationionosphere
0
0 comments X

The pith

A physically-informed fuzzy clustering method fits six-parameter parabolic ionospheric layer models to automatically separate and count tracks on vertical sounding ionograms even when their number is unknown.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an automatic method for separating vertical sounding ionograms into tracks and determining their optimal number, including in disturbed ionospheric conditions where the count is not known in advance. It models each track as a parametric curve with six parameters—three standard ones for the parabolic layer (critical frequency, lower boundary, half-width) plus three additional ones for underlying layer effects—and uses an expectation-maximization algorithm to cluster points by their distances to these curves. Parameters for each track are optimized with the Sequential Least Squares Quadratic Programming algorithm, and the optimal number of tracks is selected by sequentially adding tracks and minimizing a modified Bayesian information criterion. Preprocessing includes automatic adaptive noise filtering via a combination of DBSCAN and Gaussian Mixture algorithms, plus approximate removal of extraordinary-mode points to support ionosondes without hardware separation.

Core claim

The central claim is that representing ionogram tracks with parametrically specified six-parameter distributions close to the parabolic ionospheric layer model, combined with expectation-maximization clustering and modified Bayesian information criterion minimization, allows reliable automatic separation into tracks and determination of their unknown number, with the width of each track treated as an unknown constant determined during fitting.

What carries the argument

Expectation-maximization fuzzy clustering that assigns points to tracks by minimizing distances to six-parameter curves modeled on the parabolic ionospheric layer, with model selection performed by minimizing a modified Bayesian information criterion after sequential addition of tracks.

If this is right

  • The approach extends automatic ionogram processing to cases where the number of tracks cannot be assumed known beforehand.
  • Each identified track receives six physically motivated parameters that can be used directly for further ionospheric interpretation.
  • Adaptive noise filtering based on DBSCAN combined with Gaussian Mixture models precedes clustering to reduce interference from non-track points.
  • Approximate removal of extraordinary-mode points enables application to data from ionosondes lacking hardware separation of ordinary and extraordinary components.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The parametric physical-model approach could be extended to cluster other geophysical signals that follow known layered structures, such as seismic or atmospheric profiles.
  • Real-time implementation on networks of ionosondes might support continuous automated monitoring of ionospheric layers without requiring prior knowledge of track counts.
  • Validation against ground-truth datasets from co-located instruments like incoherent scatter radars would test how well the three additional parameters capture underlying layer effects.
  • The modified Bayesian information criterion selection procedure could serve as a template for determining the number of components in other expectation-maximization applications involving physically constrained curves.

Load-bearing premise

That tracks in real ionograms remain sufficiently close to the six-parameter parametric curves based on the parabolic ionospheric layer model for the expectation-maximization clustering and modified Bayesian information criterion to separate tracks and determine their number reliably in disturbed conditions.

What would settle it

On a collection of disturbed ionograms, if the number of tracks and their fitted parameters produced by the method deviate substantially from counts and traces obtained by expert manual interpretation or by independent simultaneous measurements, the claim of reliable automatic determination in unknown-count cases would be refuted.

Figures

Figures reproduced from arXiv: 2604.27721 by Oleg I.Berngardt, Sergey N.Ponomarchuk.

Figure 1
Figure 1. Figure 1: Examples of ionograms obtained with the ISTP SB RAS vertical view at source ↗
Figure 2
Figure 2. Figure 2: Filtering algorithm. A) the original thresholded ionogram; B) the view at source ↗
Figure 3
Figure 3. Figure 3: Examples of tracks with different values of parameters A,B,C. The view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of various numerical methods for searching the model view at source ↗
Figure 5
Figure 5. Figure 5: An example of searching for optimal clustering for a filtered iono view at source ↗
Figure 6
Figure 6. Figure 6: The examples of clustered ionogrames. A-L) source ionogrames, the view at source ↗
Figure 7
Figure 7. Figure 7: A) Distribution of algorithm execution times over different ionograms; view at source ↗
Figure 8
Figure 8. Figure 8: Estimation of relative quantity and types of different layers on the view at source ↗
read the original abstract

This paper presents a physically-informed fuzzy clustering of vertical sounding ionograms for automatically separating the ionogram into tracks suitable for further interpretation and determining their optimal number. The model is designed for use not only in conditions where the number of tracks is known, but also in disturbed ionospheric conditions where the number of tracks is preliminary unknown. The method is based on an expectation-maximization algorithm, used for clustering, and on parametrically specified distributions of distances from points to parametrically specified curves. The curves used as track models are close to model tracks in the parabolic ionospheric layer model. The resulting model of each track has six parameters: three standard ones (the critical frequency, the lower boundary of the layer, and its half-width), and three additional ones to take into account possible underlying layer effects. By sequentially increasing the number of tracks and optimizing their parameters, the model finds the optimal number of tracks on the ionogram by minimizing the modified Bayesian information criterion. The Sequential Least Squares Quadratic Programming algorithm is used to find the parameters of a single track. The width of each single track is assumed to be unknown constant found during fitting process. To improve the quality of ionogram clustering, automatic adaptive noise filtering is performed before clustering. This filtering is based on a combination of the DBSCAN and Gaussian Mixture algorithms. Also, to improve clustering quality on an ionosonde without hardware separation of the ordinary and extraordinary components, a preliminary approximate removal of points belonging to the extraordinary mode is performed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. This paper introduces a physically-informed fuzzy clustering method for vertical sounding ionograms. The approach uses an expectation-maximization (EM) algorithm to assign ionogram points to tracks, where each track is modeled by a six-parameter curve derived from the parabolic ionospheric layer model (critical frequency, lower boundary, half-width, and three additional parameters for underlying layer effects). Track width is treated as an unknown constant. The optimal number of tracks is selected by incrementally increasing the number of tracks, optimizing parameters via Sequential Least Squares Quadratic Programming, and minimizing a modified Bayesian information criterion. Preprocessing involves adaptive noise filtering with DBSCAN and Gaussian Mixture models, and approximate removal of extraordinary mode points to improve clustering, especially for ionosondes without hardware separation of modes. The method aims to handle both known and unknown numbers of tracks, including in disturbed ionospheric conditions.

Significance. The integration of domain-specific parametric models from ionospheric physics into the clustering procedure is a clear strength, offering a principled alternative to purely statistical methods. If the six-parameter curves prove sufficiently accurate and the modified BIC selection robust, the technique could enable reliable automated track separation for ionospheric monitoring and research, particularly under variable conditions. The preprocessing steps address real-world data challenges. However, without any reported quantitative results, the practical significance remains difficult to assess.

major comments (3)
  1. [Results] The manuscript supplies no quantitative validation, such as accuracy against manual expert labeling, error rates on test ionograms, or comparisons to existing methods or benchmarks. This is a load-bearing issue for the central claim that the method can automatically separate tracks and determine their optimal number in disturbed conditions where the number is unknown.
  2. [§2.1 (Track Model)] The assumption that ionogram tracks are sufficiently close to the six-parameter parabolic curves (plus constant width) for the distance-based likelihood in EM to be well-specified is not tested. In disturbed conditions, tracks may exhibit spread, tilts, or multi-valued segments, potentially leading to multimodal likelihoods and unreliable BIC-based model selection. No approximation error bounds or ablation studies isolating model mismatch are provided.
  3. [§2.3 (Model Selection)] The modification to the Bayesian information criterion is introduced without a detailed derivation or justification showing how it adapts the standard BIC for this parametric curve-fitting and sequential selection procedure. It is unclear whether the penalty term properly accounts for the six parameters per track and the optimization via SQP.
minor comments (2)
  1. [Abstract] The abstract refers to 'fuzzy clustering' but the description is of EM-based assignment; clarify if soft assignments are used or if it is effectively hard clustering.
  2. [Notation] Define the distance distribution parameters and the modified BIC formula explicitly with equations for reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive review and for recognizing the value of integrating physical ionospheric models into the clustering framework. We agree that the points raised identify areas where the manuscript can be strengthened, particularly regarding empirical validation and technical clarifications. Below we respond to each major comment and indicate the revisions we will make.

read point-by-point responses
  1. Referee: [Results] The manuscript supplies no quantitative validation, such as accuracy against manual expert labeling, error rates on test ionograms, or comparisons to existing methods or benchmarks. This is a load-bearing issue for the central claim that the method can automatically separate tracks and determine their optimal number in disturbed conditions where the number is unknown.

    Authors: We acknowledge that the current manuscript is primarily methodological and does not report quantitative performance metrics. This limits the ability to evaluate the practical utility of the approach, especially under disturbed conditions. In the revised version we will add a dedicated validation section. This will include accuracy measures obtained by comparing the automatically extracted tracks against manual expert tracings on a collection of ionograms (both quiet and disturbed), reported error statistics, and direct comparisons against standard non-physically-informed baselines such as fuzzy c-means and DBSCAN. These additions will directly support the central claims. revision: yes

  2. Referee: [§2.1 (Track Model)] The assumption that ionogram tracks are sufficiently close to the six-parameter parabolic curves (plus constant width) for the distance-based likelihood in EM to be well-specified is not tested. In disturbed conditions, tracks may exhibit spread, tilts, or multi-valued segments, potentially leading to multimodal likelihoods and unreliable BIC-based model selection. No approximation error bounds or ablation studies isolating model mismatch are provided.

    Authors: The six-parameter model is an extension of the classical parabolic layer model with three additional parameters to capture underlying-layer effects, a formulation commonly used in ionospheric physics. We agree that the closeness assumption requires explicit examination, particularly when spread-F, tilts, or multi-valued segments appear. In the revision we will expand the discussion of model assumptions and limitations, supply representative examples that quantify the residual distances between observed points and fitted curves under both quiet and disturbed conditions, and include an ablation comparison between the full six-parameter model and its three-parameter parabolic core to isolate the contribution of the extra parameters. revision: partial

  3. Referee: [§2.3 (Model Selection)] The modification to the Bayesian information criterion is introduced without a detailed derivation or justification showing how it adapts the standard BIC for this parametric curve-fitting and sequential selection procedure. It is unclear whether the penalty term properly accounts for the six parameters per track and the optimization via SQP.

    Authors: We will revise §2.3 to provide a full derivation of the modified BIC. The penalty is constructed to reflect the sequential addition of tracks, each contributing six parameters plus the constant width, while recognizing that SQP optimization is used to fit each track. The revised text will derive the effective degrees of freedom, explain the adjustment relative to the classical BIC, and demonstrate that the penalty scales appropriately with the number of tracks and the optimization procedure. revision: yes

Circularity Check

0 steps flagged

No significant circularity in EM-based parametric clustering with BIC model selection

full rationale

The derivation applies standard expectation-maximization to assign points to tracks whose shapes are defined by a six-parameter family drawn from the established parabolic ionospheric layer model (critical frequency, lower boundary, half-width plus three underlying-layer terms). Track width is treated as a fitted constant, parameters are optimized by SLSQP, and the number of tracks is chosen by sequential minimization of a modified BIC. This is ordinary model-based clustering and information-criterion order selection; the functional form of the curves is imported from prior ionospheric literature rather than defined in terms of the fitted outputs or the BIC value itself. Preprocessing (DBSCAN/GMM noise filter and extraordinary-mode removal) is independent of the clustering result. No equation or claim reduces a derived quantity to a quantity defined by its own fit, and no self-citation is load-bearing for the central procedure.

Axiom & Free-Parameter Ledger

3 free parameters · 2 axioms · 0 invented entities

The approach rests on the standard parabolic layer approximation from ionospheric physics and introduces several fitted parameters per track plus an unspecified modification to the BIC; no new physical entities are postulated.

free parameters (3)
  • critical frequency, lower boundary, half-width (three standard parameters per track)
    Core parameters of the parabolic layer model that are optimized for each track during fitting.
  • three additional parameters per track for underlying layer effects
    Extra degrees of freedom introduced to account for possible deviations from ideal parabolic shape.
  • track width (unknown constant)
    Assumed constant width of each track that is determined as part of the fitting process.
axioms (2)
  • domain assumption Ionospheric reflection tracks can be adequately represented by curves close to the parabolic layer model
    Invoked when defining the parametrically specified distributions of distances from points to curves.
  • domain assumption A modified Bayesian information criterion can correctly identify the true number of tracks when the number is unknown
    Used to stop the sequential addition of tracks after optimization.

pith-pipeline@v0.9.0 · 5580 in / 1647 out tokens · 49085 ms · 2026-05-07T04:52:20.864002+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 1 canonical work pages

  1. [1]

    Berngardt, O. (2023). Superclustering by finding statistically significant separable groups of optimal gaussian clusters . arXiv e-prints , page arXiv:2309.02623

  2. [2]

    M., Kriegel, H.-P., Ng, R

    Breunig, M. M., Kriegel, H.-P., Ng, R. T., and Sander, J. (2000). LOF: identifying density-based local outliers . In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data , SIGMOD '00, page 93–104, New York, NY, USA. Association for Computing Machinery

  3. [3]

    Byrd, R., Peihuang, L., and Nocedal, J. (1996). A limited-memory algorithm for bound-constrained optimization

  4. [4]

    Chen, Z., Gong, Z., Zhang, F., and Fang, G. (2018). A new ionogram automatic scaling method. Radio Science , 53(9):1149--1164

  5. [5]

    Chen, Z., Wang, S., Zhang, S., Fang, G., and Wang, J. (2013). Automatic scaling of f layer from ionograms. Radio Science , 48(3):334--343

  6. [6]

    Conn, A., Gould, N. I. M., and Toint, P. (2000). 2. Basic Concepts , chapter 2, pages 15--23

  7. [7]

    Daniel, C. (1959). Use of half-normal plots in interpreting factorial two-level experiments. Technometrics , 1(4):311--341

  8. [8]

    Davies, K. (1966). Ionospheric Radio Propagation . Dover books on engineering and engineering physics. Dover Publications

  9. [9]

    P., Laird, N

    Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological) , 39(1):1--22

  10. [10]

    Ding, Z., Ning, B., Wan, W., and Liu, L. (2007). Automatic scaling of f2-layer parameters from ionograms based on the empirical orthogonal function (eof) analysis of ionospheric electron density. Earth, Planets and Space , 59(1):51–58

  11. [11]

    Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise . In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining , KDD'96, pages 226--231. AAAI Press

  12. [12]

    Fox, M. W. and Blundell, C. (1989). Automatic scaling of digital ionograms. Radio Science , 24(6):747--761

  13. [13]

    Harris, T. J. and Pederick, L. H. (2017). A robust automatic ionospheric o/x mode separation technique for vertical incidence sounders. Radio Science , 52(12):1534--1543

  14. [14]

    Heitmann, A. J. and Gardiner-Garden, R. S. (2019). A robust feature extraction and parameterized fitting algorithm for bottom-side oblique and vertical incidence ionograms. Radio Science , 54(1):115--134

  15. [15]

    Ippolito, A., Scotto, C., Francis, M., Settimi, A., and Cesaroni, C. (2015). Automatic interpretation of oblique ionograms. Advances in Space Research , 55(6):1624--1629

  16. [16]

    Jiang, C., Yang, G., Lan, T., Zhu, P., Song, H., Zhou, C., Cui, X., Zhao, Z., and Zhang, Y. (2015). Improvement of automatic scaling of vertical incidence ionograms by simulated annealing. Journal of Atmospheric and Solar-Terrestrial Physics , 133:178--184

  17. [17]

    Jiang, C., Yang, G., Zhou, Y., Zhu, P., Lan, T., Zhao, Z., and Zhang, Y. (2017). Software for scaling and analysis of vertical incidence ionograms-ionoscaler. Advances in Space Research , 59(4):968--979

  18. [18]

    Kurkin, V., Medvedeva, I., and Podlesnyi, A. (2024). Effect of sudden stratosphere warming on characteristics of medium-scale traveling ionospheric disturbances in the Asian region of Russia . Advances in Space Research , 73(7):3613--3623. Recent advances in equatorial, low- and mid-latitude mesosphere, thermosphere and ionosphere studies

  19. [19]

    Kvammen, A., Vierinen, J., Huyghebaert, D., Rexer, T., Spicher, A., Gustavsson, B., and Floberg, J. (2024). Noire-net–a convolutional neural network for automatic classification and scaling of high-latitude ionograms. Frontiers in Astronomy and Space Sciences , Volume 11 - 2024

  20. [20]

    and Hanson, R

    Lawson, C. and Hanson, R. (1995). 25. Practical Analysis of Least Squares Problems , chapter 25, pages 180--198

  21. [21]

    Lynn, K. J. (2018). Histogram-based ionogram displays and their application to autoscaling. Advances in Space Research , 61(5):1220--1229

  22. [22]

    Nash, S. (2000). A survey of truncated-newton methods. Journal of Computational and Applied Mathematics , 124(1):45--59. Numerical Analysis 2000. Vol. IV: Optimization and Nonlinear Equations

  23. [23]

    Pezzopane, M. (2004). Interpre: a windows software for semiautomatic scaling of ionospheric parameters from ionograms. Computers & Geosciences , 30(1):125--130

  24. [24]

    and Pietrella, M

    Pezzopane, M. and Pietrella, M. (2008). Interobl: An interactive software tool for displaying and scaling oblique ionograms. Computers & Geosciences , 34(11):1577--1583

  25. [25]

    Podlesnyi, A., Lebedev, V., Ilyin, N., and Khahinov, V. (2014). Implementation of the method for restoring the transfer function of the ionospheric radio channel based on the results of sounding the ionosphere with a continuous chirp signal(in russian). Electromagnetic waves and electronic systems , 19(1):63--70

  26. [26]

    and Grozov, V

    Ponomarchuk, S. and Grozov, V. (2026). Technique of ionospheric parameters automatic determination from data of vertical sounding with a continuous chirp signal. Solar-Terrestrial Physics , 12:50--58

  27. [27]

    Powell, M. J. D. (1994). A Direct Search Optimization Method That Models the Objective and Constraint Functions by Linear Interpolation , pages 51--67. Springer Netherlands, Dordrecht

  28. [28]

    Ragonneau, T. M. (2022). Model-Based Derivative-Free Optimization Methods and Software . PhD thesis, Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China

  29. [29]

    V., Sridhar, M., and Ratnam, D

    Rao, T. V., Sridhar, M., and Ratnam, D. V. (2022). An automatic cadi’s ionogram scaling software tool for large ionograms data analytics. IEEE Access , 10:22161--22168

  30. [30]

    W., Galkin, I

    Reinisch, B. W., Galkin, I. A., Khmyrov, G. M., Kozlov, A. V., Bibl, K., Lisysyan, I. A., Cheney, G. P., Huang, X., Kitrosser, D. F., Paznukhov, V. V., Luo, Y., Jones, W., Stelmash, S., Hamel, R., and Grochmal, J. (2009). New digisonde for research and monitoring applications. Radio Science , 44(1)

  31. [31]

    Reinisch, B. W. and Xueqin, H. (1983). Automatic calculation of electron density profiles from digital ionograms: 3. processing of bottomside ionograms. Radio Science , 18(3):477--492

  32. [32]

    P., and Xu, X

    Schubert, E., Sander, J., Ester, M., Kriegel, H. P., and Xu, X. (2017). DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN . ACM Trans. Database Syst. , 42(3)

  33. [33]

    and Pezzopane, M

    Scotto, C. and Pezzopane, M. (2007). A method for automatic scaling of sporadic e layers from ionograms. Radio Science , 42(2)

  34. [34]

    and Pezzopane, M

    Scotto, C. and Pezzopane, M. (2012). Automatic scaling of polar ionograms. Antarctic Science , 24(1):88–94

  35. [35]

    Song, H., Hu, Y., Jiang, C., Zhou, C., Zhao, Z., and Zou, X. (2016). An automatic scaling method for obtaining the trace and parameters from oblique ionogram based on hybrid genetic algorithm. Radio Science , 51(12):1838--1854

  36. [36]

    Xiao, Z., Wang, J., Li, J., Zhao, B., Hu, L., and Liu, L. (2020). Deep-learning for ionogram automatic scaling. Advances in Space Research , 66(4):942--950

  37. [37]

    and Jordan, M

    Xu, L. and Jordan, M. I. (1996). On Convergence Properties of the EM Algorithm for Gaussian Mixtures . Neural Computation , 8(1):129–151

  38. [38]

    Zheng, H., Ji, G., Wang, G., Zhao, Z., and He, S. (2013). Automatic scaling of f layer from ionograms based on image processing and analysis. Journal of Atmospheric and Solar-Terrestrial Physics , 105-106:110--118