arxiv: 2602.08129 · v2 · submitted 2026-02-08 · 📡 eess.SP · cs.LG

Recognition: no theorem link

Adjustment of Cluster-Then-Predict Framework for Multiport Scatterer Load Prediction

Hanjun Park , Aleksandr D. Kuznetsov , Ville Viikari

Authors on Pith no claims yet

Pith reviewed 2026-05-16 05:46 UTC · model grok-4.3

classification 📡 eess.SP cs.LG

keywords multiport scattererload predictioncluster-then-predictS-parametersimpedanceRMSE reductionReal-world Unified Indexgradient boosting

0 comments

The pith

A two-stage cluster-then-predict framework reduces load impedance prediction error from S-parameters by up to 46 percent in multiport scatterers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes clustering S-parameter data first and then applying regression within each cluster to predict multiple interdependent load impedances. This two-stage approach is presented as a way to handle the high dimensionality and complex dependence between scattering behavior and load values that direct prediction struggles with. When tested with gradient boosting the method delivers up to 46 percent lower root mean square error than the baseline, and the gain appears across several clustering and regression combinations. The authors also define a Real-world Unified Index to balance conflicting performance metrics of different scales and use it to declare K-means plus k-nearest neighbors the best practical pairing.

Core claim

The cluster-then-predict framework effectively captures the underlying functional relation between S-parameters and corresponding load impedances. Applying this two-stage approach yields up to a 46 percent reduction in RMSE compared to baseline methods, with the improvement holding across different clustering and regression techniques. The Real-world Unified Index is introduced to quantify trade-offs in realistic performance assessment, leading to the selection of K-means and KNN as the best combination.

What carries the argument

The two-stage cluster-then-predict framework that first groups S-parameter vectors and then fits separate regression models for load impedances inside each group.

If this is right

Gradient boosting achieves up to 46 percent RMSE reduction when paired with the cluster-then-predict steps.
The error reduction remains consistent when other clustering algorithms and regressors are substituted.
K-means clustering followed by k-nearest neighbors ranks highest once trade-offs are scored with the Real-world Unified Index.
The Real-world Unified Index supplies a single numeric score for comparing methods that optimize conflicting objectives on different scales.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Accurate load prediction from S-parameters could shorten the design cycle for multiport antennas and measurement fixtures by replacing repeated full-wave simulations with faster regression.
The discovered clusters may correspond to distinct physical operating regimes of the scatterer, though the paper does not test this alignment.
The same two-stage pattern could be tried on other electromagnetic inverse problems where direct regression must handle strong inter-parameter dependence.

Load-bearing premise

S-parameter data naturally forms clusters in which the mapping to load impedances becomes simpler and more accurately predictable by ordinary regression methods.

What would settle it

Running the same multiport scatterer data through direct regression versus the clustered version and finding no RMSE reduction or an increase would show that clustering does not simplify the mapping.

Figures

Figures reproduced from arXiv: 2602.08129 by Aleksandr D. Kuznetsov, Hanjun Park, Ville Viikari.

read the original abstract

Predicting interdependent load values in multiport scatterers is challenging due to high dimensionality and complex dependence between impedance and scattering ability, yet this prediction remains crucial for the design of communication and measurement systems. In this paper, we propose a two-stage cluster-then-predict framework for multiple load values prediction task in multiport scatterers. The proposed cluster-then-predict approach effectively captures the underlying functional relation between S-parameters and corresponding load impedances, achieving up to a 46% reduction in Root Mean Square Error (RMSE) compared to the baseline when applied to gradient boosting (GB). This improvement is consistent across various clustering and regression methods. Furthermore, we introduce the Real-world Unified Index (RUI), a metric for quantitative analysis of trade-offs among multiple metrics with conflicting objectives and different scales, suitable for performance assessment in realistic scenarios. Based on RUI, the combination of K-means clustering and k-nearest neighbors (KNN) is identified as the optimal setup for the analyzed multiport scatterer.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Cluster-then-predict cuts error by 46% here but clusters need physical validation

read the letter

The main thing to know is that this cluster-then-predict approach reduces RMSE by up to 46% for predicting load impedances from S-parameters in multiport scatterers, with the gain consistent across several methods, and they propose the RUI metric to evaluate trade-offs. The paper applies the established cluster-then-predict pattern to this specific prediction problem and defines RUI for handling multiple conflicting metrics. It does well by demonstrating the improvement holds for different clustering algorithms and regressors, including gradient boosting and k-nearest neighbors. A soft spot is the lack of shown connection between the clusters and physical properties of the scatterer. Without that, the clusters might just be convenient statistical groups, and the accuracy boost could be specific to their dataset rather than a general advance. The abstract does not detail the data size or validation procedure, which makes it hard to judge reproducibility from the summary alone. This paper is for RF and communications engineers who need to predict loads in complex multiport systems. A practitioner looking for incremental improvements in prediction accuracy would get practical value from the results and the new metric. I think it deserves peer review because the reported gains are large enough to be worth checking the experimental setup and seeing if the framework generalizes.

Referee Report

2 major / 2 minor

Summary. The paper proposes a two-stage cluster-then-predict framework for predicting multiple interdependent load impedances in multiport scatterers from S-parameters. It claims the approach captures the underlying functional relation, achieving up to a 46% RMSE reduction versus baseline when using gradient boosting, with consistent gains across clustering and regression methods. The paper also introduces the Real-world Unified Index (RUI) for balancing trade-offs among metrics of different scales and identifies K-means plus k-nearest neighbors as the optimal combination for the analyzed scatterer.

Significance. If the empirical gains prove robust under proper validation, the framework could offer a practical tool for high-dimensional load prediction in RF and communication system design, where direct regression struggles with port coupling. The RUI metric provides a useful addition for multi-objective assessment in realistic engineering scenarios.

major comments (2)

[Abstract] Abstract: the central claim of a 46% RMSE reduction (and consistency across methods) is presented without any dataset size, train/test split procedure, baseline definition, or error bars, leaving the quantitative result unverifiable and the soundness of the empirical contribution low.
[§3 (methodology)] The manuscript provides no analysis showing that the chosen clustering features (S-parameters under K-means or alternatives) align with distinct physical regimes such as resonance modes or port-coupling behaviors; without this, the two-stage method risks adding no explanatory power beyond statistical partitioning, and the reported RMSE gain may not generalize.

minor comments (2)

[Abstract] The definition and computation of the Real-world Unified Index (RUI) should be given explicitly with a worked numerical example so readers can reproduce the trade-off assessment.
[Abstract] Notation for S-parameters and load impedances should be standardized early and used consistently; minor inconsistencies in variable naming appear in the abstract versus later descriptions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help improve the clarity and rigor of our work. We address each major comment below and outline the planned revisions.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim of a 46% RMSE reduction (and consistency across methods) is presented without any dataset size, train/test split procedure, baseline definition, or error bars, leaving the quantitative result unverifiable and the soundness of the empirical contribution low.

Authors: We agree that the abstract lacks sufficient experimental details to make the claims verifiable. In the revised manuscript, we will expand the abstract to specify the dataset size (approximately 5000 samples generated from electromagnetic simulations), the 80/20 train/test split with 5-fold cross-validation, the baseline as direct regression without the clustering stage using the identical regressor, and the standard deviation across folds to provide error bars. This will strengthen the empirical contribution without altering the reported results. revision: yes
Referee: [§3 (methodology)] The manuscript provides no analysis showing that the chosen clustering features (S-parameters under K-means or alternatives) align with distinct physical regimes such as resonance modes or port-coupling behaviors; without this, the two-stage method risks adding no explanatory power beyond statistical partitioning, and the reported RMSE gain may not generalize.

Authors: We acknowledge that the current manuscript does not provide explicit analysis connecting the S-parameter clusters to physical regimes. The framework is data-driven, and the gains are demonstrated empirically across multiple methods. In revision, we will add a discussion subsection in §3 interpreting the clusters in terms of port-coupling strength and resonance indicators observable from the S-parameter magnitudes, supported by cluster centroid visualizations. We will also note that full physical validation lies beyond the paper's scope but that the consistent RMSE improvements suggest the partitioning captures relevant structure. This addresses the concern while preserving the empirical focus. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical ML framework remains self-contained

full rationale

The paper presents a two-stage cluster-then-predict pipeline that applies standard K-means (or alternatives) to S-parameter features followed by off-the-shelf regressors (GB, KNN, etc.) to map to load impedances. Reported RMSE reductions are obtained by direct numerical comparison against non-clustered baselines on the same dataset splits; no equation defines the target improvement as a fitted parameter, no uniqueness theorem is invoked, and the newly introduced RUI is an independent scalarization of existing metrics rather than a re-derivation of the performance gain. The central claim therefore rests on observable data behavior rather than reducing to its own inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on the standard machine-learning assumption that clustering will expose simpler sub-problems; the number of clusters and the exact definition of RUI are not specified and therefore function as free choices tuned to the data.

free parameters (1)

number of clusters
Chosen to optimize the reported RMSE and RUI scores; value not stated in abstract.

axioms (1)

domain assumption S-parameter vectors contain natural groupings in which the mapping to load impedances is locally simpler
Invoked by the cluster-then-predict design without validation that the chosen features align with the physical dependence structure.

pith-pipeline@v0.9.0 · 5479 in / 1262 out tokens · 59217 ms · 2026-05-16T05:46:00.274342+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

[1]

Smart Radio Environments Empowered by Reconfig- urable Intelligent Surfaces: How It Works, State of Research, and The Road Ahead,

M. D. Renzo, A. Zappone, M. Debbah, M.-S. Alouini, C. Yuen, J. de Rosny, and S. Tretyakov, “Smart Radio Environments Empowered by Reconfig- urable Intelligent Surfaces: How It Works, State of Research, and The Road Ahead,”IEEE J. Sel. Areas Commun., vol. 38, no. 11, pp. 2450–2525, 2020

work page 2020
[2]

Integrated Sensing, Identification, and Backscatter Communication System Utilizing Inverse Scattering Approach,

D. Ma, S. Shen, H. Zhou, C. Zhang, Q. Zhang, and R. Murch, “Integrated Sensing, Identification, and Backscatter Communication System Utilizing Inverse Scattering Approach,”IEEE Trans. Antennas Propag., vol. 73, no. 8, pp. 5877–5889, 2025

work page 2025
[3]

Predicting the Bistatic Scattering of a Multiport Loaded Structure Under Arbitrary Excitation: The S-Parameters Approach,

A. D. Kuznetsov, J. Holopainen, and V . Viikari, “Predicting the Bistatic Scattering of a Multiport Loaded Structure Under Arbitrary Excitation: The S-Parameters Approach,”IEEE Trans. Antennas Propag., vol. 72, no. 8, pp. 6691–6701, 2024

work page 2024
[4]

Learned Global Optimization for Inverse Scattering Problems: Matching Global Search With Computa- tional Efficiency,

M. Salucci, L. Poli, P. Rocca, and A. Massa, “Learned Global Optimization for Inverse Scattering Problems: Matching Global Search With Computa- tional Efficiency,”IEEE Trans. Antennas Propag., vol. 70, no. 8, pp. 6240– 6255, 2022

work page 2022
[5]

AI-Assisted Computationally-Efficient Global Optimization for Inverse Scattering,

M. Salucci, M. A. Hannan, A. Polo, and A. Massa, “AI-Assisted Computationally-Efficient Global Optimization for Inverse Scattering,” in Proc. IEEE Int. Symp. Antennas Propag. USNC- URSI Radio Sci. Meeting (APS/URSI), 2021, pp. 1687–1688

work page 2021
[6]

Physical Model- Inspired Deep Unrolling Network for Solving Nonlinear Inverse Scattering Problems,

J. Liu, H. Zhou, T. Ouyang, Q. Liu, and Y . Wang, “Physical Model- Inspired Deep Unrolling Network for Solving Nonlinear Inverse Scattering Problems,”IEEE Trans. Antennas Propag., vol. 70, no. 2, pp. 1236–1249, 2022

work page 2022
[7]

Deep Learning: A Powerful Framework for the Real-Time Solution of Inverse Scattering Problems,

A. Massa, X. Chen, M. Li, A. Polo, P. Rosatti, and M. Salucci, “Deep Learning: A Powerful Framework for the Real-Time Solution of Inverse Scattering Problems,” inProc. IEEE Int. Symp. Antennas Propag. USNC- URSI Radio Sci. Meeting (APS/URSI), 2021, pp. 2008–2009

work page 2021
[8]

Design of Reconfig- urable Intelligent Surfaces by Using S-Parameter Multiport Network The- ory—Optimization and Full-Wave Validation,

A. Abrardo, A. Toccafondi, and M. Di Renzo, “Design of Reconfig- urable Intelligent Surfaces by Using S-Parameter Multiport Network The- ory—Optimization and Full-Wave Validation,”IEEE Trans. Wireless Com- mun., vol. 23, no. 11, pp. 17 084–17 102, 2024

work page 2024
[9]

Direct synthesis of microwave filters using inverse scattering transmission-line matrix method,

R. de Padua Moreira and L. de Menezes, “Direct synthesis of microwave filters using inverse scattering transmission-line matrix method,”IEEE Trans. Microw. Theory Tech., vol. 48, no. 12, pp. 2271–2276, 2000

work page 2000
[10]

Virtual VNA: Minimal-Ambiguity Scattering Matrix Es- timation With a Fixed Set of “Virtual

P. del Hougne, “Virtual VNA: Minimal-Ambiguity Scattering Matrix Es- timation With a Fixed Set of “Virtual” Load-Tunable Ports,”IEEE Trans. Instrum. Meas., vol. 74, pp. 1–19, 2025

work page 2025
[11]

Capacity-Driven Smart Skin Loads Selection Utilizing KNN and Gradient Boosting,

A. D. Kuznetsov, A. Salmi, J. Holopainen, and V . Viikari, “Capacity-Driven Smart Skin Loads Selection Utilizing KNN and Gradient Boosting,” in2025 19th Eur. Conf. Antennas Propag. (EuCAP), 2025, pp. 1–5

work page 2025
[12]

Optimization of Loads for Antenna-Based Scattering Systems Using Feedforward Neural Networks,

A. D. Kuznetsov, J. Holopainen, and V . Viikari, “Optimization of Loads for Antenna-Based Scattering Systems Using Feedforward Neural Networks,” in2024 18th Eur. Conf. Antennas Propag. (EuCAP), 2024, pp. 01–05

work page 2024
[13]

Bridging accuracy and interpretability: A rescaled cluster-then-predict approach for enhanced credit scoring,

H.-W. Teng, M.-H. Kang, I.-H. Lee, and L.-C. Bai, “Bridging accuracy and interpretability: A rescaled cluster-then-predict approach for enhanced credit scoring,”Int. Rev. Financ. Anal., vol. 91, p. 103005, 2024

work page 2024
[14]

A Comprehensive Study on Inte- grating Clustering with Regression for Short-Term Forecasting of Building Energy Consumption: Case Study of a Green Building,

Z. Ding, Z. Wang, T. Hu, and H. Wang, “A Comprehensive Study on Inte- grating Clustering with Regression for Short-Term Forecasting of Building Energy Consumption: Case Study of a Green Building,”Buildings, vol. 12, no. 10, 2022

work page 2022
[15]

Diverse Multiple Trajectory Prediction Using a Two-Stage Prediction Network Trained With Lane Loss,

S. Kim, H. Jeon, J. W. Choi, and D. Kum, “Diverse Multiple Trajectory Prediction Using a Two-Stage Prediction Network Trained With Lane Loss,” IEEE Robot. Autom. Lett., vol. 8, no. 4, pp. 2038–2045, 2023

work page 2038
[16]

Non-Invasive Assessment of Lung Water Content Using Chest Patch RF Sensors: A Computer Study Using NIH Patients CT Scan Database and AI Classification Algorithms,

C. Leong, Y . Xiao, Z. Yun, and M. F. Iskander, “Non-Invasive Assessment of Lung Water Content Using Chest Patch RF Sensors: A Computer Study Using NIH Patients CT Scan Database and AI Classification Algorithms,” IEEE Access, vol. 11, pp. 13 058–13 066, 2023

work page 2023
[17]

Least squares quantization in PCM,

S. Lloyd, “Least squares quantization in PCM,”IEEE Trans. Inf. Theory, vol. 28, no. 2, pp. 129–137, 1982

work page 1982
[18]

Variational Wasserstein Clustering,

L. Mi, W. Zhang, X. Gu, and Y . Wang, “Variational Wasserstein Clustering,” inProc. Eur. Conf. Comput. Vis. (ECCV), September 2018, pp. 322–337

work page 2018
[19]

Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,

P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,”J. Comput. Appl. Math., vol. 20, pp. 53–65, 1987

work page 1987
[20]

Mixtures of Ensembles: System Separation and Identification via Optimal Transport,

F. Elvander and I. Haasler, “Mixtures of Ensembles: System Separation and Identification via Optimal Transport,”IEEE Control Syst. Lett., vol. 9, pp. 1646–1651, 2025

work page 2025