Resource Allocation and Beamforming in FIM-Assisted BS and STAR-BD-RIS-Aided NOMA: An AIW-Meta-Learning Approach
Pith reviewed 2026-05-18 18:02 UTC · model grok-4.3
The pith
An adaptive inverse-weighted meta-reinforcement learning algorithm maximizes energy efficiency in FIM-assisted base stations with STAR-BD-RIS and NOMA.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that the AIW-Meta-SAC algorithm, by employing an adaptive weighting mechanism to incorporate constraints into the reward function of a meta-RL setup, effectively solves the joint optimization of BS beamforming, STAR-BD-RIS parameters, NOMA variables, and FIM surface shape, resulting in maximized energy efficiency and outperforming the Meta-DDPG baseline.
What carries the argument
The AIW-Meta-SAC algorithm with its adaptive inverse weighting in the reward function to handle constraints during meta-learning for the non-convex joint optimization.
If this is right
- The proposed AIW-Meta-SAC significantly outperforms the Meta-DDPG baseline.
- The FIM-assisted STAR-BD-RIS architecture achieves notable energy efficiency gains compared to conventional benchmark schemes.
- Joint optimization including dynamic FIM surface shape improves overall system energy efficiency in NOMA networks.
- Better learning efficiency and convergence behavior result from the adaptive weighting mechanism.
Where Pith is reading between the lines
- This method could be applied to optimize other wireless systems with dynamic surfaces and complex multiple access schemes.
- The adaptive weighting technique may simplify constraint handling in reinforcement learning applications for resource allocation.
- Real-world validation with hardware constraints could further demonstrate the practical benefits of the FIM and STAR-BD-RIS combination.
Load-bearing premise
The highly non-convex joint optimization problem can be solved effectively by incorporating system constraints via an adaptive weighting mechanism in the reward function of a meta-RL algorithm.
What would settle it
Running simulations or experiments where the AIW-Meta-SAC fails to show significant outperformance over Meta-DDPG or where the FIM-assisted architecture does not yield higher energy efficiency than benchmarks would falsify the main claims.
Figures
read the original abstract
This paper investigates a flexible intelligent metasurface (FIM)-enabled wireless communication system that integrates simultaneously transmitting and reflecting beyond diagonal reconfigurable intelligent surfaces (STAR-BD-RIS) with non-orthogonal multiple access (NOMA). The considered system consists of a multi-antenna FIM-assisted base station (BS) supported by dual-sector BD-RIS. The FIM is composed of low-cost radiating elements capable of independent signal transmission and dynamic vertical reconfiguration (morphing). The objective is to maximize energy efficiency (EE) by jointly optimizing the BS beamforming, STAR-BD-RIS configuration, NOMA-related variables, and the FIM surface shape under practical power constraints. Due to the highly non-convex nature of the problem, an adaptive inverse-weighted Meta-Soft Actor-Critic (AIW-Meta-SAC) algorithm is proposed. Unlike conventional Meta-SAC approaches, the proposed method employs an adaptive weighting mechanism to effectively incorporate system constraints into the reward function, thereby improving learning efficiency and convergence behavior. Simulation results demonstrate that the proposed AIW-Meta-SAC significantly outperforms the Meta-DDPG baseline. Furthermore, the FIM-assisted STAR-BD-RIS architecture achieves notable energy efficiency gains compared to conventional benchmark schemes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper considers a FIM-assisted multi-antenna BS aided by dual-sector STAR-BD-RIS in a NOMA downlink. The goal is to maximize energy efficiency by jointly optimizing BS beamforming vectors, STAR-BD-RIS phase shifts and amplitudes, NOMA power coefficients, and the dynamic shape (morphing) of the FIM surface, subject to transmit-power and other practical constraints. Because the resulting problem is highly non-convex, the authors introduce an adaptive inverse-weighted Meta-Soft Actor-Critic (AIW-Meta-SAC) algorithm that folds the constraints into the reward via an adaptive weighting mechanism. Simulation results are reported to show that AIW-Meta-SAC outperforms a Meta-DDPG baseline and that the FIM-assisted STAR-BD-RIS architecture yields notable EE improvements over conventional benchmarks.
Significance. If the learned policies remain feasible under the stated power limits and the reported EE gains are reproducible, the work would provide a concrete demonstration that meta-RL with adaptive reward shaping can handle the joint beamforming-plus-RIS-plus-NOMA-plus-morphing problem. The architectural combination of FIM morphing with STAR-BD-RIS is a timely extension of current RIS literature and could inform practical deployments once constraint satisfaction is rigorously verified.
major comments (2)
- [§IV] §IV (AIW-Meta-SAC algorithm description): the claim that the adaptive inverse-weighting mechanism 'effectively incorporates system constraints into the reward function' is load-bearing for all performance claims. Without explicit reporting of per-epoch or per-test power-constraint violation rates (or a comparison against a constrained-RL baseline such as Lagrangian relaxation or safe RL), it remains unclear whether the high-reward trajectories produced by AIW-Meta-SAC are feasible or merely penalized. This directly affects the validity of the outperformance and EE-gain statements.
- [§V] §V (Simulation results): the reported EE gains and superiority over Meta-DDPG are presented without accompanying details on the underlying channel models (e.g., Rician factors, path-loss exponents), exact power-budget values, number of Monte-Carlo realizations, or statistical significance tests. These omissions make it impossible to judge whether the numerical improvements are robust or sensitive to modeling assumptions.
minor comments (2)
- [System Model] The notation for the FIM surface-shape variables should be introduced once in the system model and then used consistently in the optimization formulation and algorithm sections.
- [Figures] Convergence and EE-versus-SNR curves would benefit from shaded confidence intervals or error bars to convey variability across random seeds.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which help improve the rigor and clarity of our manuscript. We address each major comment point by point below, indicating the revisions we will implement.
read point-by-point responses
-
Referee: [§IV] §IV (AIW-Meta-SAC algorithm description): the claim that the adaptive inverse-weighting mechanism 'effectively incorporates system constraints into the reward function' is load-bearing for all performance claims. Without explicit reporting of per-epoch or per-test power-constraint violation rates (or a comparison against a constrained-RL baseline such as Lagrangian relaxation or safe RL), it remains unclear whether the high-reward trajectories produced by AIW-Meta-SAC are feasible or merely penalized. This directly affects the validity of the outperformance and EE-gain statements.
Authors: We agree that explicit verification of constraint satisfaction is essential to support the performance claims. In the revised manuscript, we will add results in Section V reporting per-epoch and per-test power-constraint violation rates for AIW-Meta-SAC. We will also include a direct comparison of feasibility against the baseline Meta-SAC without adaptive inverse weighting. While we do not currently benchmark against Lagrangian relaxation or safe RL methods, we will discuss this limitation and note it as future work. These additions will demonstrate that the learned policies respect the power limits. revision: yes
-
Referee: [§V] §V (Simulation results): the reported EE gains and superiority over Meta-DDPG are presented without accompanying details on the underlying channel models (e.g., Rician factors, path-loss exponents), exact power-budget values, number of Monte-Carlo realizations, or statistical significance tests. These omissions make it impossible to judge whether the numerical improvements are robust or sensitive to modeling assumptions.
Authors: We acknowledge that more detailed simulation parameters are needed for reproducibility. In the revised manuscript, we will expand Section V with a dedicated parameter table specifying the channel models (Rician factors and path-loss exponents), exact power-budget values, the number of Monte-Carlo realizations, and statistical significance measures such as confidence intervals or hypothesis tests on the EE gains. This will allow readers to assess robustness under the stated assumptions. revision: yes
Circularity Check
No circularity: algorithmic proposal evaluated via independent simulation benchmarks
full rationale
The paper proposes the AIW-Meta-SAC algorithm as a solution to a joint non-convex optimization problem and validates performance through simulation results comparing against Meta-DDPG and other benchmarks. No equations, fitted parameters, or self-citations are presented that reduce the claimed EE gains or outperformance to quantities defined by the inputs themselves. The adaptive weighting mechanism is introduced as a novel component within the reward function, but its effectiveness is assessed externally via simulation rather than by construction or renaming of prior results. This is a standard self-contained algorithmic contribution with no load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The goal is to maximize energy efficiency (EE) by jointly optimizing the BS beamforming, STAR-BD-RIS configuration, NOMA-related variables, and the FIM surface shape under practical power constraints... adaptive inverse-weighted Meta-Soft Actor-Critic (AIW-Meta-SAC)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Comprehensive Review of Advances and Challenges in Next Generation Wireless Networks: From Novel Hardware Technologies to Learning Based Resource Allocation in 6G
The paper surveys novel hardware technologies including RIS and ISAC along with learning-based resource allocation for 6G, then analyzes challenges and open questions.
Reference graph
Works this paper leans on
-
[1]
A. Farhadi, A. Olfat, and C. Masouros, “Joint Beamformin g and Resource Allocation for STAR-IRS-Aided SCMA ISAC Systems U sing Meta Deep Reinforcement Learning,” IEEE Transactions on Wireless Communications, 2025
work page 2025
-
[2]
A. Farhadi Zavleh and H. Bakhshi, “Resource Allocation i n Sparse Code Multiple Access-Based Systems for Cloud-Radio Access Netw ork in 5G Networks,” Transactions on Emerging Telecommunications Technolo- gies, vol. 32, no. 1, p. e4153, 2021
work page 2021
-
[3]
A. F. Zavleh and H. Bakhshi, “Downlink Resource Allocati on to Total System Transmit Power Minimization in SCMA-Based Systems f or Cloud-RAN in 5G Networks,” Telecommunication Systems , vol. 81, no. 4, pp. 575–590, 2022
work page 2022
-
[4]
Meta-Learning for Resource Allocation in Uplink Multi-Ac tive STAR- RIS-Aided NOMA System,
S. Javadi, A. Farhadi, M. R. Mili, E. Jorswieck, and N. Al- Dhahir, “Meta-Learning for Resource Allocation in Uplink Multi-Ac tive STAR- RIS-Aided NOMA System,” IEEE Wireless Communications Letters , 2025
work page 2025
-
[5]
Downlink Multiuser Communications Relying on Flexible In telligent Metasurfaces,
J. An, C. Y uen, M. Di Renzo, M. Debbah, H. V . Poor, and L. Han zo, “Downlink Multiuser Communications Relying on Flexible In telligent Metasurfaces,” in GLOBECOM 2024-2024 IEEE Global Communica- tions Conference. IEEE, 2024, pp. 4932–4937
work page 2024
-
[6]
S. M. Kamali, A. Arbabi, E. Arbabi, Y . Horie, and A. Faraon , “De- coupling Optical Function and Geometrical Form Using Confo rmal Flexible Dielectric Metasurfaces,” Nature Communications, vol. 7, no. 1, p. 11618, 2016
work page 2016
-
[7]
Soft Shape-Programmable Surfaces by Fast Electromagnetic Actuation of Liquid Metal Networks ,
X. Ni, H. Luan, J.-T. Kim, S. I. Rogge, Y . Bai, J. W. Kwak, S. Liu, D. S. Y ang, S. Li, S. Li et al. , “Soft Shape-Programmable Surfaces by Fast Electromagnetic Actuation of Liquid Metal Networks ,” Nature Communications, vol. 13, no. 1, p. 5576, 2022
work page 2022
-
[8]
A Dynamically Reprogrammable Surface with Self-Evolving Shape Morphing,
Y . Bai, H. Wang, Y . Xue, Y . Pan, J.-T. Kim, X. Ni, T.-L. Liu, Y . Y ang, M. Han, Y . Huang et al. , “A Dynamically Reprogrammable Surface with Self-Evolving Shape Morphing,” Nature, vol. 609, no. 7928, pp. 701–708, 2022
work page 2022
-
[9]
A Meta-DDPG Algorithm for Energy and Spectr al Efficiency Optimization in STAR-RIS-Aided SWIPT,
A. Farhadi, M. Moomivand, S. K. Taskou, M. R. Mili, M. Rast i, and E. Hossain, “A Meta-DDPG Algorithm for Energy and Spectr al Efficiency Optimization in STAR-RIS-Aided SWIPT,” IEEE Wireless Communications Letters , 2024
work page 2024
-
[10]
Q. Wu, G. Chen, Q. Peng, W. Chen, Y . Y uan, Z. Cheng, J. Dou, Z. Zhao, and P . Li, “Intelligent Reflecting Surfaces for Wireless Net works: De- ployment Architectures, Key Solutions, and Field Trials,” IEEE Wireless Communications, 2025
work page 2025
-
[11]
A. Farhadi, R. Hatami, M. R. Mili, C. Masouros, and M. Ben nis, “A Meta-Learning Approach for Energy-Efficient Resource Allo cation and Antenna Selection in STAR-BD-RIS Aided Wireless Networks, ” IEEE Wireless Communications Letters , 2025
work page 2025
-
[12]
Secur ity Enhancement for Coupled Phase-Shift STAR-RIS Networks,
Z. Zhang, Z. Wang, Y . Liu, B. He, L. Lv, and J. Chen, “Secur ity Enhancement for Coupled Phase-Shift STAR-RIS Networks,” IEEE Transactions on V ehicular Technology, vol. Early Access, 2023
work page 2023
-
[13]
Sum Rate Maximization for IRS-Assisted Uplink NOMA,
M. Zeng, X. Li, G. Li, W. Hao, and O. A. Dobre, “Sum Rate Maximization for IRS-Assisted Uplink NOMA,” IEEE Communications Letters, vol. 25, no. 1, pp. 234–238, January 2021
work page 2021
-
[14]
S. Shen, B. Clerckx, and R. Murch, “Modeling and Archite cture De- sign of Reconfigurable Intelligent Surfaces Using Scatteri ng Parameter Network Analysis,” IEEE Transactions on Wireless Communications , vol. 21, no. 2, pp. 1229–1243, 2021
work page 2021
-
[15]
GWO-FNN: Fuzz y Neural Network Optimized via Grey Wolf Optimization,
P . V . de Campos Souza and I. Sayyadzadeh, “GWO-FNN: Fuzz y Neural Network Optimized via Grey Wolf Optimization,” Mathematics, vol. 13, no. 7, p. 1156, 2025
work page 2025
-
[16]
Distribut ed Multi-Agent Meta Learning for Trajectory Design in Wireles s Drone 6 Networks,
Y . Hu, M. Chen, W. Saad, H. V . Poor, and S. Cui, “Distribut ed Multi-Agent Meta Learning for Trajectory Design in Wireles s Drone 6 Networks,” IEEE Journal on Selected Areas in Communications , vol. 39, no. 10, pp. 3177–3192, October 2021
work page 2021
-
[17]
An Overview of Signal Processing Techniques for Mi llimeter Wave MIMO Systems,
R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and A. M. Sayeed, “An Overview of Signal Processing Techniques for Mi llimeter Wave MIMO Systems,” IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 3, pp. 436–453, 2016
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.