pith. sign in

arxiv: 2509.05692 · v2 · submitted 2025-09-06 · 📡 eess.SP

Resource Allocation and Beamforming in FIM-Assisted BS and STAR-BD-RIS-Aided NOMA: An AIW-Meta-Learning Approach

Pith reviewed 2026-05-18 18:02 UTC · model grok-4.3

classification 📡 eess.SP
keywords FIMSTAR-BD-RISNOMAmeta-learningenergy efficiencybeamformingresource allocationreconfigurable intelligent surface
0
0 comments X

The pith

An adaptive inverse-weighted meta-reinforcement learning algorithm maximizes energy efficiency in FIM-assisted base stations with STAR-BD-RIS and NOMA.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper investigates a communication system that uses a flexible intelligent metasurface at a multi-antenna base station together with simultaneously transmitting and reflecting beyond-diagonal reconfigurable intelligent surfaces and non-orthogonal multiple access. The objective is to maximize energy efficiency by jointly optimizing beamforming at the base station, the configuration of the STAR-BD-RIS, NOMA power allocation, and the shape of the FIM surface while respecting power limits. The problem is highly non-convex, so the authors introduce an adaptive inverse-weighted Meta-Soft Actor-Critic algorithm that adds system constraints to the learning reward through dynamic weighting. This leads to better convergence than standard meta-learning approaches. Simulations indicate superior performance and energy efficiency gains over traditional designs.

Core claim

The central discovery is that the AIW-Meta-SAC algorithm, by employing an adaptive weighting mechanism to incorporate constraints into the reward function of a meta-RL setup, effectively solves the joint optimization of BS beamforming, STAR-BD-RIS parameters, NOMA variables, and FIM surface shape, resulting in maximized energy efficiency and outperforming the Meta-DDPG baseline.

What carries the argument

The AIW-Meta-SAC algorithm with its adaptive inverse weighting in the reward function to handle constraints during meta-learning for the non-convex joint optimization.

If this is right

  • The proposed AIW-Meta-SAC significantly outperforms the Meta-DDPG baseline.
  • The FIM-assisted STAR-BD-RIS architecture achieves notable energy efficiency gains compared to conventional benchmark schemes.
  • Joint optimization including dynamic FIM surface shape improves overall system energy efficiency in NOMA networks.
  • Better learning efficiency and convergence behavior result from the adaptive weighting mechanism.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This method could be applied to optimize other wireless systems with dynamic surfaces and complex multiple access schemes.
  • The adaptive weighting technique may simplify constraint handling in reinforcement learning applications for resource allocation.
  • Real-world validation with hardware constraints could further demonstrate the practical benefits of the FIM and STAR-BD-RIS combination.

Load-bearing premise

The highly non-convex joint optimization problem can be solved effectively by incorporating system constraints via an adaptive weighting mechanism in the reward function of a meta-RL algorithm.

What would settle it

Running simulations or experiments where the AIW-Meta-SAC fails to show significant outperformance over Meta-DDPG or where the FIM-assisted architecture does not yield higher energy efficiency than benchmarks would falsify the main claims.

Figures

Figures reproduced from arXiv: 2509.05692 by Armin Farhadi, Eduard Jorswieck, Maryam Cheraghy.

Figure 1
Figure 1. Figure 1: communication system model with BS equipped with [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 5
Figure 5. Figure 5: PT versus episodes with different actor learning rates. reliable performance in our problem. As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
read the original abstract

This paper investigates a flexible intelligent metasurface (FIM)-enabled wireless communication system that integrates simultaneously transmitting and reflecting beyond diagonal reconfigurable intelligent surfaces (STAR-BD-RIS) with non-orthogonal multiple access (NOMA). The considered system consists of a multi-antenna FIM-assisted base station (BS) supported by dual-sector BD-RIS. The FIM is composed of low-cost radiating elements capable of independent signal transmission and dynamic vertical reconfiguration (morphing). The objective is to maximize energy efficiency (EE) by jointly optimizing the BS beamforming, STAR-BD-RIS configuration, NOMA-related variables, and the FIM surface shape under practical power constraints. Due to the highly non-convex nature of the problem, an adaptive inverse-weighted Meta-Soft Actor-Critic (AIW-Meta-SAC) algorithm is proposed. Unlike conventional Meta-SAC approaches, the proposed method employs an adaptive weighting mechanism to effectively incorporate system constraints into the reward function, thereby improving learning efficiency and convergence behavior. Simulation results demonstrate that the proposed AIW-Meta-SAC significantly outperforms the Meta-DDPG baseline. Furthermore, the FIM-assisted STAR-BD-RIS architecture achieves notable energy efficiency gains compared to conventional benchmark schemes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper considers a FIM-assisted multi-antenna BS aided by dual-sector STAR-BD-RIS in a NOMA downlink. The goal is to maximize energy efficiency by jointly optimizing BS beamforming vectors, STAR-BD-RIS phase shifts and amplitudes, NOMA power coefficients, and the dynamic shape (morphing) of the FIM surface, subject to transmit-power and other practical constraints. Because the resulting problem is highly non-convex, the authors introduce an adaptive inverse-weighted Meta-Soft Actor-Critic (AIW-Meta-SAC) algorithm that folds the constraints into the reward via an adaptive weighting mechanism. Simulation results are reported to show that AIW-Meta-SAC outperforms a Meta-DDPG baseline and that the FIM-assisted STAR-BD-RIS architecture yields notable EE improvements over conventional benchmarks.

Significance. If the learned policies remain feasible under the stated power limits and the reported EE gains are reproducible, the work would provide a concrete demonstration that meta-RL with adaptive reward shaping can handle the joint beamforming-plus-RIS-plus-NOMA-plus-morphing problem. The architectural combination of FIM morphing with STAR-BD-RIS is a timely extension of current RIS literature and could inform practical deployments once constraint satisfaction is rigorously verified.

major comments (2)
  1. [§IV] §IV (AIW-Meta-SAC algorithm description): the claim that the adaptive inverse-weighting mechanism 'effectively incorporates system constraints into the reward function' is load-bearing for all performance claims. Without explicit reporting of per-epoch or per-test power-constraint violation rates (or a comparison against a constrained-RL baseline such as Lagrangian relaxation or safe RL), it remains unclear whether the high-reward trajectories produced by AIW-Meta-SAC are feasible or merely penalized. This directly affects the validity of the outperformance and EE-gain statements.
  2. [§V] §V (Simulation results): the reported EE gains and superiority over Meta-DDPG are presented without accompanying details on the underlying channel models (e.g., Rician factors, path-loss exponents), exact power-budget values, number of Monte-Carlo realizations, or statistical significance tests. These omissions make it impossible to judge whether the numerical improvements are robust or sensitive to modeling assumptions.
minor comments (2)
  1. [System Model] The notation for the FIM surface-shape variables should be introduced once in the system model and then used consistently in the optimization formulation and algorithm sections.
  2. [Figures] Convergence and EE-versus-SNR curves would benefit from shaded confidence intervals or error bars to convey variability across random seeds.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help improve the rigor and clarity of our manuscript. We address each major comment point by point below, indicating the revisions we will implement.

read point-by-point responses
  1. Referee: [§IV] §IV (AIW-Meta-SAC algorithm description): the claim that the adaptive inverse-weighting mechanism 'effectively incorporates system constraints into the reward function' is load-bearing for all performance claims. Without explicit reporting of per-epoch or per-test power-constraint violation rates (or a comparison against a constrained-RL baseline such as Lagrangian relaxation or safe RL), it remains unclear whether the high-reward trajectories produced by AIW-Meta-SAC are feasible or merely penalized. This directly affects the validity of the outperformance and EE-gain statements.

    Authors: We agree that explicit verification of constraint satisfaction is essential to support the performance claims. In the revised manuscript, we will add results in Section V reporting per-epoch and per-test power-constraint violation rates for AIW-Meta-SAC. We will also include a direct comparison of feasibility against the baseline Meta-SAC without adaptive inverse weighting. While we do not currently benchmark against Lagrangian relaxation or safe RL methods, we will discuss this limitation and note it as future work. These additions will demonstrate that the learned policies respect the power limits. revision: yes

  2. Referee: [§V] §V (Simulation results): the reported EE gains and superiority over Meta-DDPG are presented without accompanying details on the underlying channel models (e.g., Rician factors, path-loss exponents), exact power-budget values, number of Monte-Carlo realizations, or statistical significance tests. These omissions make it impossible to judge whether the numerical improvements are robust or sensitive to modeling assumptions.

    Authors: We acknowledge that more detailed simulation parameters are needed for reproducibility. In the revised manuscript, we will expand Section V with a dedicated parameter table specifying the channel models (Rician factors and path-loss exponents), exact power-budget values, the number of Monte-Carlo realizations, and statistical significance measures such as confidence intervals or hypothesis tests on the EE gains. This will allow readers to assess robustness under the stated assumptions. revision: yes

Circularity Check

0 steps flagged

No circularity: algorithmic proposal evaluated via independent simulation benchmarks

full rationale

The paper proposes the AIW-Meta-SAC algorithm as a solution to a joint non-convex optimization problem and validates performance through simulation results comparing against Meta-DDPG and other benchmarks. No equations, fitted parameters, or self-citations are presented that reduce the claimed EE gains or outperformance to quantities defined by the inputs themselves. The adaptive weighting mechanism is introduced as a novel component within the reward function, but its effectiveness is assessed externally via simulation rather than by construction or renaming of prior results. This is a standard self-contained algorithmic contribution with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the method relies on standard meta-RL machinery adapted to the described wireless constraints.

pith-pipeline@v0.9.0 · 5771 in / 1156 out tokens · 56471 ms · 2026-05-18T18:02:43.888428+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    The goal is to maximize energy efficiency (EE) by jointly optimizing the BS beamforming, STAR-BD-RIS configuration, NOMA-related variables, and the FIM surface shape under practical power constraints... adaptive inverse-weighted Meta-Soft Actor-Critic (AIW-Meta-SAC)

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Comprehensive Review of Advances and Challenges in Next Generation Wireless Networks: From Novel Hardware Technologies to Learning Based Resource Allocation in 6G

    eess.SP 2026-05 unverdicted novelty 1.0

    The paper surveys novel hardware technologies including RIS and ISAC along with learning-based resource allocation for 6G, then analyzes challenges and open questions.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · cited by 1 Pith paper

  1. [1]

    Joint Beamformin g and Resource Allocation for STAR-IRS-Aided SCMA ISAC Systems U sing Meta Deep Reinforcement Learning,

    A. Farhadi, A. Olfat, and C. Masouros, “Joint Beamformin g and Resource Allocation for STAR-IRS-Aided SCMA ISAC Systems U sing Meta Deep Reinforcement Learning,” IEEE Transactions on Wireless Communications, 2025

  2. [2]

    Resource Allocation i n Sparse Code Multiple Access-Based Systems for Cloud-Radio Access Netw ork in 5G Networks,

    A. Farhadi Zavleh and H. Bakhshi, “Resource Allocation i n Sparse Code Multiple Access-Based Systems for Cloud-Radio Access Netw ork in 5G Networks,” Transactions on Emerging Telecommunications Technolo- gies, vol. 32, no. 1, p. e4153, 2021

  3. [3]

    Downlink Resource Allocati on to Total System Transmit Power Minimization in SCMA-Based Systems f or Cloud-RAN in 5G Networks,

    A. F. Zavleh and H. Bakhshi, “Downlink Resource Allocati on to Total System Transmit Power Minimization in SCMA-Based Systems f or Cloud-RAN in 5G Networks,” Telecommunication Systems , vol. 81, no. 4, pp. 575–590, 2022

  4. [4]

    Meta-Learning for Resource Allocation in Uplink Multi-Ac tive STAR- RIS-Aided NOMA System,

    S. Javadi, A. Farhadi, M. R. Mili, E. Jorswieck, and N. Al- Dhahir, “Meta-Learning for Resource Allocation in Uplink Multi-Ac tive STAR- RIS-Aided NOMA System,” IEEE Wireless Communications Letters , 2025

  5. [5]

    Downlink Multiuser Communications Relying on Flexible In telligent Metasurfaces,

    J. An, C. Y uen, M. Di Renzo, M. Debbah, H. V . Poor, and L. Han zo, “Downlink Multiuser Communications Relying on Flexible In telligent Metasurfaces,” in GLOBECOM 2024-2024 IEEE Global Communica- tions Conference. IEEE, 2024, pp. 4932–4937

  6. [6]

    De- coupling Optical Function and Geometrical Form Using Confo rmal Flexible Dielectric Metasurfaces,

    S. M. Kamali, A. Arbabi, E. Arbabi, Y . Horie, and A. Faraon , “De- coupling Optical Function and Geometrical Form Using Confo rmal Flexible Dielectric Metasurfaces,” Nature Communications, vol. 7, no. 1, p. 11618, 2016

  7. [7]

    Soft Shape-Programmable Surfaces by Fast Electromagnetic Actuation of Liquid Metal Networks ,

    X. Ni, H. Luan, J.-T. Kim, S. I. Rogge, Y . Bai, J. W. Kwak, S. Liu, D. S. Y ang, S. Li, S. Li et al. , “Soft Shape-Programmable Surfaces by Fast Electromagnetic Actuation of Liquid Metal Networks ,” Nature Communications, vol. 13, no. 1, p. 5576, 2022

  8. [8]

    A Dynamically Reprogrammable Surface with Self-Evolving Shape Morphing,

    Y . Bai, H. Wang, Y . Xue, Y . Pan, J.-T. Kim, X. Ni, T.-L. Liu, Y . Y ang, M. Han, Y . Huang et al. , “A Dynamically Reprogrammable Surface with Self-Evolving Shape Morphing,” Nature, vol. 609, no. 7928, pp. 701–708, 2022

  9. [9]

    A Meta-DDPG Algorithm for Energy and Spectr al Efficiency Optimization in STAR-RIS-Aided SWIPT,

    A. Farhadi, M. Moomivand, S. K. Taskou, M. R. Mili, M. Rast i, and E. Hossain, “A Meta-DDPG Algorithm for Energy and Spectr al Efficiency Optimization in STAR-RIS-Aided SWIPT,” IEEE Wireless Communications Letters , 2024

  10. [10]

    Intelligent Reflecting Surfaces for Wireless Net works: De- ployment Architectures, Key Solutions, and Field Trials,

    Q. Wu, G. Chen, Q. Peng, W. Chen, Y . Y uan, Z. Cheng, J. Dou, Z. Zhao, and P . Li, “Intelligent Reflecting Surfaces for Wireless Net works: De- ployment Architectures, Key Solutions, and Field Trials,” IEEE Wireless Communications, 2025

  11. [11]

    A Meta-Learning Approach for Energy-Efficient Resource Allo cation and Antenna Selection in STAR-BD-RIS Aided Wireless Networks,

    A. Farhadi, R. Hatami, M. R. Mili, C. Masouros, and M. Ben nis, “A Meta-Learning Approach for Energy-Efficient Resource Allo cation and Antenna Selection in STAR-BD-RIS Aided Wireless Networks, ” IEEE Wireless Communications Letters , 2025

  12. [12]

    Secur ity Enhancement for Coupled Phase-Shift STAR-RIS Networks,

    Z. Zhang, Z. Wang, Y . Liu, B. He, L. Lv, and J. Chen, “Secur ity Enhancement for Coupled Phase-Shift STAR-RIS Networks,” IEEE Transactions on V ehicular Technology, vol. Early Access, 2023

  13. [13]

    Sum Rate Maximization for IRS-Assisted Uplink NOMA,

    M. Zeng, X. Li, G. Li, W. Hao, and O. A. Dobre, “Sum Rate Maximization for IRS-Assisted Uplink NOMA,” IEEE Communications Letters, vol. 25, no. 1, pp. 234–238, January 2021

  14. [14]

    Modeling and Archite cture De- sign of Reconfigurable Intelligent Surfaces Using Scatteri ng Parameter Network Analysis,

    S. Shen, B. Clerckx, and R. Murch, “Modeling and Archite cture De- sign of Reconfigurable Intelligent Surfaces Using Scatteri ng Parameter Network Analysis,” IEEE Transactions on Wireless Communications , vol. 21, no. 2, pp. 1229–1243, 2021

  15. [15]

    GWO-FNN: Fuzz y Neural Network Optimized via Grey Wolf Optimization,

    P . V . de Campos Souza and I. Sayyadzadeh, “GWO-FNN: Fuzz y Neural Network Optimized via Grey Wolf Optimization,” Mathematics, vol. 13, no. 7, p. 1156, 2025

  16. [16]

    Distribut ed Multi-Agent Meta Learning for Trajectory Design in Wireles s Drone 6 Networks,

    Y . Hu, M. Chen, W. Saad, H. V . Poor, and S. Cui, “Distribut ed Multi-Agent Meta Learning for Trajectory Design in Wireles s Drone 6 Networks,” IEEE Journal on Selected Areas in Communications , vol. 39, no. 10, pp. 3177–3192, October 2021

  17. [17]

    An Overview of Signal Processing Techniques for Mi llimeter Wave MIMO Systems,

    R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and A. M. Sayeed, “An Overview of Signal Processing Techniques for Mi llimeter Wave MIMO Systems,” IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 3, pp. 436–453, 2016