Reinforcement Learning-Enabled Agent for Transmitter Optimization in Digital-Analog Radio-over-Fiber Fronthaul
Pith reviewed 2026-06-28 04:44 UTC · model grok-4.3
The pith
A reinforcement learning agent optimizes transmitter parameters in digital-analog radio-over-fiber systems from end-to-end SNR feedback alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
An RL-enabled agent architecture learns optimal values for rounding factor, scaling factor, geometric shaping factor, and pre-equalization tap coefficients directly from end-to-end SNR feedback, steadily improves SNR through sequential decisions, and outperforms baseline optimization by approximately 2.7 dB, reaching final SNRs of 35.8 dB, 42.9 dB, 53.8 dB, and 63.2 dB that support 1024-, 4096-, 16384-, and 65536-QAM, respectively, in 1- to 4-order DA-RoF experimental transmissions.
What carries the argument
The RL agent that selects transmitter parameter adjustments at each step based solely on observed SNR reward.
If this is right
- Higher-order QAM formats become feasible in DA-RoF fronthaul without manual parameter tuning.
- Optimization can run online during operation instead of requiring offline grid searches.
- The same agent structure works across different transmission orders without redesign.
- Hardware efficiency improves because no differentiable channel model is needed.
Where Pith is reading between the lines
- The method could be tested on other optical links where multiple analog and digital parameters interact.
- If SNR feedback remains reliable under field conditions, the approach might reduce the need for periodic manual recalibration in deployed networks.
- Extending the reward signal to include latency or power metrics would be a direct next measurement.
Load-bearing premise
End-to-end SNR feedback by itself is enough for the agent to discover good parameter settings and that the laboratory setup matches real-world fronthaul conditions.
What would settle it
Running the trained agent on a different fiber length or with added impairments not present in the original experiments and observing no SNR gain beyond the baseline or inability to support the claimed QAM orders.
Figures
read the original abstract
Digital-analog radio-over-fiber (DA-RoF) has emerged as a promising fronthaul solution that combines the high spectral efficiency of analog transmission with the robustness of digital transmission. However, the performance of DA-RoF critically depends on several tightly coupled parameters, including the rounding factor (RF), scaling factor (SF), geometric shaping (GS) factor, and pre-equalization taps coefficients, which jointly affect quantization noise, nonlinear distortion, and bandwidth-induced inter-symbol interference (ISI). Conventional grid search-based optimization is computationally prohibitive and impractical for optical communication. In this work, we propose a reinforcement-learning (RL)-enabled DA-RoF fronthaul agent architecture, capable of autonomously learning optimal transmitter parameters from end-to-end signal-to-noise ratio (SNR) feedback without a differentiable channel model. Experimental results demonstrate that the trained agent steadily improves SNR through sequential decision making and outperforms baseline, achieving ~2.7-dB SNR improvement for 1- to 4-order DA-RoF transmission, reaching final SNR of 35.8 dB, 42.9 dB, 53.8 dB, and 63.2 dB and supporting 1024-, 4096-, 16384-, 65536-quadrature amplitude modulation (QAM) format, respectively. These results validate that the proposed RL-enabled framework provides online, scalable, and hardware-efficient parameter optimization for DA-RoF fronthaul systems, paving the way toward high-order modulation format and intelligent next-generation radio access networks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a reinforcement-learning (RL) agent for autonomous optimization of transmitter parameters (rounding factor RF, scaling factor SF, geometric shaping GS factor, and pre-equalization taps) in digital-analog radio-over-fiber (DA-RoF) fronthaul. The agent uses only scalar end-to-end SNR feedback without a differentiable channel model. Experiments claim the trained agent achieves a steady ~2.7 dB SNR improvement over baseline, reaching final SNRs of 35.8 dB, 42.9 dB, 53.8 dB, and 63.2 dB while supporting 1024-, 4096-, 16384-, and 65536-QAM, respectively.
Significance. If the experimental results hold under scrutiny, the work provides a practical demonstration of model-free RL for hardware-in-the-loop optimization of tightly coupled parameters affecting quantization noise, nonlinearity, and ISI in DA-RoF systems. This could enable scalable, online tuning for high-order modulation formats in next-generation fronthaul without exhaustive grid search or channel models, addressing a real deployment bottleneck.
major comments (2)
- [Abstract and results section] Abstract and results section: The central claim of reliable ~2.7 dB gain via sequential decision-making rests on the RL agent converging in the joint (RF, SF, GS, pre-eq) space using only scalar SNR. No details are supplied on state representation, action discretization/continuity, exploration schedule (e.g., ε-greedy or entropy), number of independent runs averaged, or learning curves, leaving open whether reported final SNRs reflect robust policy learning or favorable initialization.
- [Method and experimental setup sections] Method and experimental setup sections: The assumption that end-to-end SNR feedback alone suffices to escape local maxima caused by the interplay of quantization noise, nonlinearity, and bandwidth-induced ISI is load-bearing but unsupported by any ablation, sensitivity analysis, or comparison against alternative optimizers (e.g., Bayesian optimization) that would confirm global optimality in the reported operating regime.
minor comments (1)
- [Abstract] Abstract: The phrasing '1- to 4-order DA-RoF transmission' is ambiguous; clarify whether this refers to modulation order or transmission order.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the manuscript to incorporate additional details and analyses where appropriate.
read point-by-point responses
-
Referee: [Abstract and results section] Abstract and results section: The central claim of reliable ~2.7 dB gain via sequential decision-making rests on the RL agent converging in the joint (RF, SF, GS, pre-eq) space using only scalar SNR. No details are supplied on state representation, action discretization/continuity, exploration schedule (e.g., ε-greedy or entropy), number of independent runs averaged, or learning curves, leaving open whether reported final SNRs reflect robust policy learning or favorable initialization.
Authors: We agree that the current manuscript does not provide these implementation details. In the revised version, we will expand the Methods section to specify: state as a tuple of current parameter values plus recent SNR history; discrete action space with bounded adjustments per parameter; ε-greedy exploration with linear decay; results averaged over 5 independent training runs with reported standard deviation; and include learning curves in a new figure demonstrating consistent convergence behavior across runs. These additions will substantiate that the reported gains arise from learned policies rather than initialization. revision: yes
-
Referee: [Method and experimental setup sections] Method and experimental setup sections: The assumption that end-to-end SNR feedback alone suffices to escape local maxima caused by the interplay of quantization noise, nonlinearity, and bandwidth-induced ISI is load-bearing but unsupported by any ablation, sensitivity analysis, or comparison against alternative optimizers (e.g., Bayesian optimization) that would confirm global optimality in the reported operating regime.
Authors: The manuscript presents empirical evidence of steady SNR improvement through sequential decisions but does not include ablations or optimizer comparisons. While the multi-order modulation results indicate effective navigation of the coupled parameter space, we will add a dedicated subsection with (i) sensitivity analysis varying one parameter at a time while holding others fixed and (ii) a direct comparison of the RL agent against Bayesian optimization under identical hardware-in-the-loop conditions. This will provide quantitative support for the sufficiency of scalar SNR feedback. revision: yes
Circularity Check
No circularity; results are empirical RL training outcomes on hardware.
full rationale
The paper reports experimental SNR measurements from an RL agent trained on end-to-end feedback in a DA-RoF setup. No derivation chain, equations, or predictions are presented that reduce to inputs by construction. Claims rest on observed performance gains (e.g., ~2.7 dB improvement) rather than any self-definitional, fitted-input, or self-citation load-bearing steps. This is a standard experimental validation with no mathematical reduction to inspect.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
6G Drivers and Vision,
NGMN Alliance, “6G Drivers and Vision,” NGMN Alliance, White Paper, Apr. 2021. [Online]. Available: https://www.ngmn.org/wp-content/uploads/NGMN-6G- Drivers-and-Vision-V1.0_final.pdf
2021
-
[2]
Exploring the key technologies and applications of 6G wireless communication network,
P Li, J Fan, and J Wu, “Exploring the key technologies and applications of 6G wireless communication network,” iScience, vol. 28, no. 5, p. 112281, May 2025, doi: 10.1016/j.isci.2025.112281
-
[3]
6G Wireless Networks: Vision, Requirements, Architecture, and Key Technologies,
Z. Zhang et al., “6G Wireless Networks: Vision, Requirements, Architecture, and Key Technologies,” IEEE Veh. Technol. Mag., vol. 14, no. 3, pp. 28–41, Sep. 2019, doi: 10.1109/MVT.2019.2921208
-
[4]
H Ji, C Sun, and W Shieh, “Spectral Efficiency Comparison Between Analog and Digital RoF for Mobile Fronthaul Transmission Link,” J. Lightwave Technol., vol. 38, no. 20, pp. 5617–5623, Oct. 2020, doi: 10.1109/JLT.2020.3003123
-
[5]
Analog vs Digital Radio-Over-Fiber: A Spectral Efficiency Debate From the SNR Perspective,
D Che, “Analog vs Digital Radio-Over-Fiber: A Spectral Efficiency Debate From the SNR Perspective,” J. Lightwave Technol., vol. 39, no. 16, pp. 5325–5335, Aug. 2021, doi: 10.1109/JLT.2021.3102220
-
[6]
Things You Should Know About Fronthaul,
A. Pizzinat, P. Chanclou, F. Saliou, and T. Diallo, “Things You Should Know About Fronthaul,” J. Lightwave Technol., vol. 33, no. 5, pp. 1077–1083, Mar. 2015, doi: 10.1109/JLT.2014.2382872
-
[7]
CPRI Specification V7 0: Interface Specification,
CPRI Cooperation, “CPRI Specification V7 0: Interface Specification,” CPRI Cooperation, Interface Specification, Oct. 2015. [Online]. Available: https://www.cpri.info/downloads/CPRI_v_7_0_2015-10- 09.pdf
2015
-
[8]
eCPRI Specification V2 0,
CPRI Cooperation, “eCPRI Specification V2 0,” May
-
[9]
Available: https://www.cpri.info/downloads/eCPRI_v_2.0_2019_05_10c
[Online]. Available: https://www.cpri.info/downloads/eCPRI_v_2.0_2019_05_10c. pdf
-
[10]
Efficient Mobile Fronthaul via DSP-Based Channel Aggregation,
X. Liu, H. Zeng, N. Chand, and F. Effenberger, “Efficient Mobile Fronthaul via DSP-Based Channel Aggregation,” J. Lightwave Technol., vol. 34, no. 6, pp. 1556– 1564, Mar. 2016, doi: 10.1109/JLT.2015.2508451
-
[11]
K. Tanaka et al., “314-Tbit/s (576 × 380.16-MHz 5G NR OFDM Signals) SDM/WDM/SCM-Based IF-over-Fiber Transmission for Analog Mobile Fronthaul,” in Optical Fiber Communication Conference (OFC) 2022, San Diego, California: Optica Publishing Group, 2022, p. W4C.2. doi: 10.1364/OFC.2022.W4C.2
-
[12]
10 51-Tbit/s IF-over-Fibre Mobile Fronthaul Link Using SDM/WDM/SCM for Accommodating Ultra High-Density Antennas in Beyond-5G Mobile Communication Systems,
K. Tanaka et al., “10 51-Tbit/s IF-over-Fibre Mobile Fronthaul Link Using SDM/WDM/SCM for Accommodating Ultra High-Density Antennas in Beyond-5G Mobile Communication Systems,” 2022
2022
-
[13]
S. Ishimura, A. Bekkali, K. Tanaka, K. Nishimura, and M Suzuki, “1 032-Tb/s CPRI-Equivalent Rate IF-Over- Fiber Transmission Using a Parallel IM/PM Transmitter for High-Capacity Mobile Fronthaul Links,” Journal of Lightwave Technology, vol. 36, no. 8, pp. 1478–1484, Apr. 2018, doi: 10.1109/JLT.2017.2787151
-
[14]
S. T. Le, K. Schuh, M. Chagnon, F. Buchali, and H. Buelow, “1 53-Tbps CPRI-Equivalent Data Rate Transmission with Kramers-Kronig Receiver for Mobile Fronthaul Links,” in 2018 European Conference on Optical Communication (ECOC), Rome: IEEE, Sep. 2018, pp. 1–3. doi: 10.1109/ECOC.2018.8535539
-
[15]
5G; NR; Base Station (BS) radio transmission and reception,
3GPP, “5G; NR; Base Station (BS) radio transmission and reception,” ETSI, Technical Specification, Jul. 2018. [Online]. Available: https://www.etsi.org/deliver/etsi_ts/138100_138199/138104/1 5.02.00_60/ts_138104v150200p.pdf
2018
-
[16]
300 GHz OFDM electronic terahertz wireless transmission based on PS and DFT-S,
L Jiang, “300 GHz OFDM electronic terahertz wireless transmission based on PS and DFT-S,” J. Infrared Millim. Waves, vol. 43, no. 5, p. 634, 2024, doi: 10.11972/j.issn.1001-9014.2024.05.008
-
[17]
Enabling Optical Network Technologies for 5G and Beyond,
X Liu, “Enabling Optical Network Technologies for 5G and Beyond,” J. Lightwave Technol., vol. 40, no. 2, pp. 358–367, Jan. 2022, doi: 10.1109/JLT.2021.3099726
-
[18]
X Liu, “Hybrid Digital-Analog Radio-over-Fiber (DA-RoF) Modulation and Demodulation Achieving a SNR Gain over Analog RoF of >10 dB at Halved Spectral Efficiency,” in Optical Fiber Communication Conference (OFC) 2021, Washington, DC: Optica Publishing Group, 2021, p. Tu5D.4. doi: 10.1364/OFC.2021.Tu5D.4
-
[19]
Cascaded digital–analog radio-over-fiber for efficient SNR scaling at >10 dB per extra bandwidth,
Y Zhu, Y Xu, W Hu, and Q Zhuge, “Cascaded digital–analog radio-over-fiber for efficient SNR scaling at >10 dB per extra bandwidth,” Opt. Lett., vol. 47, no. 15, p. 3836, Aug. 2022, doi: 10.1364/OL.462631
-
[20]
J. Zhao et al., “1.92-Tb/s CPRI-Equivalent Rate Direct Detection Transmission based on ANN Pre- Equalization for Digital-Analog Radio-over-Fiber Mobile Fronthaul,” in Optical Fiber Communication Conference (OFC) 2024, San Diego California: Optica Publishing Group, 2024, p. Tu3K.3. doi: 10.1364/OFC.2024.Tu3K.3
-
[21]
J. Zhao et al., “Sensitivity-Improved and Dispersion- Tolerant Lite-Coherent Hybrid Receiver for Digital-Analog Radio-over-Fiber Mobile Fronthaul,” J. Lightwave Technol., pp. 1–9, 2025, doi: 10.1109/JLT.2025.3550176
-
[22]
Cloned-Comb Enabled Communication & Clock Distribution Integrated Fronthaul Architecture,
J. Lin et al., “Cloned-Comb Enabled Communication & Clock Distribution Integrated Fronthaul Architecture,” Optical Fiber Communication Conference, 2025
2025
-
[23]
1λ 10 5Tb/s CPRI-Equivalent Rate 1024-QAM Transmission via Self-Homodyne Digital-Analog Radio-over-Fiber Architecture
Y. Zhu et al., “1λ 10 5Tb/s CPRI-Equivalent Rate 1024-QAM Transmission via Self-Homodyne Digital-Analog Radio-over-Fiber Architecture”
-
[24]
High-fidelity digital–analog hybrid RoF fronthaul link enabled by nonlinear radio signal shaping,
C. Cheng et al., “High-fidelity digital–analog hybrid RoF fronthaul link enabled by nonlinear radio signal shaping,” Opt. Lett., vol. 49, no. 23, p. 6876, Dec. 2024, doi: 10.1364/OL.541887
-
[25]
C. Cheng et al., “592 Gbps/$\uplambda$ Capacity of Equivalent Fronthaul Channel Based on Time-Interleaved Digital-Analog Radio-Over-Fiber,” J. Lightwave Technol., vol. 42, no. 5, pp. 1340–1346, Mar. 2024, doi: 10.1109/JLT.2023.3322900
-
[26]
End-to-End Learning for OFDM: From Neural Receivers to Pilotless Communication,
F Ait Aoudia and J Hoydis, “End-to-End Learning for OFDM: From Neural Receivers to Pilotless Communication,” IEEE Trans. Wireless Commun., vol. 21, no. 2, Art. no. 2, 2022, doi: 10.1109/TWC.2021.3101364
-
[27]
End-to-end learning of communications systems without a channel model,
F A Aoudia and J Hoydis, “End-to-end learning of communications systems without a channel model,” arXiv, pp. 298–303, 2018. 15 > REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE -CLICK HERE TO EDIT) <
2018
-
[28]
Intelligent End-to-End Nonlinear Constellation Auto-Optimization in W-band Fiber-MMW Integrated Transmission for 6G Access,
J. Jia et al., “Intelligent End-to-End Nonlinear Constellation Auto-Optimization in W-band Fiber-MMW Integrated Transmission for 6G Access,” in 2022 Optical Fiber Communications Conference and Exhibition (OFC), Mar. 2022, pp. 1–3
2022
-
[29]
Model-Driven Deep-Learning for End- to-End Optimization in Fiber-Terahertz Communication Systems
Z. Li et al., “Model-Driven Deep-Learning for End- to-End Optimization in Fiber-Terahertz Communication Systems”
-
[30]
Model-Free End-to-End Deep Learning of Joint Geometric and Probabilistic Shaping for Optical Fiber Communication in IM/DD System
Z. Li et al., “Model-Free End-to-End Deep Learning of Joint Geometric and Probabilistic Shaping for Optical Fiber Communication in IM/DD System”
-
[31]
Neural Architecture Search with Reinforcement Learning
B. Zoph and Q V Le, “Neural Architecture Search with Reinforcement Learning,” Feb 15, 2017, arXiv: arXiv:1611.01578. doi: 10.48550/arXiv.1611.01578
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1611.01578 2017
-
[32]
Learning Dexterous In-Hand Manipulation
OpenAI et al., “Learning Dexterous In-Hand Manipulation,” Jan 18, 2019, arXiv: arXiv:1808.00177. doi: 10.48550/arXiv.1808.00177
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1808.00177 2019
-
[33]
Reinforcement Learning: An Introduction,
R S Sutton and A G Barto, “Reinforcement Learning: An Introduction,” 2018
2018
-
[34]
Over-the-fiber Digital Predistortion Using Reinforcement Learning,
J. Song et al., “Over-the-fiber Digital Predistortion Using Reinforcement Learning,” in 2021 European Conference on Optical Communication (ECOC), Bordeaux, France: IEEE, Sep. 2021, pp. 1–4. doi: 10.1109/ECOC52684.2021.9605972
-
[35]
S. Wang et al., “Reinforcement learning-based complex-valued space-time MIMO 2D-LSTM nonlinear equalizer for photonics-assisted THz indoor optical wireless access networks,” J. Opt. Commun. Netw., vol. 17, no. 9, p. D144, Sep. 2025, doi: 10.1364/JOCN.558913
-
[36]
Optimization of Delta-Sigma Modulator Based on Reinforcement Learning for Mobile Fronthaul,
Z Yan, Y Zhu, G Yang, and W Hu, “Optimization of Delta-Sigma Modulator Based on Reinforcement Learning for Mobile Fronthaul,” IEEE Photon. Technol. Lett., vol. 37, no. 7, pp. 397–400, Apr. 2025, doi: 10.1109/LPT.2025.3546995
-
[37]
Y. Cheng, Y. Shao, S. Ding, and C.-K Chan, “Deep Reinforcement Learning Based Joint Allocation Scheme in a TWDM-PON-Based mMIMO Fronthaul Network,” IEEE Photonics J., vol. 16, no. 3, pp. 1–11, Jun. 2024, doi: 10.1109/JPHOT.2024.3388571
-
[38]
Y. Xu et al., “Coherent digital-analog radio-over- fiber (DA-RoF) system with a CPRI-equivalent data rate beyond 1 Tb/s for fronthaul,” Opt. Express, vol. 30, no. 16, p. 29409, Aug. 2022, doi: 10.1364/OE.457586
-
[39]
Power Amplifier Modeling Framework for Front-End-Aware Next- Generation Wireless Networks,
K Kostrzewska and P Kryszkiewicz, “Power Amplifier Modeling Framework for Front-End-Aware Next- Generation Wireless Networks,” Electronics, vol. 13, no. 9, p. 1643, Apr. 2024, doi: 10.3390/electronics13091643
-
[40]
J. G. Proakis and M. Salehi, Digital communications, 5th ed. Boston: McGraw-Hill, 2008
2008
-
[41]
5G; NR; Base Station (BS) conformance testing; Part 1: Conducted conformance testing,
3GPP, “5G; NR; Base Station (BS) conformance testing; Part 1: Conducted conformance testing,” ETSI, Technical Specification, Jan. 2023
2023
-
[42]
Y Zhu, Q Zhuge, and W Hu, “1 02Tb/s CPRI- Equivalent Rate Direct Detection Transmission Supporting 1024-QAM Using IQ Interleaved Digital-Analog Radio-over- Fiber for Mobile Fronthaul,” in 2022 Asia Communications and Photonics Conference (ACP), Shenzhen, China: IEEE, Nov. 2022, pp. 688–692. doi: 10.1109/ACP55869.2022.10088671
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.