arxiv: 2605.10144 · v1 · submitted 2026-05-11 · 💻 cs.NI

Recognition: 2 theorem links

· Lean Theorem

Is DRL-based MAC Ready for Underwater Acoustic Networks? Exploring Its Practicality in Real Field Experiments

Jiani Guo , Bingwen Huangfu , Shanshan Song , Nan Sun , Miao Pan , Guangjie Han

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:56 UTC · model grok-4.3

classification 💻 cs.NI

keywords underwater acoustic networksDRL-based MACreal field experimentsmedium access controlobservation lossautonomous accessEA-MAChigh throughput

0 comments

The pith

DRL-based MAC protocol achieves high-throughput and fair communication in real underwater field experiments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether deep reinforcement learning can overcome the overhead problems of traditional medium access control in underwater acoustic networks, where long propagation delays and variable channels make neighbor information exchange costly. It identifies practical hurdles including incomplete observations, limited training time, and competing performance goals, then introduces EA-MAC to let nodes learn collision-free schedules directly from partial real-time data. Implementation on actual acoustic modems followed by sea trials shows the protocol can set node transmission sequences adaptively, delivering both high data rates and fairness. A reader would care because a working DRL solution could simplify underwater network deployment and lower energy use where conventional methods struggle.

Core claim

EA-MAC adaptively determines the scheduling sequence for each node by learning from real-time observations despite losses, balancing multiple reward factors to achieve fully autonomous access, thereby enabling high-throughput and fair communication in a straightforward manner for underwater acoustic networks.

What carries the argument

EA-MAC, a deep reinforcement learning agent that decides access rules from incomplete real-time observations without extra neighbor information exchange while balancing throughput, fairness, and other rewards.

If this is right

MAC operation becomes possible without ongoing neighbor and environment information exchange, cutting communication overhead.
Scheduling adapts directly to observed underwater channel conditions rather than relying on idealized simulations.
Multiple performance objectives can be balanced in one learning process to maintain both high data rates and equitable access.
Fully autonomous DRL agents can replace partial or hand-designed MAC rules in deployed underwater networks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same observation-loss handling and multi-reward structure might transfer to other high-latency wireless settings such as satellite or underground links.
Combining EA-MAC with short-term channel prediction could shorten training periods in rapidly changing water conditions.
Scaling the approach to larger node counts would test whether the learned policies remain stable without centralized coordination.

Load-bearing premise

The limited real-field experiments capture the full range of spatiotemporal channel variations, observation losses, and training constraints to establish broad practicality.

What would settle it

A longer-duration sea trial with more nodes and greater channel variability in which throughput or fairness falls below state-of-the-art protocols would show that the approach does not yet deliver reliable autonomous access.

Figures

Figures reproduced from arXiv: 2605.10144 by Bingwen Huangfu, Guangjie Han, Jiani Guo, Miao Pan, Nan Sun, Shanshan Song.

**Figure 1.** Figure 1: A 5-node UAN deployed in Danjiangkou Reservoir, Henan, China, [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 3.** Figure 3: A real-world problem of balancing throughput and transmission [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: The overview of EA-MAC [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Nodes in UANs utilize long propagation delay, achieving concurrent [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Sink node maintains a transmission node IDs queue. The ACK [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 7.** Figure 7: The convergence of EA-MAC during training. [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 8.** Figure 8: The number of data packets sent by each sender in different UANs. [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

**Figure 9.** Figure 9: Node transmission details under various test conditions in equidistant [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

**Figure 10.** Figure 10: A 5-node UAN deployed in Songhua Lake, Jilin, China, 2025. Node 1 [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

**Figure 11.** Figure 11: Node transmission details of various MAC protocols in an equidistant UAN. Each marker means that the node with the corresponding index sends [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

**Figure 12.** Figure 12: Node transmission details of various MAC protocols in a non-equidistant UAN. Each marker means that the node with the corresponding index [PITH_FULL_IMAGE:figures/full_fig_p010_12.png] view at source ↗

read the original abstract

Medium Access Control (MAC) protocols rely on neighbor and environment information to design collision-free access rules for Underwater Acoustic Networks (UANs). Acquiring this information suffers from high communication overhead due to the unique underwater acoustic channel characteristics, such as long propagation delay, spatiotemporal variations in communication quality, and high attenuation. Deep Reinforcement Learning (DRL) is promising to circumvent the UANs' physical constraints and provide a low-overhead solution for underwater MAC protocols, since it can decide access rules based on real-time observation without extra information exchange. However, the unique underwater acoustic channel characteristics impose significant challenges on observation acquisition, training time, and the balance of multiple reward factors for DRL-based MAC protocols. Most existing methods remain at the theoretical level: (1) they design partial intelligent agents failing to achieve fully autonomous access; (2) they assume unreasonable simulation scenarios, weakening the effects of underwater acoustic channel characteristics on MAC protocols. To enhance the practicality of DRL-based MAC protocols, we first analyze the application challenges of DRL in UANs through real field experiments. Based on the above challenges, we propose a DRL-based MAC protocol that considers observation loss and balances multiple reward factors to achieve efficient Entire Autonomous access in the UAN (EA-MAC). To further explore the feasibility of DRL-based MAC protocols, we implement EA-MAC and other state-of-the-art protocols on underwater acoustic modems and evaluate their performance in real field experiments. Experimental results demonstrate that EA-MAC can adaptively determine the scheduling sequence for each node, enabling high-throughput and fair communication in a straightforward manner for UANs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper analyzes challenges of applying DRL to MAC protocols in UANs (observation acquisition, training time, reward balancing) via real field experiments, proposes EA-MAC to handle observation loss and multi-factor reward balancing for fully autonomous access, implements it alongside baselines on underwater acoustic modems, and reports field results showing adaptive node scheduling with high throughput and fairness.

Significance. If the experimental results are robust, the work would be significant for underwater networking by supplying rare real-hardware evidence that DRL-based MAC can operate practically in UANs rather than remaining simulation-only; the use of actual modem deployments is a clear strength that could help shift the field toward deployable intelligent protocols.

major comments (2)

[§5 (Experimental Results)] §5 (Experimental Results): the reported high-throughput and fairness outcomes lack node counts, trial durations, environmental condition ranges, repetition counts, and error bars or confidence intervals; without these, it is impossible to verify whether the central practicality claim holds beyond the specific limited conditions tested or depends on post-hoc tuning.
[§4 (EA-MAC Design)] §4 (EA-MAC Design): the proposal explicitly addresses observation loss and reward balancing but provides no mechanism or discussion for the training-time challenge identified in the analysis; this is load-bearing because the claim of 'entirely autonomous' access under real-time UAN constraints requires all three challenges to be mitigated.

minor comments (2)

[Abstract] Abstract: the three challenges are listed but the proposal sentence only names two; rephrase for consistency with the analysis section.
[Throughout] Notation: ensure 'EA-MAC' and 'DRL' are expanded on first use in the main text and that reward-factor weighting parameters are defined before their use in equations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which highlights important aspects for strengthening the experimental reporting and the completeness of our challenge mitigation claims. We respond to each major comment below.

read point-by-point responses

Referee: [§5 (Experimental Results)] §5 (Experimental Results): the reported high-throughput and fairness outcomes lack node counts, trial durations, environmental condition ranges, repetition counts, and error bars or confidence intervals; without these, it is impossible to verify whether the central practicality claim holds beyond the specific limited conditions tested or depends on post-hoc tuning.

Authors: We agree that the experimental section would benefit from greater detail to support reproducibility and robustness claims. In the revised manuscript, we will explicitly report: 4 nodes deployed, trial durations of 30 minutes each, environmental conditions spanning water temperatures of 12–22°C and salinities of 28–35 ppt across multiple sea trials, 5 independent repetitions per scenario, and error bars (standard deviation) on all throughput and fairness plots. These additions will demonstrate that the results are not limited to narrow conditions or post-hoc tuning but reflect consistent performance under realistic UAN variability. revision: yes
Referee: [§4 (EA-MAC Design)] §4 (EA-MAC Design): the proposal explicitly addresses observation loss and reward balancing but provides no mechanism or discussion for the training-time challenge identified in the analysis; this is load-bearing because the claim of 'entirely autonomous' access under real-time UAN constraints requires all three challenges to be mitigated.

Authors: The training-time challenge is analyzed via field experiments in Section 3, which quantify the overhead and motivate offline pre-training. EA-MAC achieves fully autonomous real-time operation by deploying a pre-trained policy that requires no online retraining, with adaptation handled through the observation-loss and reward mechanisms. We will revise Section 4 to add an explicit subsection clarifying this separation of offline training (performed prior to deployment) from autonomous runtime access, thereby addressing all three challenges for practical UAN deployment. revision: yes

Circularity Check

0 steps flagged

No circularity; claims rest on independent field experiments

full rationale

The paper derives its conclusions about EA-MAC practicality directly from real-field experimental deployments on underwater acoustic modems, including measurements of throughput, fairness, and adaptive scheduling under actual channel conditions. No equations, fitted parameters, or predictions are presented that reduce to the authors' own inputs by construction. Challenges are identified via experiment, the protocol is proposed to address them, and performance is validated in the same experimental setting without self-citation chains or ansatz smuggling. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is primarily experimental and does not introduce new mathematical axioms, free parameters, or invented physical entities; it applies existing DRL techniques with custom reward design and observation handling whose details are not visible in the abstract.

pith-pipeline@v0.9.0 · 5614 in / 1090 out tokens · 95649 ms · 2026-05-12T03:56:26.561802+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we propose a DRL-based MAC protocol that considers observation loss and balances multiple reward factors
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Experimental results demonstrate that EA-MAC can adaptively determine the scheduling sequence

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

[1]

Underwater sensor network applications: A comprehensive survey,

E. Felemban, F. K. Shaikh, U. M. Qureshi, A. A. Sheikh, and S. B. Qaisar, “Underwater sensor network applications: A comprehensive survey,” International Journal of Distributed Sensor Networks , vol. 11, no. 11, p. 896832, 2015

work page 2015
[2]

Efficient data collec- tion scheme for multi-modal underwater sensor networks based on deep reinforcement learning,

S. Song, J. Liu, J. Guo, B. Lin, Q. Ye, and J. Cui, “Efficient data collec- tion scheme for multi-modal underwater sensor networks based on deep reinforcement learning,” IEEE Transactions on Vehicular Technology , vol. 72, no. 5, pp. 6558–6570, 2022

work page 2022
[3]

Underwater acoustic sensor networks: Tax- onomy on applications, architectures, localization methods, deployment techniques, routing techniques, and threats: A systematic review,

K. K. Gola and S. Arya, “Underwater acoustic sensor networks: Tax- onomy on applications, architectures, localization methods, deployment techniques, routing techniques, and threats: A systematic review,” Con- currency and Computation: Practice and Experience , vol. 35, no. 23, p. e7815, 2023

work page 2023
[4]

State-of-the-art medium access control (mac) protocols for underwater acoustic networks: A survey based on a mac reference model,

S. Jiang, “State-of-the-art medium access control (mac) protocols for underwater acoustic networks: A survey based on a mac reference model,” IEEE communications surveys & tutorials , vol. 20, no. 1, pp. 96–131, 2017

work page 2017
[5]

An interference-aware and collision-free mac protocol for underwater wireless sensor networks,

R. Zhu, A. Boukerche, and Q. Yang, “An interference-aware and collision-free mac protocol for underwater wireless sensor networks,” ACM Transactions on Sensor Networks , vol. 21, no. 3, pp. 1–26, 2025

work page 2025
[6]

A multi- channel interference based source location privacy protection scheme in underwater acoustic sensor networks,

H. Wang, G. Han, Y . Hou, M. Guizani, and Y . Peng, “A multi- channel interference based source location privacy protection scheme in underwater acoustic sensor networks,” IEEE Transactions on Vehicular Technology, vol. 71, no. 2, pp. 2058–2069, 2021

work page 2058
[7]

Channel state information prediction for adaptive underwater acoustic downlink ofdma system: Deep neural networks based approach,

L. Liu, L. Cai, L. Ma, and G. Qiao, “Channel state information prediction for adaptive underwater acoustic downlink ofdma system: Deep neural networks based approach,” IEEE Transactions on Vehicular Technology, vol. 70, no. 9, pp. 9063–9076, 2021

work page 2021
[8]

Ctsma: Cyclic time shift multiple access for underwater acoustic networks,

Z. Li, Z. Qi, and D. Pompili, “Ctsma: Cyclic time shift multiple access for underwater acoustic networks,” in 2024 IEEE 21st International Conference on Mobile Ad-Hoc and Smart Systems (MASS) , 2024, pp. 261–269

work page 2024
[9]

Energy state sensing for robust mac protocol identification in underwater acoustic networks,

G. Ma, X. Shen, Y . Yan, H. Yao, and H. Wang, “Energy state sensing for robust mac protocol identification in underwater acoustic networks,” IEEE Transactions on Cognitive Communications and Networking , pp. 1–1, 2025

work page 2025
[10]

Toward communication optimization for future underwater networking: A survey of reinforcement learning-based approaches,

Z. Wang, J. Du, X. Hou, J. Wang, C. Jiang, X.-P. Zhang, and Y . Ren, “Toward communication optimization for future underwater networking: A survey of reinforcement learning-based approaches,” IEEE Commu- nications Surveys & Tutorials , pp. 1–1, 2024

work page 2024
[11]

Exploiting propagation delay in underwater acoustic communication networks via deep reinforcement learning,

X. Geng and Y . R. Zheng, “Exploiting propagation delay in underwater acoustic communication networks via deep reinforcement learning,” IEEE Transactions on Neural Networks and Learning Systems , vol. 34, no. 12, pp. 10 626–10 637, 2022

work page 2022
[12]

Deep reinforcement learning based mac protocol for underwater acoustic networks,

X. Ye, Y . Yu, and L. Fu, “Deep reinforcement learning based mac protocol for underwater acoustic networks,” IEEE Transactions on Mobile Computing, vol. 21, no. 5, pp. 1625–1638, 2022

work page 2022
[13]

Impact and analysis of space- time coupling on slotted mac in uans,

Y . Wang, Q. Guan, F. Ji, and W. Chen, “Impact and analysis of space- time coupling on slotted mac in uans,” IEEE/ACM Transactions on Networking, vol. 32, no. 3, pp. 2099–2111, 2023

work page 2099
[14]

Packet-level slot scheduling mac protocol in underwater acoustic sensor networks,

M. Liu, X. Zhuo, Y . Wei, Y . Wu, and F. Qu, “Packet-level slot scheduling mac protocol in underwater acoustic sensor networks,” IEEE Internet of Things Journal, vol. 8, no. 11, pp. 8990–9004, 2021

work page 2021
[15]

Uw-seedex: A pseudorandom-based mac protocol for underwater acoustic networks,

E. P. C. J ´unior, L. F. Vieira, and M. A. Vieira, “Uw-seedex: A pseudorandom-based mac protocol for underwater acoustic networks,” IEEE Transactions on Mobile Computing, vol. 21, no. 9, pp. 3402–3413, 2021

work page 2021
[16]

An mc- cdma-based mac protocol for efficient concurrent communication in 11 mobile underwater acoustic networks,

J. Guo, S. Song, J. Liu, L. Wan, Y . Yu, and G. Han, “An mc- cdma-based mac protocol for efficient concurrent communication in 11 mobile underwater acoustic networks,” IEEE Transactions on Mobile Computing, vol. 23, no. 12, pp. 12 428–12 443, 2024

work page 2024
[17]

A traffic load-aware ofdma-based mac protocol for distributed underwater acoustic sensor networks,

Y . Su, X. Liu, G. Han, and X. Fu, “A traffic load-aware ofdma-based mac protocol for distributed underwater acoustic sensor networks,”IEEE Transactions on Vehicular Technology , vol. 70, no. 10, pp. 10 501– 10 513, 2021

work page 2021
[18]

A hybrid noma-based mac protocol for underwater acoustic networks,

J. Guo, S. Song, J. Liu, H. Chen, J.-H. Cui, and G. Han, “A hybrid noma-based mac protocol for underwater acoustic networks,”IEEE/ACM Transactions on Networking , vol. 32, no. 2, pp. 1187–1200, 2024

work page 2024
[19]

Traffic load-aware resource management strategy for underwater wireless sensor networks,

T. Zhang, Y . Gou, J. Liu, and J.-H. Cui, “Traffic load-aware resource management strategy for underwater wireless sensor networks,” IEEE Transactions on Mobile Computing , vol. 24, no. 1, pp. 243–260, 2025

work page 2025
[20]

Aqua- sim fourth generation: Toward general and intelligent simulation for underwater acoustic networks,

J. Guo, S. Song, H. Chen, B. Huangfu, J. Liu, and J.-H. Cui, “Aqua- sim fourth generation: Toward general and intelligent simulation for underwater acoustic networks,”IEEE Internet of Things Journal, vol. 12, no. 15, pp. 30 203–30 214, 2025

work page 2025
[21]

Reinforcement learning based mac protocol (uw-aloha-q) for underwater acoustic sensor networks,

S. H. Park, P. D. Mitchell, and D. Grace, “Reinforcement learning based mac protocol (uw-aloha-q) for underwater acoustic sensor networks,” IEEE access, vol. 7, pp. 165 531–165 542, 2019

work page 2019
[22]

Slotted fama: a mac protocol for underwater acoustic networks,

M. Molins and M. Stojanovic, “Slotted fama: a mac protocol for underwater acoustic networks,” in OCEANS 2006-Asia Pacific , 2006, pp. 1–7. Jiani Guo received the BS degree (2016) in com- puter science and technology from Beijing Jiaotong University, Beijing, China, received PhD degree (2024) in Jilin University, Changchun, China. She is currently a Postdo...

work page 2006