6G Communication Networks Enabling Embodied Agents: Architecture and Prototype

Kun Yang; Lipeng Dai; Luping Xiang

arxiv: 2605.23263 · v1 · pith:7D232ROTnew · submitted 2026-05-22 · 💻 cs.RO · cs.AI· cs.SY· eess.SP· eess.SY

6G Communication Networks Enabling Embodied Agents: Architecture and Prototype

Lipeng Dai , Luping Xiang , Kun Yang This is my paper

Pith reviewed 2026-05-25 04:27 UTC · model grok-4.3

classification 💻 cs.RO cs.AIcs.SYeess.SPeess.SY

keywords embodied agents6G networkshuman-robot interactionO-RANremote interactionnetwork architectureprototypelatency

0 comments

The pith

A four-layer architecture lets 6G networks support remote human-robot interaction with millisecond latency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that embodied agents, which link decision-making to physical movement, need communication systems far stricter than those for software agents alone. It argues that 6G features such as low latency and integrated sensing can meet these needs through a specific hierarchical setup that separates human intent sensing, network transport, intelligent processing, and physical embodiment. The authors test this by building a working prototype that connects a haptic device to an industrial robot arm over a 5G O-RAN testbed and records stable closed-loop performance at millisecond delays. A sympathetic reader would care because successful remote control of physical robots could open practical uses in industry, medicine, and exploration where direct human presence is impossible or unsafe.

Core claim

Embodied agents impose heterogeneous and stringent communication demands that 6G networks can address through a hierarchical architecture of four layers: a human-intent perception layer, an O-RAN-based transport layer, an intelligent intermediary layer, and an embodiment layer. The authors implement an end-to-end prototype using a haptic device, industrial robotic arm, intermediary platform, and 5G O-RAN testbed, which achieves millisecond-level latency and stable closed-loop operation, thereby confirming the architecture's practicality for future 6G deployments.

What carries the argument

The four-layer hierarchical communication architecture that separates human-intent perception, O-RAN transport, intelligent intermediary processing, and physical embodiment to meet embodied-agent requirements.

If this is right

6G enablers such as sub-millisecond latency and native intelligence can directly satisfy the communication needs of human-robot remote interaction.
Embodied agents can in turn improve network performance by extending coverage, providing environmental sensing, and adding physical-world understanding.
The demonstrated prototype supplies a concrete reference design for industrial and research deployments of 6G-enabled embodied systems.
The architecture supports stable operation even when intent must cross from human perception to robotic actuation in real time.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same layered separation could be adapted to other physical agents such as autonomous vehicles or wearable exoskeletons that also require tight human-machine loops.
Once real 6G hardware exists, repeating the prototype test would show whether the added sensing and intelligence layers produce measurable gains over the 5G baseline.
If the intermediary layer can be made fully distributed, the architecture might reduce single points of failure in large-scale robot fleets.

Load-bearing premise

Results from a current 5G O-RAN testbed can stand in for the performance and sensing capabilities that future 6G networks will actually deliver.

What would settle it

A measurement on the prototype or a follow-on test that shows latency or reliability falling outside the range needed for stable closed-loop robot control when 6G-specific sensing or intelligence features are added or removed.

Figures

Figures reproduced from arXiv: 2605.23263 by Kun Yang, Lipeng Dai, Luping Xiang.

**Figure 2.** Figure 2: Hierarchical architecture for 6G-enabled human-robot remote in [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Prototype for human-robotic arm remote interaction. A user at area 1 wants to control a robotic arm at area 2 via the [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Experimental comparison of network latency and jitter. All network [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Embodied agents, which couple intelligent decision-making with physical actuation in the real world, impose far more stringent and heterogeneous communication requirements than purely software-based agents. While 6G promises sub-millisecond latency, ultra-high reliability, native intelligence, and integrated sensing, systematic studies on how to exploit these capabilities for embodied agent communication remain limited. This article investigates 6G-enabled communication systems for embodied agents from both conceptual and engineering perspectives. First, we review the concept, embodiment value of embodied agents, and clarify their distinctions from disembodied agents. Then, we analyse the symbiotic relationship between embodied agents and 6G networks. We highlight how key 6G enablers can support the stringent requirements of human-robot interaction. Furthermore, we demonstrate the proactive role of embodied agents in bolstering communication networks through coverage extension, environmental sensing, and physical world understanding. Building on these insights, we propose a hierarchical communication architecture for human-robot remote interaction, comprising a human-intent perception layer, an open radio access network (O-RAN)-based transport layer, an intelligent intermediary layer, and an embodiment layer. To validate its feasibility, we implement an end-to-end prototype that integrates a haptic device, an industrial robotic arm, an intermediary platform, and a 5G O-RAN testbed. Experimental results demonstrate millisecond-level latency and stable closed-loop operation, confirming the practicality of the proposed architecture and providing a reference for future 6G-embodied agent research and industrial deployments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The 5G O-RAN prototype demonstrates workable low-latency control but does not test the 6G-specific features the architecture is built around.

read the letter

The paper reviews how embodied agents differ from software ones, maps 6G enablers like sub-ms latency and integrated sensing onto human-robot needs, and proposes a four-layer stack: human-intent perception, O-RAN transport, intelligent intermediary, and embodiment. It then shows an end-to-end prototype with a haptic device, industrial arm, and 5G O-RAN testbed that runs closed-loop at millisecond latency with stable performance. That prototype is the concrete piece; the rest is synthesis of existing concepts applied to this setting. The architecture itself is a reasonable engineering breakdown and the reported latency numbers give a practical data point for remote manipulation work. The main limitation is that the validation stays on 5G hardware. The abstract and claim treat the 5G results as confirmation for the 6G architecture, yet the testbed does not include or emulate the native intelligence or sensing integration that the paper itself flags as 6G differentiators. So the results speak more to current 5G feasibility than to the future 6G case. No equations or fitted models appear, and the citation pattern is not visible here, but the logic holds together without obvious circularity. This is useful reading for people already working on robotics communication stacks who want a concrete reference architecture and latency numbers. It is not a foundational shift. A serious editor should send it to review; the prototype supplies something referees can evaluate even if the 6G extrapolation needs tightening.

Referee Report

2 major / 1 minor

Summary. The paper reviews embodied agents and their distinctions from disembodied agents, analyzes the symbiotic relationship between embodied agents and 6G networks (including how 6G enablers like sub-ms latency, native intelligence, and integrated sensing support human-robot interaction requirements, and how agents can enhance networks via sensing and coverage), proposes a four-layer hierarchical architecture (human-intent perception layer, O-RAN-based transport layer, intelligent intermediary layer, embodiment layer) for remote interaction, and validates feasibility with an end-to-end 5G O-RAN prototype (haptic device + industrial robotic arm + intermediary + 5G testbed) that achieves millisecond-level latency and stable closed-loop operation.

Significance. If the central validation claim holds after addressing the 5G-to-6G gap, the work would offer a useful conceptual framework linking 6G capabilities to embodied systems and an initial engineering reference for remote human-robot setups, with potential to guide future deployments in robotics and wireless networks.

major comments (2)

[Abstract] Abstract: the claim that the 5G O-RAN prototype 'confirm[s] the practicality of the proposed architecture' (a 6G architecture relying on sub-millisecond latency, ISAC, and native intelligence) is not supported by the reported results, which only demonstrate current 5G feasibility; the testbed description provides no evidence of emulating or incorporating 6G-specific features, making the inference from 5G metrics to 6G practicality a load-bearing gap.
[Prototype validation section] Section on prototype validation (near end of manuscript): the experimental setup integrates a 5G testbed but the architecture section emphasizes 6G enablers; without additional analysis or emulation showing how the four-layer design would exploit or require those 6G features (e.g., integrated sensing for embodiment layer), the stable closed-loop results do not confirm the 6G-specific practicality asserted in the introduction and conclusion.

minor comments (1)

[Abstract] The abstract and introduction use 'millisecond-level latency' without specifying measured values, variance, or comparison to baselines; adding quantitative details (e.g., mean latency and jitter from the haptic-arm loop) would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments correctly identify that the prototype validation uses a 5G testbed and does not directly demonstrate 6G-specific capabilities. We address each point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the 5G O-RAN prototype 'confirm[s] the practicality of the proposed architecture' (a 6G architecture relying on sub-millisecond latency, ISAC, and native intelligence) is not supported by the reported results, which only demonstrate current 5G feasibility; the testbed description provides no evidence of emulating or incorporating 6G-specific features, making the inference from 5G metrics to 6G practicality a load-bearing gap.

Authors: We agree that the abstract overstates the direct confirmation of 6G practicality. The prototype demonstrates millisecond-level latency and stable closed-loop control on a 5G O-RAN testbed, validating the core four-layer hierarchy (human-intent perception, O-RAN transport, intelligent intermediary, embodiment) under current network conditions. We will revise the abstract to state that the results confirm feasibility of the architecture using 5G technology as a practical baseline and reference for future 6G implementations, rather than claiming direct confirmation of 6G-specific features. We will also add a brief forward-looking sentence on how the layers map to 6G enablers. revision: yes
Referee: [Prototype validation section] Section on prototype validation (near end of manuscript): the experimental setup integrates a 5G testbed but the architecture section emphasizes 6G enablers; without additional analysis or emulation showing how the four-layer design would exploit or require those 6G features (e.g., integrated sensing for embodiment layer), the stable closed-loop results do not confirm the 6G-specific practicality asserted in the introduction and conclusion.

Authors: This observation is accurate. The reported experiments use 5G hardware and do not emulate sub-ms latency, ISAC, or native intelligence. We will revise the prototype section to explicitly qualify the 5G implementation, add a short analysis paragraph mapping each architecture layer to anticipated 6G capabilities (e.g., how integrated sensing could enhance the embodiment layer), and adjust the introduction and conclusion to frame the prototype as an initial engineering reference rather than a direct 6G validation. These changes will be made without altering the experimental data. revision: yes

Circularity Check

0 steps flagged

No circularity: conceptual proposal validated by independent prototype

full rationale

The paper contains no equations, fitted parameters, or derivations. It proposes a four-layer architecture conceptually and reports empirical results from a 5G O-RAN prototype (haptic device + robotic arm). No step reduces a claimed prediction or result to its own inputs by construction, self-definition, or self-citation chain. The 5G-to-6G extrapolation is an external modeling assumption, not a circular reduction within the paper's logic. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract introduces no mathematical derivations, fitted parameters, or new postulated entities; all content is descriptive or engineering-oriented.

pith-pipeline@v0.9.0 · 5813 in / 1233 out tokens · 32495 ms · 2026-05-25T04:27:20.484740+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

[1]

From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications,

F. Jiang, C. Pan, K. Wang, P. Michiardi, O. A. Dobre, and M. Debbah, “From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications,”IEEE Journal on Selected Areas in Communications, vol. 44, pp. 3507–3540, 2026

work page 2026
[2]

A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications, and Challenges,

F. Jiang, C. Pan, L. Dong, K. Wang, M. Debbah, D. Niyato, and Z. Han, “A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications, and Challenges,”IEEE Communications Surveys & Tutorials, vol. 28, pp. 4731–4764, 2026

work page 2026
[3]

A New Agent-Based Intelligent Network Architecture,

S. T. Arzo, D. Scotece, R. Bassoli, F. Granelli, L. Foschini, and F. H. Fitzek, “A New Agent-Based Intelligent Network Architecture,”IEEE Communications Standards Magazine, vol. 6, no. 4, pp. 74–79, 2022

work page 2022
[4]

Large Language Model Enhanced Multi-Agent Systems for 6G Communications,

F. Jiang, Y . Peng, L. Dong, K. Wang, K. Yang, C. Pan, D. Niyato, and O. A. Dobre, “Large Language Model Enhanced Multi-Agent Systems for 6G Communications,”IEEE Wireless Communications, vol. 31, no. 6, pp. 48–55, 2024

work page 2024
[5]

When Large Language Model Agents Meet 6G Networks: Perception, Grounding, and Alignment,

M. Xu, D. Niyato, J. Kang, Z. Xiong, S. Mao, Z. Han, D. I. Kim, and K. B. Letaief, “When Large Language Model Agents Meet 6G Networks: Perception, Grounding, and Alignment,”IEEE Wireless Com- munications, vol. 31, no. 6, pp. 63–71, 2024

work page 2024
[6]

Toward Agentic AI Networking in 6G: A Generative Foundation Model-as-Agent Approach,

Y . Xiao, G. Shi, and P. Zhang, “Toward Agentic AI Networking in 6G: A Generative Foundation Model-as-Agent Approach,”IEEE Communications Magazine, vol. 63, no. 9, pp. 68–74, 2025

work page 2025
[7]

AI Embodiment Through 6G: Shaping the Future of AGI,

L. Bariah and M. Debbah, “AI Embodiment Through 6G: Shaping the Future of AGI,”IEEE Wireless Communications, vol. 31, no. 5, pp. 174–181, 2024

work page 2024
[8]

Embodied AI Agents: Modeling the World,

P. Fung, Y . Bachrach, A. Celikyilmaz, K. Chaudhuri, D. Chen, W. Chung, E. Dupoux, H. Gong, H. J ´egou, A. Lazaric, A. Majumdar, A. Madotto, F. Meier, F. Metze, L.-P. Morency, T. Moutakanni, J. Pino, B. Terver, J. Tighe, P. Tomasello, and J. Malik, “Embodied AI Agents: Modeling the World,”arXiv e-prints, p. arXiv:2506.22355, Jun. 2025

work page arXiv 2025
[9]

Pitfalls of Embodiment in Human-Agent Experiment Design,

J. Hale, L. Schweitzer, and J. Gratch, “Pitfalls of Embodiment in Human-Agent Experiment Design,” inProceedings of the 24th ACM International Conference on Intelligent Virtual Agents, ser. IV A ’24. New York, NY , USA: Association for Computing Machinery, 2024. [Online]. Available: https://doi.org/10.1145/3652988.3673958

work page doi:10.1145/3652988.3673958 2024
[10]

Toward Artificial Intelligence-Native 6G Services [Mobile Radio],

B. C. Jung, “Toward Artificial Intelligence-Native 6G Services [Mobile Radio],”IEEE Vehicular Technology Magazine, vol. 19, no. 4, pp. 9–14, 2024

work page 2024
[11]

URLLC for 6G Enabled Industry 5.0: A Taxonomy of Architectures, Cross Layer Techniques, and Time Critical Applications,

A. M. Ibrahim, R. Nordin, Y . S. M. Khamayseh, A. Amphawan, and M. Basheer Jasser, “URLLC for 6G Enabled Industry 5.0: A Taxonomy of Architectures, Cross Layer Techniques, and Time Critical Applications,”arXiv e-prints, p. arXiv:2510.08080, Oct. 2025

work page arXiv 2025
[12]

On the Road to 6G: Visions, Requirements, Key Technologies, and Testbeds,

C.-X. Wang, X. You, X. Gao, X. Zhu, Z. Li, C. Zhang, H. Wang, Y . Huang, Y . Chen, H. Haas, J. S. Thompson, E. G. Larsson, M. D. Renzo, W. Tong, P. Zhu, X. Shen, H. V . Poor, and L. Hanzo, “On the Road to 6G: Visions, Requirements, Key Technologies, and Testbeds,” IEEE Communications Surveys & Tutorials, vol. 25, no. 2, pp. 905–974, 2023

work page 2023
[13]

A Comprehensive Survey of Wireless Time-Sensitive Networking (TSN): Architecture, Technologies, Applications, and Open Issues,

K. Zanbouri, M. Noor-A-Rahim, J. John, C. J. Sreenan, H. Vincent Poor, and D. Pesch, “A Comprehensive Survey of Wireless Time-Sensitive Networking (TSN): Architecture, Technologies, Applications, and Open Issues,”IEEE Communications Surveys & Tutorials, vol. 27, no. 4, pp. 2129–2155, 2025

work page 2025
[14]

Artificial General Intelligence (AGI)-Native Wireless Systems: A Journey Beyond 6G,

W. Saad, O. Hashash, C. K. Thomas, C. Chaccour, M. Debbah, N. Mandayam, and Z. Han, “Artificial General Intelligence (AGI)-Native Wireless Systems: A Journey Beyond 6G,”Proceedings of the IEEE, vol. 113, no. 9, pp. 849–887, 2025

work page 2025
[15]

Toward Next Generation Open Radio Access Networks: What O-RAN Can and Cannot Do!

A. S. Abdalla, P. S. Upadhyaya, V . K. Shah, and V . Marojevic, “Toward Next Generation Open Radio Access Networks: What O-RAN Can and Cannot Do!”IEEE Network, vol. 36, no. 6, pp. 206–213, 2022

work page 2022

[1] [1]

From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications,

F. Jiang, C. Pan, K. Wang, P. Michiardi, O. A. Dobre, and M. Debbah, “From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications,”IEEE Journal on Selected Areas in Communications, vol. 44, pp. 3507–3540, 2026

work page 2026

[2] [2]

A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications, and Challenges,

F. Jiang, C. Pan, L. Dong, K. Wang, M. Debbah, D. Niyato, and Z. Han, “A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications, and Challenges,”IEEE Communications Surveys & Tutorials, vol. 28, pp. 4731–4764, 2026

work page 2026

[3] [3]

A New Agent-Based Intelligent Network Architecture,

S. T. Arzo, D. Scotece, R. Bassoli, F. Granelli, L. Foschini, and F. H. Fitzek, “A New Agent-Based Intelligent Network Architecture,”IEEE Communications Standards Magazine, vol. 6, no. 4, pp. 74–79, 2022

work page 2022

[4] [4]

Large Language Model Enhanced Multi-Agent Systems for 6G Communications,

F. Jiang, Y . Peng, L. Dong, K. Wang, K. Yang, C. Pan, D. Niyato, and O. A. Dobre, “Large Language Model Enhanced Multi-Agent Systems for 6G Communications,”IEEE Wireless Communications, vol. 31, no. 6, pp. 48–55, 2024

work page 2024

[5] [5]

When Large Language Model Agents Meet 6G Networks: Perception, Grounding, and Alignment,

M. Xu, D. Niyato, J. Kang, Z. Xiong, S. Mao, Z. Han, D. I. Kim, and K. B. Letaief, “When Large Language Model Agents Meet 6G Networks: Perception, Grounding, and Alignment,”IEEE Wireless Com- munications, vol. 31, no. 6, pp. 63–71, 2024

work page 2024

[6] [6]

Toward Agentic AI Networking in 6G: A Generative Foundation Model-as-Agent Approach,

Y . Xiao, G. Shi, and P. Zhang, “Toward Agentic AI Networking in 6G: A Generative Foundation Model-as-Agent Approach,”IEEE Communications Magazine, vol. 63, no. 9, pp. 68–74, 2025

work page 2025

[7] [7]

AI Embodiment Through 6G: Shaping the Future of AGI,

L. Bariah and M. Debbah, “AI Embodiment Through 6G: Shaping the Future of AGI,”IEEE Wireless Communications, vol. 31, no. 5, pp. 174–181, 2024

work page 2024

[8] [8]

Embodied AI Agents: Modeling the World,

P. Fung, Y . Bachrach, A. Celikyilmaz, K. Chaudhuri, D. Chen, W. Chung, E. Dupoux, H. Gong, H. J ´egou, A. Lazaric, A. Majumdar, A. Madotto, F. Meier, F. Metze, L.-P. Morency, T. Moutakanni, J. Pino, B. Terver, J. Tighe, P. Tomasello, and J. Malik, “Embodied AI Agents: Modeling the World,”arXiv e-prints, p. arXiv:2506.22355, Jun. 2025

work page arXiv 2025

[9] [9]

Pitfalls of Embodiment in Human-Agent Experiment Design,

J. Hale, L. Schweitzer, and J. Gratch, “Pitfalls of Embodiment in Human-Agent Experiment Design,” inProceedings of the 24th ACM International Conference on Intelligent Virtual Agents, ser. IV A ’24. New York, NY , USA: Association for Computing Machinery, 2024. [Online]. Available: https://doi.org/10.1145/3652988.3673958

work page doi:10.1145/3652988.3673958 2024

[10] [10]

Toward Artificial Intelligence-Native 6G Services [Mobile Radio],

B. C. Jung, “Toward Artificial Intelligence-Native 6G Services [Mobile Radio],”IEEE Vehicular Technology Magazine, vol. 19, no. 4, pp. 9–14, 2024

work page 2024

[11] [11]

URLLC for 6G Enabled Industry 5.0: A Taxonomy of Architectures, Cross Layer Techniques, and Time Critical Applications,

A. M. Ibrahim, R. Nordin, Y . S. M. Khamayseh, A. Amphawan, and M. Basheer Jasser, “URLLC for 6G Enabled Industry 5.0: A Taxonomy of Architectures, Cross Layer Techniques, and Time Critical Applications,”arXiv e-prints, p. arXiv:2510.08080, Oct. 2025

work page arXiv 2025

[12] [12]

On the Road to 6G: Visions, Requirements, Key Technologies, and Testbeds,

C.-X. Wang, X. You, X. Gao, X. Zhu, Z. Li, C. Zhang, H. Wang, Y . Huang, Y . Chen, H. Haas, J. S. Thompson, E. G. Larsson, M. D. Renzo, W. Tong, P. Zhu, X. Shen, H. V . Poor, and L. Hanzo, “On the Road to 6G: Visions, Requirements, Key Technologies, and Testbeds,” IEEE Communications Surveys & Tutorials, vol. 25, no. 2, pp. 905–974, 2023

work page 2023

[13] [13]

A Comprehensive Survey of Wireless Time-Sensitive Networking (TSN): Architecture, Technologies, Applications, and Open Issues,

K. Zanbouri, M. Noor-A-Rahim, J. John, C. J. Sreenan, H. Vincent Poor, and D. Pesch, “A Comprehensive Survey of Wireless Time-Sensitive Networking (TSN): Architecture, Technologies, Applications, and Open Issues,”IEEE Communications Surveys & Tutorials, vol. 27, no. 4, pp. 2129–2155, 2025

work page 2025

[14] [14]

Artificial General Intelligence (AGI)-Native Wireless Systems: A Journey Beyond 6G,

W. Saad, O. Hashash, C. K. Thomas, C. Chaccour, M. Debbah, N. Mandayam, and Z. Han, “Artificial General Intelligence (AGI)-Native Wireless Systems: A Journey Beyond 6G,”Proceedings of the IEEE, vol. 113, no. 9, pp. 849–887, 2025

work page 2025

[15] [15]

Toward Next Generation Open Radio Access Networks: What O-RAN Can and Cannot Do!

A. S. Abdalla, P. S. Upadhyaya, V . K. Shah, and V . Marojevic, “Toward Next Generation Open Radio Access Networks: What O-RAN Can and Cannot Do!”IEEE Network, vol. 36, no. 6, pp. 206–213, 2022

work page 2022