The Multipath Reliable Connection (MRC) Transport
Pith reviewed 2026-06-26 21:49 UTC · model grok-4.3
The pith
MRC extends RoCEv2 with explicit primitives for per-packet multipath and sender-based congestion control to deliver resilience over best-effort Ethernet.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MRC extends RoCEv2 with explicit, composable primitives for per-packet multipath and sender-based congestion control, decouples packet delivery from semantic processing, adds multiple new capabilities for accelerated packet-loss recovery and adds resilience against port and path failures.
What carries the argument
The MRC transport layer, which supplies the explicit multipath and congestion-control primitives on top of RoCEv2 while decoupling delivery from semantic processing.
If this is right
- Training jobs can continue without interruption when individual ports or paths fail.
- Loss recovery occurs faster than in standard RoCEv2 because of the added recovery primitives.
- Congestion control decisions are made at the sender on a per-packet basis rather than relying solely on receiver feedback.
- The same connection can route packets across multiple paths without changing higher-level application semantics.
Where Pith is reading between the lines
- MRC could allow AI clusters to use cheaper, non-lossless switches while still meeting training reliability targets.
- The decoupling of delivery from semantics may simplify future extensions such as in-network aggregation.
- Sender-based congestion control opens the possibility of tighter integration with application-level scheduling of collective operations.
- The approach could be tested by replaying failure traces from existing large Ethernet fabrics and measuring job completion times.
Load-bearing premise
The design can be realized as an open, production-grade implementation that delivers the claimed resilience and recovery benefits when deployed over real best-effort Ethernet fabrics in large AI clusters.
What would settle it
Measurements from a production-scale AI cluster deployment that show no measurable improvement in packet-loss recovery time or continued outages under port and path failures would falsify the central claims.
read the original abstract
MRC is an open, production-grade transport designed for large-scale AI/ML training over best-effort Ethernet. It extends RoCEv2 with explicit, composable primitives for per-packet multipath and sender-based congestion control, decouples packet delivery from semantic processing, adds multiple new capabilities for accelerated packet-loss recovery and adds resilience against port and path failures. This paper presents MRC and details its core capabilities and mechanisms.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the Multipath Reliable Connection (MRC) transport as an open, production-grade protocol extending RoCEv2 for large-scale AI/ML training over best-effort Ethernet. It claims to add explicit, composable primitives for per-packet multipath and sender-based congestion control, to decouple packet delivery from semantic processing, to provide accelerated packet-loss recovery capabilities, and to deliver resilience against port and path failures. The paper presents MRC and details its core capabilities and mechanisms.
Significance. If the mechanisms can be shown to deliver the claimed resilience and recovery benefits at cluster scale, MRC would address important limitations of RoCEv2 in AI training fabrics and could influence future transport designs for high-performance Ethernet environments. The emphasis on composable primitives and decoupling is a potentially useful design direction.
major comments (1)
- [Abstract and overall manuscript (no evaluation section present)] The central claim that MRC is 'production-grade' and 'adds resilience against port and path failures' when deployed over real best-effort Ethernet fabrics in large AI clusters is unsupported. The manuscript details mechanisms but contains no implementation description, no open-source code or artifacts, no cluster-scale traces, no failure-injection results, and no quantitative comparisons against RoCEv2 baselines under the stated conditions. This absence directly undermines the production-grade and resilience assertions.
Simulated Author's Rebuttal
We thank the referee for the detailed feedback. We agree that the manuscript, as a design-focused paper, does not contain implementation details or empirical evaluations to substantiate the production-grade and resilience claims.
read point-by-point responses
-
Referee: [Abstract and overall manuscript (no evaluation section present)] The central claim that MRC is 'production-grade' and 'adds resilience against port and path failures' when deployed over real best-effort Ethernet fabrics in large AI clusters is unsupported. The manuscript details mechanisms but contains no implementation description, no open-source code or artifacts, no cluster-scale traces, no failure-injection results, and no quantitative comparisons against RoCEv2 baselines under the stated conditions. This absence directly undermines the production-grade and resilience assertions.
Authors: We agree with this assessment. The manuscript presents the protocol design, composable primitives, and mechanisms for multipath, congestion control, decoupled delivery, loss recovery, and path resilience, but provides no implementation description, artifacts, traces, or quantitative results. We will revise the abstract, introduction, and conclusion to remove the 'production-grade' descriptor and rephrase the resilience claims as design objectives and intended capabilities of the mechanisms rather than demonstrated outcomes in deployed clusters. The paper's scope is limited to specifying the transport extensions to RoCEv2. revision: yes
Circularity Check
No circularity; design paper with no derivations or fitted predictions
full rationale
The manuscript is a systems design paper that describes MRC as an extension to RoCEv2, listing explicit primitives for multipath, congestion control, decoupling, and recovery. No equations, first-principles derivations, parameter fitting, or 'predictions' appear in the provided abstract or text. The central claims concern protocol mechanisms and capabilities rather than any result that reduces to its own inputs by construction. No self-citation chains or uniqueness theorems are invoked as load-bearing steps. The paper is therefore self-contained as a descriptive presentation of a design; external validation of production-grade performance is a separate empirical question outside the scope of circularity analysis.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Best-effort Ethernet is the underlying network fabric for large-scale AI/ML training clusters
invented entities (1)
-
MRC transport
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Rdma over ethernet for distributed training at meta scale,
A. Gangidi, R. Miao, S. Zheng, S. J. Bondu, G. Goes, H. Morsy, R. Puri, M. Riftadi, A. J. Shetty, J. Yang, S. Zhang, M. J. Fernandez, S. Gandham, and H. Zeng, “Rdma over ethernet for distributed training at meta scale,” inProceedings of the ACM SIGCOMM 2024 Conference, ser. ACM SIGCOMM ’24. New York, NY , USA: Association for Computing Machinery, 2024, p....
-
[2]
Demystifying parallel and distributed deep learning: An in-depth concurrency analysis,
T. Ben-Nun and T. Hoefler, “Demystifying parallel and distributed deep learning: An in-depth concurrency analysis,”ACM Comput. Surv., vol. 52, no. 4, aug 2019. [Online]. Available: https://doi.org/10.1145/3320060
-
[3]
A. G. et al, “The llama 3 herd of models,” 2024. [Online]. Available: https://arxiv.org/abs/2407.21783
Pith/arXiv arXiv 2024
-
[4]
Falcon: A reliable, low latency hardware transport,
A. Singhvi, N. Dukkipati, P. Chandra, H. M. G. Wassel, N. K. Sharma, A. Rebello, H. Schuh, P. Kumar, B. Montazeri, N. Bansod, S. Thomas, I. Cho, H. L. Seibert, B. Wu, R. Yang, Y . Li, K. Huang, Q. Yin, A. Agarwal, S. Vaduvatha, W. Wang, M. Moshref, T. Ji, D. Wetherall, and A. Vahdat, “Falcon: A reliable, low latency hardware transport,” inProceedings of t...
-
[5]
Alibaba stellar: A new generation RDMA network for cloud AI,
J. Lu, J. Gao, F. Feng, Z. He, M. Zheng, K. Liu, J. He, B. Liao, S. Xu, K. Sun, Y . Mo, Q. Peng, J. Luo, Q. Li, G. Lu, Z. Wang, J. Dong, K. He, S. Cheng, J. Cao, H. Jiao, P. Zhang, S. Ma, L. Zhu, C. Shi, Y . Zhang, Y . Chen, W. Wang, S. Zhu, X. Li, Q. Wang, J. Liu, C. Wang, W. Lin, E. Zhai, J. Wu, Q. Liu, B. Fu, and D. Cai, “Alibaba stellar: A new generat...
-
[6]
Ultra ethernet specification v1.0.1,
Ultra Ethernet Consortium, “Ultra ethernet specification v1.0.1,” 2025. [Online]. Available: https://ultraethernet.org/wp- content/uploads/sites/20/2025/10/UE-Specification-1.0.1.pdf
2025
-
[7]
InfiniBand Trade Association (IBTA),InfiniBand Architecture Specification, Volume 1, Release 1.8, InfiniBand Trade Association,
-
[8]
Available: https://www.infinibandta.org
[Online]. Available: https://www.infinibandta.org
-
[9]
Revisiting network support for RDMA,
R. Mittal, A. Shpiner, A. Panda, E. Zahavi, A. Krishnamurthy, S. Ratnasamy, and S. Shenker, “Revisiting network support for RDMA,” inProceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, ser. SIGCOMM ’18. New York, NY , USA: Association for Computing Machinery, 2018, p. 313–326. [Online]. Available: https://doi.org/10...
-
[10]
Sohan, E
R. Sohan, E. Spada, E. Davis, M. Handle, I. Burstein, T. Hurson, J. Jose, V . Kashyap, R. Pan, and S. Sur,Open Compute Project: Multipath Reliable Connection (MRC) Specification, Version 1.0, Open Compute Project Foundation, 2026, accessed: May 17, 2026. [Online]. Available: https://www.opencompute.org/documents/ocp-mrc-1-0-pdf
2026
-
[11]
Resilient AI Supercomputer Networking using MRC and SRv6,
J. Araujo, A. Chow, M. Handley, R. Lewis, C. Paasch, J. Padhye, M. Papamichael, G. Steinbrecher, A. Tootoonchian, L. Yuan, S. Anantharamu, A. Dosi, M. Garg, M. Ghazi, T. Hoefler, D. Jayasinghe, J. Jose, A. Kabbani, G. Lu, Y . Wang, K. Doddapaneni, M. Garimella, V . Jain, Y . Le, H. Nagulapalli, S. Narayanan, R. Pan, R. Sabesan, R. Sivaramu, R. Sohan, E. D...
Pith/arXiv arXiv 2026
-
[12]
Ultra ethernet’s design principles and architectural innovations,
T. Hoefler, K. Schramm, E. Spada, K. Underwood, C. Alexander, B. Alverson, P. Bottorff, A. Caulfield, M. Handley, C. Huang, C. Raiciu, A. Kabbani, E. Opsasnick, R. Pan, A. Ran, and R. Sohan, “Ultra ethernet’s design principles and architectural innovations,” 2025. [Online]. Available: https://arxiv.org/abs/2508.08906
arXiv 2025
-
[13]
Segment Routing over IPv6 (SRv6) Network Programming,
C. Filsfils, D. Dukes, S. Previdi, J. Leddy, S. Matsushima, and D. V oyer, “Segment Routing over IPv6 (SRv6) Network Programming,” RFC 8986, Feb. 2021. [Online]. Available: https://www.rfc- editor.org/info/rfc8986
2021
-
[14]
Compressed SRv6 Segment List Encoding,
W. Cheng, C. Filsfils, Z. Li, B. Decraene, D. Cai, D. V oyer, F. Clad, S. Zadok, J. Guichard, A. Liu, R. Raszuk, and C. Li, “Compressed SRv6 Segment List Encoding,” RFC 9800, Jun. 2025. [Online]. Available: https://www.rfc-editor.org/info/rfc9800
2025
-
[15]
OCP Multipath Reliable Connec- tion,
Open Compute Project, “OCP Multipath Reliable Connec- tion,” 2026, accessed: May 18, 2026. [Online]. Avail- able: https://github.com/opencomputeproject/OCP-Multipath-Reliable- Connection
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.