pith. machine review for the scientific record. sign in

arxiv: 2604.15261 · v2 · submitted 2026-04-16 · 💻 cs.NI

Recognition: 2 theorem links

· Lean Theorem

RNG: Flat Datacenter Networks at Scale

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:55 UTC · model grok-4.3

classification 💻 cs.NI
keywords datacenter networksrandom graphsnetwork topologydistributed routingfat treesflat networksoptical cablingscalability
0
0 comments X

The pith

RNG builds flat datacenter networks from quasi-random graphs that match fat-tree performance at up to 45 percent lower cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces RNG as a practical flat network architecture for large datacenters. It relies on quasi-random graphs instead of hierarchical fat trees, paired with a distributed routing protocol that locates many edge-disjoint paths and a passive optical device that internally shuffles cables. The design targets equivalent or superior throughput and fault tolerance across common traffic patterns while reducing hardware expense and cabling effort. Adoption as the default network for most workloads at Amazon indicates the approach has been validated in production at scale.

Core claim

RNG realizes the first flat datacenter network at production scale by using a quasi-random graph topology. A new distributed routing protocol exploits the expansion properties of random graphs to discover a large number of edge-disjoint paths between any pair of endpoints. A passive optical shuffling device reduces cabling complexity to levels comparable with fat trees. Measurements show that RNG matches or exceeds fat-tree performance for a range of traffic patterns while costing up to 45 percent less, and the topology is now the default choice for most workloads at Amazon.

What carries the argument

Quasi-random graph topology with a distributed routing protocol that finds edge-disjoint paths and a passive optical cable-shuffling device.

If this is right

  • Flat topologies become deployable at hyperscale with cabling effort similar to fat trees.
  • Network hardware costs drop substantially while preserving or improving performance for typical datacenter traffic.
  • Routing decisions can remain fully distributed yet still exploit the path diversity inherent in random graphs.
  • Production operators can shift away from hierarchical designs without sacrificing fault tolerance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same quasi-random construction could be tested in other large shared infrastructures such as wide-area backbones or cloud interconnects.
  • Further gains might appear if the routing protocol were combined with traffic-aware path selection not explored in the original deployment.
  • Cabling simplification opens the possibility of more frequent topology changes or incremental expansion without major rewiring.

Load-bearing premise

The quasi-random graph and its routing protocol can consistently locate enough edge-disjoint paths at full production scale without the optical shuffling device adding reliability or performance problems.

What would settle it

A large-scale deployment where the routing protocol routinely fails to identify sufficient disjoint paths for high-bandwidth flows, producing measurable congestion or where the optical shuffler causes increased latency or failure rates.

Figures

Figures reproduced from arXiv: 2604.15261 by Chinchu Merine Joseph, C. Seshadhri, Elizabeth Tennent, Enrico Carlesso, Giacomo Bernardi, Luiza Popa, Pavan Manikonda, Randy Ram, Ratul Mahajan, Saurabh Kumar, Steven Robinson.

Figure 1
Figure 1. Figure 1: Example fat tree (a) and expander topologies [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Key design parameters of RNG 5 SPRAYPOINT ROUTING Spraypoint constructs a large number of edge disjoint paths between endpoint pairs, which increases network through￾put by offering more independent options to load balanc￾ing mechanisms like ECMP. While we analyze and deploy it on random graphs, Spraypoint works on any expander graph. It is based on the observation that high fan-out at the source and high … view at source ↗
Figure 4
Figure 4. Figure 4: A ShuffleBox. packet to an ECMP group id. LPM memory use depends on the number of network prefixes and is similar to fat trees.5 The second type of memory is to map ECMP group id to next hop set. Its required size depends on how ECMP groups are setup. The two possibilities are: (1) a separate ECMP group for each destination node, which consumes 𝑂(𝑛ℎ) memory; and (2) pre-define all possible 𝑑 ℎ groups and u… view at source ↗
Figure 5
Figure 5. Figure 5: Cabling in a datacenter with three rooms. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Path length distribution. possible value of 𝑑 is proportional to exp(−ℎ). Thus, a low value of ℎ > 1 suffices for good performance. The change in exp(−ℎ) from ℎ=1 (0.37) to ℎ=2 (0.13) is large, but it tapers off rapidly as ℎ increases. (2) For neighboring sources, high values of 𝑝 reduce the number of edge disjoint paths. 7.2 Path length Path lengths in RNG depend on the sizes of various Spray￾point levels… view at source ↗
Figure 8
Figure 8. Figure 8: (a) One of the racks hosting the emulated [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Per-flow throughput of multipath transport [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: PPS of 64B packets (left) and IOPS of storage [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Oversubscription for different topology parameters. The defaults are [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Edge disjoint path count. It offers higher peak throughput between endpoint pairs, and offers higher network-wide throughput. 9.3 Throughput relative to fat trees Beyond worst-case traffic patterns that help characterize oversubscription, operators want to also validate perfor￾mance for other traffic patterns. This validation is hard be￾cause there are many possible traffic patterns. Our evaluation is bas… view at source ↗
Figure 13
Figure 13. Figure 13: Oversubscription ratio (lower is better) for different traffic matrices. Both fat tree and [PITH_FULL_IMAGE:figures/full_fig_p012_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Reduction in number of switches in RNG relative to a 3-tier fat tree for two different port counts. We use the ratio of the number of switches used by the two topologies as the measure of relative cost. This measure automatically includes transceivers, whose count is propor￾tional to switch count. It ignores passive optical components like cables, connectors, and patch panels; the cost of these components… view at source ↗
Figure 15
Figure 15. Figure 15: Matched uplinks as the datacenter grows. [PITH_FULL_IMAGE:figures/full_fig_p015_15.png] view at source ↗
Figure 17
Figure 17. Figure 17: Comparison of fraction of uplinks that find [PITH_FULL_IMAGE:figures/full_fig_p016_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: plots the optimal 𝛽 versus 𝛼. We see if we want the degree to be at least 0.8𝑑, we can get that as soon as 𝛽 ≈ 25% of the room is deployed. This performance can be obtained if the first phase is ≈30% of the first room. If the 𝛽 ∗ of the two-phase deployment does not meet the operator criterion, we can perform a similar analysis for suc￾cessively higher number of phases until we find the number of phases w… view at source ↗
Figure 19
Figure 19. Figure 19: Comparison between estimated latency based on datacenter layout and measurements on the production fabric. Units omitted for confidentiality. 4 6 8 10 Latency ( s) 0.0 0.2 0.4 0.6 0.8 1.0 CDF over paths Fat tree RNG (baseline) RNG (biased waypoints) RNG (biased waypoints, cabling) [PITH_FULL_IMAGE:figures/full_fig_p017_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Latency distribution of ToR-to-ToR paths [PITH_FULL_IMAGE:figures/full_fig_p017_20.png] view at source ↗
read the original abstract

We design and deploy in production the first flat datacenter networks. Our design, called RNG, is based on quasi-random graphs. While the cost and fault-tolerance benefits of such topologies have been long known, their practical realization has been hampered by a lack of scalable routing and cabling approaches. RNG has a new distributed routing protocol that exploits the properties of random graphs to find a large number of edge disjoint paths between pairs of endpoints. It uses a novel passive optical device that internally shuffles cables, which makes its cabling complexity similar to that of fat trees. We show that RNG matches or exceeds the performance of fat trees for a range of traffic patterns, despite being up to 45% cheaper. RNG is now the default datacenter network for most workloads at Amazon.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents RNG, a flat datacenter network based on quasi-random graphs. It introduces a distributed routing protocol that exploits random-graph properties to discover large numbers of edge-disjoint paths, together with a passive optical cable-shuffling device that reduces cabling complexity to fat-tree levels. The authors claim that RNG matches or exceeds fat-tree performance across a range of traffic patterns while cutting costs by up to 45 percent, and report that it has become the default datacenter network for most workloads at Amazon.

Significance. If the performance and cost claims are substantiated with detailed evidence, the work would be highly significant for datacenter networking. Random-graph topologies have long been known to offer attractive cost and fault-tolerance properties, yet practical deployment has been blocked by routing and cabling difficulties. A production-scale solution that overcomes these barriers and demonstrates real-world parity with fat trees could shift industry practice toward flatter, cheaper fabrics.

major comments (2)
  1. [Distributed routing protocol] Distributed routing protocol section: the central claim that RNG matches or exceeds fat-tree performance rests on the protocol's ability to locate a large number of edge-disjoint paths at production scale. The manuscript provides no quantitative measurements of path diversity, convergence time, control overhead, or success rate under link failures, which are required to substantiate that the protocol delivers the assumed path multiplicity without excessive cost.
  2. [Performance evaluation] Performance evaluation: the assertion of performance parity or superiority for a range of traffic patterns is not accompanied by traffic matrices, throughput/latency figures with error bars, or a reproducible comparison methodology. Without these data the strength of evidence for the performance and cost claims cannot be assessed.
minor comments (1)
  1. [Abstract] The abstract would benefit from a brief statement of deployment scale (number of racks or endpoints) to contextualize the production claims.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments on our manuscript describing RNG. We address the major concerns point by point below, offering clarifications based on the production deployment while indicating revisions to improve the evidence presented.

read point-by-point responses
  1. Referee: Distributed routing protocol section: the central claim that RNG matches or exceeds fat-tree performance rests on the protocol's ability to locate a large number of edge-disjoint paths at production scale. The manuscript provides no quantitative measurements of path diversity, convergence time, control overhead, or success rate under link failures, which are required to substantiate that the protocol delivers the assumed path multiplicity without excessive cost.

    Authors: We agree that the manuscript would benefit from additional quantitative support for the routing protocol. The design exploits random-graph properties for path discovery, and its effectiveness is evidenced by successful large-scale production operation at Amazon. However, detailed internal metrics remain proprietary. We will revise the section to include theoretical bounds on path diversity from random-graph analysis, simulation results for convergence time and failure recovery, and high-level control overhead estimates. This will strengthen substantiation without disclosing sensitive production data. revision: partial

  2. Referee: Performance evaluation: the assertion of performance parity or superiority for a range of traffic patterns is not accompanied by traffic matrices, throughput/latency figures with error bars, or a reproducible comparison methodology. Without these data the strength of evidence for the performance and cost claims cannot be assessed.

    Authors: The performance claims draw from extensive internal validation during production deployment, where RNG has become the default for most workloads. Exact production traffic matrices cannot be released due to confidentiality. We will revise the evaluation section to add synthetic traffic matrices modeled on observed datacenter patterns, throughput and latency results with error bars from repeated simulations and controlled tests, and a detailed reproducible methodology description. This addresses the need for transparent evidence while preserving proprietary details. revision: partial

standing simulated objections not resolved
  • Detailed proprietary performance metrics, traffic matrices, and internal deployment data from Amazon production environments cannot be disclosed in the manuscript.

Circularity Check

0 steps flagged

No circularity: claims rest on deployment measurements and cost analysis, not self-referential derivation.

full rationale

The paper is a systems design and production deployment report. Its core claims (performance parity or superiority to fat trees at lower cost, enabled by the routing protocol and optical shuffler) are justified by real-world measurements, traffic pattern evaluations, and direct cost comparisons rather than any closed mathematical chain. No equations define a quantity in terms of itself, no fitted parameters are relabeled as predictions, and no load-bearing premise reduces to a self-citation or prior ansatz by the same authors. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claims rest on the unproven-in-prior-work effectiveness of the new routing protocol in quasi-random graphs and the reliability of the novel optical device; these are introduced without external benchmarks or independent evidence cited in the abstract.

axioms (2)
  • domain assumption Quasi-random graphs possess sufficient expansion and connectivity properties to support a large number of edge-disjoint paths between arbitrary endpoint pairs under typical datacenter traffic
    The routing protocol is stated to exploit these properties.
  • ad hoc to paper A passive optical shuffling device can be built and deployed at scale without introducing unacceptable latency, loss, or failure rates
    This device is presented as the key enabler for practical cabling.
invented entities (2)
  • RNG distributed routing protocol no independent evidence
    purpose: Locate large numbers of edge-disjoint paths in the quasi-random topology
    New protocol designed specifically for this setting.
  • Passive optical cable-shuffling device no independent evidence
    purpose: Reduce cabling complexity to fat-tree levels while preserving connectivity
    Novel hardware component introduced to solve the cabling barrier.

pith-pipeline@v0.9.0 · 5462 in / 1653 out tokens · 64479 ms · 2026-05-11T00:55:43.218946+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 20 canonical work pages

  1. [1]

    Jung Ho Ahn, Nathan Binkert, Al Davis, Moray McLaren, and Robert S Schreiber. 2009. HyperX: topology, routing, and packaging of effi- cient large-scale networks. InProceedings of the Conference on High Performance Computing Networking, Storage and Analysis. 1–11

  2. [2]

    Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. 2008. A scalable, commodity data center network architecture. InProc. ACM SIGCOMM Conference on Data Communication. 63–74. https://doi .org/ 10.1145/1402958.1402967

  3. [3]

    Hitesh Ballani, Paolo Costa, Raphael Behrendt, Daniel Cletheroe, Ist- van Haller, Krzysztof Jozwik, Fotini Karinou, Sophie Lange, Kai Shi, Benn Thomsen, and Hugh Williams. 2020. Sirius: A Flat Datacenter Network with Nanosecond Optical Switching. InProc. ACM SIGCOMM Conference on Data Communication. 782–797. https://doi .org/10.1145/ 3387514.3406221

  4. [4]

    Maciej Besta and Torsten Hoefler. 2014. Slim fly: a cost effective low-diameter network topology. InProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 348–359. https://doi.org/10.1109/SC.2014.34

  5. [5]

    Broadcom. 2026. Tomahawk 5 / BCM78900 series. (2026). https://www.broadcom.com/products/ethernet-connectivity/ switching/strataxgs/bcm78900-series [Retrieved 6-Feb-2026]

  6. [6]

    Broadcom. 2026. Tomahawk3 / BCM56980 Series. (2026). https://www.broadcom.com/products/ethernet-connectivity/ switching/strataxgs/bcm56980-series [Retrieved 6-Feb-2026]

  7. [7]

    Cheng-Shang Chang, Duan-Shin Lee, and Yi-Shean Jou. 2002. Load balanced Birkhoff–von Neumann switches, part I: One-stage buffering. Computer Communications25, 6 (2002), 611–622

  8. [8]

    2009.Concentration of Measure for the Analysis of Randomized Algorithms

    Devdutt Dubhashi and Alessandro Panconesi. 2009.Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge

  9. [9]

    Facebook. 2014. Introducing data center fabric, the next-generation Facebook data center network. (2014). https://engineering .fb.com/ 2014/11/14/production-engineering/introducing-data-center-fabric- the-next-generation-facebook-data-center-network/ [Retrieved 6-Feb-2026]

  10. [10]

    Nathan Farrington, George Porter, Sivasankar Radhakrishnan, Hamid Hajabdolali Bazzaz, Vikram Subramanya, Yeshaiahu Fainman, George Papen, and Amin Vahdat. 2010. Helios: a hybrid electrical/op- tical switch architecture for modular data centers. InProc. ACM SIG- COMM Conference on Data Communication. 339–350

  11. [11]

    Frieze and Páll Melsted

    Alan M. Frieze and Páll Melsted. 2012. Maximum matchings in ran- dom bipartite graphs and the space utilization of Cuckoo Hash tables. Random Struct. Algorithms41, 3 (2012), 334–364

  12. [12]

    fs.com. 2019. Fiber Distribution Panel Wiki, Types and Buying Tips. (2019). https://community .fs.com/article/fiber-distribution- panel-wiki-buying-tips.html [Retrieved: 2026-02-06]

  13. [13]

    fs.com. 2026. 8 Fibers MTP Female Type 1 OM4 50/125 Multimode Fiber Loopback Module. (2026). https://www .fs.com/products/ 35796.html?now_cid=2686 [Retrieved: 6-Feb-2026]

  14. [14]

    Adithya Gangidi, Rui Miao, Shengbao Zheng, Sai Jayesh Bondu, Guilherme Goes, Hany Morsy, Rohit Puri, Mohammad Riftadi, Ashmitha Jeevaraj Shetty, Jingyi Yang, Shuqiang Zhang, Mikel Jimenez Fernandez, Shashidhar Gandham, and Hongyi Zeng. 2024. RDMA over Ethernet for Distributed Training at Meta Scale. InProc. ACM SIG- COMM Conference on Data Communication. ...

  15. [15]

    Monia Ghobadi, Ratul Mahajan, Amar Phanishayee, Nikhil Devanur, Ja- nardhan Kulkarni, Gireeja Ranade, Pierre-Alexandre Blanche, Houman Rastegarfar, Madeleine Glick, and Daniel Kilper. 2016. Projector: Ag- ile reconfigurable data center interconnect. InProc. ACM SIGCOMM Conference on Data Communication. 216–229

  16. [16]

    Hamilton, Navendu Jain, Srikanth Kan- dula, Changhoon Kim, Parantap Lahiri, David A

    Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kan- dula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: a scalable and flexible data center network. InProc. ACM SIGCOMM Conference on Data Communication. 51–62. https://doi.org/10.1145/1592568.1592576

  17. [17]

    Chuanxiong Guo, Guohan Lu, Dan Li, Haitao Wu, Xuan Zhang, Yun- feng Shi, Chen Tian, Yongguang Zhang, and Songwu Lu. 2009. BCube: a high performance, server-centric network architecture for modular data centers. InProc. ACM SIGCOMM Conference on Data Communica- tion. 63–74

  18. [18]

    Chuanxiong Guo, Haitao Wu, Kun Tan, Lei Shi, Yongguang Zhang, and Songwu Lu. 2008. Dcell: a scalable and fault-tolerant network structure for data centers. InProc. ACM SIGCOMM Conference on Data Communication. 75–86

  19. [19]

    Daniel Halperin, Srikanth Kandula, Jitendra Padhye, Paramvir Bahl, and David Wetherall. 2011. Augmenting data center networks with multi-gigabit wireless links. InProc. ACM SIGCOMM Conference on Data Communication. 38–49

  20. [20]

    Navid Hamedazimi, Zafar Qazi, Himanshu Gupta, Vyas Sekar, Samir R Das, Jon P Longtin, Himanshu Shah, and Ashish Tanwer. 2014. Firefly: A reconfigurable wireless data center fabric using free-space optics. In Proc. ACM SIGCOMM Conference on Data Communication. 319–330

  21. [21]

    Brighten Godfrey

    Vipul Harsh, Sangeetha Abdu Jyothi, and P. Brighten Godfrey. 2020. Spineless Data Centers. InProc. ACM Workshop on Hot Topics in Net- works (HotNets). 67–73. https://doi.org/10.1145/3422604.3425945

  22. [22]

    Sangeetha Abdu Jyothi, Ankit Singla, P Brighten Godfrey, and Alexan- dra Kolla. 2016. Measuring and understanding throughput of network topologies. InProceedings of the International Conference for High Per- formance Computing, Networking, Storage and Analysis (SC). 761–772

  23. [23]

    Simon Kassing, Asaf Valadarsky, Gal Shahaf, Michael Schapira, and Ankit Singla. 2017. Beyond fat-trees without antennae, mirrors, and disco-balls. InProc. ACM SIGCOMM Conference on Data Communica- tion. 281–294. https://doi.org/10.1145/3098822.3098836

  24. [24]

    Allan Kaye. 2025. Rail-Optimised Networking: How NVIDIA is Rethinking AI Network Design in the Data Centre. (2025). https://vespertec.com/news/rail-optimised-networking-how-nvidia- is-rethinking-ai-network-design-data-centre/ [Retrieved 3-Feb-2026]. 13

  25. [25]

    Murali Kodialam, TV Lakshman, and Sudipta Sengupta. 2011. Traffic- oblivious routing in the hose model.IEEE/ACM Transactions on Net- working19, 3 (2011), 774–787

  26. [27]

    Hong Liu, Ryohei Urata, Kevin Yasumura, Xiang Zhou, Roy Ban- non, Jill Berger, Pedram Dashti, Norm Jouppi, Cedric Lam, Sheng Li, Erji Mao, Daniel Nelson, George Papen, Mukarram Tariq, and Amin Vahdat. 2023. Lightwave Fabrics: At-Scale Optical Circuit Switching for Datacenter and Machine Learning Systems. InProc. ACM SIGCOMM Conference on Data Communicatio...

  27. [28]

    William M Mellette, Rajdeep Das, Yibo Guo, Rob McGuinness, Alex C Snoeren, and George Porter. 2020. Expanding across time to deliver bandwidth efficiency and low latency. InUSENIX Symposium on Net- worked Systems Design and Implementation (NSDI). 1–18

  28. [29]

    Mellette, Alex Forencich, Rukshani Athapathu, Alex C

    William M. Mellette, Alex Forencich, Rukshani Athapathu, Alex C. Snoeren, George Papen, and George Porter. 2024. Realizing Rotor- Net: Toward Practical Microsecond Scale Optical Networking. In Proc. ACM SIGCOMM Conference on Data Communication. 392–414. https://doi.org/10.1145/3651890.3672273

  29. [30]

    Mogul and John Wilkes

    Jeffrey C. Mogul and John Wilkes. 2023. Physical Deployability Matters. InProc. ACM Workshop on Hot Topics in Networks (HotNets). 9–17. https://doi.org/10.1145/3626111.3628190

  30. [31]

    2001.Randomized Algo- rithms

    Rajeev Motwani and Prabhakar Raghavan. 2001.Randomized Algo- rithms. Cambridge University Press

  31. [32]

    Jayaram Mudigonda, Praveen Yalagandula, Mohammad Al-Fares, and Jeffrey C. Mogul. 2010. SPAIN: COTS data-center Ethernet for multi- pathing over arbitrary topologies. InUSENIX Symposium on Networked Systems Design and Implementation (NSDI). 18

  32. [33]

    Jayaram Mudigonda, Praveen Yalagandula, and Jeffrey C. Mogul. 2011. Taming the Flying Cable Monster: A Topology Design and Optimiza- tion Framework for Data-Center Networks. InUSENIX Annual Techni- cal Conference. https://api.semanticscholar.org/CorpusID:2989246

  33. [34]

    Pooria Namyar, Sucha Supittayapornpong, Mingyang Zhang, Min- lan Yu, and Ramesh Govindan. 2021. A throughput-centric view of the performance of datacenter topologies. InProc. ACM SIGCOMM Conference on Data Communication. 349–369. https://doi .org/10.1145/ 3452296.3472913

  34. [35]

    Sabine R Ohring, Maximilian Ibel, Sajal K Das, and Mohan J Kumar

  35. [36]

    On generalized fat trees. InProc. International Parallel Processing Symposium. 37–44

  36. [37]

    Doron Puder. 2014. Expansion of random graphs: new proofs, new results.Inventiones mathematicae201, 3 (Dec. 2014), 845–908. https: //doi.org/10.1007/s00222-014-0560-x

  37. [38]

    Kun Qian, Yongqing Xi, Jiamin Cao, Jiaqi Gao, Yichi Xu, Yu Guan, Binzhang Fu, Xuemei Shi, Fangbo Zhu, Rui Miao, Chao Wang, Peng Wang, Pengcheng Zhang, Xianlong Zeng, Eddie Ruan, Zhiping Yao, Ennan Zhai, and Dennis Cai. 2024. Alibaba HPN: A Data Center Network for Large Language Model Training. InProc. ACM SIGCOMM Conference on Data Communication. 691–706....

  38. [39]

    Mogul, Amin Vahdat, Minlan Yu, Ethan Katz-Bassett, and Michael Rubin

    Brandon Schlinker, Radhika Niranjan Mysore, Sean Smith, Jef- frey C. Mogul, Amin Vahdat, Minlan Yu, Ethan Katz-Bassett, and Michael Rubin. 2015. Condor: Better Topologies Through Declarative Design.. InProc. ACM SIGCOMM Conference on Data Communication. 449–463. http://dblp .uni-trier.de/db/conf/sigcomm/ sigcomm2015.html#SchlinkerMSMVYK15

  39. [40]

    Leah Shalev, Hani Ayoub, Nafea Bshara, and Erez Sabbag. 2020. A Cloud-Optimized Transport Protocol for Elastic and Scalable HPC.IEEE Micro40, 6 (2020), 67–73. https://doi .org/10.1109/ MM.2020.3016891

  40. [41]

    Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armis- tead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, Anand Kanagala, Jeff Provost, Jason Simmons, Eiichi Tanda, Jim Wanderer, Urs Hölzle, Stephen Stuart, and Amin Vahdat. 2015. Jupiter Rising: A Decade of Clos Topologies and Centralized Con- trol in Google’s Datacenter ...

  41. [42]

    Arjun Singhvi, Nandita Dukkipati, Prashant Chandra, Hassan M. G. Wassel, Naveen Kr. Sharma, Anthony Rebello, Henry Schuh, Praveen Kumar, Behnam Montazeri, Neelesh Bansod, Sarin Thomas, Inho Cho, Hyojeong Lee Seibert, Baijun Wu, Rui Yang, Yuliang Li, Kai Huang, Qianwen Yin, Abhishek Agarwal, Srinivas Vaduvatha, Wei- huang Wang, Masoud Moshref, Tao Ji, Davi...

  42. [43]

    Ankit Singla, Chi-Yao Hong, Lucian Popa, and Philip Brighten Godfrey

  43. [44]

    InProceedings of the 9th USENIX Symposium on Networked Systems Design and Im- plementation (NSDI)

    Jellyfish: Networking Data Centers Randomly. InProceedings of the 9th USENIX Symposium on Networked Systems Design and Im- plementation (NSDI). 225–238. https://www .usenix.org/conference/ nsdi12/technical-sessions/presentation/singla

  44. [45]

    snaketray.com. 2026. The Essential Guide to Data Center Cabling. (2026). https://www .snaketray.com/data-center-cabling-guide/ [Re- trieved 6-Feb-2026]

  45. [46]

    Solano, R

    F. Solano, R. Fabregat, Y. Donoso, and J.L. Marzo. 2005. Asymmetric tunnels in P2MP LSPs as a label space reduction method. InIEEE International Conference on Communications (ICC), Vol. 1. 43–47 Vol. 1. https://doi.org/10.1109/ICC.2005.1494318

  46. [47]

    Solano, R

    F. Solano, R. Fabregat, and J.L. Marzo. 2005. Full label space reduc- tion in MPLS networks: asymmetric merged tunneling.IEEE Com- munications Letters9, 11 (2005), 1021–1023. https://doi .org/10.1109/ LCOMM.2005.11016

  47. [48]

    Asaf Valadarsky, Gal Shahaf, Michael Dinitz, and Michael Schapira

  48. [49]

    Xpander: Towards Optimal-Performance Datacenters. InProc. International on Conference on emerging Networking EXperiments and Technologies (CoNEXT). Association for Computing Machinery, New York, NY, USA, 205–219. https://doi.org/10.1145/2999572.2999580

  49. [50]

    Leslie G Valiant and Gordon J Brebner. 1981. Universal schemes for par- allel communication. InProc. ACM Symposium on Theory of Computing (STOC). 263–277

  50. [51]

    Andersen, Michael Kaminsky, Konstantina Pa- pagiannaki, T.S

    Guohui Wang, David G. Andersen, Michael Kaminsky, Konstantina Pa- pagiannaki, T.S. Eugene Ng, Michael Kozuch, and Michael Ryan. 2010. c-Through: part-time optics in data centers. InProc. ACM SIGCOMM Conference on Data Communication. 327–338. https://doi .org/10.1145/ 1851182.1851222

  51. [52]

    Wikipedia. 2026. Balls into bins problem. (2026). https:// en.wikipedia.org/wiki/Balls_into_bins_problem [Retrieved 6-Feb- 2026]

  52. [53]

    Wikipedia. 2026. Expander graph. (2026). https://en .wikipedia.org/ wiki/Expander_graph [Retrieved 6-Feb-2026]

  53. [54]

    Wikipedia. 2026. k-shortest path routing. (2026). https:// en.wikipedia.org/wiki/K_shortest_path_routing [Retrieved: 6-Feb- 2026]

  54. [55]

    Wikipedia. 2026. Menger’s Theorem. (2026). https://en.wikipedia.org/ wiki/Menger%27s_theorem [Retrieved: 6-Feb-2026]

  55. [56]

    Wikipedia. 2026. Multi-commodity flow problem. (2026). https: //en.wikipedia.org/wiki/Multi-commodity_flow_problem [Retrieved: 14 6-Feb-2026]

  56. [57]

    Damon Wischik, Costin Raiciu, Adam Greenhalgh, and Mark Handley. 2011. Design, Implementation and Evaluation of Congestion Control for Multipath TCP. InUSENIX Sympo- sium on Networked Systems Design and Implementation (NSDI). https://www.usenix.org/conference/nsdi11/design-implementation- and-evaluation-congestion-control-multipath-tcp

  57. [58]

    1999.Models of Random Regular Graphs

    Nicholas Wormald. 1999.Models of Random Regular Graphs. Cam- bridge. https://web.williams.edu/Mathematics/sjmiller/public_html/ ntprob19/handouts/graphs/Womald_ModelsRandomGraphs.pdf

  58. [59]

    Eitan Zahavi. 2012. Fat-tree routing and node ordering providing contention free traffic for MPI global collectives.J. Parallel and Distrib. Comput.72, 11 (2012), 1423–1432. https://doi .org/10.1016/ j.jpdc.2012.01.018 Communication Architectures for Scalable Sys- tems

  59. [60]

    W/o phases

    Xia Zhou, Zengbin Zhang, Yibo Zhu, Yubo Li, Saipriya Kumar, Amin Vahdat, Ben Y Zhao, and Haitao Zheng. 2012. Mirror mirror on the ceil- ing: Flexible wireless links for data centers.ACM SIGCOMM Computer Communication Review42, 4 (2012), 443–454. A INCREMENTAL CABLING The physical connectivity ofRNGdescribed in §6 can be achieved incrementally. While the f...

  60. [61]

    Thus, Pr[𝑌= 2]= [1− (1−𝜙 2 3)3]

    The probability that no 𝑋𝑖 takes value2is (1 −𝜙 2 3)3. Thus, Pr[𝑌= 2]= [1− (1−𝜙 2 3)3]. We deduce Pr[𝑌= 1] as1 −Pr[𝑌= 0] −Pr[𝑌= 2], and applying the above formulas.□ The congestion on a path is distributed as 𝑌+ 1, where 𝑌 is distributed as in Claim G.4. As discussed in the earlier section, the average flow that can be sent isE [1/(𝑌+ 1)]. Using Claim G.4...