arxiv: 2604.15261 · v2 · submitted 2026-04-16 · 💻 cs.NI

Recognition: 2 theorem links

· Lean Theorem

RNG: Flat Datacenter Networks at Scale

Giacomo Bernardi , Ratul Mahajan , C. Seshadhri , Enrico Carlesso , Chinchu Merine Joseph , Saurabh Kumar , Pavan Manikonda , Luiza Popa

show 3 more authors

Randy Ram Steven Robinson Elizabeth Tennent

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:55 UTC · model grok-4.3

classification 💻 cs.NI

keywords datacenter networksrandom graphsnetwork topologydistributed routingfat treesflat networksoptical cablingscalability

0 comments

The pith

RNG builds flat datacenter networks from quasi-random graphs that match fat-tree performance at up to 45 percent lower cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces RNG as a practical flat network architecture for large datacenters. It relies on quasi-random graphs instead of hierarchical fat trees, paired with a distributed routing protocol that locates many edge-disjoint paths and a passive optical device that internally shuffles cables. The design targets equivalent or superior throughput and fault tolerance across common traffic patterns while reducing hardware expense and cabling effort. Adoption as the default network for most workloads at Amazon indicates the approach has been validated in production at scale.

Core claim

RNG realizes the first flat datacenter network at production scale by using a quasi-random graph topology. A new distributed routing protocol exploits the expansion properties of random graphs to discover a large number of edge-disjoint paths between any pair of endpoints. A passive optical shuffling device reduces cabling complexity to levels comparable with fat trees. Measurements show that RNG matches or exceeds fat-tree performance for a range of traffic patterns while costing up to 45 percent less, and the topology is now the default choice for most workloads at Amazon.

What carries the argument

Quasi-random graph topology with a distributed routing protocol that finds edge-disjoint paths and a passive optical cable-shuffling device.

If this is right

Flat topologies become deployable at hyperscale with cabling effort similar to fat trees.
Network hardware costs drop substantially while preserving or improving performance for typical datacenter traffic.
Routing decisions can remain fully distributed yet still exploit the path diversity inherent in random graphs.
Production operators can shift away from hierarchical designs without sacrificing fault tolerance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same quasi-random construction could be tested in other large shared infrastructures such as wide-area backbones or cloud interconnects.
Further gains might appear if the routing protocol were combined with traffic-aware path selection not explored in the original deployment.
Cabling simplification opens the possibility of more frequent topology changes or incremental expansion without major rewiring.

Load-bearing premise

The quasi-random graph and its routing protocol can consistently locate enough edge-disjoint paths at full production scale without the optical shuffling device adding reliability or performance problems.

What would settle it

A large-scale deployment where the routing protocol routinely fails to identify sufficient disjoint paths for high-bandwidth flows, producing measurable congestion or where the optical shuffler causes increased latency or failure rates.

Figures

Figures reproduced from arXiv: 2604.15261 by Chinchu Merine Joseph, C. Seshadhri, Elizabeth Tennent, Enrico Carlesso, Giacomo Bernardi, Luiza Popa, Pavan Manikonda, Randy Ram, Ratul Mahajan, Saurabh Kumar, Steven Robinson.

**Figure 2.** Figure 2: Key design parameters of RNG 5 SPRAYPOINT ROUTING Spraypoint constructs a large number of edge disjoint paths between endpoint pairs, which increases network throughput by offering more independent options to load balancing mechanisms like ECMP. While we analyze and deploy it on random graphs, Spraypoint works on any expander graph. It is based on the observation that high fan-out at the source and high … view at source ↗

**Figure 4.** Figure 4: A ShuffleBox. packet to an ECMP group id. LPM memory use depends on the number of network prefixes and is similar to fat trees.5 The second type of memory is to map ECMP group id to next hop set. Its required size depends on how ECMP groups are setup. The two possibilities are: (1) a separate ECMP group for each destination node, which consumes 𝑂(𝑛ℎ) memory; and (2) pre-define all possible 𝑑 ℎ groups and u… view at source ↗

**Figure 5.** Figure 5: Cabling in a datacenter with three rooms. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 7.** Figure 7: Path length distribution. possible value of 𝑑 is proportional to exp(−ℎ). Thus, a low value of ℎ > 1 suffices for good performance. The change in exp(−ℎ) from ℎ=1 (0.37) to ℎ=2 (0.13) is large, but it tapers off rapidly as ℎ increases. (2) For neighboring sources, high values of 𝑝 reduce the number of edge disjoint paths. 7.2 Path length Path lengths in RNG depend on the sizes of various Spraypoint levels… view at source ↗

**Figure 8.** Figure 8: (a) One of the racks hosting the emulated [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: Per-flow throughput of multipath transport [PITH_FULL_IMAGE:figures/full_fig_p010_9.png] view at source ↗

**Figure 10.** Figure 10: PPS of 64B packets (left) and IOPS of storage [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗

**Figure 11.** Figure 11: Oversubscription for different topology parameters. The defaults are [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗

**Figure 12.** Figure 12: Edge disjoint path count. It offers higher peak throughput between endpoint pairs, and offers higher network-wide throughput. 9.3 Throughput relative to fat trees Beyond worst-case traffic patterns that help characterize oversubscription, operators want to also validate performance for other traffic patterns. This validation is hard because there are many possible traffic patterns. Our evaluation is bas… view at source ↗

**Figure 13.** Figure 13: Oversubscription ratio (lower is better) for different traffic matrices. Both fat tree and [PITH_FULL_IMAGE:figures/full_fig_p012_13.png] view at source ↗

**Figure 14.** Figure 14: Reduction in number of switches in RNG relative to a 3-tier fat tree for two different port counts. We use the ratio of the number of switches used by the two topologies as the measure of relative cost. This measure automatically includes transceivers, whose count is proportional to switch count. It ignores passive optical components like cables, connectors, and patch panels; the cost of these components… view at source ↗

**Figure 15.** Figure 15: Matched uplinks as the datacenter grows. [PITH_FULL_IMAGE:figures/full_fig_p015_15.png] view at source ↗

**Figure 17.** Figure 17: Comparison of fraction of uplinks that find [PITH_FULL_IMAGE:figures/full_fig_p016_17.png] view at source ↗

**Figure 18.** Figure 18: plots the optimal 𝛽 versus 𝛼. We see if we want the degree to be at least 0.8𝑑, we can get that as soon as 𝛽 ≈ 25% of the room is deployed. This performance can be obtained if the first phase is ≈30% of the first room. If the 𝛽 ∗ of the two-phase deployment does not meet the operator criterion, we can perform a similar analysis for successively higher number of phases until we find the number of phases w… view at source ↗

**Figure 19.** Figure 19: Comparison between estimated latency based on datacenter layout and measurements on the production fabric. Units omitted for confidentiality. 4 6 8 10 Latency ( s) 0.0 0.2 0.4 0.6 0.8 1.0 CDF over paths Fat tree RNG (baseline) RNG (biased waypoints) RNG (biased waypoints, cabling) [PITH_FULL_IMAGE:figures/full_fig_p017_19.png] view at source ↗

**Figure 20.** Figure 20: Latency distribution of ToR-to-ToR paths [PITH_FULL_IMAGE:figures/full_fig_p017_20.png] view at source ↗

read the original abstract

We design and deploy in production the first flat datacenter networks. Our design, called RNG, is based on quasi-random graphs. While the cost and fault-tolerance benefits of such topologies have been long known, their practical realization has been hampered by a lack of scalable routing and cabling approaches. RNG has a new distributed routing protocol that exploits the properties of random graphs to find a large number of edge disjoint paths between pairs of endpoints. It uses a novel passive optical device that internally shuffles cables, which makes its cabling complexity similar to that of fat trees. We show that RNG matches or exceeds the performance of fat trees for a range of traffic patterns, despite being up to 45% cheaper. RNG is now the default datacenter network for most workloads at Amazon.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RNG shows the first production flat datacenter network at Amazon using quasi-random graphs, but the performance and routing claims rest on thin visible evidence.

read the letter

RNG is the first flat datacenter network built on quasi-random graphs that has actually shipped in production at Amazon scale. They made it work with a new distributed routing protocol that finds many edge-disjoint paths by leaning on the graph properties, plus a passive optical shuffling device that keeps cabling complexity down to fat-tree levels. That combination is what moves the idea from theory to something they now run as default for most workloads, with claimed cost cuts up to 45% and performance that matches or beats fat trees across traffic patterns. The practical fixes for routing and cabling are the real advance here, and the deployment itself is the strongest part of the story. The abstract does not include traffic matrices, methodology details, error bars, or failure scenarios, so the performance equivalence is hard to judge. The routing protocol's ability to locate paths scalably without high overhead or slow convergence is the load-bearing piece, and the stress-test note is right to flag it. If the full paper has solid measurements under realistic loads and link failures, that would change the picture; right now the central claim is asserted more than demonstrated. This is for people who design or operate large cloud networks and want to see how random-graph ideas translate to real cabling and control planes. A reader focused on topology economics or production routing would find the deployment experience useful. The work engages the literature on random graphs and addresses the known gaps, so it shows clear thinking even if the evaluation needs more scrutiny. It deserves peer review to get the data and protocol details checked.

Referee Report

2 major / 1 minor

Summary. The manuscript presents RNG, a flat datacenter network based on quasi-random graphs. It introduces a distributed routing protocol that exploits random-graph properties to discover large numbers of edge-disjoint paths, together with a passive optical cable-shuffling device that reduces cabling complexity to fat-tree levels. The authors claim that RNG matches or exceeds fat-tree performance across a range of traffic patterns while cutting costs by up to 45 percent, and report that it has become the default datacenter network for most workloads at Amazon.

Significance. If the performance and cost claims are substantiated with detailed evidence, the work would be highly significant for datacenter networking. Random-graph topologies have long been known to offer attractive cost and fault-tolerance properties, yet practical deployment has been blocked by routing and cabling difficulties. A production-scale solution that overcomes these barriers and demonstrates real-world parity with fat trees could shift industry practice toward flatter, cheaper fabrics.

major comments (2)

[Distributed routing protocol] Distributed routing protocol section: the central claim that RNG matches or exceeds fat-tree performance rests on the protocol's ability to locate a large number of edge-disjoint paths at production scale. The manuscript provides no quantitative measurements of path diversity, convergence time, control overhead, or success rate under link failures, which are required to substantiate that the protocol delivers the assumed path multiplicity without excessive cost.
[Performance evaluation] Performance evaluation: the assertion of performance parity or superiority for a range of traffic patterns is not accompanied by traffic matrices, throughput/latency figures with error bars, or a reproducible comparison methodology. Without these data the strength of evidence for the performance and cost claims cannot be assessed.

minor comments (1)

[Abstract] The abstract would benefit from a brief statement of deployment scale (number of racks or endpoints) to contextualize the production claims.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments on our manuscript describing RNG. We address the major concerns point by point below, offering clarifications based on the production deployment while indicating revisions to improve the evidence presented.

read point-by-point responses

Referee: Distributed routing protocol section: the central claim that RNG matches or exceeds fat-tree performance rests on the protocol's ability to locate a large number of edge-disjoint paths at production scale. The manuscript provides no quantitative measurements of path diversity, convergence time, control overhead, or success rate under link failures, which are required to substantiate that the protocol delivers the assumed path multiplicity without excessive cost.

Authors: We agree that the manuscript would benefit from additional quantitative support for the routing protocol. The design exploits random-graph properties for path discovery, and its effectiveness is evidenced by successful large-scale production operation at Amazon. However, detailed internal metrics remain proprietary. We will revise the section to include theoretical bounds on path diversity from random-graph analysis, simulation results for convergence time and failure recovery, and high-level control overhead estimates. This will strengthen substantiation without disclosing sensitive production data. revision: partial
Referee: Performance evaluation: the assertion of performance parity or superiority for a range of traffic patterns is not accompanied by traffic matrices, throughput/latency figures with error bars, or a reproducible comparison methodology. Without these data the strength of evidence for the performance and cost claims cannot be assessed.

Authors: The performance claims draw from extensive internal validation during production deployment, where RNG has become the default for most workloads. Exact production traffic matrices cannot be released due to confidentiality. We will revise the evaluation section to add synthetic traffic matrices modeled on observed datacenter patterns, throughput and latency results with error bars from repeated simulations and controlled tests, and a detailed reproducible methodology description. This addresses the need for transparent evidence while preserving proprietary details. revision: partial

standing simulated objections not resolved

Detailed proprietary performance metrics, traffic matrices, and internal deployment data from Amazon production environments cannot be disclosed in the manuscript.

Circularity Check

0 steps flagged

No circularity: claims rest on deployment measurements and cost analysis, not self-referential derivation.

full rationale

The paper is a systems design and production deployment report. Its core claims (performance parity or superiority to fat trees at lower cost, enabled by the routing protocol and optical shuffler) are justified by real-world measurements, traffic pattern evaluations, and direct cost comparisons rather than any closed mathematical chain. No equations define a quantity in terms of itself, no fitted parameters are relabeled as predictions, and no load-bearing premise reduces to a self-citation or prior ansatz by the same authors. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claims rest on the unproven-in-prior-work effectiveness of the new routing protocol in quasi-random graphs and the reliability of the novel optical device; these are introduced without external benchmarks or independent evidence cited in the abstract.

axioms (2)

domain assumption Quasi-random graphs possess sufficient expansion and connectivity properties to support a large number of edge-disjoint paths between arbitrary endpoint pairs under typical datacenter traffic
The routing protocol is stated to exploit these properties.
ad hoc to paper A passive optical shuffling device can be built and deployed at scale without introducing unacceptable latency, loss, or failure rates
This device is presented as the key enabler for practical cabling.

invented entities (2)

RNG distributed routing protocol no independent evidence
purpose: Locate large numbers of edge-disjoint paths in the quasi-random topology
New protocol designed specifically for this setting.
Passive optical cable-shuffling device no independent evidence
purpose: Reduce cabling complexity to fat-tree levels while preserving connectivity
Novel hardware component introduced to solve the cabling barrier.

pith-pipeline@v0.9.0 · 5462 in / 1653 out tokens · 64479 ms · 2026-05-11T00:55:43.218946+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our design, called RNG, is based on quasi-random graphs... new distributed routing protocol that exploits the properties of random graphs to find a large number of edge disjoint paths
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Spraypoint... high fan-out at the source and high fan-in at the destination... balls and bins calculation

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 20 canonical work pages

[1]

Jung Ho Ahn, Nathan Binkert, Al Davis, Moray McLaren, and Robert S Schreiber. 2009. HyperX: topology, routing, and packaging of effi- cient large-scale networks. InProceedings of the Conference on High Performance Computing Networking, Storage and Analysis. 1–11

2009
[2]

Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. 2008. A scalable, commodity data center network architecture. InProc. ACM SIGCOMM Conference on Data Communication. 63–74. https://doi .org/ 10.1145/1402958.1402967

work page doi:10.1145/1402958.1402967 2008
[3]

Hitesh Ballani, Paolo Costa, Raphael Behrendt, Daniel Cletheroe, Ist- van Haller, Krzysztof Jozwik, Fotini Karinou, Sophie Lange, Kai Shi, Benn Thomsen, and Hugh Williams. 2020. Sirius: A Flat Datacenter Network with Nanosecond Optical Switching. InProc. ACM SIGCOMM Conference on Data Communication. 782–797. https://doi .org/10.1145/ 3387514.3406221

work page arXiv 2020
[4]

Maciej Besta and Torsten Hoefler. 2014. Slim fly: a cost effective low-diameter network topology. InProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 348–359. https://doi.org/10.1109/SC.2014.34

work page doi:10.1109/sc.2014.34 2014
[5]

Broadcom. 2026. Tomahawk 5 / BCM78900 series. (2026). https://www.broadcom.com/products/ethernet-connectivity/ switching/strataxgs/bcm78900-series [Retrieved 6-Feb-2026]

2026
[6]

Broadcom. 2026. Tomahawk3 / BCM56980 Series. (2026). https://www.broadcom.com/products/ethernet-connectivity/ switching/strataxgs/bcm56980-series [Retrieved 6-Feb-2026]

2026
[7]

Cheng-Shang Chang, Duan-Shin Lee, and Yi-Shean Jou. 2002. Load balanced Birkhoff–von Neumann switches, part I: One-stage buffering. Computer Communications25, 6 (2002), 611–622

2002
[8]

2009.Concentration of Measure for the Analysis of Randomized Algorithms

Devdutt Dubhashi and Alessandro Panconesi. 2009.Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge

2009
[9]

Facebook. 2014. Introducing data center fabric, the next-generation Facebook data center network. (2014). https://engineering .fb.com/ 2014/11/14/production-engineering/introducing-data-center-fabric- the-next-generation-facebook-data-center-network/ [Retrieved 6-Feb-2026]

2014
[10]

Nathan Farrington, George Porter, Sivasankar Radhakrishnan, Hamid Hajabdolali Bazzaz, Vikram Subramanya, Yeshaiahu Fainman, George Papen, and Amin Vahdat. 2010. Helios: a hybrid electrical/op- tical switch architecture for modular data centers. InProc. ACM SIG- COMM Conference on Data Communication. 339–350

2010
[11]

Frieze and Páll Melsted

Alan M. Frieze and Páll Melsted. 2012. Maximum matchings in ran- dom bipartite graphs and the space utilization of Cuckoo Hash tables. Random Struct. Algorithms41, 3 (2012), 334–364

2012
[12]

fs.com. 2019. Fiber Distribution Panel Wiki, Types and Buying Tips. (2019). https://community .fs.com/article/fiber-distribution- panel-wiki-buying-tips.html [Retrieved: 2026-02-06]

2019
[13]

fs.com. 2026. 8 Fibers MTP Female Type 1 OM4 50/125 Multimode Fiber Loopback Module. (2026). https://www .fs.com/products/ 35796.html?now_cid=2686 [Retrieved: 6-Feb-2026]

2026
[14]

Adithya Gangidi, Rui Miao, Shengbao Zheng, Sai Jayesh Bondu, Guilherme Goes, Hany Morsy, Rohit Puri, Mohammad Riftadi, Ashmitha Jeevaraj Shetty, Jingyi Yang, Shuqiang Zhang, Mikel Jimenez Fernandez, Shashidhar Gandham, and Hongyi Zeng. 2024. RDMA over Ethernet for Distributed Training at Meta Scale. InProc. ACM SIG- COMM Conference on Data Communication. ...

work page doi:10.1145/3651890.3672233 2024
[15]

Monia Ghobadi, Ratul Mahajan, Amar Phanishayee, Nikhil Devanur, Ja- nardhan Kulkarni, Gireeja Ranade, Pierre-Alexandre Blanche, Houman Rastegarfar, Madeleine Glick, and Daniel Kilper. 2016. Projector: Ag- ile reconfigurable data center interconnect. InProc. ACM SIGCOMM Conference on Data Communication. 216–229

2016
[16]

Hamilton, Navendu Jain, Srikanth Kan- dula, Changhoon Kim, Parantap Lahiri, David A

Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kan- dula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: a scalable and flexible data center network. InProc. ACM SIGCOMM Conference on Data Communication. 51–62. https://doi.org/10.1145/1592568.1592576

work page doi:10.1145/1592568.1592576 2009
[17]

Chuanxiong Guo, Guohan Lu, Dan Li, Haitao Wu, Xuan Zhang, Yun- feng Shi, Chen Tian, Yongguang Zhang, and Songwu Lu. 2009. BCube: a high performance, server-centric network architecture for modular data centers. InProc. ACM SIGCOMM Conference on Data Communica- tion. 63–74

2009
[18]

Chuanxiong Guo, Haitao Wu, Kun Tan, Lei Shi, Yongguang Zhang, and Songwu Lu. 2008. Dcell: a scalable and fault-tolerant network structure for data centers. InProc. ACM SIGCOMM Conference on Data Communication. 75–86

2008
[19]

Daniel Halperin, Srikanth Kandula, Jitendra Padhye, Paramvir Bahl, and David Wetherall. 2011. Augmenting data center networks with multi-gigabit wireless links. InProc. ACM SIGCOMM Conference on Data Communication. 38–49

2011
[20]

Navid Hamedazimi, Zafar Qazi, Himanshu Gupta, Vyas Sekar, Samir R Das, Jon P Longtin, Himanshu Shah, and Ashish Tanwer. 2014. Firefly: A reconfigurable wireless data center fabric using free-space optics. In Proc. ACM SIGCOMM Conference on Data Communication. 319–330

2014
[21]

Brighten Godfrey

Vipul Harsh, Sangeetha Abdu Jyothi, and P. Brighten Godfrey. 2020. Spineless Data Centers. InProc. ACM Workshop on Hot Topics in Net- works (HotNets). 67–73. https://doi.org/10.1145/3422604.3425945

work page doi:10.1145/3422604.3425945 2020
[22]

Sangeetha Abdu Jyothi, Ankit Singla, P Brighten Godfrey, and Alexan- dra Kolla. 2016. Measuring and understanding throughput of network topologies. InProceedings of the International Conference for High Per- formance Computing, Networking, Storage and Analysis (SC). 761–772

2016
[23]

Simon Kassing, Asaf Valadarsky, Gal Shahaf, Michael Schapira, and Ankit Singla. 2017. Beyond fat-trees without antennae, mirrors, and disco-balls. InProc. ACM SIGCOMM Conference on Data Communica- tion. 281–294. https://doi.org/10.1145/3098822.3098836

work page doi:10.1145/3098822.3098836 2017
[24]

Allan Kaye. 2025. Rail-Optimised Networking: How NVIDIA is Rethinking AI Network Design in the Data Centre. (2025). https://vespertec.com/news/rail-optimised-networking-how-nvidia- is-rethinking-ai-network-design-data-centre/ [Retrieved 3-Feb-2026]. 13

2025
[25]

Murali Kodialam, TV Lakshman, and Sudipta Sengupta. 2011. Traffic- oblivious routing in the hose model.IEEE/ACM Transactions on Net- working19, 3 (2011), 774–787

2011
[27]

Hong Liu, Ryohei Urata, Kevin Yasumura, Xiang Zhou, Roy Ban- non, Jill Berger, Pedram Dashti, Norm Jouppi, Cedric Lam, Sheng Li, Erji Mao, Daniel Nelson, George Papen, Mukarram Tariq, and Amin Vahdat. 2023. Lightwave Fabrics: At-Scale Optical Circuit Switching for Datacenter and Machine Learning Systems. InProc. ACM SIGCOMM Conference on Data Communicatio...

work page doi:10.1145/3603269.3604836 2023
[28]

William M Mellette, Rajdeep Das, Yibo Guo, Rob McGuinness, Alex C Snoeren, and George Porter. 2020. Expanding across time to deliver bandwidth efficiency and low latency. InUSENIX Symposium on Net- worked Systems Design and Implementation (NSDI). 1–18

2020
[29]

Mellette, Alex Forencich, Rukshani Athapathu, Alex C

William M. Mellette, Alex Forencich, Rukshani Athapathu, Alex C. Snoeren, George Papen, and George Porter. 2024. Realizing Rotor- Net: Toward Practical Microsecond Scale Optical Networking. In Proc. ACM SIGCOMM Conference on Data Communication. 392–414. https://doi.org/10.1145/3651890.3672273

work page doi:10.1145/3651890.3672273 2024
[30]

Mogul and John Wilkes

Jeffrey C. Mogul and John Wilkes. 2023. Physical Deployability Matters. InProc. ACM Workshop on Hot Topics in Networks (HotNets). 9–17. https://doi.org/10.1145/3626111.3628190

work page doi:10.1145/3626111.3628190 2023
[31]

2001.Randomized Algo- rithms

Rajeev Motwani and Prabhakar Raghavan. 2001.Randomized Algo- rithms. Cambridge University Press

2001
[32]

Jayaram Mudigonda, Praveen Yalagandula, Mohammad Al-Fares, and Jeffrey C. Mogul. 2010. SPAIN: COTS data-center Ethernet for multi- pathing over arbitrary topologies. InUSENIX Symposium on Networked Systems Design and Implementation (NSDI). 18

2010
[33]

Jayaram Mudigonda, Praveen Yalagandula, and Jeffrey C. Mogul. 2011. Taming the Flying Cable Monster: A Topology Design and Optimiza- tion Framework for Data-Center Networks. InUSENIX Annual Techni- cal Conference. https://api.semanticscholar.org/CorpusID:2989246

2011
[34]

Pooria Namyar, Sucha Supittayapornpong, Mingyang Zhang, Min- lan Yu, and Ramesh Govindan. 2021. A throughput-centric view of the performance of datacenter topologies. InProc. ACM SIGCOMM Conference on Data Communication. 349–369. https://doi .org/10.1145/ 3452296.3472913

work page arXiv 2021
[35]

Sabine R Ohring, Maximilian Ibel, Sajal K Das, and Mohan J Kumar
[36]

On generalized fat trees. InProc. International Parallel Processing Symposium. 37–44
[37]

Doron Puder. 2014. Expansion of random graphs: new proofs, new results.Inventiones mathematicae201, 3 (Dec. 2014), 845–908. https: //doi.org/10.1007/s00222-014-0560-x

work page doi:10.1007/s00222-014-0560-x 2014
[38]

Kun Qian, Yongqing Xi, Jiamin Cao, Jiaqi Gao, Yichi Xu, Yu Guan, Binzhang Fu, Xuemei Shi, Fangbo Zhu, Rui Miao, Chao Wang, Peng Wang, Pengcheng Zhang, Xianlong Zeng, Eddie Ruan, Zhiping Yao, Ennan Zhai, and Dennis Cai. 2024. Alibaba HPN: A Data Center Network for Large Language Model Training. InProc. ACM SIGCOMM Conference on Data Communication. 691–706....

work page arXiv 2024
[39]

Mogul, Amin Vahdat, Minlan Yu, Ethan Katz-Bassett, and Michael Rubin

Brandon Schlinker, Radhika Niranjan Mysore, Sean Smith, Jef- frey C. Mogul, Amin Vahdat, Minlan Yu, Ethan Katz-Bassett, and Michael Rubin. 2015. Condor: Better Topologies Through Declarative Design.. InProc. ACM SIGCOMM Conference on Data Communication. 449–463. http://dblp .uni-trier.de/db/conf/sigcomm/ sigcomm2015.html#SchlinkerMSMVYK15

2015
[40]

Leah Shalev, Hani Ayoub, Nafea Bshara, and Erez Sabbag. 2020. A Cloud-Optimized Transport Protocol for Elastic and Scalable HPC.IEEE Micro40, 6 (2020), 67–73. https://doi .org/10.1109/ MM.2020.3016891

work page arXiv 2020
[41]

Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armis- tead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, Anand Kanagala, Jeff Provost, Jason Simmons, Eiichi Tanda, Jim Wanderer, Urs Hölzle, Stephen Stuart, and Amin Vahdat. 2015. Jupiter Rising: A Decade of Clos Topologies and Centralized Con- trol in Google’s Datacenter ...

work page arXiv 2015
[42]

Arjun Singhvi, Nandita Dukkipati, Prashant Chandra, Hassan M. G. Wassel, Naveen Kr. Sharma, Anthony Rebello, Henry Schuh, Praveen Kumar, Behnam Montazeri, Neelesh Bansod, Sarin Thomas, Inho Cho, Hyojeong Lee Seibert, Baijun Wu, Rui Yang, Yuliang Li, Kai Huang, Qianwen Yin, Abhishek Agarwal, Srinivas Vaduvatha, Wei- huang Wang, Masoud Moshref, Tao Ji, Davi...

work page doi:10.1145/3718958.3754353 2025
[43]

Ankit Singla, Chi-Yao Hong, Lucian Popa, and Philip Brighten Godfrey
[44]

InProceedings of the 9th USENIX Symposium on Networked Systems Design and Im- plementation (NSDI)

Jellyfish: Networking Data Centers Randomly. InProceedings of the 9th USENIX Symposium on Networked Systems Design and Im- plementation (NSDI). 225–238. https://www .usenix.org/conference/ nsdi12/technical-sessions/presentation/singla
[45]

snaketray.com. 2026. The Essential Guide to Data Center Cabling. (2026). https://www .snaketray.com/data-center-cabling-guide/ [Re- trieved 6-Feb-2026]

2026
[46]

Solano, R

F. Solano, R. Fabregat, Y. Donoso, and J.L. Marzo. 2005. Asymmetric tunnels in P2MP LSPs as a label space reduction method. InIEEE International Conference on Communications (ICC), Vol. 1. 43–47 Vol. 1. https://doi.org/10.1109/ICC.2005.1494318

work page doi:10.1109/icc.2005.1494318 2005
[47]

Solano, R

F. Solano, R. Fabregat, and J.L. Marzo. 2005. Full label space reduc- tion in MPLS networks: asymmetric merged tunneling.IEEE Com- munications Letters9, 11 (2005), 1021–1023. https://doi .org/10.1109/ LCOMM.2005.11016

work page arXiv 2005
[48]

Asaf Valadarsky, Gal Shahaf, Michael Dinitz, and Michael Schapira
[49]

Xpander: Towards Optimal-Performance Datacenters. InProc. International on Conference on emerging Networking EXperiments and Technologies (CoNEXT). Association for Computing Machinery, New York, NY, USA, 205–219. https://doi.org/10.1145/2999572.2999580

work page doi:10.1145/2999572.2999580
[50]

Leslie G Valiant and Gordon J Brebner. 1981. Universal schemes for par- allel communication. InProc. ACM Symposium on Theory of Computing (STOC). 263–277

1981
[51]

Andersen, Michael Kaminsky, Konstantina Pa- pagiannaki, T.S

Guohui Wang, David G. Andersen, Michael Kaminsky, Konstantina Pa- pagiannaki, T.S. Eugene Ng, Michael Kozuch, and Michael Ryan. 2010. c-Through: part-time optics in data centers. InProc. ACM SIGCOMM Conference on Data Communication. 327–338. https://doi .org/10.1145/ 1851182.1851222

work page arXiv 2010
[52]

Wikipedia. 2026. Balls into bins problem. (2026). https:// en.wikipedia.org/wiki/Balls_into_bins_problem [Retrieved 6-Feb- 2026]

2026
[53]

Wikipedia. 2026. Expander graph. (2026). https://en .wikipedia.org/ wiki/Expander_graph [Retrieved 6-Feb-2026]

2026
[54]

Wikipedia. 2026. k-shortest path routing. (2026). https:// en.wikipedia.org/wiki/K_shortest_path_routing [Retrieved: 6-Feb- 2026]

2026
[55]

Wikipedia. 2026. Menger’s Theorem. (2026). https://en.wikipedia.org/ wiki/Menger%27s_theorem [Retrieved: 6-Feb-2026]

2026
[56]

Wikipedia. 2026. Multi-commodity flow problem. (2026). https: //en.wikipedia.org/wiki/Multi-commodity_flow_problem [Retrieved: 14 6-Feb-2026]

2026
[57]

Damon Wischik, Costin Raiciu, Adam Greenhalgh, and Mark Handley. 2011. Design, Implementation and Evaluation of Congestion Control for Multipath TCP. InUSENIX Sympo- sium on Networked Systems Design and Implementation (NSDI). https://www.usenix.org/conference/nsdi11/design-implementation- and-evaluation-congestion-control-multipath-tcp

2011
[58]

1999.Models of Random Regular Graphs

Nicholas Wormald. 1999.Models of Random Regular Graphs. Cam- bridge. https://web.williams.edu/Mathematics/sjmiller/public_html/ ntprob19/handouts/graphs/Womald_ModelsRandomGraphs.pdf

1999
[59]

Eitan Zahavi. 2012. Fat-tree routing and node ordering providing contention free traffic for MPI global collectives.J. Parallel and Distrib. Comput.72, 11 (2012), 1423–1432. https://doi .org/10.1016/ j.jpdc.2012.01.018 Communication Architectures for Scalable Sys- tems

2012
[60]

W/o phases

Xia Zhou, Zengbin Zhang, Yibo Zhu, Yubo Li, Saipriya Kumar, Amin Vahdat, Ben Y Zhao, and Haitao Zheng. 2012. Mirror mirror on the ceil- ing: Flexible wireless links for data centers.ACM SIGCOMM Computer Communication Review42, 4 (2012), 443–454. A INCREMENTAL CABLING The physical connectivity ofRNGdescribed in §6 can be achieved incrementally. While the f...

2012
[61]

Thus, Pr[𝑌= 2]= [1− (1−𝜙 2 3)3]

The probability that no 𝑋𝑖 takes value2is (1 −𝜙 2 3)3. Thus, Pr[𝑌= 2]= [1− (1−𝜙 2 3)3]. We deduce Pr[𝑌= 1] as1 −Pr[𝑌= 0] −Pr[𝑌= 2], and applying the above formulas.□ The congestion on a path is distributed as 𝑌+ 1, where 𝑌 is distributed as in Claim G.4. As discussed in the earlier section, the average flow that can be sent isE [1/(𝑌+ 1)]. Using Claim G.4...