NODE: Network Wide Top-K Flows in the Data Plane
Pith reviewed 2026-05-08 05:13 UTC · model grok-4.3
The pith
Switches can detect the global top-k heaviest flows by exchanging measurements directly in the data plane until all agree on the same list.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
NODE is a network-wide top-k detection algorithm that operates exclusively in the data plane. It allows the switches to aggregate information from all other switches in the network and ensures that eventually all switches hold an identical global top-k table. The work demonstrates that this process detects global top-k flows on both synthetic and real traces with a recall rate of over 95 percent while using less than 300KB per switch.
What carries the argument
The in-data-plane aggregation protocol that lets switches propagate and merge their local top-k flow records until every switch converges on the same global top-k table.
Load-bearing premise
Switches can reliably exchange and aggregate partial top-k information in the data plane such that all switches converge to an identical global table without excessive packet loss or memory overflow.
What would settle it
A multi-switch testbed running real traffic where switches fail to reach agreement on identical top-k tables or where measured recall falls below 95 percent on the same traces.
Figures
read the original abstract
Monitoring network traffic is crucial for most network tasks, such as, identifying and blocking attacks, pinpointing failures and engineering and rerouting heavy traffic to maintain high throughput. One important metric when monitoring the traffic is finding the top-k heavy flows, that is the k heaviest flows in the traffic. Programmable networks allow performing advanced network analysis right in the data plane. In recent years, various solutions have been proposed for efficiently finding the top-k heavy flows within a single switch. However, at times we may need to find the global top-k flows. Existing solutions for global top-k detection use a centralized controller that collects and aggregates the measurements performed in each of the switches. Yet, the process of sending information to the control plane and then having the controller send back the information to the switches can be very lengthy. In order to be able to detect and mitigate short-lived events, solutions that work completely within the data plane are needed. In this paper we present NODE, a network-wide top-k detection algorithm that operates exclusively in the data plane. NODE allows the switches to aggregate information from all other switches in the network, and ensures that eventually all switches hold an identical global top-k table. We show that NODE manages to detect global top-k flows on both synthetic and real traces, with a recall rate of over 95\% while using less than 300KB per switch.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces NODE, a data-plane-only algorithm for network-wide top-k heavy flow detection. Switches maintain local top-k tables and exchange partial summaries to aggregate information, converging so that every switch holds an identical global top-k table. Evaluation on synthetic and real traces reports >95% recall while consuming <300 KB of memory per switch.
Significance. A working data-plane solution for consistent global top-k detection would be significant for low-latency applications such as attack mitigation and traffic engineering, as it removes controller round-trip delays. The memory bound and recall numbers, if reproducible and robust, would be a concrete contribution to programmable-network monitoring.
major comments (1)
- [Evaluation / Results] The central claim that all switches converge to an identical global top-k table depends on reliable exchange and aggregation of partial summaries. The evaluation (described in the abstract and presumably the results section) uses trace-driven simulations that assume perfect delivery of update packets; no experiments or analysis address packet loss, variable delays, or high flow churn, which the skeptic note correctly identifies as the weakest assumption and could cause divergence or recall below 95%.
minor comments (1)
- [Abstract] The abstract states the 95% recall and 300 KB memory figures but provides no details on network topology, number of switches, trace durations, or how the global ground truth was obtained for recall computation.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our work. We address the major comment below and commit to revisions that strengthen the evaluation.
read point-by-point responses
-
Referee: [Evaluation / Results] The central claim that all switches converge to an identical global top-k table depends on reliable exchange and aggregation of partial summaries. The evaluation (described in the abstract and presumably the results section) uses trace-driven simulations that assume perfect delivery of update packets; no experiments or analysis address packet loss, variable delays, or high flow churn, which the skeptic note correctly identifies as the weakest assumption and could cause divergence or recall below 95%.
Authors: We agree that the current evaluation assumes perfect packet delivery and does not model packet loss, variable delays, or high flow churn. This is a genuine limitation when claiming convergence and recall in realistic settings. In the revised manuscript we will add a dedicated robustness subsection containing trace-driven simulations that inject packet loss (0.1–2%), variable delays, and synthetic flow churn. We will report resulting convergence times, recall degradation, and any divergence cases, together with a brief discussion of how periodic summary exchanges mitigate occasional losses. revision: yes
Circularity Check
No significant circularity; claims rest on empirical evaluation
full rationale
The paper introduces NODE as a data-plane algorithm for aggregating partial top-k flow information across switches to converge on a global table, with the central claim supported solely by trace-driven experiments reporting >95% recall using <300KB per switch. No equations, derivations, parameter fittings, or self-citations appear in the abstract or described content that would reduce any result to its inputs by construction. The evaluation uses independent synthetic and real traces without any self-referential normalization or uniqueness theorems imported from prior author work, making the derivation chain self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
https://www.intel.com/content/www/us/en/products/ network-io/programmable-ethernet-switch.html/
Intel Tofino. https://www.intel.com/content/www/us/en/products/ network-io/programmable-ethernet-switch.html/
-
[2]
https://www.ietf.org/rfc/rfc3954.txt
Netflow. https://www.ietf.org/rfc/rfc3954.txt. (a) Synth Zipf 0.6 (b) Synth Zipf 0.8 (c) Synth Zipf 1.0 (d) CAIDA (e) Message Counts Fig. 4: NODE’s recall with different datasets when using clustering
-
[3]
The caida anonymized internet traces dataset. https://www. caida.org/catalog/datasets/passive dataset, 2018, 2019. [Online; accessed June 2023]
work page 2018
-
[4]
Traffic engineering in software defined networks
Sugam Agarwal, Murali Kodialam, and TV Lakshman. Traffic engineering in software defined networks. InIEEE INFOCOM, pages 2211–2219, 2013
work page 2013
-
[5]
Routing oblivious measurement analytics
Ran Ben Basat, Xiaoqi Chen, Gil Einziger, Shir Landau Feibish, Danny Raz, and Minlan Yu. Routing oblivious measurement analytics. InIFIP Networking, pages 449–457, 2020
work page 2020
-
[6]
Ran Ben Basat, Xiaoqi Chen, Gil Einziger, and Ori Rottenstreich. Designing heavy-hitter detection algorithms for programmable switches.IEEE/ACM Transactions on Networking, 28(3):1172– 1185, 2020
work page 2020
-
[7]
Network-wide routing-oblivious heavy hitters
Ran Ben Basat, Gil Einziger, Shir Landau Feibish, Jalil Moraney, and Danny Raz. Network-wide routing-oblivious heavy hitters. InANCS, pages 66–73, 2018
work page 2018
-
[8]
Efficient measurement on programmable switches using probabilistic recirculation
Ran Ben-Basat, Xiaoqi Chen, Gil Einziger, and Ori Rottenstre- ich. Efficient measurement on programmable switches using probabilistic recirculation. In2018 IEEE 26th International Conference on Network Protocols (ICNP), pages 313–323. IEEE, 2018
work page 2018
-
[9]
Randomized admission policy for efficient top-k and frequency estimation
Ran Ben-Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. Randomized admission policy for efficient top-k and frequency estimation. InIEEE INFOCOM, pages 1–9, 2017
work page 2017
-
[10]
Microte: Fine grained traffic engineering for data centers
Theophilus Benson, Ashok Anand, Aditya Akella, and Ming Zhang. Microte: Fine grained traffic engineering for data centers. InCoNEXT, pages 1–12, 2011
work page 2011
-
[11]
Sounding the bell for improving internet (of things) security
Theophilus Benson and Balakrishnan Chandrasekaran. Sounding the bell for improving internet (of things) security. InWorkshop on Internet of Things Security and Privacy, pages 77–82, 2017
work page 2017
-
[12]
Burton H Bloom. Space/time trade-offs in hash coding with allowable errors.Communications of the ACM, 13(7):422–426, 1970
work page 1970
-
[13]
Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard, Fernando Mujica, and Mark Horowitz. Forwarding metamorphosis: Fast programmable match-action processing in hardware for sdn.ACM SIGCOMM Computer Communication Review, 43(4):99–110, 2013
work page 2013
-
[14]
Yanpei Chen, Rean Griffit, David Zats, and Randy H Katz. Understanding tcp incast and its implications for big data work- loads.University of California at Berkeley, Tech. Rep, 2012
work page 2012
-
[15]
An improved data stream summary: the count-min sketch and its applications
Graham Cormode and Shan Muthukrishnan. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, 55(1):58–75, 2005
work page 2005
-
[16]
Robust distributed monitoring of traffic flows
Vitalii Demianiuk, Sergey Gorinsky, Sergey I Nikolenko, and Kirill Kogan. Robust distributed monitoring of traffic flows. IEEE/ACM Transactions on Networking, 29(1):275–288, 2020
work page 2020
-
[17]
Damu Ding, Marco Savi, Gianni Antichi, and Domenico Sira- cusa. An incrementally-deployable p4-enabled architecture for network-wide heavy-hitter detection.IEEE Transactions on Network and Service Management, 17(1):75–88, 2020
work page 2020
-
[18]
Network-wide heavy hitter detection with commodity switches
Rob Harrison, Qizhe Cai, Arpit Gupta, and Jennifer Rexford. Network-wide heavy hitter detection with commodity switches. InSOSR, pages 1–7, 2018
work page 2018
-
[19]
Carpe elephants: Seize the global heavy hitters
Rob Harrison, Shir Landau Feibish, Arpit Gupta, Ross Teixeira, S Muthukrishnan, and Jennifer Rexford. Carpe elephants: Seize the global heavy hitters. InSPIN@SIGCOMM, pages 15–21, 2020
work page 2020
-
[20]
{NetChain}:{Scale-Free}{Sub-RTT}coordination
Xin Jin, Xiaozhou Li, Haoyu Zhang, Nate Foster, Jeongkeun Lee, Robert Soul ´e, Changhoon Kim, and Ion Stoica. {NetChain}:{Scale-Free}{Sub-RTT}coordination. InUSENIX NSDI, pages 35–49, 2018
work page 2018
-
[21]
Flowradar: A better netflow for data centers
Yuliang Li, Rui Miao, Changhoon Kim, and Minlan Yu. Flowradar: A better netflow for data centers. InU SEN IX N SDI), pages 311–324, 2016
work page 2016
-
[22]
Zaoxing Liu, Hun Namkung, Georgios Nikolaidis, Jeongkeun Lee, Changhoon Kim, Xin Jin, Vladimir Braverman, Minlan Yu, and Vyas Sekar. Jaqen: A high-performance switch-native approach for detecting and mitigating volumetric ddos attacks with programmable switches. InUSENIX Security, pages 3829– 3846, 2021
work page 2021
-
[23]
Efficient computation of frequent and top-k elements in data streams
Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. Efficient computation of frequent and top-k elements in data streams. InDatabase Theory-ICDT 2005, pages 398–412, 2005
work page 2005
-
[24]
Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C. Snoeren. Inside the social network’s (datacenter) network. InProceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 2015
work page 2015
-
[25]
Anees Shaikh, Jennifer Rexford, and Kang G Shin. Load- sensitive routing of long-lived ip flows.ACM SIGCOMM Computer Communication Review, 29(4):215–226, 1999
work page 1999
-
[26]
Heavy-hitter detec- tion entirely in the data plane
Vibhaalakshmi Sivaraman, Srinivas Narayana, Ori Rottenstreich, Shan Muthukrishnan, and Jennifer Rexford. Heavy-hitter detec- tion entirely in the data plane. InSoSR, pages 164–176, 2017
work page 2017
-
[27]
Lu Tang, Qun Huang, and Patrick P. C. Lee. A fast and compact invertible sketch for network-wide heavy flow detec- tion.IEEE/ACM Transactions on Networking, 28(5):2350–2363, 2020
work page 2020
-
[28]
sflow: Towards resource-efficient and agile service federation in service overlay networks
Mea Wang, Baochun Li, and Zongpeng Li. sflow: Towards resource-efficient and agile service federation in service overlay networks. InICDCS, pages 628–635, 2004
work page 2004
-
[29]
Lior Zeno, Dan RK Ports, Jacob Nelson, Daehyeok Kim, Shir Landau-Feibish, Idit Keidar, Arik Rinberg, Alon Rashelbach, Igor De-Paula, and Mark Silberstein.{SwiSh}: Distributed shared state abstractions for programmable switches. InNSDI, pages 171–191, 2022
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.