Recognition: unknown
Programmable Packet Scheduling with Dynamic Reordering at Line Rate
Pith reviewed 2026-05-10 15:11 UTC · model grok-4.3
The pith
UIFO introduces a two-level scheduling model that supports dynamic reordering of buffered packets while generalizing PIFO and PIEO.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
UIFO (Update-In-First-Out) is a programmable scheduling model that introduces a two-level abstraction over classes and packets. It enables dynamic updates to the scheduling order at the class level while preserving in-order packet scheduling within each class, thereby supporting dynamic reordering of already-buffered packets. UIFO remains fully compatible with and generalizes existing PIFO and PIEO models. A hardware prototype based on priority-queue designs sustains 100 Gbps line-rate throughput on an FPGA platform and in a 28 nm ASIC process.
What carries the argument
UIFO's two-level class-and-packet abstraction, implemented with priority-queue hardware, that separates class-level ordering updates from intra-class packet order preservation.
If this is right
- Dynamic-ordering algorithms such as pFabric become expressible in programmable line-rate hardware.
- Existing PIFO and PIEO schedulers can be realized as special cases within the UIFO framework.
- Scheduling expressiveness increases while line-rate performance and hardware scalability are preserved.
- Buffered packets can have their relative priorities adjusted after arrival without violating per-class order.
Where Pith is reading between the lines
- Network devices could respond to instantaneous congestion or flow priorities by reordering classes of traffic that are already queued.
- The same two-level separation might be applied to other ordered systems where both static sequence and dynamic group priority are needed.
- Software-defined networking controllers could issue class-order updates at runtime to implement policies that react to measured conditions.
Load-bearing premise
A priority-queue-based hardware implementation of the two-level class-and-packet abstraction can be realized at line rate with acceptable area and power cost on real silicon.
What would settle it
A fabricated UIFO hardware design that either cannot dynamically reorder already-buffered packets at the class level or cannot sustain 100 Gbps throughput on realistic traffic would show the central claim is false.
Figures
read the original abstract
High-speed switch packet scheduling demands both line-rate performance and programmability. Existing programmable hardware scheduling models, such as PIFO and PIEO, can express a broad range of scheduling algorithms; however, their semantics are restricted to packet-level ordering and cannot dynamically reorder buffered packets, which limits the support for dynamic-ordering algorithms such as pFabric. To overcome this limitation, we propose UIFO (Update-In-First-Out), a new programmable scheduling model that introduces a two-level abstraction over classes and packets. UIFO enables dynamic updates to the scheduling order at the class level while preserving in-order packet scheduling within each class, thereby supporting dynamic reordering of already-buffered packets. Furthermore, UIFO remains fully compatible with and generalizes existing PIFO and PIEO models. We implement a hardware prototype of UIFO based on priority-queue designs and evaluate it on an FPGA platform and in a 28 nm ASIC process. Overall, UIFO significantly enhances scheduling expressiveness and maintains favorable scalability while sustaining 100 Gbps line-rate throughput.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes UIFO (Update-In-First-Out), a new programmable packet scheduling model introducing a two-level class-and-packet abstraction. UIFO allows dynamic updates to the scheduling order at the class level while preserving intra-class FIFO ordering, thereby supporting reordering of already-buffered packets. The authors claim that UIFO is fully compatible with and generalizes the existing PIFO and PIEO models. They present a priority-queue-based hardware prototype evaluated on an FPGA platform and synthesized in a 28 nm ASIC process, reporting sustained 100 Gbps line-rate throughput.
Significance. If the hardware implementation achieves line-rate performance with acceptable area and power overhead and the generalization claim holds without hidden restrictions, the work would meaningfully extend programmable scheduling expressiveness to algorithms requiring dynamic reordering of buffered packets (e.g., pFabric). The provision of both FPGA prototyping and ASIC synthesis results is a positive aspect, as is the focus on a concrete hardware realization rather than purely theoretical modeling.
major comments (2)
- [Model Definition] The central claim that UIFO 'remains fully compatible with and generalizes existing PIFO and PIEO models' is load-bearing but unsupported by an explicit mapping, formal proof, or enumeration showing that every PIFO/PIEO schedule can be expressed in UIFO without additional constraints on class granularity, update timing, or supported operations. This appears in the model definition and compatibility discussion.
- [Hardware Implementation and Evaluation] The hardware evaluation asserts 100 Gbps line-rate throughput on FPGA and 28 nm ASIC synthesis with 'favorable scalability,' yet provides no quantitative area, power, or latency figures, no direct PIFO/PIEO baseline comparisons, and no error bars or variability data. Because the two-level priority-queue implementation is the only mechanism that could deliver the claimed dynamic reordering at line rate, the absence of these metrics prevents verification of acceptable silicon cost. This appears in the implementation and evaluation sections.
minor comments (2)
- The abstract would be strengthened by including one or two key quantitative results (e.g., area overhead or latency) from the FPGA and ASIC evaluations to substantiate the performance claims.
- Notation for the two-level abstraction (class priority vs. per-class packet order) should be introduced with a small example or diagram early in the model section to improve readability for readers familiar with PIFO/PIEO.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We agree that the generalization claim and hardware metrics require strengthening and will revise the manuscript accordingly. We respond to each major comment below.
read point-by-point responses
-
Referee: [Model Definition] The central claim that UIFO 'remains fully compatible with and generalizes existing PIFO and PIEO models' is load-bearing but unsupported by an explicit mapping, formal proof, or enumeration showing that every PIFO/PIEO schedule can be expressed in UIFO without additional constraints on class granularity, update timing, or supported operations. This appears in the model definition and compatibility discussion.
Authors: We agree that an explicit mapping is needed to fully substantiate the claim. In the revised manuscript we will add a new subsection under the model definition that provides a constructive mapping: any PIFO schedule is realized in UIFO by placing each packet in its own singleton class and performing class-level priority updates that replicate the original packet ordering; intra-class FIFO is preserved by definition. The same construction applies to PIEO by treating its push-in operations as class updates. We will enumerate the supported operations and show that no additional constraints on granularity or timing are imposed beyond the two-level abstraction already stated. This will make the generalization rigorous and verifiable. revision: yes
-
Referee: [Hardware Implementation and Evaluation] The hardware evaluation asserts 100 Gbps line-rate throughput on FPGA and 28 nm ASIC synthesis with 'favorable scalability,' yet provides no quantitative area, power, or latency figures, no direct PIFO/PIEO baseline comparisons, and no error bars or variability data. Because the two-level priority-queue implementation is the only mechanism that could deliver the claimed dynamic reordering at line rate, the absence of these metrics prevents verification of acceptable silicon cost. This appears in the implementation and evaluation sections.
Authors: We acknowledge the absence of detailed quantitative metrics in the current draft. The revised version will include new tables and figures reporting: (1) FPGA resource utilization (LUTs, FFs, BRAMs) and measured throughput with variability across multiple place-and-route runs; (2) 28 nm ASIC synthesis results for area (gate count and cell area), power (leakage and dynamic at 100 Gbps), and critical-path latency; (3) direct side-by-side comparisons against PIFO and PIEO baselines synthesized in the identical flow, including percentage overheads. These additions will allow independent verification of line-rate performance and silicon cost. revision: yes
Circularity Check
UIFO is a new model definition with hardware prototype; no derivation reduces to inputs by construction
full rationale
The paper introduces UIFO as a novel two-level scheduling abstraction that supports dynamic class-level reordering while preserving intra-class FIFO order. It states compatibility with PIFO/PIEO as a direct consequence of the model semantics rather than a fitted or self-referential derivation. The hardware prototype and 100 Gbps evaluation are presented as implementation results, not as predictions derived from prior parameters. No self-citation chains, ansatzes smuggled via citation, or renamings of known results appear as load-bearing steps. The central claims rest on the explicit definition of the abstraction and its hardware realization, which are independent of the target results.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Priority queues can be implemented in hardware to support both class-level updates and packet-level ordering at line rate.
- ad hoc to paper The two-level abstraction is sufficient to express all algorithms previously limited by PIFO/PIEO.
invented entities (1)
-
UIFO scheduling model
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Albert Gran Alcoz, Alexander Dietmüller, and Laurent Vanbever
-
[2]
InProceedings of the 17th Usenix Conference on Net- worked Systems Design and Implementation (NSDI’20)
SP-PIFO: approximating push-in first-out behaviors using strict- priority queues. InProceedings of the 17th Usenix Conference on Net- worked Systems Design and Implementation (NSDI’20). USENIX Associ- ation, USA, 59–76
-
[3]
Mohammad Alizadeh, Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Shenker. 2013. pFabric: min- imal near-optimal datacenter transport. InProceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM (SIGCOMM ’13). Associa- tion for Computing Machinery, New York, NY, USA, 435–446. https: //doi.org/10.1145/2486001.2486031
-
[4]
Nirav Atre, Hugo Sadok, and Justine Sherry. 2024. BBQ: a fast and scalable integer priority queue for hardware packet scheduling. In Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI’24). USENIX Association, USA, Article 26, 21 pages
2024
-
[5]
Imad Benacer, François-Raymond Boyer, and Yvon Savaria. 2018. A Fast, Single-Instruction–Multiple-Data, Scalable Priority Queue.IEEE Transactions on Very Large Scale Integration (VLSI) Systems26, 10 (2018), 1939–1952. https://doi.org/10.1109/TVLSI.2018.2838044
-
[6]
Jon C. R. Bennett and Hui Zhang. 1996. Hierarchical packet fair queue- ing algorithms. InConference Proceedings on Applications, Technolo- gies, Architectures, and Protocols for Computer Communications (SIG- COMM ’96). Association for Computing Machinery, New York, NY, USA, 143–156. https://doi.org/10.1145/248156.248170
-
[7]
Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McK- eown, Martin Izzard, Fernando Mujica, and Mark Horowitz. 2013. For- warding Metamorphosis: Fast Programmable Match-Action Processing in Hardware for SDN. InProceedings of the ACM SIGCOMM 2013 Con- ference on Applications, Technologies, Architectures, and Protocols for Computer Communication...
-
[8]
A. Demers, S. Keshav, and S. Shenker. 1989. Analysis and simula- tion of a fair queueing algorithm. InSymposium Proceedings on Com- munications Architectures & Protocols (SIGCOMM ’89). Associ- ation for Computing Machinery, New York, NY, USA, 1–12. https: //doi.org/10.1145/75246.75248
-
[9]
Mostafa Elbediwy, Bill Pontikakis, Alireza Ghaffari, Jean-Pierre David, and Yvon Savaria. 2024. DR-PIFO: A Dynamic Ranking Packet Sched- uler Using a Push-In-First-Out Queue.IEEE Transactions on Network and Service Management21, 1 (2024), 355–371. https://doi.org/10.1109/ TNSM.2023.3304894
-
[10]
Jonathan Chao
Peixuan Gao, Anthony Dalleggio, Jiajin Liu, Chen Peng, Yang Xu, and H. Jonathan Chao. 2024. Sifter: an inversion-free and large-capacity programmable packet scheduler. InProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI’24). USENIX Association, USA, Article 5, 20 pages
2024
-
[11]
Jonathan Chao
Peixuan Gao, Anthony Dalleggio, Yang Xu, and H. Jonathan Chao
-
[12]
In19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22)
Gearbox: A Hierarchical Packet Scheduler for Approximate Weighted Fair Queuing. In19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX Association, Renton, WA, 551–565. https://www.usenix.org/conference/nsdi22/ presentation/gao-peixuan
-
[13]
Glen Gibb, George Varghese, Mark Horowitz, and Nick McKeown
-
[14]
InProceedings of the Ninth ACM/IEEE Symposium on Architectures for Networking and Communi- cations Systems (ANCS ’13)
Design principles for packet parsers. InProceedings of the Ninth ACM/IEEE Symposium on Architectures for Networking and Communi- cations Systems (ANCS ’13). IEEE Press, 13–24
-
[15]
Pawan Goyal, Harrick M. Vin, and Haichen Chen. 1996. Start-time fair queueing: a scheduling algorithm for integrated services packet switching networks. InConference Proceedings on Applications, Tech- nologies, Architectures, and Protocols for Computer Communications (SIGCOMM ’96). Association for Computing Machinery, New York, NY, USA, 157–168. https://d...
-
[16]
Feng Guo, Shidong Sun, Junjie Hu, Ning Zhang, and Zhiqiang Lv
-
[17]
CIPO: Efficient, lightweight and programmable packet schedul- ing.Computer Networks245 (2024), 110355. https://doi.org/10.1016/j. comnet.2024.110355
work page doi:10.1016/j 2024
-
[18]
IEEE. 2011. 802.1Qbb Priority-based Flow Control (PFC). Online. (2011). Available: https://1.ieee802.org/dcb/802-1qbb/
2011
-
[19]
Radhika Mittal, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker
-
[20]
InProceedings of the 13th Usenix Conference on Networked Systems Design and Implementation (NSDI’16)
Universal packet scheduling. InProceedings of the 13th Usenix Conference on Networked Systems Design and Implementation (NSDI’16). USENIX Association, USA, 501–521
-
[21]
Sung-Whan Moon, Kang G. Shin, and Jennifer Rexford. 2000. Scal- able Hardware Priority Queue Architectures for High-Speed Packet Switches.IEEE Trans. Comput.49, 11 (Nov. 2000), 1215–1227. https: //doi.org/10.1109/12.895938
-
[22]
Antti Nurmi, Per Lindgren, Tom Szymkowiak, and {Timo D.} Hämäläi- nen. 2023. AnTiQ: A Hardware-Accelerated Priority Queue Design with Constant Time Arbitrary Element Removal. InProceedings - 2023 26th Euromicro Conference on Digital System Design, DSD 2023 (Proceedings : Euromicro Conference on Digital System Design), Smail Niar, Hamza Ouarnoughi, and Amu...
-
[23]
Sharma, Chenxingyu Zhao, Ming Liu, Pravein G Kannan, Changhoon Kim, Arvind Krishnamurthy, and Anirudh Sivaraman
Naveen Kr. Sharma, Chenxingyu Zhao, Ming Liu, Pravein G Kannan, Changhoon Kim, Arvind Krishnamurthy, and Anirudh Sivaraman
-
[24]
InProceedings of the 17th Usenix Conference on Networked Sys- tems Design and Implementation (NSDI’20)
Programmable calendar queues for high-speed packet sched- uling. InProceedings of the 17th Usenix Conference on Networked Sys- tems Design and Implementation (NSDI’20). USENIX Association, USA, 685–700
-
[25]
M. Shreedhar and George Varghese. 1995. Efficient fair queueing using deficit round robin. InProceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM ’95). Association for Computing Machinery, New York, NY, USA, 231–242. https://doi.org/10.1145/217382.217453
- [26]
-
[27]
Anirudh Sivaraman, Suvinay Subramanian, Mohammad Alizadeh, Sharad Chole, Shang-Tse Chuang, Anurag Agrawal, Hari Balakrishnan, Tom Edsall, Sachin Katti, and Nick McKeown. 2016. Programmable Packet Scheduling at Line Rate. InProceedings of the 2016 ACM SIG- COMM Conference (SIGCOMM ’16). Association for Computing Ma- chinery, New York, NY, USA, 44–57. https...
-
[28]
Anirudh Sivaraman, Keith Winstein, Suvinay Subramanian, and Hari Balakrishnan. 2013. No silver bullet: extending SDN to the data plane. InProceedings of the Twelfth ACM Workshop on Hot Topics in Networks (HotNets-XII). Association for Computing Machinery, New York, NY, USA, Article 19, 7 pages. https://doi.org/10.1145/2535771.2535796
-
[29]
Flow problems in multi- interface networks.IEEE Transactions on Computers, 63:361–374, 2014
Yi Tang and Neil W. Bergmann. 2015. A Hardware Scheduler Based on Task Queues for FPGA-Based Embedded Real-Time Systems.IEEE Trans. Comput.64, 5 (2015), 1254–1267. https://doi.org/10.1109/TC. 2014.2315637
work page doi:10.1109/tc 2015
-
[30]
Zekun Wang, Binghao Yue, Weitao Pan, Jianyi Shi, and Yue Hao
-
[31]
A Grouped Sorting Queue Supporting Dynamic Updates for Timer Management in High-Speed Network Interface Cards. (2026). 13 SIGCOMM’26, August 17–21, 2026, Denver, Colorado, USA Zekun Wang.et al. arXiv:cs.DS/2601.09081 https://arxiv.org/abs/2601.09081
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[32]
Wikipedia. 2026. Abstract Dictionary Data Type. Online. (2026). Available: https://en.wikipedia.org/wiki/Associative_array
2026
-
[33]
Wikipedia. 2026. Earliest Deadline First. Online. (2026). Available: https://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling
2026
-
[34]
Wikipedia. 2026. Least slack time scheduling. Online. (2026). Available: https://en.wikipedia.org/wiki/Least_slack_time_scheduling
2026
-
[35]
Wikipedia. 2026. Shortest Job First. Online. (2026). Available: https: //en.wikipedia.org/wiki/Shortest_job_first
2026
-
[36]
Wikipedia. 2026. Shortest Remaining Time First. Online. (2026). Avail- able: https://en.wikipedia.org/wiki/Shortest_remaining_time
2026
-
[37]
Wikipedia. 2026. Token Bucket. Online. (2026). Available: https: //en.wikipedia.org/wiki/Token_bucket
2026
-
[38]
Christo Wilson, Hitesh Ballani, Thomas Karagiannis, and Ant Rowtron
-
[39]
InProceedings of the ACM SIGCOMM 2011 Conference (SIGCOMM ’11)
Better never than late: meeting deadlines in datacenter networks. InProceedings of the ACM SIGCOMM 2011 Conference (SIGCOMM ’11). Association for Computing Machinery, New York, NY, USA, 50–61. https://doi.org/10.1145/2018436.2018443
-
[40]
K. Yamakoshi, E. Oki, and N. Yamanaka. 2002. Dynamic deficit round- robin scheduler for 5-Tb/s switch using wavelength routing. InWork- shop on High Performance Switching and Routing, Merging Optical and IP Technologie. 204–208. https://doi.org/10.1109/HPSR.2002.1024236
-
[41]
Ruyi Yao, Zhiyu Zhang, Gaojian Fang, Peixuan Gao, Sen Liu, Yibo Fan, Yang Xu, and H. Jonathan Chao. 2023. BMW Tree: Large-scale, High- throughput and Modular PIFO Implementation using Balanced Multi- Way Sorting Tree. InProceedings of the ACM SIGCOMM 2023 Conference (ACM SIGCOMM ’23). Association for Computing Machinery, New York, NY, USA, 208–219. https:...
-
[42]
Zhuolong Yu, Chuheng Hu, Jingfeng Wu, Xiao Sun, Vladimir Braver- man, Mosharaf Chowdhury, Zhenhua Liu, and Xin Jin. 2021. Pro- grammable packet scheduling with a single queue. InProceedings of the 2021 ACM SIGCOMM 2021 Conference (SIGCOMM ’21). As- sociation for Computing Machinery, New York, NY, USA, 179–193. https://doi.org/10.1145/3452296.3472887
- [43]
-
[44]
Hui Zhao, Wenjia Niu, Yifang Qin, Song Ci, Hui Tang, and Tao Lin
-
[45]
Traffic Load-Based Dynamic Bandwidth Allocation for Balancing the Packet Loss in DiffServ Network. (2012), 99–104. https://doi.org/ 10.1109/ICIS.2012.113 14
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.