pith. sign in

arxiv: 2504.04586 · v2 · submitted 2025-04-06 · 💻 cs.NI

Joint Optimization of Handoff and Video Rate in LEO Satellite Networks

Pith reviewed 2026-05-22 21:13 UTC · model grok-4.3

classification 💻 cs.NI
keywords LEO satellite networksvideo streaminghandoff optimizationquality of experiencereinforcement learningmodel predictive controlmobility management
0
0 comments X

The pith

Joint satellite handoff and video bitrate selection optimizes quality of experience in LEO networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that low Earth orbit satellite networks require coordinated decisions on which satellite to connect to and what video bitrate to use, because frequent handoffs from moving satellites disrupt streaming. Simulations and real datasets reveal that separate handling of mobility and rate adaptation leads to poor performance for both single and multiple users sharing links. The authors develop model predictive control and reinforcement learning algorithms that make these choices together, first for one user then extended to groups via centralized training and distributed inference. Validation comes from trace-driven simulations and testbed experiments, showing gains in quality of experience for video traffic that is expected to dominate LEO usage in remote areas.

Core claim

The paper introduces a video-aware mobility management framework for LEO satellite networks that jointly optimizes satellite handoff and video bitrate selection. Model predictive control and reinforcement learning algorithms are proposed for single-user cases, with an extension to multiple competing users that employs centralized training and distributed inference to inform local policies from a global perspective. Effectiveness is demonstrated through trace-driven simulation and testbed experiments.

What carries the argument

Joint handoff and bitrate optimization using model predictive control for single users and reinforcement learning with centralized training and distributed inference for multiple users.

Load-bearing premise

The simulation models and throughput prediction algorithms accurately capture real LEO satellite channel dynamics and user behavior.

What would settle it

A controlled comparison in a live LEO satellite testbed measuring rebuffering time, average bitrate, and quality switches when running the joint algorithms versus independent handoff and rate control.

Figures

Figures reproduced from arXiv: 2504.04586 by Changhan Ge, Cheng Luo, Kyoungjun Park, Lili Qiu, Muhammad Muaz, Yi Xu, Zhiyuan He.

Figure 1
Figure 1. Figure 1: The relationship between the SNR, altitude, and azimuth angles measured from NOAA satellites in (a). (b) depicts a cumulative distribution function (CDF) plot of the SNR prediction based on the ML model. that utilizes both rate-based throughput estimates and buffer-based occupancy information to maximize the QoE. Pensieve [18] uses a RL based algorithm [20] to decide the bitrates of future video chunks. Pe… view at source ↗
Figure 2
Figure 2. Figure 2: The video QoE in the Starlink network at two locations: one is free from obstructions and the other has several obstructions around it. The figures show the visibility map generated by the Starlink APP. The red portion is the detected obstruction. Single Dish Shared Dish Split Dish [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Throughput measurement for Starlink network at different settings: (i) A single dish, (ii) Two dishes facing the same direction, and (iii) Two dishes facing different directions. used to estimate video bitrates based on the number of users and protocol overhead. 3.2 Video Performance in LEO Network To study the real video-watching experience of end users in the LEO satellite network, we measure the QoE of … view at source ↗
Figure 4
Figure 4. Figure 4: The PPO-based RL algorithm generates joint optimization policies. State: For each video chunk, the state inputs to the actor network and the critic network can be represented as 𝑆𝑡 = (𝑏𝑡 ,𝛾𝑡 ,𝑙𝑡 , 𝑛𝑚, 𝑃𝑡 , 𝐵𝑡 , 𝑣𝑡). 𝑏𝑡 is the current buffer level; 𝛾𝑡 is the number of chunks remaining in the video; 𝑙𝑡 is the bitrate at which the last chunk was downloaded; 𝑛𝑚 is a vector of m available sizes for the next vid… view at source ↗
Figure 5
Figure 5. Figure 5: The PPO-based RL algorithm generates joint optimization policies with centralized training and distributed inference. reward value. The decision-making process of a single-user RL can be directly applied to multi-user optimization. This RL uses only the current user’s state, so we call it the local state. It uses the throughput history of the current user’s satellite to implicitly learn the network conditi… view at source ↗
Figure 6
Figure 6. Figure 6: QoE results of the nine models (i.e., MPC, RL). The results are from simulated, NOAA, and Starlink datasets. All models use three bitrate levels for fair comparison except Joint RL (L+, 6B). The number of users refers to how many users are streaming video out of 20 users. 1.40 1.60 1.80 Quality 1 User 0.50 1.00 5 Users 0.00 0.50 20 Users -0.05 0.00 Rebuffer -1.00 0.00 -1.25 -1.00 -0.75 -0.40 -0.20 Smoothne… view at source ↗
Figure 7
Figure 7. Figure 7: QoE breakdown with three terms: 1) quality reward, 2) rebuffer penalty, and 3) smoothness penalty. The results are from the simulated dataset. All models use 3 bitrate levels for fair comparison except Joint RL (L+, 6B). The number of users refers to how many users are streaming video out of 20 users. can improve QoE by 18%, 68%, and 57% for 1 user, 5 users, and 20 users, respectively, in the Starlink data… view at source ↗
Figure 8
Figure 8. Figure 8: QoE results of MPC methods on the NOAA dataset with and without the model prediction. The Y-axis represents QoE. "HM" indicates the harmonic mean, while "Model" uses ML for prediction. 0 20 40 60 0 2 4 6 8 MPC: Satellite Throughput 0 20 40 60 0 1000 2000 3000 MPC: Bitrate Selection 0 20 40 60 80 0 2 4 6 8 Joint RL: Satellite Throughput 0 20 40 60 80 0 1000 2000 3000 Joint RL: Bitrate Selection SAT_ID=1 [ch… view at source ↗
Figure 9
Figure 9. Figure 9: An obstruction scenario, with the X-axis denoting time (sec). Initially, both Joint MPC (L) and Joint RL (L+) connect with satellite 2. However, satellite 2 experi￾ences an obstruction period from 15𝑡ℎ to 40𝑡ℎ. Joint MPC (L) switches to a different satellite around 30𝑡ℎ with significant bitrate fluctuation. Conversely, Joint RL (L+) transits to an alternative satellite when disturbed, effectively maintaini… view at source ↗
Figure 10
Figure 10. Figure 10: QoE results of Joint RL (L+) on the NOAA dataset varying satellite se￾lection coverage. MPC (MB) Joint MPC (Central) Joint RL (L) Models 0.25 0.30 0.35 QoE [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗
Figure 12
Figure 12. Figure 12: Comparison between testbed results and simulation results. Testbed results are marked with dots. 6.4 Testbed Results To validate the effectiveness of our algorithms in practical scenarios, we implement them in a testbed system as described in Section 5. As shown in [PITH_FULL_IMAGE:figures/full_fig_p010_12.png] view at source ↗
read the original abstract

Low Earth Orbit (LEO) satellite communication is a promising approach to providing Internet connectivity to users in many remote areas. As videos are likely to account for most traffic in the LEO satellite network, as in the rest of the Internet, this work introduces a novel video-aware mobility management framework tailored for LEO satellite networks. Utilizing simulation models alongside real-world datasets, we show the importance of handoff strategy and throughput prediction algorithms in single-user and multi-user video streaming scenarios. Motivated by these observations, we propose a set of novel algorithms that can jointly choose the satellite and video bitrate to optimize the Quality of Experience (QoE). We first develop Model Predictive Control (MPC) and Reinforcement Learning (RL) based algorithms for a single user, and then extend them to accommodate multiple competing users that may share the same satellite. We introduce centralized training and distributed inference for our RL design, enabling a distributed policy informed by a global perspective. We demonstrate the effectiveness of our proposed models using trace-driven simulation and testbed experiments. We share our code and data with the research community.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper proposes a video-aware mobility management framework for LEO satellite networks that jointly optimizes satellite handoff decisions and video bitrate selection. Using simulation models and real-world datasets, it motivates the need for joint optimization in single-user and multi-user video streaming scenarios. It develops MPC and RL algorithms for single users, extends them to multi-user cases with centralized training and distributed inference for the RL policy, and evaluates the approaches via trace-driven simulations and testbed experiments, claiming QoE improvements over non-joint baselines while sharing code and data publicly.

Significance. If the simulation models and throughput predictors accurately reflect real LEO channel dynamics and user behavior, the work provides practical joint-optimization algorithms that address an emerging challenge in LEO networks where video traffic will dominate. The centralized-training/distributed-inference RL design and the public release of code and data are explicit strengths that support reproducibility and potential follow-on work.

major comments (1)
  1. [Abstract and Evaluation sections] Abstract and Evaluation sections: the central claim that the proposed MPC and RL algorithms deliver reliable QoE gains rests on the fidelity of the simulation models and throughput prediction algorithms to real LEO satellite channel dynamics and contention behavior. No sensitivity analysis to prediction error or cross-validation against held-out real traces is described as part of the core argument, which is load-bearing for interpreting the reported improvements versus baselines.
minor comments (2)
  1. Clarify the exact definition of the QoE metric and how it incorporates handoff latency in the single-user and multi-user formulations.
  2. Ensure all figures include error bars or confidence intervals consistent with the number of runs or traces used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the evaluation methodology. We address the single major comment below and will revise the manuscript to incorporate the suggested analyses.

read point-by-point responses
  1. Referee: [Abstract and Evaluation sections] Abstract and Evaluation sections: the central claim that the proposed MPC and RL algorithms deliver reliable QoE gains rests on the fidelity of the simulation models and throughput prediction algorithms to real LEO satellite channel dynamics and contention behavior. No sensitivity analysis to prediction error or cross-validation against held-out real traces is described as part of the core argument, which is load-bearing for interpreting the reported improvements versus baselines.

    Authors: We agree that sensitivity analysis to prediction error and cross-validation on held-out traces are important to substantiate the robustness of the reported QoE gains. The manuscript already uses real-world datasets for trace-driven simulations and includes testbed experiments, but does not explicitly include these analyses in the core evaluation. In the revised version, we will add a dedicated subsection to the Evaluation section performing (1) sensitivity analysis by injecting controlled prediction errors (e.g., additive Gaussian noise at varying standard deviations) into the throughput predictor and quantifying impact on QoE for both MPC and RL policies, and (2) cross-validation by partitioning the real traces into training and held-out test sets, retraining predictors on the former and reporting QoE improvements on the latter. These results will also be referenced in the Abstract to support the central claims. revision: yes

Circularity Check

0 steps flagged

No load-bearing circularity; algorithms motivated by simulations but evaluated on external traces and testbeds without reduction to fitted inputs.

full rationale

The paper uses simulation models and real-world datasets to motivate observations about handoff and throughput prediction, then proposes MPC and RL algorithms for joint optimization. These are evaluated via trace-driven simulation and testbed experiments. No equations, self-citations, or derivations are presented that reduce the claimed QoE gains to quantities defined by the paper's own fitted parameters or inputs by construction. The central claims remain independent of the motivating models, yielding only a minor score for the general reliance on simulation fidelity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract provides no explicit free parameters, axioms, or invented entities; the work applies standard MPC and RL methods to a new domain using simulation models and real-world datasets whose validity is assumed but not detailed here.

pith-pipeline@v0.9.0 · 5733 in / 1163 out tokens · 34868 ms · 2026-05-22T21:13:44.895544+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Safety-Aware AoI Scheduling for LEO Satellite-Assisted Autonomous Driving

    cs.NI 2026-04 unverdicted novelty 5.0

    SafeScale-MATD3 is a two-timescale AoI scheduler with drift-plus-penalty safety enforcement and proactive handover that meets a 1% collision-alert violation budget in LEO satellite scenarios while cutting critical AoI by 35%.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    Starlink Simulator

    2023. Starlink Simulator. https://starlink.sx

  2. [2]

    Akyildiz, Huseyin Uzunalio, and Michael D

    Ian F. Akyildiz, Huseyin Uzunalio, and Michael D. Bender. 1999. Handover Management in Low Earth Orbit (LEO) Satellite Networks. In Mobile Networks and Applications

  3. [3]

    Prakash Chitre and Ferit Yegenoglu. 1999. Next-generation satellite networks: architectures and implementations. In IEEE Communications Magazine

  4. [4]

    Joseph Coffey. 2023. Latency in Optical Fiber Systems. https: //www.commscope.com/globalassets/digizuite/2799-latency-in-optical-fiber- systems-wp-111432-en.pdf

  5. [5]

    Florin Dobrian, Vyas Sekar, Asad Awan, Ion Stoica, Dilip Joseph, Aditya Ganjam, Jibin Zhan, and Hui Zhang. 2011. Understanding the impact of video quality on user engagement. In ACM SIGCOMM

  6. [6]

    Mark Handley. 2018. Delay is not an option: Low latency routing in space. In ACM HotNets

  7. [7]

    Jian He, Mubashir Adnan Qureshi, Lili Qiu, Jin Li, Feng Li, and Lei Han. 2018. Favor: Fine-grained video rate adaptation. In ACM MMSys

  8. [8]

    Cuong Manh Ho, Anh Tien Tran, Chunghyun Lee, Duc Thien Hua, and Sungrae Cho. 2022. Handover in mobility-aware caching strategy for LEO satellite-based overlay system with content delivery network. In ACM MobiHoc

  9. [9]

    Te-Yuan Huang, Ramesh Johari, Nick McKeown, Matthew Trunnell, and Mark Watson. 2014. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. In ACM SIGCOMM

  10. [10]

    Abbas Jamalipour and Tracy Tung. 2001. The role of satellites in global IT: trends and implications. In IEEE Personal Communications

  11. [11]

    Junchen Jiang, Vyas Sekar, and Hui Zhang. 2012. Improving fairness, effi- ciency, and stability in http-based adaptive video streaming with festive. InACM CoNEXT

  12. [12]

    Enric Juan, Mads Lauridsen, Jeroen Wigard, and Preben Mogensen. 2022. Han- dover solutions for 5G low-earth orbit satellite networks. In IEEE Access

  13. [13]

    Zeqi Lai, Hewu Li, Qi Zhang, Qian Wu, and Jianping Wu. 2021. Cooperatively constructing cost-effective content distribution networks upon emerging low earth orbit satellites and clouds. In IEEE ICNP

  14. [14]

    Zeqi Lai, Qian Wu, Hewu Li, Mingyang Lv, and Jianping Wu. 2021. Orbitcast: Exploiting mega-constellations for low-latency earth observation. In IEEE ICNP

  15. [15]

    Xu Li, Feilong Tang, Long Chen, and Jie Li. 2017. A state-aware and load- balanced routing model for LEO satellite networks. In IEEE GLOBECOM

  16. [16]

    Po-Hsun Lin and Wanjiun Liao. 2023. Space-Centric Adaptive Video Streaming with Quality of Experience Optimization in Low Earth Orbit Satellite Networks. In IEEE ICC

  17. [17]

    Vikalp Mandawaria, Neha Sharma, Diwakar Sharma, Chitradeep Majumdar, An- shuman Nigam, Seungil Park, and Jungsoo Jung. 2022. Uplink zone-based sched- uling for LEO satellite based Non-Terrestrial Networks. In IEEE WCNC

  18. [18]

    Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. 2017. Neural adaptive video streaming with pensieve. In ACM SIGCOMM

  19. [19]

    Jonathan C McDowell. 2020. The low earth orbit satellite population and impacts of the SpaceX Starlink constellation. In The Astrophysical Journal Letters

  20. [20]

    V olodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Tim- othy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asyn- chronous methods for deep reinforcement learning. In PMLR ICML. Joint Optimization of Handoff and Video Rate in LEO Satellite Networks Conference’17, July 2017, Washington, DC, USA

  21. [21]

    Hoang Nam Nguyen, Salem Lepaja, Jon Schuringa, and Harmen R van As. 2001. Handover management in low earth orbit satellite IP networks. In IEEE GLOBE- COM

  22. [22]

    Kyoungjun Park, Myungchul Kim, and Laihyuk Park. 2022. NeuSaver: Neural Adaptive Power Consumption Optimization for Mobile Video Streaming. In IEEE Transactions on Mobile Computing

  23. [23]

    Lerrel Pinto, Marcin Andrychowicz, Peter Welinder, Wojciech Zaremba, and Pieter Abbeel. 2017. Asymmetric actor critic for image-based robot learning. arXiv:1710.06542 (2017)

  24. [24]

    Stefan Schneider, Holger Karl, Ramin Khalili, and Artur Hecker. 2022. Deep- CoMP: Coordinated Multipoint Using Multi-Agent Deep Reinforcement Learn- ing

  25. [25]

    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov

  26. [26]

    Proximal Policy Optimization Algorithms

    Proximal policy optimization algorithms. arXiv:1707.06347 (2017)

  27. [27]

    Kevin Spiteri, Rahul Urgaonkar, and Ramesh K Sitaraman. 2020. BOLA: Near- optimal bitrate adaptation for online videos. In IEEE/ACM Transactions on Net- working

  28. [28]

    Yi Sun, Xiaoqi Yin, Junchen Jiang, Vyas Sekar, Fuyuan Lin, Nanshu Wang, Tao Liu, and Bruno Sinopoli. 2016. CS2P: Improving video bitrate selection and adaptation with data-driven throughput prediction. In ACM SIGCOMM

  29. [29]

    Deepak Vasisht, Jayanth Shenoy, and Ranveer Chandra. 2021. L2D2: Low latency distributed downlink for LEO satellites. In ACM SIGCOMM

  30. [30]

    Bowei Yang, Yue Wu, Xiaoli Chu, and Guanghua Song. 2016. Seamless handover in software-defined satellite networking. In IEEE Communications Letters

  31. [31]

    Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. 2015. A control- theoretic approach for dynamic adaptive video streaming over HTTP. In ACM SIGCOMM

  32. [32]

    Chao Yu, Akash Velu, Eugene Vinitsky, Jiaxuan Gao, Yu Wang, Alexandre Bayen, and Yi Wu. 2022. The surprising effectiveness of ppo in cooperative multi-agent games. In Advances in Neural Information Processing Systems

  33. [33]

    Haoyuan Zhao, Hao Fang, Feng Wang, and Jiangchuan Liu. 2023. Realtime Multimedia Services over Starlink: A Reality Check. In NOSSDA V