arxiv: 2605.05188 · v1 · submitted 2026-05-06 · 💻 cs.NI

Recognition: unknown

SILC: Lookahead Caching for Short-form Video Delivery Systems

Maleeha Masood , Shreya Kannan , Om Chabra , Deepak Vasisht , Indranil Gupta

Authors on Pith no claims yet

Pith reviewed 2026-05-08 15:57 UTC · model grok-4.3

classification 💻 cs.NI

keywords short video cachingCDN optimizationlookahead cachingcache evictionmidgress bandwidthrecommendation systemsTikTok deliveryvideo delivery

0 comments

The pith

SILC uses lookahead from push-based recommendations to cut CDN midgress costs for short videos by 11.1% to 111%.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Short video platforms rely on push-based recommendation systems that reveal sequences of upcoming videos to users rather than requiring explicit selection. These platforms also exhibit highly skewed popularity distributions that create geographic and temporal overlaps in requests. SILC incorporates this lookahead information directly into cache eviction and placement decisions at the CDN. The system thereby reduces both cache misses and the volume of traffic fetched from origin servers. Simulations driven by real user traces demonstrate consistent savings against a range of standard and learning-based eviction policies.

Core claim

SILC is a lookahead-aware caching system that exploits visibility into upcoming requests provided by push-based recommendation engines along with Pareto-distributed popularity overlaps to improve eviction and prefetching for short-form video content. In traces collected from real users and scaled to 10,000 simultaneous viewers, the approach lowers CDN midgress bandwidth costs by 11.1% to 111% relative to ten heuristic and learning-based baselines.

What carries the argument

SILC, a lookahead-aware caching policy that folds recommendation sequences and popularity skew into cache eviction decisions to lower miss rates and origin fetches.

If this is right

CDNs serving short videos can reduce origin-server bandwidth without hardware changes by using existing recommendation data.
Cache hit rates improve specifically for sequences of consecutively recommended videos.
The same lookahead mechanism can lower midgress costs on any platform where recommendations dictate the next request order.
Geographic and temporal popularity overlaps become exploitable assets rather than noise in cache management.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Direct integration of recommendation engines with CDN control planes may become necessary for efficient short-video delivery at scale.
Similar lookahead techniques could extend to other push-driven services such as music playlists or news feeds where next-item visibility exists.
Reducing midgress traffic through better caching may slow the growth of backbone bandwidth demand driven by short-video platforms.

Load-bearing premise

The collected user traces and 10,000-user simulation accurately reflect real geographic and temporal request patterns, and production CDNs can obtain sufficiently accurate lookahead data from recommendation systems.

What would settle it

Deploying SILC in a production CDN serving short videos and comparing measured midgress bandwidth usage against the same ten baseline policies on live traffic would confirm or refute the reported savings.

Figures

Figures reproduced from arXiv: 2605.05188 by Deepak Vasisht, Indranil Gupta, Maleeha Masood, Om Chabra, Shreya Kannan.

**Figure 1.** Figure 1: TikTok’s FYP (For You Page). Users are shown a sequence of videos tailored to their personal interests as learned by the recommendation algorithm. relies on more than 1000 CDN nodes (e.g., Fastly, Akamai, and its own network) to serve content cached geographically near a user (typically via HTTP) [31]. However, inefficient caching mechanisms at CDNs result in high costs for the CDN operator, and hurt user … view at source ↗

**Figure 2.** Figure 2: System Overview. SILC reduces midgress traffic and improves CDN hit rates through new lookahead eviction and online reordering policies, while preserving user engagement. and Netflix, users actively search for and select videos to watch (even from among the recommendations on the home page). In other words, users pull content by telling the system which videos to fetch. However, in short video systems, the… view at source ↗

**Figure 3.** Figure 3: Example Manifest File. A manifest file contains around 30 videos in the itemList and decides the sequence in which videos appear on a user’s FYP. • Lower Midgress: CDN midgress is related to the miss rate, but also captures the bandwidth impact of misses (e.g., missing bigger videos causes more impact to midgress). Midgress costs are incurred by the CDN but not paid by the content provider (e.g., TikTok), … view at source ↗

**Figure 6.** Figure 6: Distribution of Time between Successive Views. 23% of videos in our dataset were watched by more than 1 person, of which 54% were watched within 24 hours. overlap in short videos served to different users. As discussed before, video popularity in TikTok follows a skewed Pareto distribution. The Pareto distribution is defined as follows: f(x) = ( αx α m x α+1 , x ≥ xm, 0, x < xm, (1) where α > 0 is the shap… view at source ↗

**Figure 5.** Figure 5: Popularity Distribution in Short Videos. TikTok videos follow a highly skewed popularity distribution. We use our data donation exercise (§4) to collect and analyze the associated metadata of 2.65M unique videos. An example metadata response is shown in view at source ↗

**Figure 7.** Figure 7: Example of SILC’s Components. 3.3 SILC’s Manifest File Reordering We make two further observations to further improve SILC’s caching policy. Observation 4: Cache efficiency is improved if multiple users fetch the same video around the same time. Essentially, if all users fetch the same video around the same time, the CDN will need to fetch this video once from the content provider. Once this video is serve… view at source ↗

**Figure 8.** Figure 8: Metrics Collected in the User Study. strategies (§5.3 and §5.4)? • How do SILC’s benefits change with cache size (§5.5)? • How does SILC compare to caching all popular videos (§5.5)? • How much does manifest file reordering in SILC contribute to its cache miss improvement (§5.5)? 5.1 Evaluation Setup and Baselines Our evaluation consists of two components: an emulation at scale (§5.1.1) and a simulation to… view at source ↗

**Figure 9.** Figure 9: Byte Miss Rate. Byte miss rate of SILC compared to the best learning based (LRB), recency (LRU), frequency (LFUDA) and frequency + size (GDSF) heuristic eviction policies. SILC outperforms all baselines by at least 11.1%. The dotted line refers to the best rate possible with an infinite cache size. user trace to extract for each generated user. We extract this is in a finite time window with start time ts … view at source ↗

**Figure 10.** Figure 10: Impact of Cache Size, Caching Strategy, Reordering Policy, and Length of Manifest Files on SILC. and Vine! [28, 35, 65]. None of these works are focused on designing a CDN to improve short video delivery. Heuristic Caching Algorithms This includes work over multiple decades like LRU [19], SLRU [37], 2Q [36], LFU [67], TinyLFU [23], FIFO [41], LFUDA [2], CLOCK [18], ARC [46] and Threshold-LRU [50]. These … view at source ↗

read the original abstract

Short video platforms like TikTok, Instagram Reels, and YouTube Shorts have gained immense popularity in the last few years and are responsible for a large and growing fraction of Internet traffic. We identify two unique opportunities for improving short video delivery using their existing interactions with content delivery networks (CDNs). First, short videos use a push-based recommendation system, where the user is presented a sequence of videos recommended by the algorithm rather than user explicitly picking content to watch (e.g., in YouTube). Such push-based short video systems offer a unique opportunity for system design by providing visibility into upcoming requests. Second, the popularity of these videos follows a highly skewed Pareto distribution, leading to geographical and temporal overlap amongst videos being served. We leverage these opportunities to build SILC - a lookahead-aware caching system, aimed at (i) reducing CDN cache miss rates, as well as (ii) reducing midgress bandwidth between the CDN and the origin server. Our evaluation of SILC uses traces that we collect from real users, through (i) an in-person user study, and (ii) a data donation program involving 100 TikTok users across the world. Using a combination of these traces, we simulate traffic from 10,000 simultaneous users. Our evaluation shows that, compared to 10 state-of-the-art heuristic and learning-based cache eviction policies, SILC reduces a CDN's midgress costs by 11.1% to 111%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The 111% midgress reduction claim is arithmetically impossible and needs fixing before the numbers can be trusted, though the targeted use of lookahead from push recommendations is a reasonable practical step.

read the letter

The main thing to know is that the paper's headline result — up to 111% reduction in midgress costs — cannot be correct as stated. A percentage reduction cannot exceed 100% if the cost stays non-negative, so either the metric is defined differently or there's a calculation mistake that the authors need to address right away. That single inconsistency undercuts the evaluation before anything else gets discussed.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes SILC, a lookahead-aware caching system for short-form video delivery over CDNs. It exploits push-based recommendation systems (providing visibility into upcoming video requests) and the highly skewed Pareto popularity distribution of short videos to reduce cache miss rates and midgress bandwidth costs between CDN and origin. Traces are collected via an in-person user study plus a data-donation program from 100 TikTok users worldwide; these are used to drive a simulation of 10,000 simultaneous users. The central quantitative claim is that SILC outperforms 10 state-of-the-art heuristic and learning-based eviction policies, reducing CDN midgress costs by 11.1% to 111%.

Significance. If the performance numbers are shown to be correctly computed and robust, the work identifies a practically relevant opportunity: using readily available recommendation lookahead to improve caching for a traffic class that already dominates large portions of Internet bandwidth. The approach is conceptually simple yet directly applicable to production CDNs serving TikTok, Reels, or Shorts, and could translate into measurable reductions in origin-fetch traffic and associated costs.

major comments (2)

[Abstract] Abstract: the headline result states that SILC 'reduces a CDN's midgress costs by 11.1% to 111%'. Under the conventional definition of relative reduction ((cost_baseline - cost_SILC)/cost_baseline), any figure exceeding 100% requires cost_SILC < 0. Midgress bandwidth and origin-fetch costs cannot be negative; the upper bound is therefore arithmetically impossible and indicates either a calculation error, an unreported non-standard normalization, or a reporting typo. Because this single quantitative claim is the only concrete performance number supplied, the inconsistency is load-bearing for the evaluation.
[Evaluation] Evaluation section (presumably §4–§5): the abstract reports concrete percentage reductions from trace-driven simulation but supplies no error bars, no description of the exact midgress measurement procedure, and no sensitivity analysis with respect to trace selection, user count, or simulation parameters. Without these, it is impossible to judge whether the 11.1%–111% range is statistically reliable or an artifact of the particular 10,000-user workload.

minor comments (2)

[Abstract] The abstract refers to '10 state-of-the-art heuristic and learning-based cache eviction policies' without naming them or citing their sources; the full manuscript must list the baselines and provide references so readers can reproduce the comparison.
[System Design] Clarify the precise form and accuracy assumptions of the 'lookahead information from the recommendation system' that SILC is assumed to receive in production; the current text leaves open whether this information is perfect, delayed, or partially available.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comments highlight important issues with the presentation of results and the robustness of the evaluation. We address each point below and will revise the manuscript to correct errors and add the requested details.

read point-by-point responses

Referee: [Abstract] Abstract: the headline result states that SILC 'reduces a CDN's midgress costs by 11.1% to 111%'. Under the conventional definition of relative reduction ((cost_baseline - cost_SILC)/cost_baseline), any figure exceeding 100% requires cost_SILC < 0. Midgress bandwidth and origin-fetch costs cannot be negative; the upper bound is therefore arithmetically impossible and indicates either a calculation error, an unreported non-standard normalization, or a reporting typo.

Authors: We agree that a relative reduction percentage cannot exceed 100% under the standard formula, as costs cannot be negative. The reported upper bound of 111% is a reporting error in the abstract. We will correct the abstract to state the accurate range of midgress cost reductions observed across the ten policies (all values will be strictly below 100%) and will add a brief explanation of the relative reduction formula used. This change ensures the claim is arithmetically valid while preserving the substance of the performance comparison. revision: yes
Referee: [Evaluation] Evaluation section (presumably §4–§5): the abstract reports concrete percentage reductions from trace-driven simulation but supplies no error bars, no description of the exact midgress measurement procedure, and no sensitivity analysis with respect to trace selection, user count, or simulation parameters. Without these, it is impossible to judge whether the 11.1%–111% range is statistically reliable or an artifact of the particular 10,000-user workload.

Authors: We will strengthen the evaluation section as follows: add error bars (standard deviation across runs or 95% confidence intervals) to all reported figures; provide a detailed description of the midgress bandwidth measurement procedure, including how origin fetches and inter-CDN transfers are accounted for in the simulator; and include sensitivity analyses varying user count, trace subsets, and key parameters (e.g., cache size, lookahead window). These additions will demonstrate that the performance gains are robust rather than artifacts of the specific workload. revision: yes

Circularity Check

0 steps flagged

No circularity: performance claims are direct empirical outputs from trace-driven simulation

full rationale

The paper's central claims consist of empirical performance numbers obtained by running SILC and 10 baseline policies on a simulation driven by independently collected user traces (in-person study plus 100 TikTok users) scaled to 10,000 simultaneous users. No mathematical derivation, parameter fitting, or predictive model is described whose outputs are then fed back as inputs. No self-citations, uniqueness theorems, or ansatzes are invoked in a load-bearing manner for the reported reductions. The 11.1%-111% range is presented as a direct comparison result rather than a constructed or renamed quantity. This is a standard trace-driven evaluation with no reduction of claims to their own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central performance claim rests on the assumption that the collected traces are representative and that the simulation faithfully models CDN midgress traffic. No free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption Push-based recommendation systems provide reliable advance visibility into the sequence of videos a user will request.
This is the key premise that enables lookahead caching.
domain assumption The popularity distribution of short videos exhibits sufficient geographical and temporal overlap to make prefetching beneficial.
Stated as the second opportunity leveraged by SILC.

pith-pipeline@v0.9.0 · 5576 in / 1375 out tokens · 59238 ms · 2026-05-08T15:57:45.890825+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

67 extracted references

[1]

Avic: a cache for adap- tive bitrate video

AKHTAR, Z., LI, Y., GOVINDAN, R., HALEPOVIC, E., HAO, S., LIU, Y.,ANDSEN, S. Avic: a cache for adap- tive bitrate video. In15th International Conference on Emerging Networking Experiments And Technolo- gies(2019), CoNEXT ’19, Association for Computing Machinery

2019
[2]

Evaluating con- tent management techniques for web proxy caches

ARLITT, M., CHERKASOVA, L., DILLEY, J., FRIEDRICH, R.,ANDJIN, T. Evaluating con- tent management techniques for web proxy caches. SIGMETRICS Perform. Eval. Rev. 27, 4 (2000)

2000
[3]

Workload analysis of a large- scale key-value store

ATIKOGLU, B., XU, Y., FRACHTENBERG, E., JIANG, S.,ANDPALECZNY, M. Workload analysis of a large- scale key-value store. In12th ACM SIGMETRICS/PER- FORMANCE Joint International Conference on Mea- surement and Modeling of Computer Systems(2012), SIGMETRICS ’12, Association for Computing Machin- ery

2012
[4]

Planning for usability testing

BARNUM, C. Planning for usability testing. InUsabil- ity Testing Essentials, C. Barnum, Ed., Morgan Kauf- mann Series in Interactive Technologies. Elsevier, 2011, pp. 105–155. Chapter 5

2011
[5]

LHD: Improving cache hit rate by maximizing hit density

BECKMANN, N., CHEN, H.,ANDCIDON, A. LHD: Improving cache hit rate by maximizing hit density. In 15th USENIX Symposium on Networked Systems De- sign and Implementation (NSDI 18)(2018), USENIX Association

2018
[6]

A study of replacement algorithms for a virtual-storage computer.IBM Systems Journal 5, 2 (1966)

BELADY, L. A study of replacement algorithms for a virtual-storage computer.IBM Systems Journal 5, 2 (1966)

1966
[7]

How to man in the middle https using mitm- proxy, 2023

BELL, A. How to man in the middle https using mitm- proxy, 2023

2023
[8]

BERGER, D. S. Towards lightweight and robust ma- chine learning for cdn caching. In17th ACM Workshop on Hot Topics in Networks(2018), HotNets ’18, Associ- ation for Computing Machinery

2018
[9]

S., SITARAMAN, R

BERGER, D. S., SITARAMAN, R. K.,ANDHARCHOL- BALTER, M. AdaptSize: Orchestrating the hot object memory cache in a content delivery network. In14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17)(2017), USENIX Associa- tion

2017
[10]

Web caching and zipf-like distributions: evidence and implications

BRESLAU, L., CAO, P., FAN, L., PHILLIPS, G.,AND SHENKER, S. Web caching and zipf-like distributions: evidence and implications. InIEEE INFOCOM ’99. Conference on Computer Communications. 18th Annual Joint Conference of the IEEE Computer and Communi- cations Societies. The Future is Now(1999), vol. 1

1999
[11]

Cost-Aware WWW proxy caching algorithms

CAO, P.,ANDIRANI, S. Cost-Aware WWW proxy caching algorithms. InUSENIX Symposium on Internet Technologies and Systems (USITS 97)(1997), USENIX Association

1997
[12]

CASTILLO, E.,ANDHADI, A. S. Fitting the general- ized pareto distribution to data.Journal of the American Statistical Association 92, 440 (1997)

1997
[13]

Instagram reels: Statistics & user-growth (2025), 2025

CH, D. Instagram reels: Statistics & user-growth (2025), 2025

2025
[14]

I tube, you tube, everybody tubes: ana- lyzing the world’s largest user generated content video system

CHA, M., KWAK, H., RODRIGUEZ, P., AHN, Y.-Y., ANDMOON, S. I tube, you tube, everybody tubes: ana- lyzing the world’s largest user generated content video system. In7th ACM SIGCOMM Conference on Internet Measurement(2007), IMC ’07, Association for Com- puting Machinery

2007
[15]

CHARNESS, G., GNEEZY, U.,ANDKUHN, M. A. Experimental methods: Between-subject and within- subject design.Journal of Economic Behavior & Orga- nization 81, 1 (2012), 1–8

2012
[16]

CHEN, J., SHARMA, N., KHAN, T., LIU, S., CHANG, B., AKELLA, A., SHAKKOTTAI, S.,ANDSITARAMAN, R. K. Darwin: Flexible learning-based cdn caching. InACM SIGCOMM 2023 Conference(2023), ACM SIGCOMM ’23, Association for Computing Machinery

2023
[17]

A study on the characteristics of douyin short videos and implications for edge caching

CHEN, Z., HE, Q., MAO, Z., CHUNG, H.-M.,AND MAHARJAN, S. A study on the characteristics of douyin short videos and implications for edge caching. InACM Turing Celebration Conference - China(2019), ACM TURC ’19, Association for Computing Machinery

2019
[18]

J.A paging experiment with the multics system

CORBATO, F. J.A paging experiment with the multics system. Massachusetts Institute of Technology, 1968

1968
[19]

An approximate analysis of the lru and fifo buffer replacement schemes

DAN, A.,ANDTOWSLEY, D. An approximate analysis of the lru and fifo buffer replacement schemes. In1990 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems(1990), SIGMETRICS ’90, Association for Computing Machinery. 13

1990
[20]

Prescribing a system of random vari- ables by conditional distributions.Theory of Probability & Its Applications 15, 3 (1970)

DOBRUSHIN, R. Prescribing a system of random vari- ables by conditional distributions.Theory of Probability & Its Applications 15, 3 (1970)

1970
[21]

The design and operation of CloudLab

DUPLYAKIN, D., RICCI, R., MARICQ, A., WONG, G., DUERIG, J., EIDE, E., STOLLER, L., HIBLER, M., JOHNSON, D., WEBB, K., AKELLA, A., WANG, K., RICART, G., LANDWEBER, L., ELLIOTT, C., ZINK, M., CECCHET, E., KAR, S.,ANDMISHRA, P. The design and operation of CloudLab. In2019 USENIX Annual Technical Conference (USENIX ATC 19)(2019), USENIX Association

2019
[22]

Wisely optimizing short video streaming for a user-vendor win-win outcome

E, J., XU, W., BI, J., HE, L., LI, H., GU, A., YANG, D., ANDCHAI, Y. Wisely optimizing short video streaming for a user-vendor win-win outcome. InProceedings of the ACM SIGCOMM 2025 Conference(New York, NY , USA, 2025), SIGCOMM ’25, Association for Comput- ing Machinery, p. 1272–1274

2025
[23]

Tinylfu: A highly efficient cache admission policy.ACM Trans

EINZIGER, G., FRIEDMAN, R.,ANDMANES, B. Tinylfu: A highly efficient cache admission policy.ACM Trans. Storage 13, 4 (2017)

2017
[24]

A., BOUTABA, R., MIGAULT, D.,ANDPREDA, S

GHAZNAVI, M., JALALPOUR, E., SALAHUDDIN, M. A., BOUTABA, R., MIGAULT, D.,ANDPREDA, S. Content delivery network security: A survey.IEEE Communications Surveys & Tutorials 23, 4 (2021)

2021
[25]

Youlighter: A cognitive approach to unveil youtube cdn and changes.IEEE Transactions on Cognitive Com- munications and Networking 1, 2 (2015)

GIORDANO, D., TRAVERSO, S., GRIMAUDO, L., MEL- LIA, M., BARALIS, E., TONGAONKAR, A.,ANDSAHA, S. Youlighter: A cognitive approach to unveil youtube cdn and changes.IEEE Transactions on Cognitive Com- munications and Networking 1, 2 (2015)

2015
[26]

Caca: Learning- based content-aware cache admission for video content in edge caching

GUAN, Y., ZHANG, X.,ANDGUO, Z. Caca: Learning- based content-aware cache admission for video content in edge caching. In27th ACM International Confer- ence on Multimedia(2019), MM ’19, Association for Computing Machinery

2019
[27]

P., DUNN, R

GUMMADI, K. P., DUNN, R. J., SAROIU, S., GRIB- BLE, S. D., LEVY, H. M.,ANDZAHORJAN, J. Mea- surement, modeling, and analysis of a peer-to-peer file- sharing workload. In19th ACM Symposium on Operat- ing Systems Principles(2003), SOSP ’03, Association for Computing Machinery

2003
[28]

A video-quality driven strat- egy in short video streaming

GUO, J.,ANDZHANG, G. A video-quality driven strat- egy in short video streaming. In24th International ACM Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems(2021), MSWiM ’21, Association for Computing Machinery

2021
[29]

Liveclip: towards intelligent mobile short-form video streaming with deep reinforcement learning

HE, J., HU, M., ZHOU, Y.,ANDWU, D. Liveclip: towards intelligent mobile short-form video streaming with deep reinforcement learning. In30th ACM Work- shop on Network and Operating Systems Support for Digital Audio and Video(2020), NOSSDA V ’20, Asso- ciation for Computing Machinery

2020
[30]

Learned internet congestion control for short video uploading

HUANG, T., ZHOU, C., JIA, L., ZHANG, R.-X.,AND SUN, L. Learned internet congestion control for short video uploading. In30th ACM International Confer- ence on Multimedia(2022), MM ’22, Association for Computing Machinery

2022
[31]

INTELLIGENCE, N. N. Tiktok cdn, 2024

2024
[32]

IQBAL, H., AHMAD, R., AHMED, W., QAZI, S.,AND ALAM, M. M. Analyzing and investigating encrypted traffic for social media application instagram. In2022 18th Biennial Baltic Electronics Conference (BEC) (2022)

2022
[33]

To- wards user-level qoe: Large-scale practice in personal- ized optimization of adaptive video streaming

JIA, L., ZHOU, C., LI, C., LIU, J.,ANDSUN, L. To- wards user-level qoe: Large-scale practice in personal- ized optimization of adaptive video streaming. InPro- ceedings of the ACM SIGCOMM 2025 Conference(New York, NY , USA, 2025), SIGCOMM ’25, Association for Computing Machinery, p. 1154–1166

2025
[34]

Char- acterizing co-located workloads in alibaba cloud data- centers.IEEE Transactions on Cloud Computing 10, 4 (2022)

JIANG, C., QIU, Y., SHI, W., GE, Z., WANG, J., CHEN, S., CÉRIN, C., REN, Z., XU, G.,ANDLIN, J. Char- acterizing co-located workloads in alibaba cloud data- centers.IEEE Transactions on Cloud Computing 10, 4 (2022)

2022
[35]

Svd: A large-scale short video dataset for near- duplicate video retrieval

JIANG, Q.-Y., HE, Y., LI, G., LIN, J., LI, L.,ANDLI, W.-J. Svd: A large-scale short video dataset for near- duplicate video retrieval. InIEEE/CVF International Conference on Computer Vision (ICCV)(October 2019)

2019
[36]

2q: a low overhead high performance bu er management replacement algo- rithm

JOHNSON, T., SHASHA, D.,ET AL. 2q: a low overhead high performance bu er management replacement algo- rithm. In20th International Conference on Very Large Data Bases(1994)

1994
[37]

Caching strategies to improve disk system performance.Com- puter 27, 3 (1994)

KAREDLA, R., LOVE, J.,ANDWHERRY, B. Caching strategies to improve disk system performance.Com- puter 27, 3 (1994)

1994
[38]

Using tiktok in education: A form of micro-learning or nano-learning?Interdisci- plinary Journal of Virtual Learning in Medical Sciences 12, 3 (2021)

KHLAIF, Z.,ANDSALHA, S. Using tiktok in education: A form of micro-learning or nano-learning?Interdisci- plinary Journal of Virtual Learning in Medical Sciences 12, 3 (2021)

2021
[39]

KIRILIN, V., SUNDARRAJAN, A., GORINSKY, S.,AND SITARAMAN, R. K. Rl-cache: Learning-based cache admission for content delivery. In2019 Workshop on Network Meets AI & ML(2019), NetAI’19, Association for Computing Machinery

2019
[40]

K., ZINK, M.,ANDSITARAMAN, R

KRISHNAPPA, D. K., ZINK, M.,ANDSITARAMAN, R. K. Optimizing the video transcoding workflow in content delivery networks. In6th ACM Multimedia 14 Systems Conference(2015), MMSys ’15, Association for Computing Machinery

2015
[41]

L.Data structures and program design (2nd ed.)

KRUSE, R. L.Data structures and program design (2nd ed.). Prentice-Hall, Inc., USA, 1986

1986
[42]

Youtube shorts statistics 2025 — active users & demographics, 2024

KUMAR, N. Youtube shorts statistics 2025 — active users & demographics, 2024

2025
[43]

Tladder: Qoe-centric video ladder optimization with playback feedback at billion scale

LI, Z., LIU, H., HUANG, S., GENG, B., CHEN, J., CHEN, J., SUN, L., MA, Q., LIU, P., ZHAO, J., LIAO, Y., CHEN, J., MA, Q., MA, Q.,ANDQIAN, F. Tladder: Qoe-centric video ladder optimization with playback feedback at billion scale. InProceedings of the ACM SIGCOMM 2025 Conference(New York, NY , USA, 2025), SIGCOMM ’25, Association for Computing Ma- chinery, ...

2025
[44]

Dashlet: Taming swipe uncertainty for robust short video streaming

LI, Z., XIE, Y., NETRAVALI, R.,ANDJAMIESON, K. Dashlet: Taming swipe uncertainty for robust short video streaming. In20th USENIX Symposium on Net- worked Systems Design and Implementation (NSDI 23) (2023), USENIX Association

2023
[45]

The method of the money making mecha- nism of tiktok

LIANG, T. The method of the money making mecha- nism of tiktok. In2021 3rd International Conference on Economic Management and Cultural Industry (ICEMCI 2021)(2021), Atlantis Press

2021
[46]

Outperforming lru with an adaptive replacement cache algorithm.Com- puter 37, 4 (2004)

MEGIDDO, N.,ANDMODHA, D. Outperforming lru with an adaptive replacement cache algorithm.Com- puter 37, 4 (2004)

2004
[47]

PARETO, V.Cours d’économie politique. F. Rouge, Lausanne, 1896
[48]

An update on tiktok’s diy cdn strategy and the impact on third-party cdns, 2021

RAYBURN, D. An update on tiktok’s diy cdn strategy and the impact on third-party cdns, 2021

2021
[49]

V., YUSUF, F., LYONS, S., PAZ, E., RANGASWAMI, R., LIU, J., ZHAO, M.,AND NARASIMHAN, G

RODRIGUEZ, L. V., YUSUF, F., LYONS, S., PAZ, E., RANGASWAMI, R., LIU, J., ZHAO, M.,AND NARASIMHAN, G. Learning cache replacement with CACHEUS. In19th USENIX Conference on File and Storage Technologies (FAST 21)(2021), USENIX Asso- ciation

2021
[50]

J.,ANDVITTER, J

SHAH, R., VARMAN, P. J.,ANDVITTER, J. S. Online algorithms for prefetching and caching on parallel disks. In16th Annual ACM Symposium on Parallelism in Algo- rithms and Architectures(2004), SPAA ’04, Association for Computing Machinery

2004
[51]

S., LI, K.,ANDLLOYD, W

SONG, Z., BERGER, D. S., LI, K.,ANDLLOYD, W. Learning relaxed belady for content distribution network caching. In17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20)(2020), USENIX Association

2020
[52]

HALP: Heuristic aided learned preference eviction policy for YouTube content delivery network

SONG, Z., CHEN, K., SARDA, N., ALTINBÜKEN, D., BREVDO, E., COLEMAN, J., JU, X., JURCZYK, P., SCHOOLER, R.,ANDGUMMADI, R. HALP: Heuristic aided learned preference eviction policy for YouTube content delivery network. In20th USENIX Sympo- sium on Networked Systems Design and Implementation (NSDI 23)(2023), USENIX Association

2023
[53]

The growing complexity of content delivery networks: Challenges and implications for the internet ecosystem.Telecommunications Policy 41, 10 (2017)

STOCKER, V., SMARAGDAKIS, G., LEHR, W.,AND BAUER, S. The growing complexity of content delivery networks: Challenges and implications for the internet ecosystem.Telecommunications Policy 41, 10 (2017)

2017
[54]

K.,ANDSHUKLA, S

SUNDARRAJAN, A., KASBEKAR, M., SITARAMAN, R. K.,ANDSHUKLA, S. Midgress-aware traffic pro- visioning for content delivery. In2020 USENIX An- nual Technical Conference (USENIX ATC 20)(2020), USENIX Association

2020
[55]

Tiktok statistics you need to know, 2025

TEAM, B. Tiktok statistics you need to know, 2025

2025
[56]

V., MARTINEZ, W

VIETRI, G., RODRIGUEZ, L. V., MARTINEZ, W. A., LYONS, S., LIU, J., RANGASWAMI, R., ZHAO, M., ANDNARASIMHAN, G. Driving cache replacement with ML-based LeCaR. In10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 18) (2018), USENIX Association

2018
[57]

Compar- ing cloud content delivery networks for adaptive video streaming

WANG, C., JAYASEELAN, A.,ANDKIM, H. Compar- ing cloud content delivery networks for adaptive video streaming. In2018 IEEE 11th International Conference on Cloud Computing (CLOUD)(2018)

2018
[58]

Multipath smart preloading algorithms in short video peer-to-peer cdn transmission architecture.IEEE Network 38, 3 (2024)

WEI, D., ZHANG, J., LI, H., XUE, Z., PENG, Y.,AND HAN, R. Multipath smart preloading algorithms in short video peer-to-peer cdn transmission architecture.IEEE Network 38, 3 (2024)

2024
[59]

Qoe-optimized multipath scheduling for video services in large-scale peer-to-peer cdns.IEEE Journal on Selected Areas in Communications(2025)

WEI, D., ZHANG, J., LIU, X., LI, H., XUE, Z., HUANG, T., JIANG, L.,ANDLI, J. Qoe-optimized multipath scheduling for video services in large-scale peer-to-peer cdns.IEEE Journal on Selected Areas in Communications(2025)

2025
[60]

Individual comparisons by ranking methods.Biometrics Bulletin 1, 6 (1945), 80–83

WILCOXON, F. Individual comparisons by ranking methods.Biometrics Bulletin 1, 6 (1945), 80–83

1945
[61]

PhD thesis, Massachusetts Institute of Technology, 2021

WU, J.Study of a Video-sharing Platform: The Global Rise of TikTok. PhD thesis, Massachusetts Institute of Technology, 2021

2021
[62]

Blender: A container placement strat- egy by leveraging zipf-like distribution within container- ized data centers.IEEE Transactions on Network and Service Management 19, 2 (2022)

WU, Z., DENG, Y., FENG, H., ZHOU, Y., MIN, G., ANDZHANG, Z. Blender: A container placement strat- egy by leveraging zipf-like distribution within container- ized data centers.IEEE Transactions on Network and Service Management 19, 2 (2022). 15

2022
[63]

Nctm: A novel coded trans- mission mechanism for short video deliveries

XU, Z., LI, Q., SHI, W., JIANG, Y., YUAN, Z., ZHANG, P.,ANDMUNTEAN, G.-M. Nctm: A novel coded trans- mission mechanism for short video deliveries. InPro- ceedings of the ACM Web Conference 2024(New York, NY , USA, 2024), WWW ’24, Association for Comput- ing Machinery, p. 2847–2858

2024
[64]

P., REDMILES, E

ZANNETTOU, S., NEMES-NEMETH, O., AYALON, O., GOETZEN, A., GUMMADI, K. P., REDMILES, E. M., ANDROESNER, F. Analyzing user engagement with tiktok’s short format video recommendations using data donations. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems(New York, NY , USA, 2024), CHI ’24, Association for Computing Machinery

2024
[65]

Mobile instant video clip sharing with screen scrolling: Measurement and enhancement.IEEE Transactions on Multimedia 20, 8 (2018)

ZHANG, L., WANG, F.,ANDLIU, J. Mobile instant video clip sharing with screen scrolling: Measurement and enhancement.IEEE Transactions on Multimedia 20, 8 (2018)

2018
[66]

SIEVE is simpler than LRU: an efficient Turn-Key eviction algorithm for web caches

ZHANG, Y., YANG, J., YUE, Y., VIGFUSSON, Y.,AND RASHMI, K. SIEVE is simpler than LRU: an efficient Turn-Key eviction algorithm for web caches. In21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)(2024), USENIX Associa- tion

2024
[67]

Disk caching in large database and timeshared systems

ZIVKOV, B.,ANDSMITH, A. Disk caching in large database and timeshared systems. In5th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems(1997). Appendix Additional User Study Details In Table 3, we list the different questions that we asked par- ticipants of our user study at the end of a watch session a...

1997