TrimCaching: Parameter-sharing Edge Caching for AI Model Downloading
Pith reviewed 2026-05-24 02:31 UTC · model grok-4.3
The pith
Sharing parameter blocks across AI models lets edge networks cache more models and raise hit ratios.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TrimCaching exploits the fact that many AI models share parameter blocks containing reusable knowledge; modeling this overlap turns model placement into a submodular maximization problem whose solution, via a special-case polynomial algorithm or a greedy method, improves cache hit ratio over non-sharing baselines.
What carries the argument
The parameter-sharing model placement formulation that treats shared blocks as common storage items to maximize hit ratio under latency and storage constraints.
If this is right
- Edge servers can store a larger effective catalog of models within the same memory budget.
- Download latency for users requesting models with shared blocks drops because fewer unique blocks need transmission.
- The approximation algorithms give network operators a concrete way to compute placements without solving the NP-hard general problem.
- The framework extends to any set of models whose parameter overlap can be quantified in advance.
Where Pith is reading between the lines
- If parameter sharing turns out to be dynamic rather than fixed, the placement decisions would need periodic recomputation as new models arrive.
- The same sharing idea could apply to caching of other composite objects such as container images or dataset shards that contain overlapping files.
- Operators might combine TrimCaching with popularity prediction to decide which shared blocks to pre-position on which edges.
Load-bearing premise
A wide range of AI models share a significant proportion of parameter blocks that can be treated as reusable across models.
What would settle it
Running the placement algorithm on a trace of real AI models where the measured shared-block overlap is below the level assumed in the special-case analysis and observing no hit-ratio gain over independent-model caching.
Figures
read the original abstract
Next-generation mobile networks are expected to facilitate fast AI model downloading to end users. By caching models on edge servers, mobile networks can deliver models to end users with low latency, resulting in a paradigm of edge model caching. In this paper, we develop a novel model placement framework, called parameter-sharing model caching (TrimCaching). TrimCaching exploits the key observation that a wide range of AI models, such as convolutional neural networks or large language models, can share a significant proportion of parameter blocks containing reusable knowledge, thereby improving storage efficiency. To this end, we formulate a parameter-sharing model placement problem to maximize the cache hit ratio in multi-edge wireless networks by balancing the fundamental tradeoff between storage efficiency and service latency. We show that the formulated problem is a submodular maximization problem with submodular constraints, for which no polynomial-time approximation algorithm exists. To tackle this challenge, we study an important special case, where a small fixed number of parameter blocks are shared across models, which often holds in practice. In such a case, a polynomial-time algorithm with a $\left(1-\epsilon\right)/2$-approximation guarantee is developed. Subsequently, we address the original problem for the general case by developing a greedy algorithm. Simulation results demonstrate that the proposed TrimCaching framework significantly improves the cache hit ratio compared with state-of-the-art content caching without exploiting shared parameters in AI models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes TrimCaching, a parameter-sharing edge caching framework for AI model downloading in multi-edge wireless networks. It formulates the model placement problem as maximizing cache hit ratio by exploiting shared parameter blocks across models (e.g., CNNs or LLMs), shows the problem is submodular maximization under submodular constraints (no poly-time approximation exists in general), develops a polynomial-time (1-ε)/2-approximation algorithm for the special case of a small fixed number of shared blocks, proposes a greedy algorithm for the general case, and reports via simulations that TrimCaching significantly improves cache hit ratio over state-of-the-art content caching methods that ignore parameter sharing.
Significance. If the algorithmic guarantees and simulation improvements hold, the work addresses a timely problem in edge computing for large AI models by trading off storage efficiency against latency through parameter reuse. The submodular formulation, the special-case approximation guarantee, and the practical observation about fixed shared blocks are strengths that could influence caching designs in 5G/6G networks.
major comments (3)
- [Abstract] Abstract: The central simulation claim (significant cache hit ratio improvement) is load-bearing for the paper's contribution, yet the abstract provides no details on simulation setup, number of trials, error bars, specific network parameters, or how submodularity was verified; this prevents assessment of whether the reported gains are robust or reproducible.
- [Abstract] Abstract: The key modeling assumption that 'a wide range of AI models... share a significant proportion of parameter blocks... often holds with a small fixed number of shared blocks in practice' is stated without quantification or reference to concrete model families (e.g., specific CNN or LLM parameter overlap statistics); this assumption directly enables the special-case algorithm and must be supported for the (1-ε)/2 guarantee to be practically relevant.
- [Abstract] Abstract: The claim that the formulated problem is 'a submodular maximization problem with submodular constraints, for which no polynomial-time approximation algorithm exists' is presented without a proof sketch or reference to the specific submodular constraint functions; verification of both submodularity and the inapproximability result is required to justify moving to the special-case and greedy algorithms.
minor comments (1)
- [Abstract] Abstract: The approximation ratio is written as $(1-ε)/2$; clarify whether ε is a user-specified parameter or derived from the number of shared blocks.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major comment below and will revise the abstract to better support the claims while respecting length constraints.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central simulation claim (significant cache hit ratio improvement) is load-bearing for the paper's contribution, yet the abstract provides no details on simulation setup, number of trials, error bars, specific network parameters, or how submodularity was verified; this prevents assessment of whether the reported gains are robust or reproducible.
Authors: We agree that the abstract would benefit from additional details on the simulation results. In the revision, we will incorporate key elements such as the number of Monte Carlo trials, mention of error bars, and primary network parameters (e.g., number of edge servers and model sizes). Submodularity is established theoretically (see Section III); we will add a reference to this section. The complete simulation methodology remains in Section V. revision: yes
-
Referee: [Abstract] Abstract: The key modeling assumption that 'a wide range of AI models... share a significant proportion of parameter blocks... often holds with a small fixed number of shared blocks in practice' is stated without quantification or reference to concrete model families (e.g., specific CNN or LLM parameter overlap statistics); this assumption directly enables the special-case algorithm and must be supported for the (1-ε)/2 guarantee to be practically relevant.
Authors: We acknowledge that the assumption requires stronger support. We will revise the abstract (or move supporting text to the introduction) to include quantitative examples drawn from the literature on CNNs (e.g., ResNet variants) and LLMs (e.g., BERT/GPT fine-tuning), citing typical shared-block overlap statistics of 30-60%. This will directly justify the practical relevance of the special-case algorithm. revision: yes
-
Referee: [Abstract] Abstract: The claim that the formulated problem is 'a submodular maximization problem with submodular constraints, for which no polynomial-time approximation algorithm exists' is presented without a proof sketch or reference to the specific submodular constraint functions; verification of both submodularity and the inapproximability result is required to justify moving to the special-case and greedy algorithms.
Authors: The abstract summarizes results whose full proofs appear in Section III, where we prove submodularity of both the objective (cache-hit ratio) and the per-edge storage constraints, and invoke the known inapproximability of submodular maximization under submodular knapsack constraints. We will revise the abstract to add an explicit reference to Section III and briefly identify the constraint functions. revision: partial
Circularity Check
No significant circularity in derivation chain
full rationale
The paper formulates a parameter-sharing model placement problem as submodular maximization with submodular constraints, develops a polynomial-time approximation for the special case of fixed shared blocks and a greedy algorithm for the general case, then validates via simulation. These steps rest on standard submodular optimization properties and the independent observation about shared AI model parameters; no equation or result reduces by construction to a fitted input, self-citation, or renamed known pattern. Simulations provide external empirical evidence rather than a forced prediction. The derivation is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A wide range of AI models share a significant proportion of parameter blocks containing reusable knowledge, often with a small fixed number shared across models.
Reference graph
Works this paper leans on
-
[1]
TrimCaching: Parameter- sharing AI model caching in wireless edge networks,
G. Qu, Z. Lin, F. Liu, X. Chen, and K. Huang, “TrimCaching: Parameter- sharing AI model caching in wireless edge networks,” in Proc. IEEE Int. Conf. Distrib. Comput. Syst. (ICDCS) , Jersey City, NJ, USA, Jul. 2024, pp. 36–46
work page 2024
-
[2]
D²-JSCC: Digital deep joint source-channel coding for semantic communications,
J. Huang, K. Yuan, C. Huang, and K. Huang, “D²-JSCC: Digital deep joint source-channel coding for semantic communications,” IEEE J. Sel. Areas Commun., vol. 43, no. 4, pp. 1246–1261, Apr. 2025
work page 2025
-
[3]
Semantic sleuth: Identifying ponzi contracts via large language models,
C. Wu, J. Chen, Z. Wang, R. Liang, and R. Du, “Semantic sleuth: Identifying ponzi contracts via large language models,” in Proc. 39th IEEE/ACM Int. Conf. Autom. Softw. Eng. , ser. ASE ’24, Oct. 2024, p. 582–593
work page 2024
-
[4]
S. Hu, Z. Fang, Y . Deng, X. Chen, Y . Fang, and S. Kwong, “Toward full-scene domain generalization in multi-agent collaborative bird’s eye view segmentation for connected and autonomous driving,” IEEE Trans. Intell. Transp. Syst. , vol. 26, no. 2, pp. 1783–1796, Feb. 2025
work page 2025
-
[5]
Fedhome: Cloud-edge based personalized federated learning for in-home health monitoring,
Q. Wu, X. Chen, Z. Zhou, and J. Zhang, “Fedhome: Cloud-edge based personalized federated learning for in-home health monitoring,” IEEE Trans. Mobile Comput. , vol. 21, no. 8, pp. 2818–2832, Aug. 2022
work page 2022
-
[6]
Federated learning for smart healthcare: A survey,
D. C. Nguyen, Q.-V . Pham, P. N. Pathirana, M. Ding, A. Seneviratne, Z. Lin, O. Dobre, and W.-J. Hwang, “Federated learning for smart healthcare: A survey,” ACM Comput. Surv. , vol. 55, no. 3, pp. 1–37, Feb. 2022
work page 2022
-
[7]
Privacy risks in reinforcement learning for household robots,
M. Li, W. Ding, and D. Zhao, “Privacy risks in reinforcement learning for household robots,” in 2024 IEEE International Conference on Robotics and Automation (ICRA) . IEEE, 2024, pp. 5148–5154
work page 2024
-
[8]
FedMeld: A model-dispersal feder- ated learning framework for space-ground integrated networks,
Q. Chen, X. Chen, and K. Huang, “FedMeld: A model-dispersal feder- ated learning framework for space-ground integrated networks,” arXiv preprint arXiv:2412.17231, 2024
-
[9]
C. Wu, J. Chen, K. He, Z. Zhao, R. Du, and C. Zhang, “EchoHand: High accuracy and presentation attack resistant hand authentication on commodity mobile devices,” in Proc. 2022 ACM SIGSAC Conf. Comput. Commun. Secur., ser. CCS ’22, Nov. 2022, p. 2931–2945
work page 2022
-
[10]
3GPP, “3rd generation partnership project; Technical specification group services and system aspects; Study on traffic characteristics and perfor- mance requirements for AI/ML model transfer in 5GS; (Release 18),” 3rd Generation Partnership Project (3GPP), Technical Specification (TS) 22.874, Dec. 2021, version 18.2.0
work page 2021
-
[11]
Notable site recognition using deep learning on mobile and crowd-sourced imagery,
J. Tan, A. Noulas, D. S ´aez, and R. Schifanella, “Notable site recognition using deep learning on mobile and crowd-sourced imagery,” in Proc. 2020 21st IEEE Int. Conf. Mobile Data Manage. (MDM) , Versailles, France, Aug. 2020, pp. 137–147
work page 2020
-
[12]
Sense4FL: Vehicular crowdsensing enhanced federated learning for autonomous driving,
Y . Ma, S. Hu, Z. Fang, Y . Ji, Y . Deng, and Y . Fang, “Sense4FL: Vehicular crowdsensing enhanced federated learning for autonomous driving,” arXiv preprint arXiv:2503.17697 , 2025
-
[13]
Space-ground fluid AI for 6G edge intelligence,
Q. Chen, Z. Wang, X. Chen, J. Wen, D. Zhou, S. Ji, M. Sheng, and K. Huang, “Space-ground fluid AI for 6G edge intelligence,” arXiv preprint arXiv:2411.15845, 2024
-
[14]
A joint learning and communications framework for federated learning over wireless networks,
M. Chen, Z. Yang, W. Saad, C. Yin, H. V . Poor, and S. Cui, “A joint learning and communications framework for federated learning over wireless networks,” IEEE Trans. Wireless Commun. , vol. 20, no. 1, pp. 269–283, Jan. 2021
work page 2021
-
[15]
Byzantine-robust federated learning via cosine similarity aggregation,
T. Zhu, Z. Guo, C. Yao, J. Tan, S. Dou, W. Wang, and Z. Han, “Byzantine-robust federated learning via cosine similarity aggregation,” Comput. Netw., vol. 254, p. 110730, Dec. 2024
work page 2024
-
[16]
Ultra- low-latency edge inference for distributed sensing,
Z. Wang, A. E. Kalør, Y . Zhou, P. Popovski, and K. Huang, “Ultra- low-latency edge inference for distributed sensing,” arXiv preprint arXiv:2407.13360, 2024
-
[17]
Z. Fang, S. Hu, J. Wang, Y . Deng, X. Chen, and Y . Fang, “Priori- tized information bottleneck theoretic framework with distributed online learning for edge video analytics,” IEEE/ACM Trans. Netw., pp. 1–17, early access 2025
work page 2025
-
[18]
Gemel: Model merging for memory-efficient, real-time video analytics at the edge,
A. Padmanabhan, N. Agarwal, A. Iyer, G. Ananthanarayanan, Y . Shu, N. Karianakis, G. H. Xu, and R. Netravali, “Gemel: Model merging for memory-efficient, real-time video analytics at the edge,” in Proc. USENIX Symp. Netw. Syst. Des. Implement. (NSDI) , Boston, MA, USA, Apr. 2023, pp. 973–994
work page 2023
-
[19]
Pushing large language models to the 6G edge: Vision, challenges, and opportunities,
Z. Lin, G. Qu, Q. Chen, X. Chen, Z. Chen, and K. Huang, “Pushing large language models to the 6G edge: Vision, challenges, and opportunities,” arXiv preprint arXiv:2309.16739 , 2023
-
[20]
Transfer learning & fine-tuning,
TenserFlow, “Transfer learning & fine-tuning,” 2023. [Online]. Available: https://www.tensorflow.org/guide/keras/transfer learning# introduction
work page 2023
-
[21]
A comprehensive survey on transfer learning,
F. Zhuang, Z. Qi, K. Duan, D. Xi, Y . Zhu, H. Zhu, H. Xiong, and Q. He, “A comprehensive survey on transfer learning,” Proc. IEEE, vol. 109, no. 1, pp. 43–76, Jan. 2020
work page 2020
-
[22]
Spottune: Transfer learning through adaptive fine-tuning,
Y . Guo, H. Shi, A. Kumar, K. Grauman, T. Rosing, and R. Feris, “Spottune: Transfer learning through adaptive fine-tuning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) , Long Beach, CA, USA, Jun. 2019
work page 2019
-
[23]
PACP: Priority-aware collaborative perception for connected and autonomous vehicles,
Z. Fang, S. Hu, H. An, Y . Zhang, J. Wang, H. Cao, X. Chen, and Y . Fang, “PACP: Priority-aware collaborative perception for connected and autonomous vehicles,” IEEE Trans. Mobile Comput. , pp. 15 003– 15 018, Dec. 2024
work page 2024
-
[24]
LoRA: Low-rank adaptation of large language models,
E. J. Hu, yelong shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen, “LoRA: Low-rank adaptation of large language models,” in Proc. Int. Conf. Learn. Represent. (ICLR) , Apr. 2022
work page 2022
-
[25]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV , USA, Jun. 2016, pp. 770–778
work page 2016
-
[26]
Learning multiple layers of features from tiny images,
A. Krizhevsky, G. Hinton et al. , “Learning multiple layers of features from tiny images,” Apr. 2009
work page 2009
-
[27]
The E2E dataset: New challenges for end-to-end generation,
J. Novikova, O. Du ˇsek, and V . Rieser, “The E2E dataset: New challenges for end-to-end generation,” in Proc. Annu. SIGdial Meeting Disc. Dialogue, Saarbr ¨ucken, Germany, Aug. 2017, pp. 201–206
work page 2017
-
[28]
Cre- ating training corpora for NLG micro-planners,
C. Gardent, A. Shimorina, S. Narayan, and L. Perez-Beltrachini, “Cre- ating training corpora for NLG micro-planners,” in Proc. Annu. Meet. Assoc. Comput. Linguist. (ACL), Vancouver, Canada, Jul. 2017, pp. 179– 188
work page 2017
-
[29]
Dart: Open-domain structured data record to text generation,
L. Nan, D. Radev, R. Zhang, A. Rau, A. Sivaprasad, C. Hsieh, X. Tang, A. Vyas, N. Verma, P. Krishna, Y . Liu, N. Irwanto, J. Pan, F. Rahman, A. Zaidi, M. Mutuma, Y . Tarabar, A. Gupta, T. Yu, Y . C. Tan, X. V . Lin, C. Xiong, R. Socher, and N. F. Rajani, “Dart: Open-domain structured data record to text generation,” arXiv preprint arXiv:2007.02871, 2021
-
[30]
Femtocaching: Wireless content delivery through distributed caching helpers,
K. Shanmugam, N. Golrezaei, A. G. Dimakis, A. F. Molisch, and G. Caire, “Femtocaching: Wireless content delivery through distributed caching helpers,” IEEE Trans. Inf. Theory , vol. 59, no. 12, pp. 8402– 8413, Dec. 2013
work page 2013
-
[31]
On the complexity of optimal content placement in hierarchical caching networks,
K. Poularakis and L. Tassiulas, “On the complexity of optimal content placement in hierarchical caching networks,” IEEE Trans. Commun. , vol. 64, no. 5, pp. 2092–2103, Mar. 2016
work page 2092
-
[32]
On the complexity of optimal re- quest routing and content caching in heterogeneous cache networks,
M. Dehghan, B. Jiang, A. Seetharam, T. He, T. Salonidis, J. Kurose, D. Towsley, and R. Sitaraman, “On the complexity of optimal re- quest routing and content caching in heterogeneous cache networks,” IEEE/ACM Trans. Netw., vol. 25, no. 3, pp. 1635–1648, Jun. 2017. 15
work page 2017
-
[33]
Edge-caching wireless networks: Performance analysis and optimization,
T. X. Vu, S. Chatzinotas, and B. Ottersten, “Edge-caching wireless networks: Performance analysis and optimization,” IEEE Trans. Wireless Commun., vol. 17, no. 4, pp. 2827–2839, Apr. 2018
work page 2018
-
[34]
Caching at the wireless edge: Design aspects, challenges, and future directions,
D. Liu, B. Chen, C. Yang, and A. F. Molisch, “Caching at the wireless edge: Design aspects, challenges, and future directions,” IEEE Commun. Mag., vol. 54, no. 9, pp. 22–28, Sep. 2016
work page 2016
-
[35]
On energy-efficient edge caching in heterogeneous networks,
F. Gabry, V . Bioglio, and I. Land, “On energy-efficient edge caching in heterogeneous networks,” IEEE J. Sel. Areas Commun. , vol. 34, no. 12, pp. 3288–3298, Dec. 2016
work page 2016
-
[36]
Delay-minimized edge caching in heterogeneous vehicular networks: A matching-based approach,
H. Wu, J. Chen, W. Xu, N. Cheng, W. Shi, L. Wang, and X. Shen, “Delay-minimized edge caching in heterogeneous vehicular networks: A matching-based approach,” IEEE Trans. Wireless Commun. , vol. 19, no. 10, pp. 6409–6424, Oct. 2020
work page 2020
-
[37]
Latency minimization for content delivery networks with wireless edge caching,
T. X. Vu, L. Lei, S. Vuppala, A. Kalantari, S. Chatzinotas, and B. Otter- sten, “Latency minimization for content delivery networks with wireless edge caching,” in Proc. IEEE Int. Conf.Commun. (ICC) , Kansas City, MO, USA, May 2018, pp. 1–6
work page 2018
-
[38]
PartialLoading: User scheduling and bandwidth allocation for parameter-sharing edge inference,
G. Qu, Q. Chen, X. Chen, K. Huang, and Y . Fang, “PartialLoading: User scheduling and bandwidth allocation for parameter-sharing edge inference,” arXiv preprint arXiv:2503.22982 , 2025
-
[39]
Edge intel- ligence: Architectures, challenges, and applications,
D. Xu, T. Li, Y . Li, X. Su, S. Tarkoma, and P. Hui, “Edge intel- ligence: Architectures, challenges, and applications,” arXiv preprint arXiv:2003.12172, 2020
-
[40]
M. Xu, D. Niyato, H. Zhang, J. Kang, Z. Xiong, S. Mao, and Z. Han, “Cached model-as-a-resource: Provisioning large language model agents for edge intelligence in space-air-ground integrated networks,” arXiv preprint arXiv:2403.05826, 2024
-
[41]
Pipelining split learning in multi-hop edge networks,
W. Wei, Z. Lin, T. Li, X. Li, and X. Chen, “Pipelining split learning in multi-hop edge networks,” arXiv preprint arXiv:2505.04368 , 2025
-
[42]
Split learning in 6G edge networks,
Z. Lin, G. Qu, X. Chen, and K. Huang, “Split learning in 6G edge networks,” IEEE Wireless Commun., vol. 31, no. 4, pp. 170–176, Aug. 2024
work page 2024
-
[43]
QoS-aware placement of deep learning services on the edge with multiple service implemen- tations,
N. Hudson, H. Khamfroush, and D. E. Lucani, “QoS-aware placement of deep learning services on the edge with multiple service implemen- tations,” in Proc. Int. Conf. on Comput. Commun. and Netw. (ICCCN) , Athens, Greece, Jul. 2021, pp. 1–8
work page 2021
-
[44]
In-situ model downloading to realize versatile edge AI in 6G mobile networks,
K. Huang, H. Wu, Z. Liu, and X. Qi, “In-situ model downloading to realize versatile edge AI in 6G mobile networks,” IEEE Commun. Mag., vol. 30, no. 3, pp. 96–102, 2023
work page 2023
-
[45]
Mobile edge intelligence for large language models: A contemporary survey,
G. Qu, Q. Chen, W. Wei, Z. Lin, X. Chen, and K. Huang, “Mobile edge intelligence for large language models: A contemporary survey,” IEEE Commun. Surveys Tuts., pp. 1–42, early access 2025
work page 2025
-
[46]
Efficient multiuser AI downloading via reusable knowledge broadcasting,
H. Wu, Q. Zeng, and K. Huang, “Efficient multiuser AI downloading via reusable knowledge broadcasting,” IEEE Trans. Wireless Commun., vol. 23, no. 8, pp. 10 459–10 472, Aug. 2024
work page 2024
-
[47]
Collaborative edge caching in context-aware device-to-device networks,
X. Zhao, P. Yuan, H. li, and S. Tang, “Collaborative edge caching in context-aware device-to-device networks,” IEEE Trans. Veh. Technol. , vol. 67, no. 10, pp. 9583–9596, Oct. 2018
work page 2018
-
[48]
Soft cache hits: Improving performance through recommendation and delivery of related content,
P. Sermpezis, T. Giannakas, T. Spyropoulos, and L. Vigneri, “Soft cache hits: Improving performance through recommendation and delivery of related content,” IEEE J. Sel. Areas Commun. , vol. 36, no. 6, pp. 1300– 1313, Jun. 2018
work page 2018
-
[49]
Multimedia caching strategies for hetero- geneous application and server environments,
A. Dan and D. Sitaram, “Multimedia caching strategies for hetero- geneous application and server environments,” Multimed. Tools Appl. , vol. 4, pp. 279–312, May 1997
work page 1997
-
[50]
Jointly optimizing content caching and recommendations in small cell net- works,
L. E. Chatzieleftheriou, M. Karaliopoulos, and I. Koutsopoulos, “Jointly optimizing content caching and recommendations in small cell net- works,” IEEE Trans. Mobile Comput. , vol. 18, no. 1, pp. 125–138, Jan. 2019
work page 2019
-
[51]
A survey of caching mechanisms in information-centric networking,
M. Zhang, H. Luo, and H. Zhang, “A survey of caching mechanisms in information-centric networking,” IEEE Commun. Surveys Tuts., vol. 17, no. 3, pp. 1473–1499, 3rd Quart. 2015
work page 2015
-
[52]
Fujishige, Submodular functions and optimization
S. Fujishige, Submodular functions and optimization . New York, NY , USA: Elsevier, 2005
work page 2005
-
[53]
Lov ´asz, Mathematical Programming The State of the Art: Bonn
L. Lov ´asz, Mathematical Programming The State of the Art: Bonn
-
[54]
Submodular functions and convexity, pp
Berlin, Heidelberg: Springer, 1983, ch. Submodular functions and convexity, pp. 235–257
work page 1983
-
[55]
Autotune: Automatically tuning convolutional neural networks for improved transfer learning,
S. S. Basha, S. K. Vinakota, V . Pulabaigari, S. Mukherjee, and S. R. Dubey, “Autotune: Automatically tuning convolutional neural networks for improved transfer learning,” Neural Netw., vol. 133, pp. 112–122, Jan. 2021
work page 2021
-
[56]
A comprehensive survey on transfer learning,
F. Zhuang, Z. Qi, K. Duan, D. Xi, Y . Zhu, H. Zhu, H. Xiong, and Q. He, “A comprehensive survey on transfer learning,” Proc. IEEE, vol. 109, no. 1, pp. 43–76, Jan. 2021
work page 2021
-
[57]
Models and pre-trained weights
PyTorch, “Models and pre-trained weights.” [Online]. Available: https://docs.pytorch.org/vision/main/models.html
-
[58]
Introducing Apple’s on-device and server foundation models,
Apple, “Introducing Apple’s on-device and server foundation models,”
-
[59]
Available: https://machinelearning.apple.com/research/ introducing-apple-foundation-models
[Online]. Available: https://machinelearning.apple.com/research/ introducing-apple-foundation-models
-
[60]
Serving customized AI models at scale with LoRA,
IBM, “Serving customized AI models at scale with LoRA,” 2024. [Online]. Available: https://research.ibm.com/blog/LoRAs-explained
work page 2024
-
[61]
Submodular optimization with submodular cover and submodular knapsack constraints,
R. K. Iyer and J. A. Bilmes, “Submodular optimization with submodular cover and submodular knapsack constraints,” in Proc. Adv. Neural Inform. Process. Syst. (NeurIPS) , Stateline, NV , USA, Dec. 2013, pp. 1–9
work page 2013
-
[62]
Fast semidifferential-based submod- ular function optimization,
R. Iyer, S. Jegelka, and J. Bilmes, “Fast semidifferential-based submod- ular function optimization,” in Proc. Int. Conf. Mach. Learn. (ICML) , Atlanta, USA, Jun. 2013, pp. 855–863
work page 2013
-
[63]
3GPP, “3rd generation partnership project; Technical specification group radio access network; NR; Base station (BS) radio transmission and reception; (Release 18),” 3rd Generation Partnership Project (3GPP), Technical Specification (TS) 38.104, Dec. 2024, version 18.8.0
work page 2024
-
[64]
GSMA, “Mobile backhaul: An overview,” 2019. [Online]. Available: https://www.gsma.com/futurenetworks/wiki/ mobile-backhaul-an-overview/
work page 2019
-
[65]
Ultra- dense 5G small cell deployment for fiber and wireless backhaul-aware infrastructures,
A. L. Rezaabad, H. Beyranvand, J. A. Salehi, and M. Maier, “Ultra- dense 5G small cell deployment for fiber and wireless backhaul-aware infrastructures,” IEEE Trans. Veh. Technol., vol. 67, no. 12, pp. 12 231– 12 243, Dec. 2018
work page 2018
-
[66]
Spectral efficiency analysis of cell-free massive MIMO systems with zero-forcing detector,
P. Liu, K. Luo, D. Chen, and T. Jiang, “Spectral efficiency analysis of cell-free massive MIMO systems with zero-forcing detector,” IEEE Trans. Wireless Commun., vol. 19, no. 2, pp. 795–807, Feb. 2020
work page 2020
-
[67]
Relative frequency as a determinant of phonetic change,
G. K. Zipf, “Relative frequency as a determinant of phonetic change,” Harvard Studies in Classical Philology , vol. 40, pp. 1–95, 1929
work page 1929
-
[68]
Social and spatial proactive caching for mobile data offloading,
E. Bas ¸tu˘g, M. Bennis, and M. Debbah, “Social and spatial proactive caching for mobile data offloading,” in Proc. IEEE Int. Conf.Commun. Workshops (ICC Wkshps), Sydney, NSW, Australia, Jun. 2014, pp. 581– 586
work page 2014
-
[69]
Learning-aided content placement in caching-enabled fog computing systems using thompson sampling,
J. Zhu, X. Huang, and Z. Shao, “Learning-aided content placement in caching-enabled fog computing systems using thompson sampling,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , Barcelona, Spain, May 2020, pp. 5060–5064
work page 2020
-
[70]
A survey of ICN content naming and in-network caching in 5G and beyond networks,
O. Serhane, K. Yahyaoui, B. Nour, and H. Moungla, “A survey of ICN content naming and in-network caching in 5G and beyond networks,” IEEE Internet Things J. , vol. 8, no. 6, pp. 4081–4104, Mar. 2021
work page 2021
-
[71]
On caching and routing in information-centric net- works,
A. Seetharam, “On caching and routing in information-centric net- works,” IEEE Commun. Mag. , vol. 56, no. 3, pp. 204–209, Mar. 2018
work page 2018
-
[72]
H. Kellerer, R. Sarto Basso, and V . A. Strusevich, “Approximability issues for unconstrained and constrained maximization of half-product related functions,” Theor. Comput. Sci., vol. 659, pp. 64–71, Jan. 2017
work page 2017
-
[73]
Approximation algorithms for the multiple knapsack problem with assignment restrictions,
M. Dawande, J. Kalagnanam, P. Keskinocak, F. S. Salman, and R. Ravi, “Approximation algorithms for the multiple knapsack problem with assignment restrictions,” J. Comb. Optim. , vol. 4, pp. 171–186, 2000
work page 2000
-
[74]
A polynomial time approximation scheme for the multiple knapsack problem,
C. Chekuri and S. Khanna, “A polynomial time approximation scheme for the multiple knapsack problem,” SIAM J. Comput. , vol. 35, no. 3, pp. 713–728, 2005. 1 APPENDIX A PROOF OF PROPOSITION 1 We begin by introducing a few statements. For any fea- sible X, let η (X) = {xm,i | xm,i = 1, xm,i ∈ X} denote the set of model caching decisions with xm,i = 1 , IX =...
work page 2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.