HASA: Subnet Allocation for Compute-Constrained Model-Heterogeneous Federated Learning

Ahmed M. Abdelmoniem; Amir Hossein Shahdadian; Christian Herglotz; Mahdi Taheri; Samira Nazari

arxiv: 2606.07621 · v1 · pith:EENY2EUJnew · submitted 2026-05-30 · 💻 cs.LG · cs.AI· cs.DC

HASA: Subnet Allocation for Compute-Constrained Model-Heterogeneous Federated Learning

Amir Hossein Shahdadian , Ahmed M. Abdelmoniem , Mahdi Taheri , Samira Nazari , Christian Herglotz This is my paper

Pith reviewed 2026-06-28 19:28 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.DC

keywords federated learningmodel heterogeneitysubnet allocationheterogeneity-awarecompute-constrainedclient personalizationedge devicesnext-word prediction

0 comments

The pith

Allocating wider subnets to clients with higher data heterogeneity raises mean accuracy from 13.82% to 14.32% and improves tail performance under fixed compute in federated learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HASA to allocate different sized subnets of a shared model to clients in federated learning based on how heterogeneous their local data is. This is done while keeping the total compute cost the same across policies for fair comparison. Experiments on a next-word prediction task show higher average client accuracy and better results for the worst-performing clients compared to uniform allocation or other baselines. An ablation confirms that the direction of allocation matters, with the reverse hurting performance. A second study on image classification indicates that success hinges on the score correctly identifying clients that benefit from wider models.

Core claim

HASA is a train-only rule that assigns subnet widths based on client heterogeneity scores computed from local training data while enforcing a fixed size-weighted compute budget. On an article-title next-word prediction benchmark with seven clients, HASA improves unweighted mean client test accuracy over uniform allocation across 10 matched seeds, increasing mean client test accuracy from 13.82 percent to 14.32 percent, and improves worst-client accuracy on average. In a matched-budget comparison with representative partial-training baselines, HASA achieves the strongest worst-client and tail-client accuracy on this benchmark.

What carries the argument

The HASA allocation rule, which computes a heterogeneity score from each client's local data to decide its subnet width while holding total compute fixed.

If this is right

Mean client test accuracy rises by half a percentage point over uniform allocation.
Worst-client and tail-client accuracies become the highest among matched-budget partial-training methods.
Reversing the allocation direction, so heterogeneous clients receive smaller subnets, reduces both mean and tail performance.
The gains hold in a cross-domain image-classification setting only when the score aligns with the need for extra width.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could extend to dynamic client pools if heterogeneity scores are recomputed periodically from recent local batches.
Combining the allocation rule with client-specific fine-tuning after subnet training might further close the gap to full-model personalization.
The method's reliance on local data only suggests it could apply directly to privacy-sensitive domains where central data inspection is impossible.

Load-bearing premise

The heterogeneity score must accurately reflect each client's need for additional model width.

What would settle it

An experiment on the same benchmark in which replacing the heterogeneity score with random values still yields the reported accuracy gains would show the score is not driving the improvement.

Figures

Figures reproduced from arXiv: 2606.07621 by Ahmed M. Abdelmoniem, Amir Hossein Shahdadian, Christian Herglotz, Mahdi Taheri, Samira Nazari.

**Figure 2.** Figure 2: Overview of HASA: train-only heterogeneity scoring, budgeted width mapping, and federated training with client-specific subnets. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Client-level test accuracy on the article-title benchmark over 10 [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Edge services increasingly use federated learning to personalize on-device models while keeping sensitive data local. In practice, deployments must handle heterogeneity in both client resources and local data distributions. Model-heterogeneous federated learning lowers client cost by allowing each client to train a subnet of a shared supernet, but most subnet-allocation policies are driven by device constraints and do not explicitly account for statistical heterogeneity. This paper proposes Heterogeneity-Aware Subnet Allocation (HASA), a train-only rule that assigns subnet widths based on client heterogeneity scores computed from local training data while enforcing a fixed size-weighted compute budget. This design enables budget-matched comparisons with alternative allocation policies. On an article-title next-word prediction benchmark with seven clients, HASA improves unweighted mean client test accuracy over uniform allocation across 10 matched seeds, increasing mean client test accuracy from 13.82 percent to 14.32 percent, and improves worst-client accuracy on average. In a matched-budget comparison with representative partial-training baselines, HASA achieves the strongest worst-client and tail-client accuracy on this benchmark. A directionality ablation shows that assigning smaller subnets to more heterogeneous clients degrades both mean and tail performance. A cross-domain image-classification study further shows that the effectiveness of heterogeneity-aware allocation depends on how well the heterogeneity score reflects clients' need for additional model width.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

HASA adds a data-driven subnet allocation rule using local heterogeneity scores, but the 0.5-point gain rests on a seven-client benchmark and an untested assumption about score quality.

read the letter

The main thing to know is that this paper defines a concrete train-only allocation policy: subnet width is set by a heterogeneity score computed from each client's local data, subject to a fixed size-weighted compute budget. That specific rule is not in the priors they cite, and they set up budget-matched comparisons plus a directionality ablation that shows reversing the assignment (smaller subnets to more heterogeneous clients) hurts mean and tail accuracy.

They do a clean job on the controlled comparisons and the ablation, which at least tests one direction of the mechanism. The cross-domain image-classification note is also honest in flagging that results depend on the score actually tracking clients' need for extra width.

The soft spots are the evidence base. Everything rests on seven clients and ten seeds in one next-word task, with no error bars or statistical tests reported in the abstract. The 0.5-point mean lift (13.82 to 14.32) and the worst/tail-client claims are therefore hard to separate from sampling noise. The load-bearing assumption—that the heterogeneity score reflects actual capacity demand—is stated explicitly but only partially probed; the ablation checks sign, not fidelity, and the small client count makes it difficult to isolate the intended effect.

This is for people already working inside model-heterogeneous federated learning who care about allocation policies. A specialist might pick up the rule and try it, but the current results are too narrow to change practice. It deserves peer review because the allocation idea is distinct and they attempted matched-budget controls, though any referee would need to see larger-scale experiments and direct validation of the score before taking the gains seriously.

Referee Report

2 major / 1 minor

Summary. The paper proposes Heterogeneity-Aware Subnet Allocation (HASA), a train-only rule that assigns subnet widths in model-heterogeneous federated learning according to client heterogeneity scores computed from local data while enforcing a fixed size-weighted compute budget. On a seven-client article-title next-word prediction benchmark it reports an increase in unweighted mean client test accuracy from 13.82% to 14.32% versus uniform allocation across ten matched seeds, together with improved worst-client and tail-client accuracy relative to representative partial-training baselines. A directionality ablation and a cross-domain image-classification study are included; the latter is presented as evidence that effectiveness depends on how well the heterogeneity score tracks clients' need for additional model width.

Significance. If the heterogeneity score reliably tracks clients' capacity demand, the budget-matched design would allow principled allocation of limited compute in statistically heterogeneous FL deployments and could improve tail performance without raising total cost. The explicit statement that effectiveness hinges on score quality and the use of matched-budget comparisons are strengths that facilitate future verification.

major comments (2)

[Abstract] Abstract: the central performance claims (mean accuracy lift of 0.5 pp, strongest worst- and tail-client accuracy) rest on a single benchmark with only seven clients and ten seeds; no error bars, statistical tests, or description of how the heterogeneity score is computed or how data/seed selection was performed are supplied, which is load-bearing for attributing gains to the allocation rule rather than sampling variance.
[Abstract] Abstract: the directionality ablation tests only the sign of the allocation (smaller subnets to more heterogeneous clients) but supplies no direct evidence that the heterogeneity score correlates with clients' actual need for model width (e.g., via correlation with data-complexity or optimization-difficulty metrics), even though the paper itself states that effectiveness depends on this correlation.

minor comments (1)

[Abstract] The abstract does not provide the formula or precise definition of the heterogeneity score, which would improve reproducibility of the allocation rule.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract and the strength of supporting evidence. We will revise the abstract to incorporate additional details on the experimental setup and clarify the role of the ablations and cross-domain study. The responses below address each major comment.

read point-by-point responses

Referee: [Abstract] Abstract: the central performance claims (mean accuracy lift of 0.5 pp, strongest worst- and tail-client accuracy) rest on a single benchmark with only seven clients and ten seeds; no error bars, statistical tests, or description of how the heterogeneity score is computed or how data/seed selection was performed are supplied, which is load-bearing for attributing gains to the allocation rule rather than sampling variance.

Authors: We agree the abstract is concise and will expand it in revision to briefly describe the heterogeneity score computation from local data and to note that all results are averaged over 10 matched seeds with the same data partitioning. Error bars will be referenced (they appear in the main figures) and we will add a short statement on the benchmark scale as a limitation. The seven-client setting was selected to permit fine-grained per-client analysis under controlled conditions; we do not claim broad generalizability from this benchmark alone. revision: yes
Referee: [Abstract] Abstract: the directionality ablation tests only the sign of the allocation (smaller subnets to more heterogeneous clients) but supplies no direct evidence that the heterogeneity score correlates with clients' actual need for model width (e.g., via correlation with data-complexity or optimization-difficulty metrics), even though the paper itself states that effectiveness depends on this correlation.

Authors: The directionality ablation shows that reversing the allocation rule harms both mean and tail performance, which supports the chosen sign. We acknowledge it does not include explicit correlation coefficients with data-complexity metrics. The cross-domain image-classification experiment is presented precisely to illustrate that gains appear only when the heterogeneity score tracks clients' need for width; we will revise the abstract to make this linkage more explicit and to note that the score's validity is evidenced by the conditional effectiveness across domains rather than by the ablation alone. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation or claims

full rationale

The paper defines HASA as an allocation rule that computes heterogeneity scores directly from local training data and assigns subnet widths under a fixed compute budget. Reported gains (13.82% to 14.32% mean accuracy) are empirical measurements on held-out test data across seeds, with directionality ablation and cross-domain checks. No equations, predictions, or first-principles results reduce to fitted inputs by construction; the score-to-width mapping is a deterministic function of externally computed inputs, not self-referential. No self-citation chains or uniqueness theorems are invoked to justify the central mechanism. The load-bearing assumption (score quality) is stated explicitly as an empirical precondition rather than derived internally.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are stated. The heterogeneity score itself is an implicit modeling choice whose definition and validation are not supplied.

pith-pipeline@v0.9.1-grok · 5794 in / 1156 out tokens · 19292 ms · 2026-06-28T19:28:30.999404+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 9 canonical work pages · 3 internal anchors

[1]

Fortune: A negative memory overhead hardware-agnostic fault tolerance technique in dnns,

S. Nazari, M. Taheri, A. Azarpeyvand, M. Afsharchi, T. Ghasempouri, C. Herglotz, M. Daneshtalab, and M. Jenihhin, “Fortune: A negative memory overhead hardware-agnostic fault tolerance technique in dnns,” inIEEE 33rd Asian Test Symposium (ATS), 2024

2024
[2]

Reliability-aware performance optimization of dnn hw accelerators through heterogeneous quantization,

S. Nazari, M. Taheri, A. Azarpeyvand, M. Afsharchi, C. Herglotz, and M. Jenihhin, “Reliability-aware performance optimization of dnn hw accelerators through heterogeneous quantization,” in2025 IEEE 26th Latin American Test Symposium (LATS). IEEE, 2025, pp. 1–6

2025
[3]

Mix-and-match pruning: Globally guided layer-wise sparsification of dnns,

D. Monachan, S. Nazari, M. Taheri, A. Azarpeyvand, M. Krstic, M. Huebner, and C. Herglotz, “Mix-and-match pruning: Globally guided layer-wise sparsification of dnns,”arXiv preprint arXiv:2603.20280, 2026

work page arXiv 2026
[4]

Towards federated learning at scale: System design,

K. A. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V . Ivanov, C. M. Kiddon, J. Kone ˇcný, S. Mazzocchi, B. McMahan, T. V . Overveldt, D. Petrou, D. Ramage, and J. Roselander, “Towards federated learning at scale: System design,” inSysML, 2019

2019
[5]

Communication-Efficient Learning of Deep Networks from Decentralized Data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” inProceedings of AISTATS, 2017, pp. 1273–1282

2017
[6]

A comprehensive empirical study of heterogeneity in federated learning,

A. M. Abdelmoniem, C.-Y . Ho, P. Papageorgiou, and M. Canini, “A comprehensive empirical study of heterogeneity in federated learning,” IEEE Internet of Things Journal, vol. 10, pp. 14 071–14 083, 2023

2023
[7]

Practical secure aggregation for privacy-preserving machine learning,

K. Bonawitz, V . Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” inACM CCS, 2017

2017
[8]

REFL: Resource-efficient federated learning,

A. M. Abdelmoniem, A. N. Sahu, M. Canini, and S. A. Fahmy, “REFL: Resource-efficient federated learning,” inProceedings of the Eighteenth European Conference on Computer Systems (EuroSys), 2023, pp. 215– 232

2023
[9]

Towards mitigating device het- erogeneity in federated learning via adaptive model quantization,

A. M. Abdelmoniem and M. Canini, “Towards mitigating device het- erogeneity in federated learning via adaptive model quantization,” in Proceedings of the 1st Workshop on Machine Learning and Systems (EuroMLSys), 2021, pp. 96–103

2021
[10]

Federated optimization in heterogeneous networks,

T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,” inProceedings of Machine Learning and Systems, 2020

2020
[11]

SCAFFOLD: Stochastic controlled averaging for federated learning,

S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “SCAFFOLD: Stochastic controlled averaging for federated learning,” inProceedings of ICML, 2020

2020
[12]

Expanding the Reach of Federated Learning by Reducing Client Resource Requirements

S. Caldas, J. Kone ˇcny, H. B. McMahan, and A. Talwalkar, “Expanding the reach of federated learning by reducing client resource require- ments,”arXiv 1812.07210, 2019

work page internal anchor Pith review Pith/arXiv arXiv 2019
[13]

Heterofl: Computation and commu- nication efficient federated learning for heterogeneous clients,

E. Diao, J. Ding, and V . Tarokh, “Heterofl: Computation and commu- nication efficient federated learning for heterogeneous clients,”arXiv 2010.01264, 2021

work page arXiv 2010
[14]

Slimmable Neural Networks

J. Yu, L. Yang, N. Xu, J. Yang, and T. Huang, “Slimmable neural networks,”arXiv 1812.08928, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[15]

Edge computing: Vision and challenges,

W. Shi, J. Cao, Q. Zhang, Y . Li, and L. Xu, “Edge computing: Vision and challenges,”IEEE Internet of Things Journal, vol. 3, no. 5, pp. 637–646, 2016

2016
[16]

The emergence of edge computing,

M. Satyanarayanan, “The emergence of edge computing,”IEEE Com- puter, vol. 50, no. 1, pp. 30–39, 2017

2017
[17]

Mitigating malicious model fusion in federated learning via confidence-aware defense,

Q. Li, P. Papageorgiou, G. Liu, M. Gao, L. You, C. Wan, and A. M. Abdelmoniem, “Mitigating malicious model fusion in federated learning via confidence-aware defense,”Information Fusion, vol. 126, 2025

2025
[18]

Discovering latent knowledge proto- types for heterogeneous federated learning,

Q. Li and A. M. Abdelmoniem, “Discovering latent knowledge proto- types for heterogeneous federated learning,” inProceedings of the 28th European Conference on Artificial Intelligence (ECAI), 2025

2025
[19]

Hierarchical knowledge structuring for effective federated learning in heterogeneous environ- ments,

W. F. Tam, Q. Li, and A. M. Abdelmoniem, “Hierarchical knowledge structuring for effective federated learning in heterogeneous environ- ments,” inProceedings of IEEE IJCNN, 2025

2025
[20]

Fedrolex: model- heterogeneous federated learning with rolling sub-model extraction,

S. Alam, L. Liu, M. Yan, and M. Zhang, “Fedrolex: model- heterogeneous federated learning with rolling sub-model extraction,” in Proceedings of NeurIPS, 2022

2022
[21]

Scalefl: Resource-adaptive federated learning with heterogeneous clients,

F. Ilhan, G. Su, and L. Liu, “Scalefl: Resource-adaptive federated learning with heterogeneous clients,” inIEEE/CVF CVPR, 2023

2023
[22]

Once-for-all: Train one network and specialize it for efficient deployment.arXiv preprint arXiv:1908.09791, 2019

H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-all: Train one network and specialize it for efficient deployment,”arXiv 1908.09791, 2020

work page arXiv 1908
[23]

SlimFL: Federated Learning With Superposition Coding Over Slimmable Neural Networks ,

W. J. Yun, Y . Kwak, H. Baek, S. Jung, M. Ji, M. Bennis, J. Park, and J. Kim, “ SlimFL: Federated Learning With Superposition Coding Over Slimmable Neural Networks ,”IEEE/ACM Transactions on Networking, vol. 31, no. 06, pp. 2499–2514, 2023

2023
[24]

arXiv preprint arXiv:1910.03581 , year=

D. Li and J. Wang, “Fedmd: Heterogenous federated learning via model distillation,”arXiv 1910.03581, 2019

work page arXiv 1910
[25]

Boyi Liu, Lujia Wang, and Ming Liu

T. Lin, L. Kong, S. U. Stich, and M. Jaggi, “Ensemble distillation for robust model fusion in federated learning,”arXiv 2006.07242, 2021

work page arXiv 2006
[26]

Agnostic Federated Learning

M. Mohri, G. Sivek, and A. T. Suresh, “Agnostic federated learning,” arXiv 1902.00146, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1902
[27]

Fair resource allocation in federated learning,

T. Li, M. Sanjabi, A. Beirami, and V . Smith, “Fair resource allocation in federated learning,”arXiv 1905.10497, 2020

work page arXiv 1905
[28]

An efficient framework for clustered federated learning,

A. Ghosh, J. Chung, D. Yin, and K. Ramchandran, “An efficient framework for clustered federated learning,” inProceedings of the 34th NeurIPS, 2020

2020
[29]

Medium articles dataset (2019, 7 publications),

D. Lazar, “Medium articles dataset (2019, 7 publications),” Kaggle dataset, 2019, accessed 2026-01-29. [Online]. Available: https: //www.kaggle.com/datasets/dorianlazar/medium-articles-dataset

2019

[1] [1]

Fortune: A negative memory overhead hardware-agnostic fault tolerance technique in dnns,

S. Nazari, M. Taheri, A. Azarpeyvand, M. Afsharchi, T. Ghasempouri, C. Herglotz, M. Daneshtalab, and M. Jenihhin, “Fortune: A negative memory overhead hardware-agnostic fault tolerance technique in dnns,” inIEEE 33rd Asian Test Symposium (ATS), 2024

2024

[2] [2]

Reliability-aware performance optimization of dnn hw accelerators through heterogeneous quantization,

S. Nazari, M. Taheri, A. Azarpeyvand, M. Afsharchi, C. Herglotz, and M. Jenihhin, “Reliability-aware performance optimization of dnn hw accelerators through heterogeneous quantization,” in2025 IEEE 26th Latin American Test Symposium (LATS). IEEE, 2025, pp. 1–6

2025

[3] [3]

Mix-and-match pruning: Globally guided layer-wise sparsification of dnns,

D. Monachan, S. Nazari, M. Taheri, A. Azarpeyvand, M. Krstic, M. Huebner, and C. Herglotz, “Mix-and-match pruning: Globally guided layer-wise sparsification of dnns,”arXiv preprint arXiv:2603.20280, 2026

work page arXiv 2026

[4] [4]

Towards federated learning at scale: System design,

K. A. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V . Ivanov, C. M. Kiddon, J. Kone ˇcný, S. Mazzocchi, B. McMahan, T. V . Overveldt, D. Petrou, D. Ramage, and J. Roselander, “Towards federated learning at scale: System design,” inSysML, 2019

2019

[5] [5]

Communication-Efficient Learning of Deep Networks from Decentralized Data,

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” inProceedings of AISTATS, 2017, pp. 1273–1282

2017

[6] [6]

A comprehensive empirical study of heterogeneity in federated learning,

A. M. Abdelmoniem, C.-Y . Ho, P. Papageorgiou, and M. Canini, “A comprehensive empirical study of heterogeneity in federated learning,” IEEE Internet of Things Journal, vol. 10, pp. 14 071–14 083, 2023

2023

[7] [7]

Practical secure aggregation for privacy-preserving machine learning,

K. Bonawitz, V . Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” inACM CCS, 2017

2017

[8] [8]

REFL: Resource-efficient federated learning,

A. M. Abdelmoniem, A. N. Sahu, M. Canini, and S. A. Fahmy, “REFL: Resource-efficient federated learning,” inProceedings of the Eighteenth European Conference on Computer Systems (EuroSys), 2023, pp. 215– 232

2023

[9] [9]

Towards mitigating device het- erogeneity in federated learning via adaptive model quantization,

A. M. Abdelmoniem and M. Canini, “Towards mitigating device het- erogeneity in federated learning via adaptive model quantization,” in Proceedings of the 1st Workshop on Machine Learning and Systems (EuroMLSys), 2021, pp. 96–103

2021

[10] [10]

Federated optimization in heterogeneous networks,

T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V . Smith, “Federated optimization in heterogeneous networks,” inProceedings of Machine Learning and Systems, 2020

2020

[11] [11]

SCAFFOLD: Stochastic controlled averaging for federated learning,

S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “SCAFFOLD: Stochastic controlled averaging for federated learning,” inProceedings of ICML, 2020

2020

[12] [12]

Expanding the Reach of Federated Learning by Reducing Client Resource Requirements

S. Caldas, J. Kone ˇcny, H. B. McMahan, and A. Talwalkar, “Expanding the reach of federated learning by reducing client resource require- ments,”arXiv 1812.07210, 2019

work page internal anchor Pith review Pith/arXiv arXiv 2019

[13] [13]

Heterofl: Computation and commu- nication efficient federated learning for heterogeneous clients,

E. Diao, J. Ding, and V . Tarokh, “Heterofl: Computation and commu- nication efficient federated learning for heterogeneous clients,”arXiv 2010.01264, 2021

work page arXiv 2010

[14] [14]

Slimmable Neural Networks

J. Yu, L. Yang, N. Xu, J. Yang, and T. Huang, “Slimmable neural networks,”arXiv 1812.08928, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[15] [15]

Edge computing: Vision and challenges,

W. Shi, J. Cao, Q. Zhang, Y . Li, and L. Xu, “Edge computing: Vision and challenges,”IEEE Internet of Things Journal, vol. 3, no. 5, pp. 637–646, 2016

2016

[16] [16]

The emergence of edge computing,

M. Satyanarayanan, “The emergence of edge computing,”IEEE Com- puter, vol. 50, no. 1, pp. 30–39, 2017

2017

[17] [17]

Mitigating malicious model fusion in federated learning via confidence-aware defense,

Q. Li, P. Papageorgiou, G. Liu, M. Gao, L. You, C. Wan, and A. M. Abdelmoniem, “Mitigating malicious model fusion in federated learning via confidence-aware defense,”Information Fusion, vol. 126, 2025

2025

[18] [18]

Discovering latent knowledge proto- types for heterogeneous federated learning,

Q. Li and A. M. Abdelmoniem, “Discovering latent knowledge proto- types for heterogeneous federated learning,” inProceedings of the 28th European Conference on Artificial Intelligence (ECAI), 2025

2025

[19] [19]

Hierarchical knowledge structuring for effective federated learning in heterogeneous environ- ments,

W. F. Tam, Q. Li, and A. M. Abdelmoniem, “Hierarchical knowledge structuring for effective federated learning in heterogeneous environ- ments,” inProceedings of IEEE IJCNN, 2025

2025

[20] [20]

Fedrolex: model- heterogeneous federated learning with rolling sub-model extraction,

S. Alam, L. Liu, M. Yan, and M. Zhang, “Fedrolex: model- heterogeneous federated learning with rolling sub-model extraction,” in Proceedings of NeurIPS, 2022

2022

[21] [21]

Scalefl: Resource-adaptive federated learning with heterogeneous clients,

F. Ilhan, G. Su, and L. Liu, “Scalefl: Resource-adaptive federated learning with heterogeneous clients,” inIEEE/CVF CVPR, 2023

2023

[22] [22]

Once-for-all: Train one network and specialize it for efficient deployment.arXiv preprint arXiv:1908.09791, 2019

H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-all: Train one network and specialize it for efficient deployment,”arXiv 1908.09791, 2020

work page arXiv 1908

[23] [23]

SlimFL: Federated Learning With Superposition Coding Over Slimmable Neural Networks ,

W. J. Yun, Y . Kwak, H. Baek, S. Jung, M. Ji, M. Bennis, J. Park, and J. Kim, “ SlimFL: Federated Learning With Superposition Coding Over Slimmable Neural Networks ,”IEEE/ACM Transactions on Networking, vol. 31, no. 06, pp. 2499–2514, 2023

2023

[24] [24]

arXiv preprint arXiv:1910.03581 , year=

D. Li and J. Wang, “Fedmd: Heterogenous federated learning via model distillation,”arXiv 1910.03581, 2019

work page arXiv 1910

[25] [25]

Boyi Liu, Lujia Wang, and Ming Liu

T. Lin, L. Kong, S. U. Stich, and M. Jaggi, “Ensemble distillation for robust model fusion in federated learning,”arXiv 2006.07242, 2021

work page arXiv 2006

[26] [26]

Agnostic Federated Learning

M. Mohri, G. Sivek, and A. T. Suresh, “Agnostic federated learning,” arXiv 1902.00146, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1902

[27] [27]

Fair resource allocation in federated learning,

T. Li, M. Sanjabi, A. Beirami, and V . Smith, “Fair resource allocation in federated learning,”arXiv 1905.10497, 2020

work page arXiv 1905

[28] [28]

An efficient framework for clustered federated learning,

A. Ghosh, J. Chung, D. Yin, and K. Ramchandran, “An efficient framework for clustered federated learning,” inProceedings of the 34th NeurIPS, 2020

2020

[29] [29]

Medium articles dataset (2019, 7 publications),

D. Lazar, “Medium articles dataset (2019, 7 publications),” Kaggle dataset, 2019, accessed 2026-01-29. [Online]. Available: https: //www.kaggle.com/datasets/dorianlazar/medium-articles-dataset

2019