arxiv: 2604.12429 · v1 · submitted 2026-04-14 · 💻 cs.IT · math.IT

Recognition: unknown

On the Optimality of Hierarchical Secure Aggregation with Arbitrary Heterogeneous Data Assignment

Chenyi Sun , Ziting Zhang , Kai Wan , Xiang Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:13 UTC · model grok-4.3

classification 💻 cs.IT math.IT

keywords secure aggregationhierarchical networkinformation theoretic securityheterogeneous data assignmentcommunication loadoptimalityuser dropoutscollusion

0 comments

The pith

A scheme for secure aggregation in three-layer hierarchical networks achieves optimal communication loads while guaranteeing information-theoretic security with arbitrary heterogeneous data assignments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper considers a network where users holding different numbers of datasets send masked partial aggregates through relays to a central server. The setting includes possible user dropouts and limited collusion between the server or relays and some users. The authors construct an explicit scheme that lets the server recover only the total sum of gradients while ensuring relays learn nothing about individual inputs. They prove that the scheme meets the lowest achievable communication loads on both the user-to-relay and relay-to-server links under the stated security requirements. A reader would care because the result shows how to perform private distributed summation efficiently over realistic multi-hop networks with unequal data holdings.

Core claim

The paper claims that, for a three-layer hierarchical network with arbitrary heterogeneous data assignment across users, unpredictable dropouts, and possible collusion of the server or any relay with a subset of users, there exists an aggregation scheme that satisfies information-theoretic server security (revealing only the sum) and relay security (revealing nothing) while attaining the minimal possible communication loads on the two layers.

What carries the argument

The hierarchical secure aggregation scheme that performs local masking of partially aggregated gradients at each user followed by aggregated forwarding at the relays.

If this is right

The server recovers exactly the sum of surviving users' gradients and no other information.
No relay obtains any information about the inputs of its associated users.
The scheme works for any predetermined heterogeneous data assignment and any set of surviving users.
Both layers operate at the lowest communication cost permitted by the security constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same masking approach might be adapted to networks with more than three layers if the collusion model is extended.
If data assignments become dynamic, the scheme would need a way to update masks without extra communication.
The optimality result suggests similar tight bounds could be derived for secure aggregation over wireless or time-varying hierarchical topologies.

Load-bearing premise

The assignment of datasets to users is fixed in advance and known to all parties, and the network follows a strict three-layer structure in which relays may collude only with limited subsets of users.

What would settle it

A concrete scheme that meets the same security constraints yet uses strictly fewer bits on either the user-relay or relay-server link, or a proof that the paper's lower bounds on those loads cannot be achieved.

Figures

Figures reproduced from arXiv: 2604.12429 by Chenyi Sun, Kai Wan, Xiang Zhang, Ziting Zhang.

**Figure 1.** Figure 1: Hierarchical secure aggregation problem (U,V1, V2) = (2, 2, 4) with user (2, 4) as a straggler, relay 1 colludes with user (1, 2) and the server colludes with user (2, 1). In contrast to [19], the recent work [22] introduces the hierarchical gradient coding problem with arbitrary data assignment and considers an asymmetric architecture, where relay u serves a heterogeneous cluster of Vu users. The communi… view at source ↗

read the original abstract

This paper studies the information theoretic secure aggregation problem in a three-layer hierarchical network with arbitrary heterogeneous data assignment, where clustered users communicate with an aggregation server through an intermediate layer of relays. We consider a more general setting with arbitrary heterogeneous data assignment across users, where `arbitrary' means that the data assignment is given in advance and `heterogeneous' means that the users may hold different numbers of datasets. Each user locally computes the partially aggregated gradients as its input based on the assigned datasets and transmits masked input to its associated relay. The relays then forward the aggregated messages to the server, which aims to recover the sum of the gradients. In this process, while some users may drop out unpredictably, the server needs to correctly recover the desired aggregation from the surviving users. Moreover, the server or any relay may collude with a subset of users. We impose the following security constraints: (i) server security, requiring the server to learn only the sum of gradients without gaining any additional information about individual inputs; and (ii) relay security, ensuring that each relay learns nothing about users' inputs. Under these constraints, we propose an aggregation scheme that guarantees information theoretic security and achieves the optimal two-layer communication loads.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives an explicit masking scheme for three-layer hierarchical secure aggregation that matches the lower bounds on communication loads for arbitrary heterogeneous data assignments.

read the letter

The main thing to know is that the authors build a concrete scheme using linear masking over finite fields that lets the server recover only the sum of gradients while each relay learns nothing about individual inputs, and the two-layer communication loads match the derived lower bounds for the worst-case set of surviving users after arbitrary dropouts. The setting includes collusion by the server or any single relay with some users, and data sizes can differ arbitrarily across users as long as the assignment is fixed and known ahead of time. This is a genuine extension beyond uniform or two-layer cases in prior work. The construction is explicit: users mask their partial aggregates locally, relays sum the masked messages, and the server decodes using the known masks. The optimality argument relies on standard linear-algebraic counting of independent equations, which appears clean and free of circularity or hidden parameters. The stress-test confirms the bounds hold without extra assumptions that would break the claim. One minor limitation is the static assignment requirement; if data sizes changed dynamically the masks would need recomputation, but the paper states this upfront and does not claim broader applicability. The worst-case dropout bounds are conservative yet correctly tight for the model. No other gaps in the argument are visible. This work is aimed at researchers in information-theoretic secure computation and hierarchical federated learning who care about communication optimality under strong privacy constraints. A reader already working on relay-assisted protocols would get direct value from the matching bounds as a benchmark. The paper shows clear, honest engagement with the model and literature through the generalization and explicit proof. It deserves serious peer review because the construction and lower-bound match are substantive enough to warrant referee time even if some details need polishing.

Referee Report

0 major / 2 minor

Summary. The manuscript studies the information theoretic secure aggregation problem in a three-layer hierarchical network with arbitrary heterogeneous data assignment across users. It proposes an aggregation scheme using local masking that guarantees information theoretic security against collusions by the server or relays with subsets of users, while handling unpredictable user dropouts. The scheme achieves the optimal two-layer communication loads by matching derived lower bounds for the worst-case surviving set of users.

Significance. If the results hold, the paper makes a significant contribution to the field of secure aggregation by extending it to hierarchical networks with heterogeneous data assignments and providing matching upper and lower bounds on communication loads. The explicit construction based on linear-algebraic arguments over finite fields is a strength, offering reproducible and verifiable optimality for the stated model. This could have implications for privacy-preserving distributed computing in federated learning scenarios.

minor comments (2)

The abstract states that the scheme 'achieves the optimal two-layer communication loads' without quoting the explicit expressions for those loads; adding them would improve immediate readability.
Notation for the heterogeneous data assignment (e.g., how the number of datasets per user is denoted) could be introduced more explicitly in the model section to aid readers unfamiliar with the prior literature.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and accurate summary of our manuscript, which correctly identifies the key contributions: an information-theoretically secure hierarchical aggregation scheme for arbitrary heterogeneous data assignments that achieves optimal two-layer communication loads under collusions and dropouts. We appreciate the recognition of the linear-algebraic construction and its potential implications for federated learning. Since no specific major comments were provided, we have no points to address point-by-point.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The manuscript derives lower bounds on two-layer communication loads via standard linear-algebraic arguments over finite fields applied to the security constraints and dropout model, then constructs an explicit masking scheme that meets those bounds for arbitrary heterogeneous assignments. No step reduces a claimed result to a fitted parameter, self-definition, or unverified self-citation; the optimality statement is the direct equality between the independently derived lower bound and the achievable load of the constructed scheme. The provided abstract and skeptic summary confirm the chain relies on external model assumptions rather than internal re-labeling of inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Relies on standard information-theoretic security definitions and a fixed hierarchical network model with possible user dropouts; no free parameters or invented entities are evident from the abstract.

axioms (2)

standard math Standard information-theoretic definitions of security against server and relay collusion
Invoked to define server security (learn only the sum) and relay security (learn nothing).
domain assumption Three-layer hierarchical network model with arbitrary heterogeneous data assignment given in advance and possible unpredictable dropouts
Core setup assumed throughout the problem statement.

pith-pipeline@v0.9.0 · 5516 in / 1172 out tokens · 62678 ms · 2026-05-10T15:13:07.229515+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 5 canonical work pages · 1 internal anchor

[1]

Large scale distributed deep networks,

J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, M. Ran- zato, A. Senior, P. Tucker, K. Yanget al., “Large scale distributed deep networks,”Advances in neural information processing systems, vol. 25, 2012

2012
[2]

Mapreduce: simplified data processing on large clusters,

J. Dean and S. Ghemawat, “Mapreduce: simplified data processing on large clusters,”Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008

2008
[3]

Spark: Cluster computing with working sets,

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: Cluster computing with working sets,” in2nd USENIX workshop on hot topics in cloud computing (HotCloud 10), 2010

2010
[4]

Gradient coding: Avoiding stragglers in distributed learning,

R. Tandon, Q. Lei, A. G. Dimakis, and N. Karampatziakis, “Gradient coding: Avoiding stragglers in distributed learning,” inInternational Conference on Machine Learning. PMLR, 2017, pp. 3368–3376

2017
[5]

Speeding up distributed machine learning using codes,

K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran, “Speeding up distributed machine learning using codes,”IEEE Trans- actions on Information Theory, vol. 64, no. 3, pp. 1514–1529, 2017

2017
[6]

Stochastic gradient cod- ing for straggler mitigation in distributed learning,

R. Bitar, M. Wootters, and S. El Rouayheb, “Stochastic gradient cod- ing for straggler mitigation in distributed learning,”IEEE Journal on Selected Areas in Information Theory, vol. 1, no. 1, pp. 277–291, 2020

2020
[7]

Distributed linearly separable computation,

K. Wan, H. Sun, M. Ji, and G. Caire, “Distributed linearly separable computation,”IEEE Transactions on Information Theory, vol. 68, no. 2, pp. 1259–1278, 2021

2021
[8]

On the tradeoff between computation and communication costs for distributed linearly separable computation,

——, “On the tradeoff between computation and communication costs for distributed linearly separable computation,”IEEE Transactions on Communications, vol. 69, no. 11, pp. 7390–7405, 2021

2021
[9]

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

B. Lin, Z. Tang, Y . Ye, J. Cui, B. Zhu, P. Jin, J. Huang, J. Zhang, Y . Pang, M. Ninget al., “Moe-llava: Mixture of experts for large vision-language models,”arXiv preprint arXiv:2401.15947, 2024

work page internal anchor Pith review arXiv 2024
[10]

Uni-moe: Scaling unified multimodal llms with mixture of experts,

Y . Li, S. Jiang, B. Hu, L. Wang, W. Zhong, W. Luo, L. Ma, and M. Zhang, “Uni-moe: Scaling unified multimodal llms with mixture of experts,”IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2025

2025
[11]

Optimal communication- computation trade-off in heterogeneous gradient coding,

T. Jahani-Nezhad and M. A. Maddah-Ali, “Optimal communication- computation trade-off in heterogeneous gradient coding,”IEEE Journal on Selected Areas in Information Theory, vol. 2, no. 3, pp. 1002–1011, 2021

2021
[12]

Information theoretic secure aggregation with user dropouts,

Y . Zhao and H. Sun, “Information theoretic secure aggregation with user dropouts,”IEEE Transactions on Information Theory, vol. 68, no. 11, pp. 7471–7484, 2022

2022
[13]

Lightsecagg: a lightweight and versatile design for secure aggregation in federated learning,

J. So, C. He, C.-S. Yang, S. Li, Q. Yu, R. E Ali, B. Guler, and S. Avestimehr, “Lightsecagg: a lightweight and versatile design for secure aggregation in federated learning,”Proceedings of Machine Learning and Systems, vol. 4, pp. 694–720, 2022

2022
[14]

On secure aggregation with uncoded groupwise keys against user dropouts and user collusion,

Z. Zhang, J. Liu, K. Wan, H. Sun, M. Ji, and G. Caire, “On secure aggregation with uncoded groupwise keys against user dropouts and user collusion,”IEEE Transactions on Information Theory, 2025

2025
[15]

The capacity region of information theoretic secure aggregation with uncoded groupwise keys,

K. Wan, H. Sun, M. Ji, T. Mi, and G. Caire, “The capacity region of information theoretic secure aggregation with uncoded groupwise keys,” IEEE Transactions on Information Theory, vol. 70, no. 10, pp. 6932– 6949, 2024

2024
[16]

On the information theoretic secure aggregation with uncoded groupwise keys,

K. Wan, X. Yao, H. Sun, M. Ji, and G. Caire, “On the information theoretic secure aggregation with uncoded groupwise keys,”IEEE Trans- actions on Information Theory, vol. 70, no. 9, pp. 6596–6619, 2024

2024
[17]

Tree gradient coding,

A. Reisizadeh, S. Prakash, R. Pedarsani, and A. S. Avestimehr, “Tree gradient coding,” in2019 IEEE International Symposium on Information Theory (ISIT). IEEE, 2019, pp. 2808–2812

2019
[18]

Hier- archical coded gradient aggregation for learning at the edge,

S. Prakash, A. Reisizadeh, R. Pedarsani, and A. S. Avestimehr, “Hier- archical coded gradient aggregation for learning at the edge,” in2020 IEEE International Symposium on Information Theory (ISIT). IEEE, 2020, pp. 2616–2621

2020
[19]

Optimal rate region for key efficient hierarchical secure aggregation with user collusion,

X. Zhang, K. Wan, H. Sun, S. Wang, M. Ji, and G. Caire, “Optimal rate region for key efficient hierarchical secure aggregation with user collusion,” in2024 IEEE Information Theory Workshop (ITW). IEEE, 2024, pp. 573–578

2024
[20]

2025 , journal =

X. Zhang, Z. Li, K. Wan, H. Sun, M. Ji, and G. Caire, “Fundamental limits of hierarchical secure aggregation with cyclic user association,” arXiv preprint arXiv:2503.04564, 2025

work page arXiv 2025
[21]

On hierarchical secure aggregation against relay and user collusion,

M. Xu, X. Han, K. Wan, and G. Ge, “On hierarchical secure aggregation against relay and user collusion,”arXiv preprint arXiv:2511.20117, 2025

work page arXiv 2025
[22]

Hierarchical gradient coding: From optimal design to privacy at intermediate nodes,

A. Gholami, T. Jahani-Nezhad, K. Wan, and G. Caire, “Hierarchical gradient coding: From optimal design to privacy at intermediate nodes,”
[23]

Available: https://arxiv.org/abs/2502.18251

[Online]. Available: https://arxiv.org/abs/2502.18251

work page arXiv
[24]

Hierarchical secure aggregation with heterogeneous security constraints and arbitrary user collusion,

Z. Li, X. Zhang, J. Lv, J. Fan, H. Chen, and G. Caire, “Collusion-resilient hierarchical secure aggregation with heterogeneous security constraints,” arXiv preprint arXiv:2507.14768, 2025

work page arXiv 2025
[25]

Communication theory of secrecy systems,

C. E. Shannon, “Communication theory of secrecy systems,”in The Bell System Technical Journal, vol. 28, no. 4, pp. 656–715, Oct. 1949. APPENDIXA PROOF OFCONSTRAINT2AND3 Proof of Constraint 2.We aim to prove that the matrix h B(u) 2 ;S T (Uu(1)) 2 B(Uu(1)) 2 ;· · ·;S T (Uu(U−1)) 2 B(Uu(U−1)) 2 i is full rank. To this end, we first denote the key matrix in ...

1949