arXiv preprint arXiv:2108.11887 , year =

Jiaju Qi, Qihao Zhou, Lei Lei, Kan Zheng , title = · 2021 · arXiv 2108.11887

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Scalar Federated Learning for Linear Quadratic Regulator

eess.SY · 2026-04-06 · unverdicted · novelty 7.0

A scalar-projection federated zeroth-order method for model-free LQR policy learning that reduces per-agent communication from O(d) to O(1) with convergence rate improving in the number of agents.

Experience Constrained Hierarchical Federated Reinforcement Learning for Large-scale UAV Teams in Hazardous Environments

cs.LG · 2026-05-04 · unverdicted · novelty 6.0

In experience-constrained federated RL for UAVs, learning performance depends primarily on experience reuse and minibatch size rather than the number of participating learners.

Insider Attacks in Multi-Agent LLM Consensus Systems

cs.MA · 2026-05-08 · unverdicted · novelty 5.0

A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.

Reinforcement Learning for Scalable and Trustworthy Intelligent Systems

cs.LG · 2026-05-08 · unverdicted · novelty 3.0

Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.

citing papers explorer

Showing 4 of 4 citing papers.

Scalar Federated Learning for Linear Quadratic Regulator eess.SY · 2026-04-06 · unverdicted · none · ref 16
A scalar-projection federated zeroth-order method for model-free LQR policy learning that reduces per-agent communication from O(d) to O(1) with convergence rate improving in the number of agents.
Experience Constrained Hierarchical Federated Reinforcement Learning for Large-scale UAV Teams in Hazardous Environments cs.LG · 2026-05-04 · unverdicted · none · ref 3
In experience-constrained federated RL for UAVs, learning performance depends primarily on experience reuse and minibatch size rather than the number of participating learners.
Insider Attacks in Multi-Agent LLM Consensus Systems cs.MA · 2026-05-08 · unverdicted · none · ref 223
A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
Reinforcement Learning for Scalable and Trustworthy Intelligent Systems cs.LG · 2026-05-08 · unverdicted · none · ref 38
Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.

arXiv preprint arXiv:2108.11887 , year =

fields

years

verdicts

representative citing papers

citing papers explorer