A scalar-projection federated zeroth-order method for model-free LQR policy learning that reduces per-agent communication from O(d) to O(1) with convergence rate improving in the number of agents.
arXiv preprint arXiv:2108.11887 , year =
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
In experience-constrained federated RL for UAVs, learning performance depends primarily on experience reuse and minibatch size rather than the number of participating learners.
A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.
citing papers explorer
-
Scalar Federated Learning for Linear Quadratic Regulator
A scalar-projection federated zeroth-order method for model-free LQR policy learning that reduces per-agent communication from O(d) to O(1) with convergence rate improving in the number of agents.
-
Experience Constrained Hierarchical Federated Reinforcement Learning for Large-scale UAV Teams in Hazardous Environments
In experience-constrained federated RL for UAVs, learning performance depends primarily on experience reuse and minibatch size rather than the number of participating learners.
-
Insider Attacks in Multi-Agent LLM Consensus Systems
A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
-
Reinforcement Learning for Scalable and Trustworthy Intelligent Systems
Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.