FedQHD achieves closed-form federated Q-learning via hyperdimensional encoders with linear readouts, formalizes the federation gap under heterogeneous encoders, and reports competitive performance on continuous-state benchmarks with reduced computation.
arXiv preprint arXiv:2108.11887 , year =
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
A scalar-projection federated zeroth-order method for model-free LQR policy learning that reduces per-agent communication from O(d) to O(1) with convergence rate improving in the number of agents.
In experience-constrained federated RL for UAVs, learning performance depends primarily on experience reuse and minibatch size rather than the number of participating learners.
A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.
citing papers explorer
-
FedQHD: Closed-Form Function-Space Federated Reinforcement Learning
FedQHD achieves closed-form federated Q-learning via hyperdimensional encoders with linear readouts, formalizes the federation gap under heterogeneous encoders, and reports competitive performance on continuous-state benchmarks with reduced computation.
-
Scalar Federated Learning for Linear Quadratic Regulator
A scalar-projection federated zeroth-order method for model-free LQR policy learning that reduces per-agent communication from O(d) to O(1) with convergence rate improving in the number of agents.
-
Experience Constrained Hierarchical Federated Reinforcement Learning for Large-scale UAV Teams in Hazardous Environments
In experience-constrained federated RL for UAVs, learning performance depends primarily on experience reuse and minibatch size rather than the number of participating learners.
-
Insider Attacks in Multi-Agent LLM Consensus Systems
A malicious agent in multi-agent LLM consensus systems can be trained via a surrogate world model and RL to reduce consensus rates and prolong disagreement more effectively than direct prompt attacks.
-
Reinforcement Learning for Scalable and Trustworthy Intelligent Systems
Reinforcement learning is advanced for communication-efficient federated optimization and for preference-aligned, contextually safe policies in large language models.