SubMAPG uses a new Partition Multilinear Extension to derive unbiased policy gradients from submodular difference rewards, delivering 1/2-approximation and sublinear dynamic regret for online distributed task allocation in open multi-agent systems.
Advances in Neural Information Processing Systems , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.
citing papers explorer
-
Submodular Multi-Agent Policy Learning for Online Distributed Task Allocation in Open Multi-Agent Systems
SubMAPG uses a new Partition Multilinear Extension to derive unbiased policy gradients from submodular difference rewards, delivering 1/2-approximation and sublinear dynamic regret for online distributed task allocation in open multi-agent systems.
-
Response Time Enhances Alignment with Heterogeneous Preferences
Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.