A3M integrates adaptive DRL, adversarial opponent modeling, and multi-objective rewards to cut regret 30-40% versus baselines while remaining robust to strategy shifts in repeated auctions.
arXiv preprint arXiv:2512.14321 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
D2MDT uses department-aware multi-agent consultation with residual deliberation to improve EHR-based mortality prediction and efficiency.
EVLA combines a Unified Co-State Encoder and Electro-aware Structured Reasoning Chain with physics-guided training to produce energy-optimal driving decisions, reporting +5.6% accuracy gains over fine-tuned VLM baselines on a driving QA benchmark.
citing papers explorer
-
A3M: Adaptive, Adversarial and Multi-Objective Learning for Strategic Bidding in Repeated Auctions
A3M integrates adaptive DRL, adversarial opponent modeling, and multi-objective rewards to cut regret 30-40% versus baselines while remaining robust to strategy shifts in repeated auctions.
-
D2MDT: Department-aware Multidisciplinary Team Consultation with Deliberation for Efficient Clinical Prediction
D2MDT uses department-aware multi-agent consultation with residual deliberation to improve EHR-based mortality prediction and efficiency.
-
EVLA: An Electro-Aware Multimodal Assistant for Physically-Grounded Driving Reasoning and Control
EVLA combines a Unified Co-State Encoder and Electro-aware Structured Reasoning Chain with physics-guided training to produce energy-optimal driving decisions, reporting +5.6% accuracy gains over fine-tuned VLM baselines on a driving QA benchmark.