MARLaaS enables concurrent RL fine-tuning across up to 32 tasks using LoRA adapters and a disaggregated asynchronous architecture, matching single-task performance while improving accelerator utilization by 4.3x and cutting end-to-end time by 85%.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
ExecuTorch is a unified PyTorch-native deployment framework that enables seamless on-device execution of AI models across heterogeneous hardware while preserving original PyTorch semantics.
OGLS-SD improves LLM reasoning by using verifiable outcome rewards to guide logit steering that calibrates teacher distributions in on-policy self-distillation, addressing reflection-induced mismatches.
citing papers explorer
-
MARLaaS: Multi-Tenant Asynchronous Reinforcement Learning as a Service
MARLaaS enables concurrent RL fine-tuning across up to 32 tasks using LoRA adapters and a disaggregated asynchronous architecture, matching single-task performance while improving accelerator utilization by 4.3x and cutting end-to-end time by 85%.
-
ExecuTorch -- A Unified PyTorch Solution to Run AI Models On-Device
ExecuTorch is a unified PyTorch-native deployment framework that enables seamless on-device execution of AI models across heterogeneous hardware while preserving original PyTorch semantics.
-
OGLS-SD: On-Policy Self-Distillation with Outcome-Guided Logit Steering for LLM Reasoning
OGLS-SD improves LLM reasoning by using verifiable outcome rewards to guide logit steering that calibrates teacher distributions in on-policy self-distillation, addressing reflection-induced mismatches.