pith. sign in

Lightningrl: Breaking the accuracy- parallelism trade-off of block-wise dllms via reinforcement learning

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

fields

cs.LG 2 cs.CL 1

years

2026 3

roles

background 2

polarities

background 2

representative citing papers

DMax: Aggressive Parallel Decoding for dLLMs

cs.LG · 2026-04-09 · conditional · novelty 7.0 · 2 refs

DMax uses On-Policy Uniform Training and Soft Parallel Decoding to enable aggressive parallelism in dLLMs, raising TPF on GSM8K from 2.04 to 5.47 and on MBPP from 2.71 to 5.86 while preserving accuracy.

Multi-Block Diffusion Language Models

cs.LG · 2026-06-28 · unverdicted · novelty 6.0 · 2 refs

MBD-LMs raise average tokens per forward pass from 3.47 to 6.19 (and to 9.34 with DMax) via multi-block teacher forcing and optimized parallel decoding while holding or slightly improving accuracy on math and code tasks.

citing papers explorer

Showing 3 of 3 citing papers.

  • TAD: Temporal-Aware Trajectory Self-Distillation for Fast and Accurate Diffusion LLM cs.CL · 2026-05-10 · unverdicted · none · ref 53

    TAD improves the accuracy-parallelism trade-off in diffusion LLMs via temporal-aware self-distillation that applies hard labels to soon-to-be-decoded tokens and soft supervision to future tokens.

  • DMax: Aggressive Parallel Decoding for dLLMs cs.LG · 2026-04-09 · conditional · none · ref 33 · 2 links

    DMax uses On-Policy Uniform Training and Soft Parallel Decoding to enable aggressive parallelism in dLLMs, raising TPF on GSM8K from 2.04 to 5.47 and on MBPP from 2.71 to 5.86 while preserving accuracy.

  • Multi-Block Diffusion Language Models cs.LG · 2026-06-28 · unverdicted · none · ref 35 · 2 links

    MBD-LMs raise average tokens per forward pass from 3.47 to 6.19 (and to 9.34 with DMax) via multi-block teacher forcing and optimized parallel decoding while holding or slightly improving accuracy on math and code tasks.