Decoupled weight decay regularization

Ilya Loshchilov, Frank Hutter · 2019

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

representative citing papers

Inverse Design for Conditional Distribution Matching

cs.LG · 2026-05-10 · unverdicted · novelty 7.0

Defines Conditional Distribution Matching (CDM) as finding inputs whose induced conditional distributions match a target distribution and proposes the MLGD-F inference-time algorithm using pretrained diffusion models to solve it without retraining.

LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

LaTER reduces LLM token usage 16-33% on reasoning benchmarks by exploring in latent space then switching to explicit CoT verification, with gains like 70% to 73.3% on AIME 2025 in the training-free version.

MetaColloc: Optimization-Free PDE Solving via Meta-Learned Basis Functions

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

MetaColloc meta-learns a universal set of neural basis functions offline so that new PDEs can be solved at test time with a single linear solve instead of per-equation neural-network optimization.

DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification

cs.CL · 2026-05-10 · unverdicted · novelty 6.0

DeltaRubric decomposes multimodal preference evaluation into self-generated planning and verification steps within a single model, producing large accuracy improvements on VL-RewardBench via multi-role reinforcement learning.

Signal Reshaping for GRPO in Weak-Feedback Agentic Code Repair

cs.AI · 2026-05-08 · unverdicted · novelty 5.0

Reshaping outcome rewards, process signals, and rollout comparability in GRPO raises strict compile-and-semantic accuracy in agentic code repair from 0.385 to 0.535 under weak feedback.

citing papers explorer

Showing 5 of 5 citing papers.

Inverse Design for Conditional Distribution Matching cs.LG · 2026-05-10 · unverdicted · none · ref 24
Defines Conditional Distribution Matching (CDM) as finding inputs whose induced conditional distributions match a target distribution and proposes the MLGD-F inference-time algorithm using pretrained diffusion models to solve it without retraining.
LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification cs.CL · 2026-05-08 · unverdicted · none · ref 26
LaTER reduces LLM token usage 16-33% on reasoning benchmarks by exploring in latent space then switching to explicit CoT verification, with gains like 70% to 73.3% on AIME 2025 in the training-free version.
MetaColloc: Optimization-Free PDE Solving via Meta-Learned Basis Functions cs.LG · 2026-05-12 · unverdicted · none · ref 26
MetaColloc meta-learns a universal set of neural basis functions offline so that new PDEs can be solved at test time with a single linear solve instead of per-equation neural-network optimization.
DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification cs.CL · 2026-05-10 · unverdicted · none · ref 26
DeltaRubric decomposes multimodal preference evaluation into self-generated planning and verification steps within a single model, producing large accuracy improvements on VL-RewardBench via multi-role reinforcement learning.
Signal Reshaping for GRPO in Weak-Feedback Agentic Code Repair cs.AI · 2026-05-08 · unverdicted · none · ref 23
Reshaping outcome rewards, process signals, and rollout comparability in GRPO raises strict compile-and-semantic accuracy in agentic code repair from 0.385 to 0.535 under weak feedback.

Decoupled weight decay regularization

fields

years

verdicts

representative citing papers

citing papers explorer