GRPO, Dr. GRPO, and DAPO are three settings of one dial on the group standard deviation of binary rewards, unified by the group-standard-deviation identity where disagreement equals update magnitude.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
Test-time sampling improves coverage but stalls at modal and correlation ceilings for answer selection, with the effective number of samples as the practical limit.
Kolmogorov n-width theory plus PRESS statistics yield closed-form optimal spline resolution; KORE estimates bias/noise scales from two pilots and matches CV performance with far fewer fits.
Cartesian 3D PDE operators factor exactly into 1D line kernels via Kronecker algebra, yielding O(N) cost and O(Nx+Ny+Nz) storage for any fixed stencil or polynomial degree.
citing papers explorer
-
GRPO, Dr. GRPO, and DAPO Are Three Operations on One Number: The Group-Standard-Deviation Identity
GRPO, Dr. GRPO, and DAPO are three settings of one dial on the group standard deviation of binary rewards, unified by the group-standard-deviation identity where disagreement equals update magnitude.
-
When More Sampling Hurts: The Modal Ceiling and Correlation Ceiling of Test-Time Scaling
Test-time sampling improves coverage but stalls at modal and correlation ceilings for answer selection, with the effective number of samples as the practical limit.
-
Solve for the Hyperparameter, Skip the Search: Kolmogorov-Optimal Scaling Laws for Spline Regression
Kolmogorov n-width theory plus PRESS statistics yield closed-form optimal spline resolution; KORE estimates bias/noise scales from two pilots and matches CV performance with far fewer fits.
-
No 3D Matrices: A Unified Tensor-Product View of Matrix-Free Cartesian PDE Solvers
Cartesian 3D PDE operators factor exactly into 1D line kernels via Kronecker algebra, yielding O(N) cost and O(Nx+Ny+Nz) storage for any fixed stencil or polynomial degree.