A new learning paradigm: Learning using privileged information.Neural networks, 22(5-6):544–557

Vladimir Vapnik, Akshay Vashist · 2009

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

From Generic Correlation to Input-Specific Credit in On-Policy Self Distillation

cs.LG · 2026-05-12 · conditional · novelty 7.0

Self-distillation token rewards measure input-response-feedback pointwise mutual information, and CREDIT extracts the input-specific component with contrastive baselines to improve LLM reasoning performance.

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

Anti-Self-Distillation reverses self-distillation signals via PMI to fix overconfidence on structural tokens, matching GRPO baseline accuracy 2-10x faster with up to 11.5 point gains across 4B-30B models.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information cs.LG · 2026-05-12 · unverdicted · none · ref 25
Anti-Self-Distillation reverses self-distillation signals via PMI to fix overconfidence on structural tokens, matching GRPO baseline accuracy 2-10x faster with up to 11.5 point gains across 4B-30B models.

A new learning paradigm: Learning using privileged information.Neural networks, 22(5-6):544–557

fields

years

verdicts

representative citing papers

citing papers explorer