pith. sign in

Language models are hid- 41 den reasoners: Unlocking latent reasoning capabilities via self-rewarding

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 2 other 1

citation-polarity summary

years

2026 3 2025 2

verdicts

UNVERDICTED 5

polarities

background 2 unclear 1

representative citing papers

LASAR: Latent Adaptive Semantic Aligned Reasoning for Generative Recommendation

cs.IR · 2026-05-11 · unverdicted · novelty 6.0

LASAR uses two-stage supervised training plus reinforcement learning to ground semantic IDs, align latent reasoning trajectories to CoT hidden states via KL divergence, and adaptively choose reasoning depth, halving average steps while improving quality on three datasets.

Self-Supervised Bootstrapping of Action-Predictive Embodied Reasoning

cs.RO · 2026-02-09 · unverdicted · novelty 6.0

R&B-EnCoRe uses self-supervised importance-weighted variational inference to distill action-predictive reasoning datasets that improve VLA performance on manipulation, navigation, and driving tasks without external verifiers.

citing papers explorer

Showing 5 of 5 citing papers.