arXiv preprint arXiv:2508.15096 , year=

Rabeeh Karimi Mahabadi, Sanjeev Satheesh, Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro · 2025 · arXiv 2508.15096

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

OProver: A Unified Framework for Agentic Formal Theorem Proving

cs.CL · 2026-05-17 · unverdicted · novelty 6.0

OProver-32B achieves top Pass@32 scores on MiniF2F, ProverBench, and PutnamBench by combining continued pretraining with iterative agentic proving, retrieval, SFT on repairs, and RL on unresolved cases using a 6.86M-proof dataset.

Forgetting in Language Models: Capacity, Optimization, and Self-Generated Replay

cs.LG · 2026-05-25 · unverdicted · novelty 5.0

Self-generated replay from language models nearly eliminates catastrophic forgetting during finetuning except when models are pretrained close to saturation.

NVIDIA Nemotron 3: Efficient and Open Intelligence

cs.CL · 2025-12-24 · unverdicted · novelty 5.0

NVIDIA releases the Nemotron 3 model family with hybrid Mamba-Transformer architecture, LatentMoE, NVFP4 training, MTP layers, and multi-environment RL post-training for reasoning and agentic tasks.

citing papers explorer

Showing 3 of 3 citing papers.

OProver: A Unified Framework for Agentic Formal Theorem Proving cs.CL · 2026-05-17 · unverdicted · none · ref 167
OProver-32B achieves top Pass@32 scores on MiniF2F, ProverBench, and PutnamBench by combining continued pretraining with iterative agentic proving, retrieval, SFT on repairs, and RL on unresolved cases using a 6.86M-proof dataset.
Forgetting in Language Models: Capacity, Optimization, and Self-Generated Replay cs.LG · 2026-05-25 · unverdicted · none · ref 14
Self-generated replay from language models nearly eliminates catastrophic forgetting during finetuning except when models are pretrained close to saturation.
NVIDIA Nemotron 3: Efficient and Open Intelligence cs.CL · 2025-12-24 · unverdicted · none · ref 167
NVIDIA releases the Nemotron 3 model family with hybrid Mamba-Transformer architecture, LatentMoE, NVFP4 training, MTP layers, and multi-environment RL post-training for reasoning and agentic tasks.

arXiv preprint arXiv:2508.15096 , year=

fields

years

verdicts

representative citing papers

citing papers explorer