Efficient exploration for llms.arXiv preprint arXiv:2402.00396

Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao, Benjamin Van Roy · arXiv 2402.00396

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Optimality of Sub-network Laplace Approximations: New Results and Methods

stat.ML · 2026-05-09 · conditional · novelty 7.0

Sub-network Laplace approximations always underestimate full-model predictive variance, and two new gradient-based and greedy selection rules provide theoretically grounded improvements.

Epistemic Uncertainty for Test-Time Discovery

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

UG-TTT adds epistemic uncertainty measured by adapter disagreement as an exploration bonus in RL for LLMs, raising maximum reward and diversity on scientific discovery benchmarks.

citing papers explorer

Showing 2 of 2 citing papers.

Optimality of Sub-network Laplace Approximations: New Results and Methods stat.ML · 2026-05-09 · conditional · none · ref 9
Sub-network Laplace approximations always underestimate full-model predictive variance, and two new gradient-based and greedy selection rules provide theoretically grounded improvements.
Epistemic Uncertainty for Test-Time Discovery cs.LG · 2026-05-11 · unverdicted · none · ref 7
UG-TTT adds epistemic uncertainty measured by adapter disagreement as an exploration bonus in RL for LLMs, raising maximum reward and diversity on scientific discovery benchmarks.

Efficient exploration for llms.arXiv preprint arXiv:2402.00396

fields

years

verdicts

representative citing papers

citing papers explorer