pith. sign in

hub

Gerald Tesauro

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

hub tools

citation-role summary

background 2 method 1

citation-polarity summary

representative citing papers

On-line Learning in Tree MDPs by Treating Policies as Bandit Arms

cs.AI · 2026-05-06 · unverdicted · novelty 7.0

Bandit algorithms can be adapted to Tree MDPs by treating policies as arms with shared-data confidence bounds, achieving polynomial memory and instance-dependent bounds on sample complexity and regret that depend on terminal-state gaps rather than all policies.

Towards AI-assisted Neutrino Flavor Theory Design

hep-ph · 2025-06-09 · unverdicted · novelty 7.0

AMBer applies reinforcement learning with physics feedback to automate construction of neutrino flavor models that minimize free parameters, validated on known cases and extended to a new symmetry group.

ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders

cs.RO · 2026-05-19 · accept · novelty 6.0 · 2 refs

ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.

Evaluation-driven Scaling for Scientific Discovery

cs.LG · 2026-04-21 · unverdicted · novelty 6.0

SimpleTES scales test-time evaluation in LLMs to discover state-of-the-art solutions on 21 scientific problems across six domains, outperforming frontier models and optimization pipelines with examples like 2x faster LASSO and new Erdos constructions.

Language Models (Mostly) Know What They Know

cs.CL · 2022-07-11 · unverdicted · novelty 6.0

Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.

Scaling Laws for Transfer

cs.LG · 2021-02-02 · unverdicted · novelty 6.0

Effective data transferred from pre-training to fine-tuning is described by a power law in model parameter count and fine-tuning dataset size, acting like a multiplier on the fine-tuning data.

Scheduling Discovery in the 2020s

astro-ph.IM · 2019-07-17 · unverdicted · novelty 2.0

Advocates developing high-quality open-source scheduling software and linking observation planning with data analysis for future astronomical surveys.

citing papers explorer

Showing 15 of 15 citing papers.