arXiv preprint arXiv:2009.10897 , year=

Chloe Ching-Yun Hsu, Celestine Mendler-Dünner, Moritz Hardt · 2009 · arXiv 2009.10897

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

KLip-PPO: A per-sample KL perspective on PPO-Clip

cs.LG · 2026-06-22 · unverdicted · novelty 7.0

PPO-Clip gradient equals a per-sample KL surrogate with closed-form coefficient on importance ratio and advantage, yielding identical curves on five MuJoCo tasks.

Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters

cs.LG · 2026-05-07 · accept · novelty 7.0

Synthetic data augmentation helps channel-mixing time series models but degrades channel-independent ones, with reliable gains only from seasonal-trend generators and gradual schedules in low-resource settings.

LogNEO: A GPT-Neo Reinforcement Learning Framework for Accurate Real-Time Log Anomaly Detection

cs.LG · 2026-06-06 · unverdicted · novelty 4.0

LogNEO applies PPO to GPT-Neo with a partial-credit exponentially decaying position-aware reward to reach F1 scores of 0.927/0.913/0.984 on HDFS/BGL/Thunderbird while running at production speeds.

citing papers explorer

Showing 3 of 3 citing papers.

KLip-PPO: A per-sample KL perspective on PPO-Clip cs.LG · 2026-06-22 · unverdicted · none · ref 6
PPO-Clip gradient equals a per-sample KL surrogate with closed-form coefficient on importance ratio and advantage, yielding identical curves on five MuJoCo tasks.
Does Synthetic Data Help? Empirical Evidence from Deep Learning Time Series Forecasters cs.LG · 2026-05-07 · accept · none · ref 262
Synthetic data augmentation helps channel-mixing time series models but degrades channel-independent ones, with reliable gains only from seasonal-trend generators and gradual schedules in low-resource settings.
LogNEO: A GPT-Neo Reinforcement Learning Framework for Accurate Real-Time Log Anomaly Detection cs.LG · 2026-06-06 · unverdicted · none · ref 23
LogNEO applies PPO to GPT-Neo with a partial-credit exponentially decaying position-aware reward to reach F1 scores of 0.927/0.913/0.984 on HDFS/BGL/Thunderbird while running at production speeds.

arXiv preprint arXiv:2009.10897 , year=

fields

years

verdicts

representative citing papers

citing papers explorer