A multi-dimensional behavioral scoring system using LLM judges evaluates agentic stock predictors and feeds scores into closed-loop RL to improve one-day MAPE by 11.5% on held-out data.
Reflexion: Language agents with verbal reinforcement learning
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
CastFlow introduces a role-specialized agentic workflow with memory retrieval and multi-view toolkit for iterative ensemble time series forecasting, using two-stage SFT+RLVR training on a domain-specific LLM to outperform static baselines.
citing papers explorer
-
Multi-Dimensional Behavioral Evaluation of Agentic Stock Prediction Systems Using Large Language Model Judges with Closed-Loop Reinforcement Learning Feedback
A multi-dimensional behavioral scoring system using LLM judges evaluates agentic stock predictors and feeds scores into closed-loop RL to improve one-day MAPE by 11.5% on held-out data.
-
CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting
CastFlow introduces a role-specialized agentic workflow with memory retrieval and multi-view toolkit for iterative ensemble time series forecasting, using two-stage SFT+RLVR training on a domain-specific LLM to outperform static baselines.