QuantEvolver applies reinforcement fine-tuning to evolve an LLM policy for generating executable alpha factor expressions, yielding higher-quality and more complementary factors than prompt-based baselines on market benchmarks.
Mixed citations
Adaptive root cause localization for microservice systems with multi-agent recursion-of-thought
Mixed citation behavior. Most common role is background (40%).
citation-role summary
citation-polarity summary
years
2026 8verdicts
UNVERDICTED 8representative citing papers
SREGym is a modular, open-source live benchmark with 90 high-fidelity SRE failure scenarios built on real cloud stacks for evaluating AI agents on diagnosis and mitigation tasks.
Introduces the first benchmark for fine-grained failures in reinforcement fine-tuning of LLMs and an automatic management framework that detects, diagnoses, and remediates them.
E2E-REME outperforms nine LLMs in accuracy and efficiency for end-to-end microservice remediation by using experience-simulation reinforcement fine-tuning on a new benchmark called MicroRemed.
TopoEvo is a topology-aware self-evolving multi-agent framework for root cause analysis in microservices that uses multimodal alignment, vector-quantized symptom tokens, and a hypothesis-evidence-test workflow to separate root causes from cascading symptoms.
RCLAgent uses multi-agent recursion-of-thought with parallel reasoning on trace graphs to outperform prior methods in root cause localization accuracy and efficiency for microservice systems.
Thesis proposes BARO for metrics, EventADL for events, TORAI for multimodal RCA without call graphs, and RCAEval benchmark with systematic evaluation of causal methods.
Systematic review of 145 papers on LLM-based log analysis, providing a unified taxonomy, common design patterns, evaluation practices, and challenges for deployment under drift and limited labels.
citing papers explorer
-
SREGym: A Live Benchmark for AI SRE Agents with High-Fidelity Failure Scenarios
SREGym is a modular, open-source live benchmark with 90 high-fidelity SRE failure scenarios built on real cloud stacks for evaluating AI agents on diagnosis and mitigation tasks.
-
Towards Robust LLM Post-Training: Automatic Failure Management for Reinforcement Fine-Tuning
Introduces the first benchmark for fine-grained failures in reinforcement fine-tuning of LLMs and an automatic management framework that detects, diagnoses, and remediates them.
-
E2E-REME: Towards End-to-End Microservices Auto-Remediation via Experience-Simulation Reinforcement Fine-Tuning
E2E-REME outperforms nine LLMs in accuracy and efficiency for end-to-end microservice remediation by using experience-simulation reinforcement fine-tuning on a new benchmark called MicroRemed.