Title resolution pending

Wannita Takerngsaksiri, Jirat Pasuksmit, Patanamon Thongtanunam, Chakkrit Tantithamthavorn, Ruixiong Zhang, Fan Jiang, Jing Li, Evan Cook, Kun Chen, Ming Wu · 2024 · arXiv 2411.12924

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Guidelines for Empirical Studies in Software Engineering involving Large Language Models

cs.SE · 2025-08-21 · accept · novelty 7.0 · 2 refs

The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.

ACE-Bench: A Lightweight Benchmark for Evaluating Azure SDK Usage Correctness

cs.DC · 2026-02-14 · unverdicted · novelty 6.0

ACE-Bench is an execution-free benchmark that scores LLM coding agents on correct Azure SDK usage via deterministic regex checks and reference-based LLM judges derived from official documentation.

Rethinking Agentic Reinforcement Learning In Large Language Models

cs.AI · 2026-04-30 · unverdicted · novelty 3.0 · 3 refs

The paper reviews conceptual foundations, methodological innovations, effective designs, critical challenges, and future directions for LLM-based Agentic Reinforcement Learning.

citing papers explorer

Showing 3 of 3 citing papers.

Guidelines for Empirical Studies in Software Engineering involving Large Language Models cs.SE · 2025-08-21 · accept · none · ref 131 · 2 links
The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
ACE-Bench: A Lightweight Benchmark for Evaluating Azure SDK Usage Correctness cs.DC · 2026-02-14 · unverdicted · none · ref 18
ACE-Bench is an execution-free benchmark that scores LLM coding agents on correct Azure SDK usage via deterministic regex checks and reference-based LLM judges derived from official documentation.
Rethinking Agentic Reinforcement Learning In Large Language Models cs.AI · 2026-04-30 · unverdicted · none · ref 83 · 3 links
The paper reviews conceptual foundations, methodological innovations, effective designs, critical challenges, and future directions for LLM-based Agentic Reinforcement Learning.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer