Title resolution pending

Alfred Santa Molison, Marcia Moraes, Glaucia Melo, Fabio Santos, Wesley K · 2025 · arXiv 4174.2025

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

SmellBench: Evaluating LLM Agents on Architectural Code Smell Repair

cs.SE · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

SmellBench is the first benchmark showing LLM agents resolve 47.7% of architectural code smells while accurately spotting false positives, but aggressive repairs often introduce new smells and degrade overall quality.

A Large-Scale Empirical Study of AI-Generated Code in Real-World Repositories

cs.SE · 2026-03-28 · unverdicted · novelty 7.0

A large-scale study of real-world repositories finds that AI-generated code differs from human-written code in complexity, structural traits, defect indicators, and commit-level activity patterns.

Measuring and Exploiting Contextual Bias in LLM-Assisted Security Code Review

cs.SE · 2026-03-19 · accept · novelty 7.0

LLM-based security code review is vulnerable to framing bias, with a novel iterative refinement attack achieving 100% success in reintroducing vulnerabilities across real projects.

ToxiShield: Promoting Inclusive Developer Communication through Real-Time Toxicity Filtering

cs.SE · 2026-04-15 · unverdicted · novelty 5.0 · 2 refs

ToxiShield delivers a real-time GitHub extension with a BERT toxicity detector at 98% accuracy, a Claude-based coach, and a fine-tuned Llama reframer at 95% style transfer accuracy, validated by a 10-person TAM study.

Quality and Security Signals in AI-Generated Python Refactoring Pull Requests

cs.SE · 2026-05-20 · unverdicted · novelty 4.0

Empirical analysis of AI refactoring PRs shows quality attribute improvements in 22.5% of cases with new Pylint issues in 24.17% and Bandit findings in 4.7%, yet 73.5% developer acceptance.

citing papers explorer

Showing 5 of 5 citing papers.

SmellBench: Evaluating LLM Agents on Architectural Code Smell Repair cs.SE · 2026-05-07 · unverdicted · none · ref 17 · 2 links
SmellBench is the first benchmark showing LLM agents resolve 47.7% of architectural code smells while accurately spotting false positives, but aggressive repairs often introduce new smells and degrade overall quality.
A Large-Scale Empirical Study of AI-Generated Code in Real-World Repositories cs.SE · 2026-03-28 · unverdicted · none · ref 26
A large-scale study of real-world repositories finds that AI-generated code differs from human-written code in complexity, structural traits, defect indicators, and commit-level activity patterns.
Measuring and Exploiting Contextual Bias in LLM-Assisted Security Code Review cs.SE · 2026-03-19 · accept · none · ref 4
LLM-based security code review is vulnerable to framing bias, with a novel iterative refinement attack achieving 100% success in reintroducing vulnerabilities across real projects.
ToxiShield: Promoting Inclusive Developer Communication through Real-Time Toxicity Filtering cs.SE · 2026-04-15 · unverdicted · none · ref 2 · 2 links
ToxiShield delivers a real-time GitHub extension with a BERT toxicity detector at 98% accuracy, a Claude-based coach, and a fine-tuned Llama reframer at 95% style transfer accuracy, validated by a 10-person TAM study.
Quality and Security Signals in AI-Generated Python Refactoring Pull Requests cs.SE · 2026-05-20 · unverdicted · none · ref 22
Empirical analysis of AI refactoring PRs shows quality attribute improvements in 22.5% of cases with new Pylint issues in 24.17% and Bandit findings in 4.7%, yet 73.5% developer acceptance.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer