LLMs show up to 60.58% social bias in generated code; a new Fairness Monitor Agent cuts bias by 65.1% and raises functional correctness from 75.80% to 83.97%.
Angelix: Scalable mul- tiline program patch synthesis via symbolic analysis
6 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 6representative citing papers
ContractSkill converts draft web agent skills into explicit executable contracts that enable deterministic verification, fault localization, and minimal local repair, improving stability on benchmarks like VisualWebArena.
PrevaRank ranks plausible patches from APR tools using similarity to historic fix features, improving correct fix placement in top ranks on Defects4J bugs.
CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.
Formalizes shared-context batched satisfiability and evaluates predicate-by-predicate, disjunctive over-approximation, and new Core-Literal Filter on symbolic abstraction and active property checking tasks.
API misuses in data-centric libraries share key characteristics with deep learning misuses and occur regardless of whether documentation directives are present.
citing papers explorer
-
Social Bias in LLM-Generated Code: Benchmark and Mitigation
LLMs show up to 60.58% social bias in generated code; a new Fairness Monitor Agent cuts bias by 65.1% and raises functional correctness from 75.80% to 83.97%.
-
ContractSkill: Repairable Contract-Based Skills for Multimodal Web Agents
ContractSkill converts draft web agent skills into explicit executable contracts that enable deterministic verification, fault localization, and minimal local repair, improving stability on benchmarks like VisualWebArena.
-
Ranking Plausible Patches by Historic Feature Frequencies
PrevaRank ranks plausible patches from APR tools using similarity to historic fix features, improving correct fix placement in top ranks on Defects4J bugs.
-
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.
-
Shared-Context Batched Satisfiability
Formalizes shared-context batched satisfiability and evaluates predicate-by-predicate, disjunctive over-approximation, and new Core-Literal Filter on symbolic abstraction and active property checking tasks.
-
An Empirical Study of API Misuses of Data-Centric Libraries
API misuses in data-centric libraries share key characteristics with deep learning misuses and occur regardless of whether documentation directives are present.