Large language models for software engineering: Survey and open problems

Fan, Angela, Gokkaya, Beliz, Harman, Mark, Lyubarskiy, Mitya, Sengupta, Shubho, Yoo, Shin · 2023 · arXiv 2310.03533

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

representative citing papers

On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation

cs.SE · 2026-04-15 · unverdicted · novelty 6.0

Continuous latent-vector compression improves BLEU scores on repository-level code tasks by up to 28.3% at 4x compression while cutting inference latency.

A Taxonomy of Programming Languages for Code Generation

cs.CL · 2026-03-31 · accept · novelty 6.0

The researchers provide a systematic 4-tier classification of 646 programming languages, quantifying the extreme data scarcity facing over 70% of the world's programming languages in the age of LLMs.

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

cs.SE · 2024-03-12 · unverdicted · novelty 6.0

LiveCodeBench collects 400 recent contest problems to create a contamination-free benchmark evaluating LLMs on code generation and related capabilities like self-repair and execution.

StarCoder 2 and The Stack v2: The Next Generation

cs.SE · 2024-02-29 · accept · novelty 6.0

StarCoder2-15B matches or beats CodeLlama-34B on code tasks despite being smaller, and StarCoder2-3B outperforms prior 15B models, with open weights and exact training data identifiers released.

Bias in the Loop: Auditing LLM-as-a-Judge for Software Engineering

cs.SE · 2026-04-18 · unverdicted · novelty 5.0

LLM judges for code tasks show high sensitivity to prompt biases that systematically favor certain options, changing accuracy and model rankings even when code is unchanged.

From Theory to Practice: Code Generation Using LLMs for CAPEC and CWE Frameworks

cs.CR · 2026-04-02 · unverdicted · novelty 5.0

LLMs generated 615 vulnerable code snippets aligned with CAPEC and CWE frameworks across three languages, with 0.98 cosine similarity between model outputs.

citing papers explorer

Showing 6 of 6 citing papers.

On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation cs.SE · 2026-04-15 · unverdicted · none · ref 6
Continuous latent-vector compression improves BLEU scores on repository-level code tasks by up to 28.3% at 4x compression while cutting inference latency.
A Taxonomy of Programming Languages for Code Generation cs.CL · 2026-03-31 · accept · none · ref 4
The researchers provide a systematic 4-tier classification of 646 programming languages, quantifying the extreme data scarcity facing over 70% of the world's programming languages in the age of LLMs.
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code cs.SE · 2024-03-12 · unverdicted · none · ref 46
LiveCodeBench collects 400 recent contest problems to create a contamination-free benchmark evaluating LLMs on code generation and related capabilities like self-repair and execution.
StarCoder 2 and The Stack v2: The Next Generation cs.SE · 2024-02-29 · accept · none · ref 198
StarCoder2-15B matches or beats CodeLlama-34B on code tasks despite being smaller, and StarCoder2-3B outperforms prior 15B models, with open weights and exact training data identifiers released.
Bias in the Loop: Auditing LLM-as-a-Judge for Software Engineering cs.SE · 2026-04-18 · unverdicted · none · ref 7
LLM judges for code tasks show high sensitivity to prompt biases that systematically favor certain options, changing accuracy and model rankings even when code is unchanged.
From Theory to Practice: Code Generation Using LLMs for CAPEC and CWE Frameworks cs.CR · 2026-04-02 · unverdicted · none · ref 18
LLMs generated 615 vulnerable code snippets aligned with CAPEC and CWE frameworks across three languages, with 0.98 cosine similarity between model outputs.

Large language models for software engineering: Survey and open problems

fields

years

verdicts

representative citing papers

citing papers explorer