hub

arXiv preprint arXiv:2009.08366 , year=

Guo, D · 2020 · arXiv 2009.08366

18 Pith papers cite this work. Polarity classification is still indexing.

18 Pith papers citing it

read on arXiv browse 18 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

Evaluating Tool Cloning in Agentic-AI Ecosystems

cs.SE · 2026-05-10 · unverdicted · novelty 7.0

Tool cloning is pervasive in agentic AI ecosystems, with 60% of high-Jaccard and 85% of high-ssdeep similar pairs verified as true clones in a study of over 8,800 repositories.

RepoDoc: A Knowledge Graph-Based Framework to Automatic Documentation Generation and Incremental Updates

cs.SE · 2026-04-29 · unverdicted · novelty 7.0

RepoDoc uses a repository knowledge graph with module clustering and semantic impact propagation to generate more complete documentation 3x faster with 85% fewer tokens and handle incremental updates 73% faster than prior LLM-based tools.

Structural Anchors and Reasoning Fragility:Understanding CoT Robustness in LLM4Code

cs.SE · 2026-04-14 · unverdicted · novelty 7.0

CoT prompting in LLM4Code shows mixed robustness that depends on model family, task structure, and perturbations destabilizing structural anchors, leading to trajectory deformations like lengthening, branching, and simplification.

CodeBLEU: a Method for Automatic Evaluation of Code Synthesis

cs.SE · 2020-09-22 · conditional · novelty 7.0

CodeBLEU improves correlation with human programmer scores on code synthesis tasks by adding syntactic AST matching and semantic data-flow matching to the standard BLEU n-gram approach.

NeuroFlake: A Neuro-Symbolic LLM Framework for Flaky Test Classification

cs.SE · 2026-05-12 · unverdicted · novelty 6.0

NeuroFlake integrates discriminative token mining into LLMs to classify flaky tests, raising F1-score to 69.34% on FlakeBench while showing greater robustness to semantic-preserving perturbations than prior methods.

VulStyle: A Multi-Modal Pre-Training for Code Stylometry-Augmented Vulnerability Detection

cs.CR · 2026-04-29 · unverdicted · novelty 6.0

VulStyle pre-trains on 4.9M functions using code, non-terminal ASTs, and stylometry features, then fine-tunes to achieve SOTA F1 gains of 4-48% on BigVul and VulDeePecker.

Residual Risk Analysis in Benign Code: How Far Are We? A Multi-Model Semantic and Structural Similarity Approach

cs.SE · 2026-04-22 · unverdicted · novelty 6.0

Patched functions often remain similar to vulnerable ones, and a new multi-model similarity scoring system identifies residual issues like null pointer dereferences in 61% of high-risk cases from the PrimeVul dataset.

On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation

cs.SE · 2026-04-15 · unverdicted · novelty 6.0

Continuous latent-vector compression improves BLEU scores on repository-level code tasks by up to 28.3% at 4x compression while cutting inference latency.

Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis

cs.SE · 2026-04-12 · unverdicted · novelty 6.0

A framework combining universal AST normalization, hybrid graph-LLM embeddings, and strict execution-grounded validation achieves 89-92% intra-language accuracy and 74-80% cross-language F1 while resolving 70% of vulnerabilities at 12% failure rate.

DiffHLS: Differential Learning for High-Level Synthesis QoR Prediction with GNNs and LLM Code Embeddings

cs.LG · 2026-04-10 · unverdicted · novelty 6.0

DiffHLS predicts HLS QoR via differential learning: separate GNN+LLM models for kernel baseline and design delta are composed to yield the final estimate, showing lower MAPE than GNN baselines on PolyBench.

Static Program Slicing Using Language Models With Dataflow-Aware Pretraining and Constrained Decoding

cs.SE · 2026-04-09 · unverdicted · novelty 6.0 · 2 refs

Sliceformer reformulates static program slicing as seq2seq using CodeT5+ with dataflow-aware pretraining via DFG permutation and span corruption plus constrained decoding, yielding up to 22% ExactMatch gains on Java and Python benchmarks.

AFGNN: API Misuse Detection using Graph Neural Networks and Clustering

cs.SE · 2026-04-09 · unverdicted · novelty 6.0

AFGNN detects API misuses in Java code more effectively than prior methods by representing usage as graphs and clustering learned embeddings from self-supervised training.

On the Role of Fault Localization Context for LLM-Based Program Repair

cs.SE · 2026-04-07 · unverdicted · novelty 6.0

More fault localization context does not consistently improve LLM-based program repair; file-level context gives 15-17x gains, optimal around 6-10 files, while line-level context often degrades performance from noise.

DCVD: Dual-Channel Cross-Modal Fusion for Joint Vulnerability Detection and Localization

cs.CR · 2026-05-10 · unverdicted · novelty 5.0

DCVD performs joint function-level vulnerability detection and statement-level localization by extracting control-dependency and semantic features in parallel branches, fusing them with contrastive alignment and bidirectional cross-attention, and applying explicit supervision at both granularities.

Learning Generalizable Multimodal Representations for Software Vulnerability Detection

cs.SE · 2026-04-28 · unverdicted · novelty 5.0

MultiVul uses multimodal contrastive learning to align code and comment representations, yielding up to 27% F1 gains on vulnerability detection benchmarks over prompting and code-only baselines.

PLMGH: What Matters in PLM-GNN Hybrids for Code Classification and Vulnerability Detection

cs.SE · 2026-04-28 · unverdicted · novelty 5.0

Controlled experiments show PLM-GNN hybrids improve code tasks over GNN-only baselines, with PLM source having larger impact than GNN backbone.

Prompt-Driven Code Summarization: A Systematic Literature Review

cs.SE · 2026-04-16 · unverdicted · novelty 4.0

A systematic review that categorizes prompting strategies for LLM-based code summarization, assesses their effectiveness, and identifies gaps in research and evaluation practices.

A Survey on Large Language Models for Code Generation

cs.CL · 2024-06-01 · unverdicted · novelty 3.0

A systematic literature review that organizes recent work on LLMs for code generation into a taxonomy covering data curation, model advances, evaluations, ethics, environmental impact, and applications, with benchmark comparisons.

citing papers explorer

Showing 18 of 18 citing papers.

Evaluating Tool Cloning in Agentic-AI Ecosystems cs.SE · 2026-05-10 · unverdicted · none · ref 15
Tool cloning is pervasive in agentic AI ecosystems, with 60% of high-Jaccard and 85% of high-ssdeep similar pairs verified as true clones in a study of over 8,800 repositories.
RepoDoc: A Knowledge Graph-Based Framework to Automatic Documentation Generation and Incremental Updates cs.SE · 2026-04-29 · unverdicted · none · ref 13
RepoDoc uses a repository knowledge graph with module clustering and semantic impact propagation to generate more complete documentation 3x faster with 85% fewer tokens and handle incremental updates 73% faster than prior LLM-based tools.
Structural Anchors and Reasoning Fragility:Understanding CoT Robustness in LLM4Code cs.SE · 2026-04-14 · unverdicted · none · ref 35
CoT prompting in LLM4Code shows mixed robustness that depends on model family, task structure, and perturbations destabilizing structural anchors, leading to trajectory deformations like lengthening, branching, and simplification.
CodeBLEU: a Method for Automatic Evaluation of Code Synthesis cs.SE · 2020-09-22 · conditional · none · ref 83
CodeBLEU improves correlation with human programmer scores on code synthesis tasks by adding syntactic AST matching and semantic data-flow matching to the standard BLEU n-gram approach.
NeuroFlake: A Neuro-Symbolic LLM Framework for Flaky Test Classification cs.SE · 2026-05-12 · unverdicted · none · ref 14
NeuroFlake integrates discriminative token mining into LLMs to classify flaky tests, raising F1-score to 69.34% on FlakeBench while showing greater robustness to semantic-preserving perturbations than prior methods.
VulStyle: A Multi-Modal Pre-Training for Code Stylometry-Augmented Vulnerability Detection cs.CR · 2026-04-29 · unverdicted · none · ref 21
VulStyle pre-trains on 4.9M functions using code, non-terminal ASTs, and stylometry features, then fine-tunes to achieve SOTA F1 gains of 4-48% on BigVul and VulDeePecker.
Residual Risk Analysis in Benign Code: How Far Are We? A Multi-Model Semantic and Structural Similarity Approach cs.SE · 2026-04-22 · unverdicted · none · ref 17
Patched functions often remain similar to vulnerable ones, and a new multi-model similarity scoring system identifies residual issues like null pointer dereferences in 61% of high-risk cases from the PrimeVul dataset.
On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation cs.SE · 2026-04-15 · unverdicted · none · ref 13
Continuous latent-vector compression improves BLEU scores on repository-level code tasks by up to 28.3% at 4x compression while cutting inference latency.
Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis cs.SE · 2026-04-12 · unverdicted · none · ref 22
A framework combining universal AST normalization, hybrid graph-LLM embeddings, and strict execution-grounded validation achieves 89-92% intra-language accuracy and 74-80% cross-language F1 while resolving 70% of vulnerabilities at 12% failure rate.
DiffHLS: Differential Learning for High-Level Synthesis QoR Prediction with GNNs and LLM Code Embeddings cs.LG · 2026-04-10 · unverdicted · none · ref 23
DiffHLS predicts HLS QoR via differential learning: separate GNN+LLM models for kernel baseline and design delta are composed to yield the final estimate, showing lower MAPE than GNN baselines on PolyBench.
Static Program Slicing Using Language Models With Dataflow-Aware Pretraining and Constrained Decoding cs.SE · 2026-04-09 · unverdicted · none · ref 2 · 2 links
Sliceformer reformulates static program slicing as seq2seq using CodeT5+ with dataflow-aware pretraining via DFG permutation and span corruption plus constrained decoding, yielding up to 22% ExactMatch gains on Java and Python benchmarks.
AFGNN: API Misuse Detection using Graph Neural Networks and Clustering cs.SE · 2026-04-09 · unverdicted · none · ref 26
AFGNN detects API misuses in Java code more effectively than prior methods by representing usage as graphs and clustering learned embeddings from self-supervised training.
On the Role of Fault Localization Context for LLM-Based Program Repair cs.SE · 2026-04-07 · unverdicted · none · ref 11
More fault localization context does not consistently improve LLM-based program repair; file-level context gives 15-17x gains, optimal around 6-10 files, while line-level context often degrades performance from noise.
DCVD: Dual-Channel Cross-Modal Fusion for Joint Vulnerability Detection and Localization cs.CR · 2026-05-10 · unverdicted · none · ref 26
DCVD performs joint function-level vulnerability detection and statement-level localization by extracting control-dependency and semantic features in parallel branches, fusing them with contrastive alignment and bidirectional cross-attention, and applying explicit supervision at both granularities.
Learning Generalizable Multimodal Representations for Software Vulnerability Detection cs.SE · 2026-04-28 · unverdicted · none · ref 25
MultiVul uses multimodal contrastive learning to align code and comment representations, yielding up to 27% F1 gains on vulnerability detection benchmarks over prompting and code-only baselines.
PLMGH: What Matters in PLM-GNN Hybrids for Code Classification and Vulnerability Detection cs.SE · 2026-04-28 · unverdicted · none · ref 13
Controlled experiments show PLM-GNN hybrids improve code tasks over GNN-only baselines, with PLM source having larger impact than GNN backbone.
Prompt-Driven Code Summarization: A Systematic Literature Review cs.SE · 2026-04-16 · unverdicted · none · ref 31
A systematic review that categorizes prompting strategies for LLM-based code summarization, assesses their effectiveness, and identifies gaps in research and evaluation practices.
A Survey on Large Language Models for Code Generation cs.CL · 2024-06-01 · unverdicted · none · ref 86
A systematic literature review that organizes recent work on LLMs for code generation into a taxonomy covering data curation, model advances, evaluations, ethics, environmental impact, and applications, with benchmark comparisons.

arXiv preprint arXiv:2009.08366 , year=

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer