hub Canonical reference

UniXcoder: Unified Cross-Modal Pre-training for Code Representation

Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, Jian Yin · 2022 · cs.CL · arXiv 2203.03850

Canonical reference. 83% of citing Pith papers cite this work as background.

28 Pith papers citing it

Background 83% of classified citations

open full Pith review browse 28 citing papers arXiv PDF

abstract

Pre-trained models for programming languages have recently demonstrated great success on code intelligence. To support both code-related understanding and generation tasks, recent works attempt to pre-train unified encoder-decoder models. However, such encoder-decoder framework is sub-optimal for auto-regressive tasks, especially code completion that requires a decoder-only manner for efficient inference. In this paper, we present UniXcoder, a unified cross-modal pre-trained model for programming language. The model utilizes mask attention matrices with prefix adapters to control the behavior of the model and leverages cross-modal contents like AST and code comment to enhance code representation. To encode AST that is represented as a tree in parallel, we propose a one-to-one mapping method to transform AST in a sequence structure that retains all structural information from the tree. Furthermore, we propose to utilize multi-modal contents to learn representation of code fragment with contrastive learning, and then align representations among programming languages using a cross-modal generation task. We evaluate UniXcoder on five code-related tasks over nine datasets. To further evaluate the performance of code fragment representation, we also construct a dataset for a new task, called zero-shot code-to-code search. Results show that our model achieves state-of-the-art performance on most tasks and analysis reveals that comment and AST can both enhance UniXcoder.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 5 method 1

citation-polarity summary

background 5 use method 1

representative citing papers

R2Code: A Self-Reflective LLM Framework for Requirements-to-Code Traceability

cs.SE · 2026-04-24 · unverdicted · novelty 7.0

R2Code improves requirement-to-code traceability with a bidirectional alignment network, self-reflective consistency verification, and dynamic context-adaptive retrieval, yielding 7.4% average F1 gain and up to 41.7% lower token use on five datasets.

TypePro: Boosting LLM-Based Type Inference via Inter-Procedural Slicing

cs.SE · 2026-04-03 · unverdicted · novelty 7.0

TypePro reaches 88.9% and 86.6% Top-1 exact match on Python and TypeScript type-inference datasets by feeding LLMs inter-procedural slices plus structurally derived candidate types.

CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

cs.CL · 2026-02-02 · unverdicted · novelty 7.0

Multimodal LLMs process code as images to achieve up to 8x token compression, with visual cues like syntax highlighting aiding tasks and clone detection remaining resilient or even improving under compression.

ReDef: Do Code Language Models Truly Understand Code Changes for Just-in-Time Software Defect Prediction?

cs.SE · 2025-09-11 · unverdicted · novelty 7.0

ReDef creates a revert-anchored dataset of 3,164 defective and 10,268 clean code modifications and shows that code language models perform better with diff encodings but maintain stable performance under counterfactual perturbations, indicating reliance on superficial cues.

OpenClassGen: A Large-Scale Corpus of Real-World Python Classes for LLM Research

cs.SE · 2025-04-22 · accept · novelty 7.0

OpenClassGen supplies 324,843 real-world Python classes with self-contained skeletons and static metrics to support LLM class generation research and evaluation.

Large Language Models for Multi-Lingual Equivalent Mutant Detection: An Extended Empirical Study

cs.SE · 2026-07-01 · unverdicted · novelty 6.0

LLM-based methods achieve higher F1-scores than traditional approaches for equivalent mutant detection in Java and C, with fine-tuned code embeddings performing best and showing cross-lingual generalization.

The Decomposition Is the Fingerprint: Per-Component Identity for Agent Skills

cs.CR · 2026-06-30 · unverdicted · novelty 6.0

A per-component SimHash fingerprint supplies structural identity for AI agent skills, recovering family membership under paraphrase and refactoring with AUC 0.974 while localizing changes.

Test Case Selection for Deep Neural Networks: A Replication Study on LLMs for Code

cs.SE · 2026-06-25 · unverdicted · novelty 6.0

Replication of TCS strategies on 17 LLM instances across three code tasks shows only partial generalization from vision DNN results, with uncertainty features aiding early failure discovery and representation features aiding accuracy estimation.

Lost in the Flow with Code Talkers: Unveiling the Instruction-Tuning Tax of Large Language Models in Code Tasks

cs.SE · 2026-06-07 · unverdicted · novelty 6.0

Empirical study finds instruction tuning on CodeLLMs improves instruction following at the expense of infilling performance, termed the Instruction-Tuning Tax.

XSearch: Explainable Code Search via Concept-to-Code Alignment

cs.SE · 2026-05-15 · unverdicted · novelty 6.0 · 2 refs

XSearch achieves 15x gains on out-of-distribution code search benchmarks by replacing global embedding similarity with explicit concept-to-statement alignment.

cs.SE · 2026-05-08 · unverdicted · novelty 6.0

SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.

Tail-aware N-version Machine Learning Models for Reliable API Recommendation

cs.SE · 2026-04-30 · unverdicted · novelty 6.0

NvRec profiles multiple API recommendation models on tail-API performance and applies majority voting with reliability filters to raise true accept rates while controlling rejection of uncertain outputs.

VulStyle: A Multi-Modal Pre-Training for Code Stylometry-Augmented Vulnerability Detection

cs.CR · 2026-04-29 · unverdicted · novelty 6.0

VulStyle pre-trains on 4.9M functions using code, non-terminal ASTs, and stylometry features, then fine-tunes to achieve SOTA F1 gains of 4-48% on BigVul and VulDeePecker.

Residual Risk Analysis in Benign Code: How Far Are We? A Multi-Model Semantic and Structural Similarity Approach

cs.SE · 2026-04-22 · unverdicted · novelty 6.0

Patched functions often remain similar to vulnerable ones, and a new multi-model similarity scoring system identifies residual issues like null pointer dereferences in 61% of high-risk cases from the PrimeVul dataset.

On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation

cs.SE · 2026-04-15 · unverdicted · novelty 6.0

Continuous latent-vector compression improves BLEU scores on repository-level code tasks by up to 28.3% at 4x compression while cutting inference latency.

TurboEvolve: Towards Fast and Robust LLM-Driven Program Evolution

cs.NE · 2026-04-12 · unverdicted · novelty 6.0

TurboEvolve improves LLM program evolution by running parallel islands with LLM-generated diverse candidates that carry self-assigned weights, an adaptive scheduler, and clustered seed injection to reach stronger solutions at lower evaluation budgets.

AFGNN: API Misuse Detection using Graph Neural Networks and Clustering

cs.SE · 2026-04-09 · unverdicted · novelty 6.0

AFGNN detects API misuses in Java code more effectively than prior methods by representing usage as graphs and clustering learned embeddings from self-supervised training.

GoCoMA: Hyperbolic Multimodal Representation Fusion for Large Language Model-Generated Code Attribution

cs.CL · 2026-03-24 · unverdicted · novelty 6.0

GoCoMA fuses code stylometry and binary artifact images via hyperbolic Poincaré ball projection and geodesic-cosine attention to attribute LLM-generated code, outperforming baselines on CoDET-M4 and LLMAuthorBench.

Multi Language Models for On-the-Fly Syntax Highlighting

cs.SE · 2025-10-05 · unverdicted · novelty 6.0

Unified multi-language deep learning model for on-the-fly syntax highlighting using normalization and few-shot learning to support six languages with lower deployment cost.

PseudoBridge: Pseudo Code as the Bridge for Better Semantic and Logic Alignment in Code Retrieval

cs.SE · 2025-09-25 · unverdicted · novelty 6.0

PseudoBridge uses LLM-synthesized pseudo-code to bridge NL semantics and PL logic plus logic-invariant style augmentation to boost robustness and generalization in code retrieval.

Fine-Tuning Code Language Models to Detect Cross-Language Bugs

cs.SE · 2025-07-29 · conditional · novelty 6.0

Fine-tuning 13 CodeLMs on a constructed CLB dataset with nine interaction types improves detection, with UniXcoder-base reaching F1 0.7407 and small models outperforming large ones.

RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation

cs.RO · 2025-06-22 · unverdicted · novelty 6.0

RoboTwin 2.0 automates diverse synthetic data creation for dual-arm robots via MLLMs and five-axis domain randomization, leading to 228-367% gains in manipulation success.

Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning

cs.AI · 2026-05-14 · unverdicted · novelty 5.0

LaMR decomposes code context pruning into two rubrics using dedicated CRFs, a mixture-of-experts gate, and AST-derived labels to filter noise and often match or beat full-context baselines on coding benchmarks.

PLMGH: What Matters in PLM-GNN Hybrids for Code Classification and Vulnerability Detection

cs.SE · 2026-04-28 · unverdicted · novelty 5.0

Controlled experiments show PLM-GNN hybrids improve code tasks over GNN-only baselines, with PLM source having larger impact than GNN backbone.

citing papers explorer

Showing 28 of 28 citing papers.

R2Code: A Self-Reflective LLM Framework for Requirements-to-Code Traceability cs.SE · 2026-04-24 · unverdicted · none · ref 38 · internal anchor
R2Code improves requirement-to-code traceability with a bidirectional alignment network, self-reflective consistency verification, and dynamic context-adaptive retrieval, yielding 7.4% average F1 gain and up to 41.7% lower token use on five datasets.
TypePro: Boosting LLM-Based Type Inference via Inter-Procedural Slicing cs.SE · 2026-04-03 · unverdicted · none · ref 12 · internal anchor
TypePro reaches 88.9% and 86.6% Top-1 exact match on Python and TypeScript type-inference datasets by feeding LLMs inter-procedural slices plus structurally derived candidate types.
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding cs.CL · 2026-02-02 · unverdicted · none · ref 39 · internal anchor
Multimodal LLMs process code as images to achieve up to 8x token compression, with visual cues like syntax highlighting aiding tasks and clone detection remaining resilient or even improving under compression.
ReDef: Do Code Language Models Truly Understand Code Changes for Just-in-Time Software Defect Prediction? cs.SE · 2025-09-11 · unverdicted · none · ref 15 · internal anchor
ReDef creates a revert-anchored dataset of 3,164 defective and 10,268 clean code modifications and shows that code language models perform better with diff encodings but maintain stable performance under counterfactual perturbations, indicating reliance on superficial cues.
OpenClassGen: A Large-Scale Corpus of Real-World Python Classes for LLM Research cs.SE · 2025-04-22 · accept · none · ref 20 · internal anchor
OpenClassGen supplies 324,843 real-world Python classes with self-contained skeletons and static metrics to support LLM class generation research and evaluation.
Large Language Models for Multi-Lingual Equivalent Mutant Detection: An Extended Empirical Study cs.SE · 2026-07-01 · unverdicted · none · ref 30 · internal anchor
LLM-based methods achieve higher F1-scores than traditional approaches for equivalent mutant detection in Java and C, with fine-tuned code embeddings performing best and showing cross-lingual generalization.
The Decomposition Is the Fingerprint: Per-Component Identity for Agent Skills cs.CR · 2026-06-30 · unverdicted · none · ref 14 · internal anchor
A per-component SimHash fingerprint supplies structural identity for AI agent skills, recovering family membership under paraphrase and refactoring with AUC 0.974 while localizing changes.
Test Case Selection for Deep Neural Networks: A Replication Study on LLMs for Code cs.SE · 2026-06-25 · unverdicted · none · ref 26 · internal anchor
Replication of TCS strategies on 17 LLM instances across three code tasks shows only partial generalization from vision DNN results, with uncertainty features aiding early failure discovery and representation features aiding accuracy estimation.
Lost in the Flow with Code Talkers: Unveiling the Instruction-Tuning Tax of Large Language Models in Code Tasks cs.SE · 2026-06-07 · unverdicted · none · ref 19 · internal anchor
Empirical study finds instruction tuning on CodeLLMs improves instruction following at the expense of infilling performance, termed the Instruction-Tuning Tax.
XSearch: Explainable Code Search via Concept-to-Code Alignment cs.SE · 2026-05-15 · unverdicted · none · ref 22 · 2 links · internal anchor
XSearch achieves 15x gains on out-of-distribution code search benchmarks by replacing global embedding similarity with explicit concept-to-statement alignment.
Similar Pattern Annotation via Retrieval Knowledge for LLM-Based Test Code Fault Localization cs.SE · 2026-05-08 · unverdicted · none · ref 17 · internal anchor
SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.
Tail-aware N-version Machine Learning Models for Reliable API Recommendation cs.SE · 2026-04-30 · unverdicted · none · ref 9 · internal anchor
NvRec profiles multiple API recommendation models on tail-API performance and applies majority voting with reliability filters to raise true accept rates while controlling rejection of uncertain outputs.
VulStyle: A Multi-Modal Pre-Training for Code Stylometry-Augmented Vulnerability Detection cs.CR · 2026-04-29 · unverdicted · none · ref 16 · internal anchor
VulStyle pre-trains on 4.9M functions using code, non-terminal ASTs, and stylometry features, then fine-tunes to achieve SOTA F1 gains of 4-48% on BigVul and VulDeePecker.
Residual Risk Analysis in Benign Code: How Far Are We? A Multi-Model Semantic and Structural Similarity Approach cs.SE · 2026-04-22 · unverdicted · none · ref 16 · internal anchor
Patched functions often remain similar to vulnerable ones, and a new multi-model similarity scoring system identifies residual issues like null pointer dereferences in 61% of high-risk cases from the PrimeVul dataset.
On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation cs.SE · 2026-04-15 · unverdicted · none · ref 12 · internal anchor
Continuous latent-vector compression improves BLEU scores on repository-level code tasks by up to 28.3% at 4x compression while cutting inference latency.
TurboEvolve: Towards Fast and Robust LLM-Driven Program Evolution cs.NE · 2026-04-12 · unverdicted · none · ref 6 · internal anchor
TurboEvolve improves LLM program evolution by running parallel islands with LLM-generated diverse candidates that carry self-assigned weights, an adaptive scheduler, and clustered seed injection to reach stronger solutions at lower evaluation budgets.
AFGNN: API Misuse Detection using Graph Neural Networks and Clustering cs.SE · 2026-04-09 · unverdicted · none · ref 25 · internal anchor
AFGNN detects API misuses in Java code more effectively than prior methods by representing usage as graphs and clustering learned embeddings from self-supervised training.
GoCoMA: Hyperbolic Multimodal Representation Fusion for Large Language Model-Generated Code Attribution cs.CL · 2026-03-24 · unverdicted · none · ref 31 · internal anchor
GoCoMA fuses code stylometry and binary artifact images via hyperbolic Poincaré ball projection and geodesic-cosine attention to attribute LLM-generated code, outperforming baselines on CoDET-M4 and LLMAuthorBench.
Multi Language Models for On-the-Fly Syntax Highlighting cs.SE · 2025-10-05 · unverdicted · none · ref 12 · internal anchor
Unified multi-language deep learning model for on-the-fly syntax highlighting using normalization and few-shot learning to support six languages with lower deployment cost.
PseudoBridge: Pseudo Code as the Bridge for Better Semantic and Logic Alignment in Code Retrieval cs.SE · 2025-09-25 · unverdicted · none · ref 17 · internal anchor
PseudoBridge uses LLM-synthesized pseudo-code to bridge NL semantics and PL logic plus logic-invariant style augmentation to boost robustness and generalization in code retrieval.
Fine-Tuning Code Language Models to Detect Cross-Language Bugs cs.SE · 2025-07-29 · conditional · none · ref 21 · internal anchor
Fine-tuning 13 CodeLMs on a constructed CLB dataset with nine interaction types improves detection, with UniXcoder-base reaching F1 0.7407 and small models outperforming large ones.
RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation cs.RO · 2025-06-22 · unverdicted · none · ref 17 · internal anchor
RoboTwin 2.0 automates diverse synthetic data creation for dual-arm robots via MLLMs and five-axis domain randomization, leading to 228-367% gains in manipulation success.
Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning cs.AI · 2026-05-14 · unverdicted · none · ref 11 · internal anchor
LaMR decomposes code context pruning into two rubrics using dedicated CRFs, a mixture-of-experts gate, and AST-derived labels to filter noise and often match or beat full-context baselines on coding benchmarks.
PLMGH: What Matters in PLM-GNN Hybrids for Code Classification and Vulnerability Detection cs.SE · 2026-04-28 · unverdicted · none · ref 12 · internal anchor
Controlled experiments show PLM-GNN hybrids improve code tasks over GNN-only baselines, with PLM source having larger impact than GNN backbone.
VerilogCL: A Contrastive Learning Framework for Robust LLM-Based Verilog Generation cs.AR · 2026-04-20 · unverdicted · none · ref 28 · internal anchor
VerilogCL applies contrastive learning with minimal-error data pairs and a proactive screening module to improve compilation success and functional correctness of 7B LLM-generated Verilog over open-source and commercial baselines on VerilogEval and RTLLM benchmarks.
Separating Secrets from Placeholders: A Hybrid CNN-CodeBERT Framework for Three-Class Credential Leakage Detection cs.SE · 2026-05-29 · unverdicted · none · ref 34 · internal anchor
Hybrid CNN-CodeBERT framework for three-class credential leakage detection reports MCC of 0.86 and macro F1 of 0.90 on a new 9,426-sample dataset across 10 languages, improving placeholder detection and cutting high-severity alerts by 33%.
Are Decoder-Only Large Language Models the Silver Bullet for Code Search? cs.SE · 2024-10-29 · unverdicted · none · ref 10 · internal anchor
Fine-tuned decoder-only LLMs achieve up to 40.4% higher MAP than UniXcoder on CoSQA+ for code search, with non-monotonic size scaling and data composition sensitivity.
CodePori: Large-Scale System for Autonomous Software Development Using Multi-Agent Technology cs.SE · 2024-02-02 · unverdicted · none · ref 45 · internal anchor
CodePori is a multi-agent LLM system for code generation whose participant evaluation identifies practical challenges like memory limits and hallucinations missed by binary benchmarks.

UniXcoder: Unified Cross-Modal Pre-training for Code Representation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer