arxiv: 2605.10240 · v1 · submitted 2026-05-11 · 💻 cs.SE · cs.CR· cs.LG

Recognition: no theorem link

MARGIN: Margin-Aware Regularized Geometry for Imbalanced Vulnerability Detection

Yuteng Zhang , Huifang Ma , Jiahui Wei , Qingqing Li , Yafei Yang

Authors on Pith no claims yet

Pith reviewed 2026-05-12 05:22 UTC · model grok-4.3

classification 💻 cs.SE cs.CRcs.LG

keywords software vulnerability detectionimbalanced learninghyperspherical embeddingsmetric learningvon Mises-Fisher distributiongeometric regularizationdeep learning securityprototype modeling

0 comments

The pith

MARGIN corrects geometric distortions in hyperspherical embeddings to improve vulnerability detection on imbalanced datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Software vulnerability detection faces severe frequency and difficulty imbalances that distort embeddings in hyperspherical space. The paper reinterprets these imbalances as geometric distortions and proposes MARGIN to address them through adaptive margin metric learning and hyperspherical prototype modeling. MARGIN uses von Mises-Fisher concentration to dynamically adjust regularization and align embedding distributions with Voronoi cells. This produces more stable decision boundaries and yields better classification and detection results, especially on challenging imbalanced data. The method also generates more structured embedding geometries that support improved robustness and interpretability.

Core claim

By dynamically adjusting geometric regularization according to the distribution structure estimated by the von Mises-Fisher concentration, MARGIN aligns the probability mass of embedding distributions with their corresponding Voronoi cells in hyperspherical space, thereby reducing geometric distortion induced by frequency and difficulty imbalances and yielding more stable decision boundaries for vulnerability classification and detection.

What carries the argument

The MARGIN framework performs adaptive margin metric learning and hyperspherical prototype modeling, using von Mises-Fisher concentration to align embedding probability mass with Voronoi cells and reduce hyperspherical distortion.

If this is right

Consistent outperformance of strong baselines in classification accuracy and detection tasks on imbalanced vulnerability datasets
More structured embedding geometries that enhance robustness, interpretability, and generalization
Reduced geometric distortion leading to more stable decision boundaries for downstream vulnerability models
Improved performance specifically on challenging subsets where frequency or difficulty imbalance is pronounced

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The geometric regularization could transfer to other security classification tasks such as malware or intrusion detection where class imbalance distorts embeddings
Integrating MARGIN with graph neural networks for vulnerability analysis might further stabilize representations on code graphs
The approach implies that many deep learning failures on imbalanced security data originate from unregularized hyperspherical geometry rather than model capacity alone
Testing the method on real-world streaming vulnerability reports could reveal whether the Voronoi alignment remains effective under distribution shift

Load-bearing premise

Frequency and difficulty imbalances manifest primarily as geometric distortions in hyperspherical representation space that aligning with Voronoi cells via von Mises-Fisher concentration will reliably improve detection without introducing new failure modes.

What would settle it

An experiment on the same public vulnerability datasets in which MARGIN produces no measurable reduction in embedding distortion (as quantified by distance to Voronoi cell boundaries) yet still fails to improve detection metrics over baselines.

Figures

Figures reproduced from arXiv: 2605.10240 by Huifang Ma, Jiahui Wei, Qingqing Li, Yafei Yang, Yuteng Zhang.

**Figure 1.** Figure 1: Top 25 CWE Types in CVE Project (2023–2025). Frequency distribution of CWE categories in publicly disclosed vulnerabilities from the CVE Project. The distribution is highly imbalanced, where a small number of CWE types account for a large portion of vulnerabilities while many others appear only rarely. This long-tailed pattern reflects the uneven real-world occurrence of software weaknesses and highlights … view at source ↗

**Figure 2.** Figure 2: Illustration of vulnerability diversity and learning difficulty across two CWE types. (a) CWE-119 exhibits high embeddings variance with diverse patterns, including stack overflows, heap overflows, and off-by-one errors. (b) In contrast, CWE-415 follows a highly templated allocate→free→free pattern with limited variation. Frequency: Difficulty: Frequency: Difficulty: Embedding Dispersion: Embedding Dispers… view at source ↗

**Figure 3.** Figure 3: (a) With comparable class difficulty, lower-frequency classes yield embeddings that are harder to converge around their prototypes on the manifold. (b) With equal class frequencies, higher difficulty (i.e., greater intraclass variance) results in embeddings that are harder to converge around their prototypes on the manifold. ences in learning difficulty across CWE categories [5]. Although CWE defines a s… view at source ↗

**Figure 4.** Figure 4: Geometric Interpretation of Misclassification and Heuristic Design of MARGIN Conceptual illustration of spherical angular spans for class decision of Cosine Softmax Loss and MARGIN on S 1 . negatives for class a and false positives for class b. Similar phenomenon applies to class b. We define the union of all such probability masses that lie outside its corresponding Voronoi cell and enter those of other c… view at source ↗

**Figure 5.** Figure 5: Classes distribution across labels. In terms of frequency, non-vulnerable samples dominate the gaps, while the remaining CWE vulnerability samples exhibit a long-tailed distribution. Overall, the label distribution is highly imbalanced. traditional methods, while differences with recent strong baselines are often insignificant, likely due to near-saturation. Overall, MARGIN mainly improves feature represe… view at source ↗

**Figure 6.** Figure 6: Per-CWE F1 Performance Comparsion with Baselines Per-class F1 scores and support for MARGIN and baseline methods across different CWE categories. The bar chart indicates the number of samples per class (support), while the line plots represent F1 scores. MARGIN is highlighted with a red solid line to emphasize its performance. a. Geometric Median Prototypes Similarity b. Geometric Median - Weight Prototype… view at source ↗

**Figure 7.** Figure 7: Prototypes Convergence During the Training Epoches. Guided by the adaptive margin mechanism, the geometric median prototypes are not only mutually separated but also aligned with the weight prototypes, suggesting that the embedding distributions of different classes gradually form a clear, separable, and stable structure. 2 4 6 8 10 Epoch 0.0 0.2 0.4 0.6 0.8 1.0 Clustering Metrics (0~1) NMI ARI ANGULAR_SIL… view at source ↗

**Figure 8.** Figure 8: Clustering and ETF Dynamics During training, NMI, ARI, angular silhouette, and the Gram condition number of geometric median prototypes change rapidly at first and stabilize later, demonstrating our method’s effectiveness in achieving a well-structured, stable prototype configuration. 0 5 10 15 20 25 30 35 Epoch 0.3 0.4 0.5 0.6 0.7 FNR + FPR Distortion Region Dynamics MARGIN Cosine Softmax Loss Only [PIT… view at source ↗

**Figure 9.** Figure 9: Distortion Region Dynamics Comparison of Macro FNR + Macro FPR across training epochs. MARGIN consistently yields lower error rates than cosine softmax, indicating reduced distortion region effects and improved robustness under class imbalance. icant challenge. For example, in CWE-416, which exhibits extreme imbalance with very few samples, most competing methods such as reveal, moevd, and mvd suffer drama… view at source ↗

read the original abstract

Software vulnerability detection is critical for ensuring software security and reliability. Despite recent advances in deep learning, real-world vulnerability datasets suffer from two severe challenges: frequency imbalance and difficulty imbalance. We reinterpret these challenges from an embedding geometry perspective, observing that such imbalances induce geometric distortions in hyperspherical representation space. To address this issue, we propose MARGIN, a metric-based framework that learns discriminative vulnerability representations through adaptive margin metric learning and hyperspherical prototype modeling. MARGIN dynamically adjusts geometric regularization according to the distribution structure estimated by the von Mises-Fisher concentration, aligning the probability mass of embedding distributions with their corresponding Voronoi cells, thereby reducing geometric distortion and yielding more stable decision boundaries. Extensive experiments on public vulnerability datasets show that MARGIN consistently outperforms strong baselines, achieving notable improvements in classification and detection, especially on challenging, imbalanced datasets. Further analysis demonstrates that MARGIN produces more structured embedding geometries, improving robustness, interpretability, and generalization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MARGIN reframes imbalance in vulnerability detection as hyperspherical distortion and tries to fix it with adaptive margins and vMF concentration, but the abstract leaves the key alignment step unproven.

read the letter

The paper's main move is to treat frequency and difficulty imbalance in vulnerability datasets as geometric distortions in hyperspherical embedding space, then introduce MARGIN to learn adaptive margins and hyperspherical prototypes. It uses von Mises-Fisher concentration to estimate distribution structure and align embedding probability mass with Voronoi cells around class prototypes, aiming for more stable boundaries and better generalization on imbalanced data. Experiments on public datasets are said to show consistent gains over baselines, especially on hard cases, plus more structured embeddings that aid interpretability. This is a reasonable application of metric learning ideas to a practical software-security problem, and the geometric reinterpretation gives the work a clear organizing principle that prior DL-for-vuln papers often lack. The focus on robustness and generalization is also welcome. The soft spots are more noticeable. The abstract supplies no equations, no ablation tables, no statistical significance tests, and no concrete numbers for the claimed improvements, so the central claim that vMF-based margin adjustment actually produces the desired alignment rests on unverified outcomes. The stress-test point about concentration estimation being sensitive to small samples and label noise in real vulnerability data is a legitimate worry; without seeing the derivation or validation that the estimated kappa values create stable boundaries rather than over-regularization, it is hard to know whether the method holds up or just adds helpful regularization by accident. The work mostly recombines existing margin and hyperspherical techniques rather than introducing a new primitive, which keeps the novelty incremental. This paper is aimed at researchers who build deep models for code security and need to handle class imbalance without heavy resampling. A reader already working on metric learning or geometric embeddings in applied settings could extract useful implementation details if the full experiments are solid. It is worth sending to peer review so referees can check the missing derivations, ablations, and noise robustness tests; the topic is important enough and the framing coherent enough to justify the time.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes MARGIN, a metric-based framework for software vulnerability detection that reinterprets frequency and difficulty imbalances as geometric distortions in hyperspherical embedding space. It introduces adaptive margin metric learning combined with hyperspherical prototype modeling, where von Mises-Fisher concentration parameters are used to dynamically adjust margins and align embedding distributions with Voronoi cells, thereby reducing distortions and producing more stable decision boundaries. Experiments on public vulnerability datasets demonstrate that MARGIN outperforms strong baselines in classification and detection tasks, particularly on imbalanced data, with additional analysis showing improved embedding structure, robustness, and generalization.

Significance. If the geometric alignment mechanism holds under validation, the work would offer a principled, geometry-aware regularization strategy for handling imbalances in security-critical ML tasks. The hyperspherical prototype approach and explicit analysis of embedding geometries represent a strength, as does the use of public datasets that supports potential reproducibility. This could contribute to more interpretable and robust vulnerability detectors beyond standard class-reweighting techniques.

major comments (2)

[Abstract] Abstract: the central claim that vMF concentration estimation produces reliable Voronoi alignment (and thereby stable boundaries) without new failure modes is load-bearing but unsupported by any derivation, ablation, or statistical test in the provided description; the skeptic concern on sensitivity to small samples and label noise in vulnerability data therefore remains unaddressed.
[Abstract] The experimental outcomes asserting 'notable improvements' and 'more structured embedding geometries' rest entirely on unverified results; no equations, tables, or ablation studies are referenced to confirm that the observed gains arise from the claimed geometric correction rather than other factors.

minor comments (1)

Clarify the exact procedure for estimating the von Mises-Fisher concentration parameter κ from finite, noisy samples and how it is used to set per-class margins.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and detailed comments. We agree that the abstract would benefit from clearer references to the supporting derivations, ablations, and analyses in the full manuscript. We address each major comment below and will revise the abstract accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that vMF concentration estimation produces reliable Voronoi alignment (and thereby stable boundaries) without new failure modes is load-bearing but unsupported by any derivation, ablation, or statistical test in the provided description; the skeptic concern on sensitivity to small samples and label noise in vulnerability data therefore remains unaddressed.

Authors: The abstract is necessarily concise and does not contain the full technical details. The manuscript derives the vMF-based adaptive margin in Section 3, shows how concentration parameters align embedding distributions with Voronoi cells, and includes ablations plus boundary-stability metrics in Section 4. We will revise the abstract to explicitly reference these sections. On sensitivity, the paper reports experiments with reduced sample sizes and synthetic label noise; the adaptive regularization does not introduce new instabilities relative to baselines. If the referee finds the current robustness analysis insufficient, we can expand it in a revision. revision: partial
Referee: [Abstract] The experimental outcomes asserting 'notable improvements' and 'more structured embedding geometries' rest entirely on unverified results; no equations, tables, or ablation studies are referenced to confirm that the observed gains arise from the claimed geometric correction rather than other factors.

Authors: We accept that the abstract should make the source of the reported gains verifiable. The full manuscript contains the relevant performance tables, embedding-geometry visualizations with quantitative metrics, and ablation studies that isolate the contribution of the geometric regularization. We will update the abstract to cite these specific results and sections so that the claims are directly traceable. revision: yes

Circularity Check

0 steps flagged

No circularity in MARGIN derivation chain

full rationale

The paper proposes MARGIN as a new metric-based framework that reinterprets frequency and difficulty imbalances as hyperspherical geometric distortions, then applies adaptive margin learning and vMF-based concentration to align embeddings with Voronoi cells. No equations, steps, or claims reduce the outputs to inputs by construction, fitted parameters renamed as predictions, or load-bearing self-citations. The method is presented as an independent contribution with external experimental validation on public datasets, so the derivation remains self-contained against benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; all modeling choices (margin adaptation, hyperspherical prototypes, von Mises-Fisher concentration) are described at a high level only.

pith-pipeline@v0.9.0 · 5476 in / 1108 out tokens · 23051 ms · 2026-05-12T05:22:44.897351+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

[1]

A comprehensive survey of loss functions and metrics in deep learning,

J. Terven, D.-M. Cordova-Esparza, J.-A. Romero-Gonz ´alez, A. Ram´ırez- Pedraza, and E. Ch ´avez Urbiola, “A comprehensive survey of loss functions and metrics in deep learning,”Artificial Intelligence Review, vol. 58, 04 2025

work page 2025
[2]

An empirical study of the im- balance issue in software vulnerability detection,

Y . Guo, Q. Hu, Q. Tang, and Y . L. Traon, “An empirical study of the im- balance issue in software vulnerability detection,” inComputer Security – ESORICS 2023, G. Tsudik, M. Conti, K. Liang, and G. Smaragdakis, Eds. Cham: Springer Nature Switzerland, 2024, pp. 371–390

work page 2023
[3]

Neural collapse to multiple centers for imbalanced data,

F. Li, J. Luo, F. Peng, Y . Qian, H. Yan, and Z. Zhu, “Neural collapse to multiple centers for imbalanced data,” inAdvances in Neural Information Processing Systems 37, ser. NeurIPS 2024. Neural Information Processing Systems Foundation, Inc. (NeurIPS), 2024, p. 65583–65617. [Online]. Available: https://dx.doi.org/10.52202/ 079017-2095

work page 2024
[4]

Class- balanced loss based on effective number of samples,

Y . Cui, M. Jia, T.-Y . Lin, Y . Song, and S. Belongie, “Class- balanced loss based on effective number of samples,” in2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Jun. 2019, p. 9260–9269. [Online]. Available: https://dx.doi.org/10.1109/cvpr.2019.00949

work page doi:10.1109/cvpr.2019.00949 2019
[5]

An investigation of quality issues in vulnerability detection datasets,

Y . Guo and S. Bettaieb, “An investigation of quality issues in vulnerability detection datasets,” in2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&;PW). IEEE, Jul. 2023, p. 29–33. [Online]. Available: https://dx.doi.org/10.1109/eurospw59978. 2023.00008

work page doi:10.1109/eurospw59978 2023
[6]

Code2vec: Learning distributed representations of code,

U. Alon, M. Zilberstein, O. Levy, and E. Yahav, “Code2vec: Learning distributed representations of code,”Proc. ACM Program. Lang., vol. 3, no. POPL, pp. 40:1–40:29, Jan. 2019. [Online]. Available: https://doi.acm.org/10.1145/3290353

work page doi:10.1145/3290353 2019
[7]

Neural networks and the bias/variance dilemma,

S. Geman, E. Bienenstock, and R. Doursat, “Neural networks and the bias/variance dilemma,”Neural Computation, vol. 4, no. 1, p. 1–58, Jan

work page
[8]

Available: https://dx.doi.org/10.1162/neco.1992.4.1.1

[Online]. Available: https://dx.doi.org/10.1162/neco.1992.4.1.1

work page doi:10.1162/neco.1992.4.1.1 1992
[9]

Hyperspherical prototype networks,

P. Mettes, E. van der Pol, and C. G. M. Snoek, “Hyperspherical prototype networks,” inAdvances in Neural Information Processing Systems, 2019

work page 2019
[10]

L2-constrained softmax loss for discriminative face verification,

R. Ranjan, C. D. Castillo, and R. Chellappa, “L2-constrained softmax loss for discriminative face verification,”ArXiv, vol. abs/1703.09507,

work page arXiv
[11]

L2-constrained softmax loss for discriminative face verification,

[Online]. Available: https://doi.org/10.48550/arXiv.1703.09507

work page doi:10.48550/arxiv.1703.09507
[12]

Guiding neural collapse: Optimising towards the nearest simplex equiangular tight frame,

E. Markou, T. Ajanthan, and S. Gould, “Guiding neural collapse: Optimising towards the nearest simplex equiangular tight frame,” in NeurIPS, 2024

work page 2024
[13]

Clustering on the unit hypersphere using von mises–fisher distributions,

A. Banerjee, I. S. Dhillon, J. Ghosh, and S. Sra, “Clustering on the unit hypersphere using von mises–fisher distributions,” inJournal of Machine Learning Research, vol. 6, 2005, pp. 1345–1382

work page 2005
[14]

Deep neural collapse is provably optimal for the deep unconstrained features model,

C. Lampert, M. Mondelli, and P. S ´uken´ık, “Deep neural collapse is provably optimal for the deep unconstrained features model,” inAdvances in Neural Information Processing Systems 36, ser. NeurIPS 2023. Neural Information Processing Systems Foundation, Inc. (NeurIPS), 2023, p. 52991–53024. [Online]. Available: https: //dx.doi.org/10.52202/075280-2306

work page doi:10.52202/075280-2306 2023
[15]

V on mises-fisher clustering models,

S. Gopal and Y . Yang, “V on mises-fisher clustering models,” inProceedings of the 31st International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, E. P. Xing and T. Jebara, Eds., vol. 32, no. 1. Bejing, China: PMLR, 22–24 Jun 2014, pp. 154–162. [Online]. Available: https: //proceedings.mlr.press/v32/gopal14.html

work page 2014
[16]

A c/c++ code vulnerability dataset with code changes and cve summaries,

J. Fan, Y . Li, S. Wang, and T. N. Nguyen, “A c/c++ code vulnerability dataset with code changes and cve summaries,” in2020 IEEE/ACM 17th International Conference on Mining Software Repositories (MSR), 2020, pp. 508–512

work page 2020
[17]

Megavul: A c/c++ vulnerability dataset with comprehensive code representations,

C. Ni, L. Shen, X. Yang, Y . Zhu, and S. Wang, “Megavul: A c/c++ vulnerability dataset with comprehensive code representations,” in Proceedings of the 21st International Conference on Mining Software Repositories, ser. MSR ’24. New York, NY , USA: Association for Computing Machinery, 2024, p. 738–742. [Online]. Available: https://doi.org/10.1145/3643991.3644886

work page doi:10.1145/3643991.3644886 2024
[18]

Reposvul: A repository-level high-quality vulnerability dataset,

X. Wang, R. Hu, C. Gao, X.-C. Wen, Y . Chen, and Q. Liao, “Reposvul: A repository-level high-quality vulnerability dataset,” inProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings, ser. ICSE-Companion ’24. New York, NY , USA: Association for Computing Machinery, 2024, p. 472–483. [Online]. Available...

work page doi:10.1145/3639478.3647634 2024
[19]

Deep learning based vulnerability detection: Are we there yet?

S. Chakraborty, R. Krishna, Y . Ding, and B. Ray, “Deep learning based vulnerability detection: Are we there yet?”IEEE Transactions on Software Engineering, vol. 48, no. 9, p. 3280–3296, Sep. 2022. [Online]. Available: https://dx.doi.org/10.1109/tse.2021.3087402

work page doi:10.1109/tse.2021.3087402 2022
[20]

Livable: Exploring long-tailed classification of software vulnerability types,

X.-C. Wen, C. Gao, F. Luo, H. Wang, G. Li, and Q. Liao, “Livable: Exploring long-tailed classification of software vulnerability types,” IEEE Transactions on Software Engineering, vol. 50, no. 6, p. 1325–1339, Jun. 2024. [Online]. Available: https://dx.doi.org/10.1109/ tse.2024.3382361

work page arXiv 2024
[21]

µvuldeepecker: A deep learning-based system for multiclass vulnerability detection,

D. Zou, S. Wang, S. Xu, Z. Li, and H. Jin, “µvuldeepecker: A deep learning-based system for multiclass vulnerability detection,”IEEE Transactions on Dependable and Secure Computing, vol. 18, no. 5, pp. 2224–2236, 2021

work page 2021
[22]

Sysevr: A framework for using deep learning to detect software vulnerabilities,

Z. Li, D. Zou, S. Xu, H. Jin, Y . Zhu, and Z. Chen, “Sysevr: A framework for using deep learning to detect software vulnerabilities,” IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 4, p. 2244–2258, Jul. 2022. [Online]. Available: https://dx.doi.org/ 10.1109/tdsc.2021.3051525

work page doi:10.1109/tdsc.2021.3051525 2022
[23]

Applying contrastive learning to code vulnerability type classification,

C. Ji, S. Yang, H. Sun, and Y . Zhang, “Applying contrastive learning to code vulnerability type classification,” inConference on Empirical Methods in Natural Language Processing, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:273901675

work page 2024
[24]

One-for-all does not work! enhancing vulnerability detection by mixture-of-experts (moe),

X. Yang, S. Wang, J. Zhou, and W. Zhu, “One-for-all does not work! enhancing vulnerability detection by mixture-of-experts (moe),” Proceedings of the ACM on Software Engineering, vol. 2, no. FSE, p. 446–464, Jun. 2025. [Online]. Available: https://dx.doi.org/10.1145/ 3715736

work page 2025
[25]

Clever: Multi-modal contrastive learning for vulnerability code representation,

J. Li, L. Cui, S. Zhao, Y . Yang, L. Li, and H. Zhu, “Clever: Multi-modal contrastive learning for vulnerability code representation,” inFindings of the Association for Computational Linguistics: ACL 2025. Association for Computational Linguistics, 2025, p. 7940–7951. [Online]. Available: https://dx.doi.org/10.18653/v1/2025.findings-acl.414

work page doi:10.18653/v1/2025.findings-acl.414 2025
[26]

Vulnerability detection via multiple-graph-based code representation,

F. Qiu, Z. Liu, X. Hu, X. Xia, G. Chen, and X. Wang, “Vulnerability detection via multiple-graph-based code representation,” IEEE Transactions on Software Engineering, vol. 50, no. 8, p. 2178–2199, Aug. 2024. [Online]. Available: https://dx.doi.org/10.1109/ tse.2024.3427815

work page arXiv 2024
[27]

Y . Zhou, S. Liu, J. Siow, X. Du, and Y . Liu,Devign: effective vulner- ability identification by learning comprehensive program semantics via graph neural networks. Red Hook, NY , USA: Curran Associates Inc., 2019

work page 2019
[28]

CodeBERT:

Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, and M. Zhou, “Codebert: A pre-trained model for programming and natural languages,” inFindings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 1536–1547. [Online]. Available: https://dx.doi.org/10.18653/v1/2020.findings-emnlp.139

work page doi:10.18653/v1/2020.findings-emnlp.139 2020
[29]

Graphcodebert: Pre-training code representations with data flow

D. Guo, S. Ren, S. Lu, Z. Feng, D. Tang, S. Liu, L. Zhou, N. Duan, A. Svyatkovskiy, S. Fu, M. Tufano, S. K. Deng, C. B. Clement, D. Drain, N. Sundaresan, J. Yin, D. Jiang, and M. Zhou, “Graphcodebert: Pre-training code representations with data flow.” inICLR. OpenReview.net, 2021. [Online]. Available: https: //dblp.uni-trier.de/db/conf/iclr/iclr2021.html#...

work page 2021
[30]

C ode T 5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

Y . Wang, W. Wang, S. Joty, and S. C. Hoi, “Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation,” inProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2021, p. 8696–8708. [Online]. Available: https://dx.doi.org/10.18653/v1/...

work page doi:10.18653/v1/2021.emnlp-main.685 2021
[31]

Unixcoder: Unified cross-modal pre-training for code representation,

D. Guo, S. Lu, N. Duan, Y . Wang, M. Zhou, and J. Yin, “Unixcoder: Unified cross-modal pre-training for code representation,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2022, p. 7212–7225. [Online]. Available: https://dx.doi.org/10.18653/v1/2...

work page doi:10.18653/v1/2022.acl-long.499 2022
[32]

Automated program repair in the era of large pre-trained language models

X. Yang, S. Wang, Y . Li, and S. Wang, “Does data sampling improve deep learning-based vulnerability detection? yeas! and nays!” inProceedings of the 45th International Conference on Software Engineering, ser. ICSE ’23. IEEE Press, 2023, p. 2287–2298. [Online]. Available: https://doi.org/10.1109/ICSE48619.2023.00192

work page doi:10.1109/icse48619.2023.00192 2023
[33]

Arcface: Additive angular margin loss for deep face recognition,

J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” in2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4685– 4694

work page 2019
[34]

Y . Meng, J. Huang, G. Wang, C. Zhang, H. Zhuang, L. Kaplan, and J. Han,Spherical text embedding. Red Hook, NY , USA: Curran Associates Inc., 2019

work page 2019
[35]

Enhancing deep learning vulnerability detection through imbalance loss functions: An empirical study,

Y . He, G. Lin, X. Ma, J. W. Keung, C. Tan, W. Hu, and F. Li, “Enhancing deep learning vulnerability detection through imbalance loss functions: An empirical study,”Proceedings of the 15th Asia-Pacific Symposium on Internetware, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:271293860

work page 2024