pith. machine review for the scientific record. sign in

arxiv: 2605.10240 · v1 · submitted 2026-05-11 · 💻 cs.SE · cs.CR· cs.LG

Recognition: no theorem link

MARGIN: Margin-Aware Regularized Geometry for Imbalanced Vulnerability Detection

Authors on Pith no claims yet

Pith reviewed 2026-05-12 05:22 UTC · model grok-4.3

classification 💻 cs.SE cs.CRcs.LG
keywords software vulnerability detectionimbalanced learninghyperspherical embeddingsmetric learningvon Mises-Fisher distributiongeometric regularizationdeep learning securityprototype modeling
0
0 comments X

The pith

MARGIN corrects geometric distortions in hyperspherical embeddings to improve vulnerability detection on imbalanced datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Software vulnerability detection faces severe frequency and difficulty imbalances that distort embeddings in hyperspherical space. The paper reinterprets these imbalances as geometric distortions and proposes MARGIN to address them through adaptive margin metric learning and hyperspherical prototype modeling. MARGIN uses von Mises-Fisher concentration to dynamically adjust regularization and align embedding distributions with Voronoi cells. This produces more stable decision boundaries and yields better classification and detection results, especially on challenging imbalanced data. The method also generates more structured embedding geometries that support improved robustness and interpretability.

Core claim

By dynamically adjusting geometric regularization according to the distribution structure estimated by the von Mises-Fisher concentration, MARGIN aligns the probability mass of embedding distributions with their corresponding Voronoi cells in hyperspherical space, thereby reducing geometric distortion induced by frequency and difficulty imbalances and yielding more stable decision boundaries for vulnerability classification and detection.

What carries the argument

The MARGIN framework performs adaptive margin metric learning and hyperspherical prototype modeling, using von Mises-Fisher concentration to align embedding probability mass with Voronoi cells and reduce hyperspherical distortion.

If this is right

  • Consistent outperformance of strong baselines in classification accuracy and detection tasks on imbalanced vulnerability datasets
  • More structured embedding geometries that enhance robustness, interpretability, and generalization
  • Reduced geometric distortion leading to more stable decision boundaries for downstream vulnerability models
  • Improved performance specifically on challenging subsets where frequency or difficulty imbalance is pronounced

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The geometric regularization could transfer to other security classification tasks such as malware or intrusion detection where class imbalance distorts embeddings
  • Integrating MARGIN with graph neural networks for vulnerability analysis might further stabilize representations on code graphs
  • The approach implies that many deep learning failures on imbalanced security data originate from unregularized hyperspherical geometry rather than model capacity alone
  • Testing the method on real-world streaming vulnerability reports could reveal whether the Voronoi alignment remains effective under distribution shift

Load-bearing premise

Frequency and difficulty imbalances manifest primarily as geometric distortions in hyperspherical representation space that aligning with Voronoi cells via von Mises-Fisher concentration will reliably improve detection without introducing new failure modes.

What would settle it

An experiment on the same public vulnerability datasets in which MARGIN produces no measurable reduction in embedding distortion (as quantified by distance to Voronoi cell boundaries) yet still fails to improve detection metrics over baselines.

Figures

Figures reproduced from arXiv: 2605.10240 by Huifang Ma, Jiahui Wei, Qingqing Li, Yafei Yang, Yuteng Zhang.

Figure 1
Figure 1. Figure 1: Top 25 CWE Types in CVE Project (2023–2025). Frequency distribution of CWE categories in publicly disclosed vulnerabilities from the CVE Project. The distribution is highly imbalanced, where a small number of CWE types account for a large portion of vulnerabilities while many others appear only rarely. This long-tailed pattern reflects the uneven real-world occurrence of software weaknesses and highlights … view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of vulnerability diversity and learning difficulty across two CWE types. (a) CWE-119 exhibits high embeddings variance with diverse patterns, including stack overflows, heap overflows, and off-by-one errors. (b) In contrast, CWE-415 follows a highly templated allocate→free→free pattern with limited variation. Frequency: Difficulty: Frequency: Difficulty: Embedding Dispersion: Embedding Dispers… view at source ↗
Figure 3
Figure 3. Figure 3: (a) With comparable class difficulty, lower-frequency classes yield embeddings that are harder to converge around their prototypes on the manifold. (b) With equal class frequencies, higher difficulty (i.e., greater intra￾class variance) results in embeddings that are harder to converge around their prototypes on the manifold. ences in learning difficulty across CWE categories [5]. Al￾though CWE defines a s… view at source ↗
Figure 4
Figure 4. Figure 4: Geometric Interpretation of Misclassification and Heuristic Design of MARGIN Conceptual illustration of spherical angular spans for class decision of Cosine Softmax Loss and MARGIN on S 1 . negatives for class a and false positives for class b. Similar phenomenon applies to class b. We define the union of all such probability masses that lie outside its corresponding Voronoi cell and enter those of other c… view at source ↗
Figure 5
Figure 5. Figure 5: Classes distribution across labels. In terms of frequency, non-vulnerable samples dominate the gaps, while the remaining CWE vulnerability samples exhibit a long-tailed distribution. Overall, the label distribution is highly imbalanced. traditional methods, while differences with recent strong base￾lines are often insignificant, likely due to near-saturation. Overall, MARGIN mainly improves feature represe… view at source ↗
Figure 6
Figure 6. Figure 6: Per-CWE F1 Performance Comparsion with Baselines Per-class F1 scores and support for MARGIN and baseline methods across different CWE categories. The bar chart indicates the number of samples per class (support), while the line plots represent F1 scores. MARGIN is highlighted with a red solid line to emphasize its performance. a. Geometric Median Prototypes Similarity b. Geometric Median - Weight Prototype… view at source ↗
Figure 7
Figure 7. Figure 7: Prototypes Convergence During the Training Epoches. Guided by the adaptive margin mechanism, the geometric median prototypes are not only mutually separated but also aligned with the weight prototypes, suggesting that the embedding distributions of different classes gradually form a clear, separable, and stable structure. 2 4 6 8 10 Epoch 0.0 0.2 0.4 0.6 0.8 1.0 Clustering Metrics (0~1) NMI ARI ANGULAR_SIL… view at source ↗
Figure 8
Figure 8. Figure 8: Clustering and ETF Dynamics During training, NMI, ARI, angular silhouette, and the Gram condition number of geometric median prototypes change rapidly at first and stabilize later, demonstrating our method’s effec￾tiveness in achieving a well-structured, stable prototype configuration. 0 5 10 15 20 25 30 35 Epoch 0.3 0.4 0.5 0.6 0.7 FNR + FPR Distortion Region Dynamics MARGIN Cosine Softmax Loss Only [PIT… view at source ↗
Figure 9
Figure 9. Figure 9: Distortion Region Dynamics Comparison of Macro FNR + Macro FPR across training epochs. MARGIN consistently yields lower error rates than cosine softmax, indicating reduced distortion region effects and improved robustness under class imbalance. icant challenge. For example, in CWE-416, which exhibits extreme imbalance with very few samples, most competing methods such as reveal, moevd, and mvd suffer drama… view at source ↗
read the original abstract

Software vulnerability detection is critical for ensuring software security and reliability. Despite recent advances in deep learning, real-world vulnerability datasets suffer from two severe challenges: frequency imbalance and difficulty imbalance. We reinterpret these challenges from an embedding geometry perspective, observing that such imbalances induce geometric distortions in hyperspherical representation space. To address this issue, we propose MARGIN, a metric-based framework that learns discriminative vulnerability representations through adaptive margin metric learning and hyperspherical prototype modeling. MARGIN dynamically adjusts geometric regularization according to the distribution structure estimated by the von Mises-Fisher concentration, aligning the probability mass of embedding distributions with their corresponding Voronoi cells, thereby reducing geometric distortion and yielding more stable decision boundaries. Extensive experiments on public vulnerability datasets show that MARGIN consistently outperforms strong baselines, achieving notable improvements in classification and detection, especially on challenging, imbalanced datasets. Further analysis demonstrates that MARGIN produces more structured embedding geometries, improving robustness, interpretability, and generalization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes MARGIN, a metric-based framework for software vulnerability detection that reinterprets frequency and difficulty imbalances as geometric distortions in hyperspherical embedding space. It introduces adaptive margin metric learning combined with hyperspherical prototype modeling, where von Mises-Fisher concentration parameters are used to dynamically adjust margins and align embedding distributions with Voronoi cells, thereby reducing distortions and producing more stable decision boundaries. Experiments on public vulnerability datasets demonstrate that MARGIN outperforms strong baselines in classification and detection tasks, particularly on imbalanced data, with additional analysis showing improved embedding structure, robustness, and generalization.

Significance. If the geometric alignment mechanism holds under validation, the work would offer a principled, geometry-aware regularization strategy for handling imbalances in security-critical ML tasks. The hyperspherical prototype approach and explicit analysis of embedding geometries represent a strength, as does the use of public datasets that supports potential reproducibility. This could contribute to more interpretable and robust vulnerability detectors beyond standard class-reweighting techniques.

major comments (2)
  1. [Abstract] Abstract: the central claim that vMF concentration estimation produces reliable Voronoi alignment (and thereby stable boundaries) without new failure modes is load-bearing but unsupported by any derivation, ablation, or statistical test in the provided description; the skeptic concern on sensitivity to small samples and label noise in vulnerability data therefore remains unaddressed.
  2. [Abstract] The experimental outcomes asserting 'notable improvements' and 'more structured embedding geometries' rest entirely on unverified results; no equations, tables, or ablation studies are referenced to confirm that the observed gains arise from the claimed geometric correction rather than other factors.
minor comments (1)
  1. Clarify the exact procedure for estimating the von Mises-Fisher concentration parameter κ from finite, noisy samples and how it is used to set per-class margins.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and detailed comments. We agree that the abstract would benefit from clearer references to the supporting derivations, ablations, and analyses in the full manuscript. We address each major comment below and will revise the abstract accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that vMF concentration estimation produces reliable Voronoi alignment (and thereby stable boundaries) without new failure modes is load-bearing but unsupported by any derivation, ablation, or statistical test in the provided description; the skeptic concern on sensitivity to small samples and label noise in vulnerability data therefore remains unaddressed.

    Authors: The abstract is necessarily concise and does not contain the full technical details. The manuscript derives the vMF-based adaptive margin in Section 3, shows how concentration parameters align embedding distributions with Voronoi cells, and includes ablations plus boundary-stability metrics in Section 4. We will revise the abstract to explicitly reference these sections. On sensitivity, the paper reports experiments with reduced sample sizes and synthetic label noise; the adaptive regularization does not introduce new instabilities relative to baselines. If the referee finds the current robustness analysis insufficient, we can expand it in a revision. revision: partial

  2. Referee: [Abstract] The experimental outcomes asserting 'notable improvements' and 'more structured embedding geometries' rest entirely on unverified results; no equations, tables, or ablation studies are referenced to confirm that the observed gains arise from the claimed geometric correction rather than other factors.

    Authors: We accept that the abstract should make the source of the reported gains verifiable. The full manuscript contains the relevant performance tables, embedding-geometry visualizations with quantitative metrics, and ablation studies that isolate the contribution of the geometric regularization. We will update the abstract to cite these specific results and sections so that the claims are directly traceable. revision: yes

Circularity Check

0 steps flagged

No circularity in MARGIN derivation chain

full rationale

The paper proposes MARGIN as a new metric-based framework that reinterprets frequency and difficulty imbalances as hyperspherical geometric distortions, then applies adaptive margin learning and vMF-based concentration to align embeddings with Voronoi cells. No equations, steps, or claims reduce the outputs to inputs by construction, fitted parameters renamed as predictions, or load-bearing self-citations. The method is presented as an independent contribution with external experimental validation on public datasets, so the derivation remains self-contained against benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; all modeling choices (margin adaptation, hyperspherical prototypes, von Mises-Fisher concentration) are described at a high level only.

pith-pipeline@v0.9.0 · 5476 in / 1108 out tokens · 23051 ms · 2026-05-12T05:22:44.897351+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

  1. [1]

    A comprehensive survey of loss functions and metrics in deep learning,

    J. Terven, D.-M. Cordova-Esparza, J.-A. Romero-Gonz ´alez, A. Ram´ırez- Pedraza, and E. Ch ´avez Urbiola, “A comprehensive survey of loss functions and metrics in deep learning,”Artificial Intelligence Review, vol. 58, 04 2025

  2. [2]

    An empirical study of the im- balance issue in software vulnerability detection,

    Y . Guo, Q. Hu, Q. Tang, and Y . L. Traon, “An empirical study of the im- balance issue in software vulnerability detection,” inComputer Security – ESORICS 2023, G. Tsudik, M. Conti, K. Liang, and G. Smaragdakis, Eds. Cham: Springer Nature Switzerland, 2024, pp. 371–390

  3. [3]

    Neural collapse to multiple centers for imbalanced data,

    F. Li, J. Luo, F. Peng, Y . Qian, H. Yan, and Z. Zhu, “Neural collapse to multiple centers for imbalanced data,” inAdvances in Neural Information Processing Systems 37, ser. NeurIPS 2024. Neural Information Processing Systems Foundation, Inc. (NeurIPS), 2024, p. 65583–65617. [Online]. Available: https://dx.doi.org/10.52202/ 079017-2095

  4. [4]

    Class- balanced loss based on effective number of samples,

    Y . Cui, M. Jia, T.-Y . Lin, Y . Song, and S. Belongie, “Class- balanced loss based on effective number of samples,” in2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Jun. 2019, p. 9260–9269. [Online]. Available: https://dx.doi.org/10.1109/cvpr.2019.00949

  5. [5]

    An investigation of quality issues in vulnerability detection datasets,

    Y . Guo and S. Bettaieb, “An investigation of quality issues in vulnerability detection datasets,” in2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&;PW). IEEE, Jul. 2023, p. 29–33. [Online]. Available: https://dx.doi.org/10.1109/eurospw59978. 2023.00008

  6. [6]

    Code2vec: Learning distributed representations of code,

    U. Alon, M. Zilberstein, O. Levy, and E. Yahav, “Code2vec: Learning distributed representations of code,”Proc. ACM Program. Lang., vol. 3, no. POPL, pp. 40:1–40:29, Jan. 2019. [Online]. Available: https://doi.acm.org/10.1145/3290353

  7. [7]

    Neural networks and the bias/variance dilemma,

    S. Geman, E. Bienenstock, and R. Doursat, “Neural networks and the bias/variance dilemma,”Neural Computation, vol. 4, no. 1, p. 1–58, Jan

  8. [8]

    Available: https://dx.doi.org/10.1162/neco.1992.4.1.1

    [Online]. Available: https://dx.doi.org/10.1162/neco.1992.4.1.1

  9. [9]

    Hyperspherical prototype networks,

    P. Mettes, E. van der Pol, and C. G. M. Snoek, “Hyperspherical prototype networks,” inAdvances in Neural Information Processing Systems, 2019

  10. [10]

    L2-constrained softmax loss for discriminative face verification,

    R. Ranjan, C. D. Castillo, and R. Chellappa, “L2-constrained softmax loss for discriminative face verification,”ArXiv, vol. abs/1703.09507,

  11. [11]
  12. [12]

    Guiding neural collapse: Optimising towards the nearest simplex equiangular tight frame,

    E. Markou, T. Ajanthan, and S. Gould, “Guiding neural collapse: Optimising towards the nearest simplex equiangular tight frame,” in NeurIPS, 2024

  13. [13]

    Clustering on the unit hypersphere using von mises–fisher distributions,

    A. Banerjee, I. S. Dhillon, J. Ghosh, and S. Sra, “Clustering on the unit hypersphere using von mises–fisher distributions,” inJournal of Machine Learning Research, vol. 6, 2005, pp. 1345–1382

  14. [14]

    Deep neural collapse is provably optimal for the deep unconstrained features model,

    C. Lampert, M. Mondelli, and P. S ´uken´ık, “Deep neural collapse is provably optimal for the deep unconstrained features model,” inAdvances in Neural Information Processing Systems 36, ser. NeurIPS 2023. Neural Information Processing Systems Foundation, Inc. (NeurIPS), 2023, p. 52991–53024. [Online]. Available: https: //dx.doi.org/10.52202/075280-2306

  15. [15]

    V on mises-fisher clustering models,

    S. Gopal and Y . Yang, “V on mises-fisher clustering models,” inProceedings of the 31st International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, E. P. Xing and T. Jebara, Eds., vol. 32, no. 1. Bejing, China: PMLR, 22–24 Jun 2014, pp. 154–162. [Online]. Available: https: //proceedings.mlr.press/v32/gopal14.html

  16. [16]

    A c/c++ code vulnerability dataset with code changes and cve summaries,

    J. Fan, Y . Li, S. Wang, and T. N. Nguyen, “A c/c++ code vulnerability dataset with code changes and cve summaries,” in2020 IEEE/ACM 17th International Conference on Mining Software Repositories (MSR), 2020, pp. 508–512

  17. [17]

    Megavul: A c/c++ vulnerability dataset with comprehensive code representations,

    C. Ni, L. Shen, X. Yang, Y . Zhu, and S. Wang, “Megavul: A c/c++ vulnerability dataset with comprehensive code representations,” in Proceedings of the 21st International Conference on Mining Software Repositories, ser. MSR ’24. New York, NY , USA: Association for Computing Machinery, 2024, p. 738–742. [Online]. Available: https://doi.org/10.1145/3643991.3644886

  18. [18]

    Reposvul: A repository-level high-quality vulnerability dataset,

    X. Wang, R. Hu, C. Gao, X.-C. Wen, Y . Chen, and Q. Liao, “Reposvul: A repository-level high-quality vulnerability dataset,” inProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings, ser. ICSE-Companion ’24. New York, NY , USA: Association for Computing Machinery, 2024, p. 472–483. [Online]. Available...

  19. [19]

    Deep learning based vulnerability detection: Are we there yet?

    S. Chakraborty, R. Krishna, Y . Ding, and B. Ray, “Deep learning based vulnerability detection: Are we there yet?”IEEE Transactions on Software Engineering, vol. 48, no. 9, p. 3280–3296, Sep. 2022. [Online]. Available: https://dx.doi.org/10.1109/tse.2021.3087402

  20. [20]

    Livable: Exploring long-tailed classification of software vulnerability types,

    X.-C. Wen, C. Gao, F. Luo, H. Wang, G. Li, and Q. Liao, “Livable: Exploring long-tailed classification of software vulnerability types,” IEEE Transactions on Software Engineering, vol. 50, no. 6, p. 1325–1339, Jun. 2024. [Online]. Available: https://dx.doi.org/10.1109/ tse.2024.3382361

  21. [21]

    µvuldeepecker: A deep learning-based system for multiclass vulnerability detection,

    D. Zou, S. Wang, S. Xu, Z. Li, and H. Jin, “µvuldeepecker: A deep learning-based system for multiclass vulnerability detection,”IEEE Transactions on Dependable and Secure Computing, vol. 18, no. 5, pp. 2224–2236, 2021

  22. [22]

    Sysevr: A framework for using deep learning to detect software vulnerabilities,

    Z. Li, D. Zou, S. Xu, H. Jin, Y . Zhu, and Z. Chen, “Sysevr: A framework for using deep learning to detect software vulnerabilities,” IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 4, p. 2244–2258, Jul. 2022. [Online]. Available: https://dx.doi.org/ 10.1109/tdsc.2021.3051525

  23. [23]

    Applying contrastive learning to code vulnerability type classification,

    C. Ji, S. Yang, H. Sun, and Y . Zhang, “Applying contrastive learning to code vulnerability type classification,” inConference on Empirical Methods in Natural Language Processing, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:273901675

  24. [24]

    One-for-all does not work! enhancing vulnerability detection by mixture-of-experts (moe),

    X. Yang, S. Wang, J. Zhou, and W. Zhu, “One-for-all does not work! enhancing vulnerability detection by mixture-of-experts (moe),” Proceedings of the ACM on Software Engineering, vol. 2, no. FSE, p. 446–464, Jun. 2025. [Online]. Available: https://dx.doi.org/10.1145/ 3715736

  25. [25]

    Clever: Multi-modal contrastive learning for vulnerability code representation,

    J. Li, L. Cui, S. Zhao, Y . Yang, L. Li, and H. Zhu, “Clever: Multi-modal contrastive learning for vulnerability code representation,” inFindings of the Association for Computational Linguistics: ACL 2025. Association for Computational Linguistics, 2025, p. 7940–7951. [Online]. Available: https://dx.doi.org/10.18653/v1/2025.findings-acl.414

  26. [26]

    Vulnerability detection via multiple-graph-based code representation,

    F. Qiu, Z. Liu, X. Hu, X. Xia, G. Chen, and X. Wang, “Vulnerability detection via multiple-graph-based code representation,” IEEE Transactions on Software Engineering, vol. 50, no. 8, p. 2178–2199, Aug. 2024. [Online]. Available: https://dx.doi.org/10.1109/ tse.2024.3427815

  27. [27]

    Y . Zhou, S. Liu, J. Siow, X. Du, and Y . Liu,Devign: effective vulner- ability identification by learning comprehensive program semantics via graph neural networks. Red Hook, NY , USA: Curran Associates Inc., 2019

  28. [28]

    CodeBERT:

    Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, and M. Zhou, “Codebert: A pre-trained model for programming and natural languages,” inFindings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 1536–1547. [Online]. Available: https://dx.doi.org/10.18653/v1/2020.findings-emnlp.139

  29. [29]

    Graphcodebert: Pre-training code representations with data flow

    D. Guo, S. Ren, S. Lu, Z. Feng, D. Tang, S. Liu, L. Zhou, N. Duan, A. Svyatkovskiy, S. Fu, M. Tufano, S. K. Deng, C. B. Clement, D. Drain, N. Sundaresan, J. Yin, D. Jiang, and M. Zhou, “Graphcodebert: Pre-training code representations with data flow.” inICLR. OpenReview.net, 2021. [Online]. Available: https: //dblp.uni-trier.de/db/conf/iclr/iclr2021.html#...

  30. [30]

    C ode T 5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

    Y . Wang, W. Wang, S. Joty, and S. C. Hoi, “Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation,” inProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2021, p. 8696–8708. [Online]. Available: https://dx.doi.org/10.18653/v1/...

  31. [31]

    Unixcoder: Unified cross-modal pre-training for code representation,

    D. Guo, S. Lu, N. Duan, Y . Wang, M. Zhou, and J. Yin, “Unixcoder: Unified cross-modal pre-training for code representation,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2022, p. 7212–7225. [Online]. Available: https://dx.doi.org/10.18653/v1/2...

  32. [32]

    Automated program repair in the era of large pre-trained language models

    X. Yang, S. Wang, Y . Li, and S. Wang, “Does data sampling improve deep learning-based vulnerability detection? yeas! and nays!” inProceedings of the 45th International Conference on Software Engineering, ser. ICSE ’23. IEEE Press, 2023, p. 2287–2298. [Online]. Available: https://doi.org/10.1109/ICSE48619.2023.00192

  33. [33]

    Arcface: Additive angular margin loss for deep face recognition,

    J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” in2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4685– 4694

  34. [34]

    Y . Meng, J. Huang, G. Wang, C. Zhang, H. Zhuang, L. Kaplan, and J. Han,Spherical text embedding. Red Hook, NY , USA: Curran Associates Inc., 2019

  35. [35]

    Enhancing deep learning vulnerability detection through imbalance loss functions: An empirical study,

    Y . He, G. Lin, X. Ma, J. W. Keung, C. Tan, W. Hu, and F. Li, “Enhancing deep learning vulnerability detection through imbalance loss functions: An empirical study,”Proceedings of the 15th Asia-Pacific Symposium on Internetware, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:271293860