ViPER: Vision-based Packing-Aware Encoder for Robust Malware Detection

Bisma Tahir; Fatima Qaiser; Muhammad Abid Mughal; Nauman Shamim

arxiv: 2606.12949 · v1 · pith:XNXCNZRLnew · submitted 2026-06-11 · 💻 cs.CR · cs.CV

ViPER: Vision-based Packing-Aware Encoder for Robust Malware Detection

Fatima Qaiser , Bisma Tahir , Muhammad Abid Mughal , Nauman Shamim This is my paper

Pith reviewed 2026-06-27 06:30 UTC · model grok-4.3

classification 💻 cs.CR cs.CV

keywords malware detectionbyteplot imagespacking detectionvision transformerWindows PEdual-head architecturegating mechanism

0 comments

The pith

ViPER conditions malware predictions on inferred packing state via a dual-head vision model to handle packed executables.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ViPER as a way to make vision-based malware detection work when executables are packed, which creates high-entropy byteplot images that hide the patterns these models normally use. Because packing appears in both benign and malicious software, simply flagging packed files does not solve the problem. ViPER adds a second head to detect packing and then routes the malware decision through a gating step that applies different boundaries depending on the detected packing state. It also uses weighted losses and stratified sampling to manage the uneven distribution of packing labels. A reader would care because this offers a single supervised pipeline that keeps visual detection viable without disassembly or separate packing filters.

Core claim

ViPER builds on a LoRA-adapted ViT-B/14 backbone with a dual-head architecture that jointly learns malware classification and packing detection. A packing-aware gating mechanism conditions malware predictions on the inferred packing state, enabling distinct decision boundaries for packed and unpacked inputs. To address packing label skew during training, it employs frequency-weighted losses with stratified sampling over joint class-packing strata.

What carries the argument

The packing-aware gating mechanism that routes malware classification through the output of a parallel packing-detection head.

If this is right

The model reaches a balanced accuracy of 0.8521, ROC-AUC of 0.9260, and AUPR of 0.9279 on 200,000 Windows PE byteplot images.
It outperforms representative state-of-the-art baselines on all primary malware-detection metrics.
Packing detection reaches an AUC of 0.9949.
Frequency-weighted losses combined with stratified sampling over joint class-packing strata mitigate training skew.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same dual-head plus gating pattern could be tested on other binary obfuscation techniques that produce high-entropy images.
Extending the approach to additional file formats would require only retraining the heads on new byteplot distributions.
Combining the packing signal with lightweight static features might further stabilize performance on edge cases.

Load-bearing premise

Packing state can be accurately inferred from byteplot images at inference time and used to condition malware predictions without introducing systematic errors on real-world packed samples whose packing labels were not seen during training.

What would settle it

A test set containing packed malware and benign samples that use packers absent from the training distribution shows malware-detection metrics falling below those of non-packing-aware baselines.

Figures

Figures reproduced from arXiv: 2606.12949 by Bisma Tahir, Fatima Qaiser, Muhammad Abid Mughal, Nauman Shamim.

**Figure 2.** Figure 2: Packing label distribution in the dataset. Left: Malware samples (76.20% packed). Right: Benign samples (73.08% packed). The near-symmetric packing prevalence across both classes confirms that packing state alone is an insufficient discriminator [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: ROC curves for all four ablation configurations on the test set. The area under each curve corresponds to the AUC values reported in [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Training curves for ViPER (configuration 4). Loss decreases smoothly for both heads across 20 epochs. Validation AUC and balanced accuracy improve consistently through epoch 18 (best checkpoint) before plateauing. Qaiser et al.: Preprint submitted to Elsevier Page 9 of 12 [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

read the original abstract

Visualization-based malware detection maps raw binary bytes to grayscale images and applies learned visual classifiers, providing an evasion-resistant and disassembly-free alternative to conventional analysis pipelines. However, executable packing remains a critical failure mode: packed binaries produce high-entropy images that obscure the structural patterns these models rely on. Because packing is also prevalent in benign software (e.g., for compression or copy protection), packing state alone is not a reliable indicator of maliciousness, and existing approaches do not address this challenge within a unified supervised framework. We present ViPER, a Vision-based Packing-Aware Encoder for Robust malware detection. ViPER builds on a LoRA-adapted ViT-B/14 backbone with a dual-head architecture that jointly learns malware classification and packing detection. A packing-aware gating mechanism conditions malware predictions on the inferred packing state, enabling distinct decision boundaries for packed and unpacked inputs. To address packing label skew during training, we employ frequency-weighted losses with stratified sampling over joint class-packing strata. Evaluated on 200,000 Windows PE byteplot images, ViPER achieves a balanced accuracy of 0.8521, ROC-AUC of 0.9260, and AUPR of 0.9279, outperforming representative state-of-the-art baselines across all primary metrics, while attaining a packing detection AUC of 0.9949.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ViPER adds a packing detection head and gating to a ViT for byteplot malware classification, with solid reported numbers, but leaves generalization to unseen packers untested.

read the letter

ViPER's core idea is to jointly train malware classification and packing detection on byteplot images, then use the packing head output to gate which malware decision boundary applies. This directly targets the high-entropy problem that breaks standard visualization detectors when samples are packed.

The paper does a reasonable job on the engineering side. It uses a LoRA-adapted ViT-B/14, frequency-weighted losses, and stratified sampling over the joint malware-packing strata. On 200k Windows PE images it reports balanced accuracy 0.8521, ROC-AUC 0.9260, AUPR 0.9279, and packing detection AUC 0.9949, claiming gains over the baselines it compares against.

The soft spots are exactly where the stress-test note flags them. The abstract gives no evidence that the packing head was checked on packers absent from the training strata, and no ablation shows what happens to malware metrics when the gate receives an error versus oracle packing state. Without those checks, the conditioning mechanism could introduce the systematic bias the authors are trying to avoid. Baseline implementation details and statistical significance are also missing from the abstract.

This work is for people already working on visualization-based malware detection who need to handle packed executables. A reader focused on practical security ML would find the architecture and scale of the evaluation useful. It deserves a serious referee because it tackles a documented failure mode with a clear, trainable approach and reports concrete metrics on a large set.

I would send it to peer review, but the reviewers should require out-of-distribution packing tests and ablations on gate errors.

Referee Report

2 major / 2 minor

Summary. The manuscript presents ViPER, a LoRA-adapted ViT-B/14 model with a dual-head architecture for joint malware classification and packing detection on Windows PE byteplot images. It introduces a packing-aware gating mechanism that conditions the malware head on the inferred packing state, employs frequency-weighted losses and stratified sampling over joint malware/packing strata to handle skew, and reports balanced accuracy of 0.8521, ROC-AUC of 0.9260, AUPR of 0.9279, and packing detection AUC of 0.9949 on a 200,000-image dataset, outperforming representative baselines.

Significance. If the packing head generalizes and the gating mechanism operates without systematic bias on real-world inputs, the approach would address a key limitation of visualization-based malware detectors by explicitly modeling packing state rather than treating it as noise. The dual-head design with explicit conditioning is a targeted contribution to handling a prevalent failure mode in the domain.

major comments (2)

[Abstract] Abstract and evaluation description: the central robustness claim rests on the packing-aware gating mechanism, yet no results are provided for the packing head or gated malware metrics on packers absent from the training strata; the use of stratified sampling over joint class-packing strata presupposes that test-time packing label distributions match training, but no ablation comparing gated predictions against oracle packing state or on held-out packers is reported, leaving open the risk that packing mispredictions route the malware head to an incorrect boundary.
[Abstract] Abstract: the reported outperformance (balanced accuracy 0.8521, ROC-AUC 0.9260) is presented without any information on baseline implementations, hyperparameter matching, or statistical significance testing, which is load-bearing for the claim that the dual-head and gating design drives the gains rather than implementation differences.

minor comments (2)

The abstract refers to 'representative state-of-the-art baselines' without naming the methods or citing their sources; the main text should explicitly list and reference them.
Details on the exact LoRA rank, scaling factor, and loss weighting coefficients are mentioned as free parameters but not reported numerically; these should be included for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below with clarifications on the evaluation setup and indicate planned revisions where appropriate.

read point-by-point responses

Referee: [Abstract] Abstract and evaluation description: the central robustness claim rests on the packing-aware gating mechanism, yet no results are provided for the packing head or gated malware metrics on packers absent from the training strata; the use of stratified sampling over joint class-packing strata presupposes that test-time packing label distributions match training, but no ablation comparing gated predictions against oracle packing state or on held-out packers is reported, leaving open the risk that packing mispredictions route the malware head to an incorrect boundary.

Authors: We agree that the reported results rely on test data drawn from the same joint malware-packing distribution as training via stratified sampling, and we do not provide separate metrics for the packing head or gated malware classification on packers entirely absent from the training strata. The packing head reaches 0.9949 AUC within the observed packers, and the gating is trained end-to-end to condition the malware head. No oracle-gating or held-out-packer ablations appear in the manuscript because the primary focus is the joint supervised framework under realistic skew rather than explicit OOD packer evaluation. This is a genuine limitation for claims of robustness to novel packers. We will add an explicit discussion of this scope limitation in the revised manuscript and, resources permitting, include a small held-out packer experiment. revision: partial
Referee: [Abstract] Abstract: the reported outperformance (balanced accuracy 0.8521, ROC-AUC 0.9260) is presented without any information on baseline implementations, hyperparameter matching, or statistical significance testing, which is load-bearing for the claim that the dual-head and gating design drives the gains rather than implementation differences.

Authors: The abstract is a concise summary; the full manuscript (Section 4 and Appendix) specifies that baselines were reimplemented from official repositories or papers, hyperparameters were matched to the originals where possible, and all metrics are reported as means over five random seeds with standard deviations. Statistical significance between ViPER and baselines was evaluated with paired t-tests on the per-seed scores. We will revise the abstract to include a brief parenthetical reference to these evaluation details or move the key implementation notes earlier in the evaluation section for clarity. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ML results with no self-referential derivations

full rationale

The paper describes a standard dual-head ViT model with gating and weighted losses, then reports held-out test metrics (balanced accuracy 0.8521, AUCs 0.9260/0.9279/0.9949) on 200k byteplot images. No equations, predictions, or first-principles claims reduce by construction to fitted parameters on the same data; the evaluation is a conventional train/test split with no load-bearing self-citation or ansatz smuggling. The skeptic concern about unseen packers is an external generalization issue, not a circularity in the reported derivation.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that byteplot images retain sufficient structural signal for both malware and packing classification, that packing labels exist or can be obtained for training, and that standard supervised learning assumptions hold for the joint task.

free parameters (2)

LoRA rank and scaling
Hyperparameters controlling the low-rank adaptation of the ViT backbone; chosen to enable efficient fine-tuning.
loss weighting coefficients
Scalars in the frequency-weighted losses that balance the joint malware and packing objectives.

axioms (2)

domain assumption Byteplot visualization preserves discriminative structural patterns for both malware classification and packing detection
Invoked by the entire visualization-based pipeline and required for the dual-head approach to succeed.
domain assumption Packing state labels are available or accurately obtainable for the training distribution
Required for the stratified sampling and frequency-weighted losses described in the abstract.

pith-pipeline@v0.9.1-grok · 5778 in / 1504 out tokens · 29259 ms · 2026-06-27T06:30:07.740230+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 21 canonical work pages · 5 internal anchors

[1]

Ashawa, M., Owoh, N., Hosseinzadeh, S., Osamor, J.,

https://doi.org/10.3390/s25154581. Ashawa, M., Owoh, N., Hosseinzadeh, S., Osamor, J.,

work page doi:10.3390/s25154581
[2]

Bavishi, S., Narayanan, A.,

https://doi.org/10.3390/electronics13204081. Bavishi, S., Narayanan, A.,

work page doi:10.3390/electronics13204081
[3]

arXiv preprint arXiv:2409.19461

Accelerating malware classification: A vision transformer solution. arXiv preprint arXiv:2409.19461. https://doi.org/10.48550/arXiv.2409.19461. Bhodia,N.,Prajapati,P.,DiTroia,F.,Stamp,M.,2019.Transferlearningfor image-based malware classification. arXiv preprint arXiv:1903.11551. https://doi.org/10.48550/arXiv.1903.11551. Biondi, F., Enescu, M.A., Given-W...

work page doi:10.48550/arxiv.2409.19461 2019
[4]

In: Proc

Tutorial: An overview of malware detection and evasion techniques. In: Proc. Int. Symp. Leveraging Applica- tions of Formal Methods (ISoLA), Limassol, Cyprus. pp. 235–266. https://doi.org/10.1007/978-3-030-03418-4_34. Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., Joulin, A.,

work page doi:10.1007/978-3-030-03418-4_34
[5]

In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021

Emerging properties in self-supervised vision trans- formers. In: Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV). pp. 9650–9660. https://doi.org/10.1109/ICCV48922.2021.00951. Caruana, R.,

work page doi:10.1109/iccv48922.2021.00951 2021
[6]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Multitask learning.Mach. Learn.28 (1), 41–75. https://doi.org/10.1023/A:1007379606734. Chawla,N.V.,Bowyer,K.W.,Hall,L.O.,Kegelmeyer,W.P.,2002.SMOTE: Synthetic minority over-sampling technique.J. Artif. Intell. Res.16, 321–357. https://doi.org/10.1613/jair.953. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1023/a:1007379606734 2002
[7]

In: Proc

The foundations of cost-sensitive learning. In: Proc. 17th Int. Joint Conf. Artificial Intelligence (IJCAI), Seattle, WA, USA. pp. 973–978. Gibert,D.,Mateu,C.,Planes,J.,Vicens,R.,2019.Usingconvolutionalneu- ralnetworksforclassificationofmalwarerepresentedasimages.J. Com- put. Virol. Hacking Tech.15(1),15–28.https://doi.org/10.1007/s11416- 018-0323-0. He, ...

work page doi:10.1007/s11416- 2019
[8]

2016, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1, doi: 10.1109/CVPR.2016.90

Deep residual learning for image recognition. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA. pp. 770–778. https://doi.org/10.1109/CVPR.2016.90. Horšík, J.,

work page doi:10.1109/cvpr.2016.90 2016
[9]

GitHub repository.https://github.com/ horsicq/Detect-It-Easy(accessed 1 January 2024)

Detect-It-Easy. GitHub repository.https://github.com/ horsicq/Detect-It-Easy(accessed 1 January 2024). Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., Adam, H.,

2024
[10]

In: Proc

Searching for MobileNetV3. In: Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Korea. pp. 1314–1324. https://doi.org/10.1109/ICCV.2019.00140. Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W.,

work page doi:10.1109/iccv.2019.00140 2019
[11]

LoRA: Low-Rank Adaptation of Large Language Models

LoRA: Low-rank adaptation of large lan- guage models. In: Proc. Int. Conf. Learning Representations (ICLR). https://doi.org/10.48550/arXiv.2106.09685. Huang, W., Stokes, J.W.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2106.09685
[12]

In: Proc

MTNet: A multi-task neural network for dynamic malware classification. In: Proc. 13th Int. Conf. Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA). pp. 399–418. https://doi.org/10.1007/978-3-319-40667-1_20. Ki,Y.,Kim,E.,Kim,H.K.,2015.Anovelapproachtodetectmalwarebased onAPIcallsequenceanalysis.Int. J. Distrib. Sens. Netw.11(6),6591...

work page doi:10.1007/978-3-319-40667-1_20 2015
[13]

SGDR: Stochastic Gradient Descent with Warm Restarts

SGDR: Stochastic gradient descent with warm restarts. In: Proc. Int. Conf. Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1608.03983. Loshchilov, I., Hutter, F.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1608.03983
[14]

Decoupled Weight Decay Regularization

Decoupled weight decay regular- ization. In: Proc. Int. Conf. Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1711.05101. Lu, Z., Tu, S., Li, Z.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1711.05101
[15]

Malware image classification based on lightweightvisiontransformerandprogressivefocalloss.In:Proc.2025 15th Int. Conf. Communication and Network Security (ICCNS). pp. 217–222. https://doi.org/10.1145/3789456.3789463. Lyda, R., Hamrock, J.,

work page doi:10.1145/3789456.3789463 2025
[16]

https://doi.org/10.1109/MSP.2007.48

Using entropy analysis to find en- crypted and packed malware.IEEE Security Privacy5 (2), 40–45. https://doi.org/10.1109/MSP.2007.48. Masab, M., Ahmad, K., Hussain, M., Khan, M.S.,

work page doi:10.1109/msp.2007.48 2007
[17]

Malware im- age classification using global context vision transformers for infor- mation security.ICCK Trans. Inf. Security Cryptography2 (1), 1–15. https://doi.org/10.62762/TISC.2025.775760. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.,

work page doi:10.62762/tisc.2025.775760 2025
[18]

Nataraj, S

Malware images: Visualization and automatic classification. In: Proc. 8th Int. Symp.VisualizationforCyberSecurity(VizSec),Pittsburgh,PA,USA. pp. 1–7. https://doi.org/10.1145/2016904.2016908. Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., Assran, M., Ballas, N., Galuba, W., H...

work page doi:10.1145/2016904.2016908
[19]

DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning robust visualfeatureswithoutsupervision.Trans. Machine Learning Research. https://doi.org/10.48550/arXiv.2304.07193. Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J., Cavallaro, L.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2304.07193
[20]

In: Proc

TESSERACT: Eliminating experimental bias in malware classification across space and time. In: Proc. USENIX Security Symp., Santa Clara, CA, USA. pp. 729–746. https://doi.org/10.48550/arXiv.1807.07838. Ugarte-Pedrero, X., Balzarotti, D., Santos, I., Bringas, P.G.,

work page doi:10.48550/arxiv.1807.07838
[21]

In: Proc

SoK: Deep packer inspection: A longitudinal study of the complexity of run- time packers. In: Proc. IEEE Symp. Security and Privacy (S&P), San Jose, CA, USA. pp. 659–673. https://doi.org/10.1109/SP.2015.48. Vasan, D., Alazab, M., Wassan, S., Safaei, B., Zheng, Q.,

work page doi:10.1109/sp.2015.48 2015
[22]

Networks171, 107138

IM- CFN: Image-based malware classification using fine-tuned convolu- tional neural network architecture.Comput. Networks171, 107138. Qaiser et al.:Preprint submitted to ElsevierPage 11 of 12 ViPER: Vision-based Packing-Aware Encoder https://doi.org/10.1016/j.comnet.2020.107138. Yan,J.,Qi,Y.,Rao,Q.,2019.Detectingmalwarewithanensemblemethod based on deep n...

work page doi:10.1016/j.comnet.2020.107138 2020

[1] [1]

Ashawa, M., Owoh, N., Hosseinzadeh, S., Osamor, J.,

https://doi.org/10.3390/s25154581. Ashawa, M., Owoh, N., Hosseinzadeh, S., Osamor, J.,

work page doi:10.3390/s25154581

[2] [2]

Bavishi, S., Narayanan, A.,

https://doi.org/10.3390/electronics13204081. Bavishi, S., Narayanan, A.,

work page doi:10.3390/electronics13204081

[3] [3]

arXiv preprint arXiv:2409.19461

Accelerating malware classification: A vision transformer solution. arXiv preprint arXiv:2409.19461. https://doi.org/10.48550/arXiv.2409.19461. Bhodia,N.,Prajapati,P.,DiTroia,F.,Stamp,M.,2019.Transferlearningfor image-based malware classification. arXiv preprint arXiv:1903.11551. https://doi.org/10.48550/arXiv.1903.11551. Biondi, F., Enescu, M.A., Given-W...

work page doi:10.48550/arxiv.2409.19461 2019

[4] [4]

In: Proc

Tutorial: An overview of malware detection and evasion techniques. In: Proc. Int. Symp. Leveraging Applica- tions of Formal Methods (ISoLA), Limassol, Cyprus. pp. 235–266. https://doi.org/10.1007/978-3-030-03418-4_34. Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., Joulin, A.,

work page doi:10.1007/978-3-030-03418-4_34

[5] [5]

In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021

Emerging properties in self-supervised vision trans- formers. In: Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV). pp. 9650–9660. https://doi.org/10.1109/ICCV48922.2021.00951. Caruana, R.,

work page doi:10.1109/iccv48922.2021.00951 2021

[6] [6]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Multitask learning.Mach. Learn.28 (1), 41–75. https://doi.org/10.1023/A:1007379606734. Chawla,N.V.,Bowyer,K.W.,Hall,L.O.,Kegelmeyer,W.P.,2002.SMOTE: Synthetic minority over-sampling technique.J. Artif. Intell. Res.16, 321–357. https://doi.org/10.1613/jair.953. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1023/a:1007379606734 2002

[7] [7]

In: Proc

The foundations of cost-sensitive learning. In: Proc. 17th Int. Joint Conf. Artificial Intelligence (IJCAI), Seattle, WA, USA. pp. 973–978. Gibert,D.,Mateu,C.,Planes,J.,Vicens,R.,2019.Usingconvolutionalneu- ralnetworksforclassificationofmalwarerepresentedasimages.J. Com- put. Virol. Hacking Tech.15(1),15–28.https://doi.org/10.1007/s11416- 018-0323-0. He, ...

work page doi:10.1007/s11416- 2019

[8] [8]

2016, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1, doi: 10.1109/CVPR.2016.90

Deep residual learning for image recognition. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA. pp. 770–778. https://doi.org/10.1109/CVPR.2016.90. Horšík, J.,

work page doi:10.1109/cvpr.2016.90 2016

[9] [9]

GitHub repository.https://github.com/ horsicq/Detect-It-Easy(accessed 1 January 2024)

Detect-It-Easy. GitHub repository.https://github.com/ horsicq/Detect-It-Easy(accessed 1 January 2024). Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., Adam, H.,

2024

[10] [10]

In: Proc

Searching for MobileNetV3. In: Proc. IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Korea. pp. 1314–1324. https://doi.org/10.1109/ICCV.2019.00140. Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W.,

work page doi:10.1109/iccv.2019.00140 2019

[11] [11]

LoRA: Low-Rank Adaptation of Large Language Models

LoRA: Low-rank adaptation of large lan- guage models. In: Proc. Int. Conf. Learning Representations (ICLR). https://doi.org/10.48550/arXiv.2106.09685. Huang, W., Stokes, J.W.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2106.09685

[12] [12]

In: Proc

MTNet: A multi-task neural network for dynamic malware classification. In: Proc. 13th Int. Conf. Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA). pp. 399–418. https://doi.org/10.1007/978-3-319-40667-1_20. Ki,Y.,Kim,E.,Kim,H.K.,2015.Anovelapproachtodetectmalwarebased onAPIcallsequenceanalysis.Int. J. Distrib. Sens. Netw.11(6),6591...

work page doi:10.1007/978-3-319-40667-1_20 2015

[13] [13]

SGDR: Stochastic Gradient Descent with Warm Restarts

SGDR: Stochastic gradient descent with warm restarts. In: Proc. Int. Conf. Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1608.03983. Loshchilov, I., Hutter, F.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1608.03983

[14] [14]

Decoupled Weight Decay Regularization

Decoupled weight decay regular- ization. In: Proc. Int. Conf. Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1711.05101. Lu, Z., Tu, S., Li, Z.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1711.05101

[15] [15]

Malware image classification based on lightweightvisiontransformerandprogressivefocalloss.In:Proc.2025 15th Int. Conf. Communication and Network Security (ICCNS). pp. 217–222. https://doi.org/10.1145/3789456.3789463. Lyda, R., Hamrock, J.,

work page doi:10.1145/3789456.3789463 2025

[16] [16]

https://doi.org/10.1109/MSP.2007.48

Using entropy analysis to find en- crypted and packed malware.IEEE Security Privacy5 (2), 40–45. https://doi.org/10.1109/MSP.2007.48. Masab, M., Ahmad, K., Hussain, M., Khan, M.S.,

work page doi:10.1109/msp.2007.48 2007

[17] [17]

Malware im- age classification using global context vision transformers for infor- mation security.ICCK Trans. Inf. Security Cryptography2 (1), 1–15. https://doi.org/10.62762/TISC.2025.775760. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.,

work page doi:10.62762/tisc.2025.775760 2025

[18] [18]

Nataraj, S

Malware images: Visualization and automatic classification. In: Proc. 8th Int. Symp.VisualizationforCyberSecurity(VizSec),Pittsburgh,PA,USA. pp. 1–7. https://doi.org/10.1145/2016904.2016908. Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., Assran, M., Ballas, N., Galuba, W., H...

work page doi:10.1145/2016904.2016908

[19] [19]

DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning robust visualfeatureswithoutsupervision.Trans. Machine Learning Research. https://doi.org/10.48550/arXiv.2304.07193. Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J., Cavallaro, L.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2304.07193

[20] [20]

In: Proc

TESSERACT: Eliminating experimental bias in malware classification across space and time. In: Proc. USENIX Security Symp., Santa Clara, CA, USA. pp. 729–746. https://doi.org/10.48550/arXiv.1807.07838. Ugarte-Pedrero, X., Balzarotti, D., Santos, I., Bringas, P.G.,

work page doi:10.48550/arxiv.1807.07838

[21] [21]

In: Proc

SoK: Deep packer inspection: A longitudinal study of the complexity of run- time packers. In: Proc. IEEE Symp. Security and Privacy (S&P), San Jose, CA, USA. pp. 659–673. https://doi.org/10.1109/SP.2015.48. Vasan, D., Alazab, M., Wassan, S., Safaei, B., Zheng, Q.,

work page doi:10.1109/sp.2015.48 2015

[22] [22]

Networks171, 107138

IM- CFN: Image-based malware classification using fine-tuned convolu- tional neural network architecture.Comput. Networks171, 107138. Qaiser et al.:Preprint submitted to ElsevierPage 11 of 12 ViPER: Vision-based Packing-Aware Encoder https://doi.org/10.1016/j.comnet.2020.107138. Yan,J.,Qi,Y.,Rao,Q.,2019.Detectingmalwarewithanensemblemethod based on deep n...

work page doi:10.1016/j.comnet.2020.107138 2020