arxiv: 2605.04698 · v1 · submitted 2026-05-06 · 💻 cs.CR · cs.LG

Recognition: 2 theorem links

Gray-Box Poisoning of Continuous Malware Ingestion Pipelines

Jan Dolej\v{s} , Martin Jure\v{c}ek , R\'obert L\'orencz

Authors on Pith no claims yet

Pith reviewed 2026-05-08 18:08 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords malware detectionadversarial poisoningcontinuous learningIAT injectiongray-box attacksensemble defensebinary manipulationfunctionality-preserving perturbations

0 comments

The pith

Subtle IAT modifications in malware binaries enable poisoning of continuous detection pipelines and lower model recall.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that attackers can create compact, functionality-preserving adversarial binaries by injecting specific entries into the Import Address Table and that these samples degrade the recall of a LightGBM malware classifier when added to its training data. This matters because modern detection systems rely on continuous ingestion of new files to keep up with evolving threats, creating an opening for poisoned data to influence the model if it passes basic checks. The authors evaluate the attack in a gray-box setting and measure its impact on recall, then test a defense that uses a homogeneous ensemble to filter most of the malicious samples while retaining legitimate ones. The work concludes that production pipelines require stronger pre-ingestion validation to counter such threats.

Core claim

Using functionality-preserving IAT and section injections generated via the secml_malware framework, the authors produce adversarial binaries that, once ingested into the training set, cause a LightGBM-based malware detector to miss a significantly higher fraction of threats. The same samples remain compact and retain malicious behavior. The paper further shows that a homogeneous ensemble defense identifies and removes up to 95.6 percent of the poisoning attempts while preserving a high retention rate for clean data, thereby demonstrating both the vulnerability and a practical mitigation for continuous ingestion pipelines.

What carries the argument

IAT-based functionality-preserving perturbations that generate compact poisoning samples for gray-box attacks on continuous malware ingestion pipelines.

If this is right

Continuous ingestion pipelines for malware detection remain open to gray-box poisoning via small binary manipulations.
IAT perturbations produce compact samples that can measurably reduce recall in models such as LightGBM.
A homogeneous ensemble defense can filter up to 95.6 percent of poisoning attempts while keeping most legitimate samples.
Production systems require robust pre-ingestion validation to limit the impact of such attacks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same ingestion vulnerability could appear in any continuous-learning security system that retrains on incoming files without strong checks.
Defenders may need pipeline-level filters that combine multiple signals beyond ensemble disagreement.
Testing the attack against models other than LightGBM would clarify how broadly the IAT method applies.

Load-bearing premise

The generated adversarial binaries will be ingested into the defender's training set without prior filtering or validation in a realistic continuous pipeline.

What would settle it

Running the same poisoned binaries through an actual production malware ingestion pipeline that applies standard validation or sandbox checks and measuring whether recall still drops or whether the ensemble filter removes them before training.

Figures

Figures reproduced from arXiv: 2605.04698 by Jan Dolej\v{s}, Martin Jure\v{c}ek, R\'obert L\'orencz.

**Figure 1.** Figure 1: Dataset partitioning. Raw data sources are processed, deduplicated, and view at source ↗

**Figure 2.** Figure 2: Ensemble agreement for poisoned samples. view at source ↗

read the original abstract

Modern malware detection pipelines rely on continuous data ingestion and machine learning to counter the high volume of novel threats. This work investigates a realistic gray-box poisoning threat model targeting these pipelines. Using the secml_malware framework, we generate problem-space adversarial binaries through functionality-preserving manipulations, specifically Import Address Table (IAT) and section injections. We evaluate the impact of these poisoned samples when ingested into a defender's training set for a LightGBM malware detection model. Our empirical results demonstrate that subtle IAT-based perturbations enable compact poisoning samples that significantly degrade detection recall. These findings illustrate the inherent challenge of developing low-visibility adversarial perturbations that maintain high poisoning efficacy within continuous learning systems. We further evaluate a defense mechanism based on a homogeneous ensemble, which successfully identifies and filters up to 95.6% of poisoning attempts while maintaining a high retention rate for legitimate data. These findings emphasize the necessity of robust pre-ingestion validation in production pipelines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

IAT-based poisoning can drop LightGBM recall if the samples reach training data, but the paper never checks whether real ingestion pipelines would let them through.

read the letter

The core result is that functionality-preserving IAT and section injections, generated via secml_malware, produce compact poisoned binaries that degrade a LightGBM malware detector once they enter its training set. The authors also report that a homogeneous ensemble defense catches 95.6% of these attempts while retaining most clean samples. That combination applied to a continuous ingestion setting is the concrete piece that was not already in the cited literature.

Referee Report

3 major / 2 minor

Summary. This manuscript explores a gray-box poisoning attack against continuous malware ingestion pipelines. The authors employ the secml_malware framework to craft functionality-preserving adversarial malware binaries using Import Address Table (IAT) and section injections. These samples are evaluated for their ability to degrade the performance of a LightGBM malware classifier when incorporated into its training data. The work also proposes and tests a homogeneous ensemble-based defense that filters a high percentage (95.6%) of poisoning attempts while preserving legitimate samples. The central finding is that subtle IAT perturbations can produce effective, low-visibility poisons in such dynamic systems.

Significance. If the empirical results hold under more complete experimental conditions, the work would be significant for demonstrating practical poisoning risks in continuously updating malware detection pipelines, which are common in production. It earns credit for using an established problem-space attack framework (secml_malware) that preserves binary functionality and for evaluating a concrete ensemble defense. The focus on low-visibility IAT perturbations and continuous ingestion is timely. However, the absence of quantitative recall metrics, baselines, and pipeline realism currently limits the strength of the contribution.

major comments (3)

[Section 5] Section 5 (Experimental Evaluation): The central claim that IAT-based perturbations significantly degrade LightGBM recall requires that the generated adversarial binaries actually enter the training set. The evaluation only measures impact 'when ingested' and provides no simulation or testing of realistic pre-ingestion pipeline steps such as PE header validation, size/entropy thresholds, preliminary static checks, or rate limiting. This assumption is load-bearing for whether the reported recall degradation materializes in practice.
[Section 5] Section 5: The abstract states that the perturbations 'significantly degrade detection recall,' yet the manuscript provides no quantitative recall drop values, unpoisoned baseline comparisons, poisoning budget (e.g., fraction of training set), data splits, or error bars. Without these, the magnitude and statistical reliability of the attack cannot be assessed.
[Section 6] Section 6 (Defense): The homogeneous ensemble is reported to filter up to 95.6% of poisoning attempts, but the paper supplies insufficient detail on ensemble construction, features used for filtering, retention rate on clean data, or false-positive rates on legitimate samples. This prevents evaluation of the defense's practical trade-offs and reproducibility.

minor comments (2)

[Abstract] The abstract claims a 'high retention rate for legitimate data' without providing the specific percentage or supporting table; adding this number would improve clarity.
[Threat Model] Notation for 'gray-box' access (model vs. data knowledge) could be defined more explicitly in the threat model section to avoid ambiguity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We have revised the manuscript to address the concerns about experimental assumptions, missing quantitative details, and insufficient defense specifications. Our responses to each major comment are provided below.

read point-by-point responses

Referee: [Section 5] Section 5 (Experimental Evaluation): The central claim that IAT-based perturbations significantly degrade LightGBM recall requires that the generated adversarial binaries actually enter the training set. The evaluation only measures impact 'when ingested' and provides no simulation or testing of realistic pre-ingestion pipeline steps such as PE header validation, size/entropy thresholds, preliminary static checks, or rate limiting. This assumption is load-bearing for whether the reported recall degradation materializes in practice.

Authors: We agree that the assumption of successful ingestion is central and that a full pipeline simulation would strengthen the claims. Our threat model focuses on the poisoning effect once samples reach the training set, as IAT injections preserve valid PE headers, functionality, and low-entropy profiles that commonly pass basic static filters. In the revision we have added a dedicated paragraph in Section 5 analyzing why these perturbations are likely to evade typical pre-ingestion checks (PE validation, entropy thresholds) and a limitations subsection noting that comprehensive end-to-end simulation of proprietary pipelines is beyond the current scope but would be valuable future work. revision: partial
Referee: [Section 5] Section 5: The abstract states that the perturbations 'significantly degrade detection recall,' yet the manuscript provides no quantitative recall drop values, unpoisoned baseline comparisons, poisoning budget (e.g., fraction of training set), data splits, or error bars. Without these, the magnitude and statistical reliability of the attack cannot be assessed.

Authors: We acknowledge the omission of these metrics in the original submission. The revised Section 5 now includes explicit recall degradation figures relative to unpoisoned baselines, the poisoning budgets tested (as fractions of the training set), the train/test splits employed, and error bars computed over multiple independent runs. These additions allow direct assessment of the attack's magnitude and reliability. revision: yes
Referee: [Section 6] Section 6 (Defense): The homogeneous ensemble is reported to filter up to 95.6% of poisoning attempts, but the paper supplies insufficient detail on ensemble construction, features used for filtering, retention rate on clean data, or false-positive rates on legitimate samples. This prevents evaluation of the defense's practical trade-offs and reproducibility.

Authors: We thank the referee for this observation. The revised Section 6 now provides the requested details: ensemble construction (multiple LightGBM models with bootstrapped training subsets and varied random seeds), the specific features used for disagreement-based filtering, retention rates on clean legitimate samples, and false-positive rates on benign binaries. These additions enable evaluation of the defense's practical trade-offs and support reproducibility. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation with no derivations or self-referential fits

full rationale

The paper is an empirical study that generates adversarial binaries via an external framework (secml_malware), injects IAT/section perturbations, ingests them into a LightGBM training set, and measures recall degradation plus an ensemble defense. No equations, parameter fits, uniqueness theorems, or derivation chains are present. Claims rest on experimental outcomes rather than any reduction to author-defined inputs or self-citations. The load-bearing assumption about ingestion is a modeling choice, not a circular derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical security study; abstract introduces no mathematical axioms, free parameters, or new entities. All claims rest on experimental outcomes whose details are not provided.

pith-pipeline@v0.9.0 · 5468 in / 1093 out tokens · 34287 ms · 2026-05-08T18:08:55.820763+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 7 canonical work pages

[1]

Aryal, K., Gupta, M., Abdelsalam, M.: A survey on adversarial attacks for malware analysis (2022), https://arxiv.org/abs/2111.08223

work page arXiv 2022
[2]

Dolejš et al

Aryal, K., Gupta, M., Abdelsalam, M.: Analysis of label-flip poi- soning attack on machine learning based malware detector (2023), https://arxiv.org/abs/2301.01044 8 J. Dolejš et al

work page arXiv 2023
[3]

Chen, S., Xue, M., Fan, L., Hao, S., Xu, L., Zhu, H., Li, B.: Automated poison- ing attacks and defenses in malware detection systems: An adversarial machine learning approach (2017), https://arxiv.org/abs/1706.04146

work page arXiv 2017
[4]

Demetrio, L., Biggio, B.: secml-malware: A python library for adversarial robust- ness evaluation of windows malware classifiers (2021)

2021
[5]

IEEE Trans- actions on Information Forensics and Security (2021)

Demetrio, L., Biggio, B., Lagorio, G., Roli, F., Armando, A.: Functionality- preserving black-box optimization of adversarial windows malware. IEEE Trans- actions on Information Forensics and Security (2021)

2021
[6]

ACM Transactions on Privacy and Security (2021)

Demetrio, L., Coull, S.E., Biggio, B., Lagorio, G., Armando, A., Roli, F.: Ad- versarial exemples: A survey and experimental evaluation of practical attacks on machine learning for windows malware detection. ACM Transactions on Privacy and Security (2021)

2021
[7]

Huntress Cybersecurity Guide (2025), https://www.huntress.com/malware-guide/malware-statistics, accessed: 2026-04-29

Huntress: Malware Statistics You Can’t Ignore. Huntress Cybersecurity Guide (2025), https://www.huntress.com/malware-guide/malware-statistics, accessed: 2026-04-29

2025
[8]

Proceedings of the 31st

Joyce, R.J., Miller, G., Roth, P., Zak, R., Zaresky-Williams, E., Ander- son, H., Raff, E., Holt, J.: Ember2024 - a benchmark dataset for holis- tic evaluation of malware classifiers. In: Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2. p. 5516–5526. KDD ’25, ACM (Aug 2025). https://doi.org/10.1145/3711896.3737431,...

work page doi:10.1145/3711896.3737431 2025
[9]

Practical Security An- alytics (June 2021), https://practicalsecurityanalytics.com/pe-malware-machine- learning-dataset/, accessed: 2026-02-02

Lester, M.: PE Malware Machine Learning Dataset. Practical Security An- alytics (June 2021), https://practicalsecurityanalytics.com/pe-malware-machine- learning-dataset/, accessed: 2026-02-02

2021
[10]

In: 2013 fourth cybercrime and trustworthy computing workshop

Oliver, J., Cheng, C., Chen, Y.: Tlsh–a locality sensitive hash. In: 2013 fourth cybercrime and trustworthy computing workshop. pp. 7–13. IEEE (2013)

2013
[11]

In: 2020 IEEE symposium on security and privacy (SP)

Pierazzi, F., Pendlebury, F., Cortellazzi, J., Cavallaro, L.: Intriguing properties of adversarial ml attacks in the problem space. In: 2020 IEEE symposium on security and privacy (SP). pp. 1332–1349. IEEE (2020)

2020
[12]

In: 2019 17th International Conference on Privacy, Security and Trust (PST)

Sasaki, S., Hidano, S., Uchibayashi, T., Suganuma, T., Hiji, M., Kiyomoto, S.: On embedding backdoor in malware detectors using machine learning. In: 2019 17th International Conference on Privacy, Security and Trust (PST). pp. 1–5 (2019). https://doi.org/10.1109/PST47121.2019.8949034

work page doi:10.1109/pst47121.2019.8949034 2019
[13]

In: 30th USENIX Security Sympo- sium (USENIX Security 21)

Severi, G., Meyer, J., Coull, S., Oprea, A.: Explanation-Guided backdoor poi- soning attacks against malware classifiers. In: 30th USENIX Security Sympo- sium (USENIX Security 21). pp. 1487–1504. USENIX Association (Aug 2021), https://www.usenix.org/conference/usenixsecurity21/presentation/severi

2021
[14]

CoRR abs/2003.03100(2020), https://arxiv.org/abs/2003.03100

Song, W., Li, X., Afroz, S., Garg, D., Kuznetsov, D., Yin, H.: Automatic generation of adversarial examples for interpreting malware classifiers. CoRR abs/2003.03100(2020), https://arxiv.org/abs/2003.03100

work page arXiv 2003
[15]

Computers & Security81, 123–147 (2019)

Ucci, D., Aniello, L., Baldoni, R.: Survey of machine learning techniques for mal- ware analysis. Computers & Security81, 123–147 (2019)

2019
[16]

Vassilev, A., Oprea, A., Fordyce, A., Anderson, H., Davies, X., Hamin, M.: Adver- sarial machine learning: A taxonomy and terminology of attacks and mitigations (2025-03-2404:03:002025).https://doi.org/https://doi.org/10.6028/NIST.AI.100- 2e2025

work page doi:10.6028/nist.ai.100- 2025
[17]

https://github.com/CyberForce/Pesidious (2020), accessed: 2026-04-29

Vaya, C., Sen, B.: PEsidious: Malware Mutation Using Reinforcement Learning and Generative Adversarial Networks. https://github.com/CyberForce/Pesidious (2020), accessed: 2026-04-29

2020