pith. sign in

arxiv: 2606.03714 · v1 · pith:DSK3MLNZnew · submitted 2026-06-02 · 💻 cs.CR

Don't Trust Us: A privacy-by-design android malware detection pipeline

Pith reviewed 2026-06-28 09:31 UTC · model grok-4.3

classification 💻 cs.CR
keywords android malwareprivacy by designstatic analysisdynamic analysisSVM classifiersandboxAPKmalware detection
0
0 comments X

The pith

Android malware detection achieves strong performance without accessing any sensitive user data through a privacy-by-design pipeline of static analysis and conditional sandboxed dynamic checks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that malware detection should avoid sensitive data entirely rather than manage it after collection. It implements this by extracting features from APKs via static analysis in the Drebin style, classifying them with an SVM using dual-reject thresholds, and sending only uncertain cases to a sandbox where dynamic analysis extracts no user information. On a temporally split 2024-2025 dataset the static stage alone reaches an F1 score of 0.87 while deferring just 6.7 percent of samples. The approach demonstrates that privacy can be ensured by never collecting the data instead of anonymizing or encrypting it later. This removes the need for users to trust the system with privileged access to their devices.

Core claim

The pipeline performs static analysis on each APK to extract features, vectorizes them, and applies an SVM equipped with dual-reject thresholds that either makes a confident classification or defers the sample to sandboxed dynamic analysis. The dynamic stage operates without extracting sensitive data or device identifiers. On the test set this yields an F1 score of 0.87 with only 6.7% of samples deferred, confirming that effective detection does not require access to user information.

What carries the argument

The dual-reject threshold rule applied to the SVM classifier after static feature extraction from APKs, which routes uncertain samples to a sandboxed dynamic analysis stage that collects no genuine user data.

If this is right

  • Over 93% of applications can be classified as malicious or benign using only static features extracted from the APK file itself.
  • Strong detection performance remains possible even when no device identifiers, network artifacts, or runtime traces are collected.
  • Sandboxed dynamic analysis can still provide high-confidence maliciousness recognition without compromising user privacy.
  • The requirement for user trust in data handling is eliminated because no sensitive data enters the pipeline at any stage.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar pipelines could be adapted for other mobile operating systems that support static APK-like analysis.
  • Improving the static classifier to reduce the deferral rate below 6.7% would further minimize any dynamic analysis overhead.
  • The temporal dataset split indicates the method may handle evolving malware patterns better than random splits.

Load-bearing premise

The sandboxed dynamic analysis can be performed without ever extracting or requiring genuine user information or device identifiers.

What would settle it

Observing that the dynamic sandbox stage requires access to device identifiers or user data to reach the reported high-confidence detection would show the privacy claim does not hold.

read the original abstract

Android malware detection increasingly relies on collecting and processing sensitive user data, including device identifiers, network artifacts, and runtime traces, while privacy is too often treated as a secondary concern. Existing privacy-aware approaches typically enforce privacy after data collection, for example, through anonymization, encryption, or federated learning, yet still require access to user information and therefore demand a high level of user trust in systems that already operate with privileged access to device activity. We argue that this requirement should be removed rather than managed. Android malware detection should be privacy-aware by design, so that effective analysis does not depend on sensitive data being accessed in the first place. To this end, we first formalize a set of design requirements for privacy-by-design detection and then implement each requirement in a comprehensive pipeline. First, static analysis is performed to extract relevant data from each APK, following the Drebin representation, which is then submitted to an SVM after vectorization. The model is equipped with a dual-reject threshold rule that either commits to a confident decision or defers uncertain samples to a dynamic analysis stage within a sandboxed environment, so that genuine user information never enters the analysis loop. Results confirm that, on a temporally split dataset spanning from 2024 to 2025, the pipeline achieves an F1 score of 0.87 with the first static analysis stage, deferring only 6.7% of test samples to secondary dynamic analysis. Additionally, dynamic sandboxing helps recognize applications' maliciousness with high confidence without extracting any sensitive data. These results demonstrate that strong detection performance is achievable without sacrificing user privacy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes a privacy-by-design Android malware detection pipeline that performs static analysis on APKs using the Drebin feature representation, vectorizes the features, and classifies them with an SVM equipped with dual-reject thresholds. Samples falling between the thresholds (reported at 6.7% on the test set) are deferred to a sandboxed dynamic analysis stage that the authors claim extracts no sensitive user data or device identifiers. On a temporally split dataset spanning 2024–2025 the pipeline reports an F1 score of 0.87 while asserting that genuine user information never enters the analysis loop.

Significance. If the privacy guarantee for the dynamic stage can be substantiated, the work would be significant because it attempts to remove the need for user trust in data collection rather than managing privacy after collection. The use of a temporal split and the dual-reject mechanism are positive design choices that align with realistic deployment constraints.

major comments (1)
  1. [Abstract] Abstract: the central claim that the sandboxed dynamic analysis stage extracts no sensitive data or device identifiers is asserted without any enumeration of the runtime features collected, description of sandbox instrumentation, or argument showing why the chosen features cannot embed user information (e.g., network flows, file paths, or process lists). This assertion is load-bearing for the privacy-by-design guarantee.
minor comments (2)
  1. [Abstract] Abstract: the reported F1 score of 0.87 and 6.7% deferral rate are given without dataset size, baseline comparisons, error bars, or implementation details, making it impossible to assess whether the numbers support the performance claim.
  2. The manuscript should provide the exact definition and selection procedure for the dual-reject thresholds, including any hyper-parameter values or cross-validation used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comment on substantiating the privacy claims for the dynamic analysis stage is well-taken and directly addresses a load-bearing aspect of the privacy-by-design argument. We respond point-by-point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the sandboxed dynamic analysis stage extracts no sensitive data or device identifiers is asserted without any enumeration of the runtime features collected, description of sandbox instrumentation, or argument showing why the chosen features cannot embed user information (e.g., network flows, file paths, or process lists). This assertion is load-bearing for the privacy-by-design guarantee.

    Authors: We agree that the abstract (and supporting sections) must provide explicit support for this claim rather than asserting it. In the revised manuscript we will: (1) enumerate the exact runtime features collected inside the sandbox (restricted to generic system-call sequences, memory-access patterns, and CPU usage signatures that contain no user identifiers, file paths tied to personal data, or network payloads); (2) describe the sandbox instrumentation (an isolated, non-rooted Android emulator with no access to device accounts, contacts, or external storage containing user files); and (3) add a short argument explaining why these features cannot embed user information (they are deliberately filtered at collection time to exclude any artifact that could be linked to a specific user or device). These additions will appear in both the abstract and the methods section describing the dynamic stage. We view this as a necessary clarification rather than a change in the underlying design. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical pipeline evaluated on held-out temporal split

full rationale

The manuscript presents a privacy-by-design pipeline consisting of static APK analysis (Drebin features), SVM classification with dual-reject thresholds, and optional sandboxed dynamic analysis. Reported performance (F1 0.87, 6.7% deferral) is measured directly on a temporally split 2024-2025 dataset rather than derived from any fitted parameter or self-referential equation. No equations, derivations, or load-bearing self-citations appear. The privacy assertion (sandbox extracts no sensitive data) is a design claim, not a mathematical reduction to inputs. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; full methods, assumptions, and any fitted thresholds are not visible.

free parameters (1)
  • dual-reject thresholds
    The two decision thresholds that determine when to commit versus defer are almost certainly chosen or tuned on data.
axioms (1)
  • domain assumption Drebin static features extracted from APKs are sufficient to produce high-confidence decisions for the majority of samples
    The pipeline's first stage and low deferral rate rest on this representation working well without runtime or user data.

pith-pipeline@v0.9.1-grok · 5824 in / 1333 out tokens · 25717 ms · 2026-06-28T09:31:21.749368+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 30 canonical work pages

  1. [1]

    Android developers documentation on accessibility ser- vices.https://developer.android.com/guide/ topics/ui/accessibility/service, accessed online on March 2026

  2. [2]

    https://support.avast.com/en-us/article/ mobile-security-permissions/#mac, accessed online on March 2026

    Android permissions required by avast mobile security. https://support.avast.com/en-us/article/ mobile-security-permissions/#mac, accessed online on March 2026

  3. [3]

    ARES ’24

    Aldini, A., Petrelli, T.: Image-based detection and clas- sification of android malware through cnn models. In: Proceedings of the 19th International Conference on Availability, Reliability and Security. ARES ’24, Asso- ciation for Computing Machinery, New York, NY , USA (2024). https://doi.org/10.1145/3664476.3670441

  4. [4]

    IEEE Access12, 173168–173191 (2024)

    Altaha, S.J., Aljughaiman, A., Gul, S.: A survey on android malware detection tech- niques using supervised machine learning. IEEE Access12, 173168–173191 (2024). https://doi.org/10.1109/ACCESS.2024.3485706

  5. [5]

    In: Proceedings of the 3rd ACM on International Workshop on Security And Privacy An- alytics

    Alzaylaee, M.K., Yerima, S.Y ., Sezer, S.: Emulator vs real phone: Android malware detection using ma- chine learning. In: Proceedings of the 3rd ACM on International Workshop on Security And Privacy An- alytics. p. 65–72. IWSPA ’17, Association for Com- puting Machinery, New York, NY , USA (Mar 2017). https://doi.org/10.1145/3041008.3041010

  6. [6]

    Computers & Security89, 101663 (Feb 2020)

    Alzaylaee, M.K., Yerima, S.Y ., Sezer, S.: Dl-droid: Deep learning based android malware detection using real devices. Computers & Security89, 101663 (Feb 2020). https://doi.org/10.1016/j.cose.2019.101663

  7. [7]

    any.run: Any.run - interactive online malware sandbox, https://any.run/

  8. [8]

    In: Proceed- ings 2014 Network and Distributed System Security Symposium

    Arp, D., Spreitzenbarth, M., Hübner, M., Gascon, H., Rieck, K.: Drebin: Effective and explainable detec- tion of android malware in your pocket. In: Proceed- ings 2014 Network and Distributed System Security Symposium. Internet Society, San Diego, CA (2014). https://doi.org/10.14722/ndss.2014.23247

  9. [9]

    Computers & Security130, 103277 (2023)

    Bhat, P., Behal, S., Dutta, K.: A system call-based android malware detection approach with homoge- neous & heterogeneous ensemble machine learn- ing. Computers & Security130, 103277 (2023). https://doi.org/10.1016/j.cose.2023.103277

  10. [10]

    IEEE Transactions on Information Foren- sics and Security13(5), 1286–1300 (2018)

    Chen, J., Wang, C., Zhao, Z., Chen, K., Du, R., Ahn, G.J.: Uncovering the face of android ran- somware: Characterization and real-time detec- tion. IEEE Transactions on Information Foren- sics and Security13(5), 1286–1300 (2018). https://doi.org/10.1109/TIFS.2017.2787905

  11. [11]

    In: Proceed- ings of the 4th IEEE Conference on Secure and Trustworthy Machine Learning

    Chow, T., D’Onghia, M., Linhardt, L., Kan, Z., Arp, D., Cavallaro, L., Pierazzi, F.: Beyond the tesser- act: Trustworthy dataset curation for sound evalu- ations of android malware classifiers. In: Proceed- ings of the 4th IEEE Conference on Secure and Trustworthy Machine Learning. IEEE, Munich, Ger- many (2026),https://discovery.ucl.ac.uk/id/ eprint/1022...

  12. [12]

    Information and Software Technology189, 107892 (2026)

    Ciaramella, G., Martinelli, F., Peluso, C., San- tone, A., Mercaldo, F.: A method for real-world privacy-preserving android malware detection through federated machine learning. Information and Software Technology189, 107892 (2026). https://doi.org/10.1016/j.infsof.2025.107892

  13. [13]

    In: Proceed- ings of the 5th ACM Conference on Data and Applica- tion Security and Privacy

    Conti, M., Mancini, L.V ., Spolaor, R., Verde, N.V .: Can’t you hear me knocking: Identification of user ac- tions on android apps via traffic analysis. In: Proceed- ings of the 5th ACM Conference on Data and Applica- tion Security and Privacy. p. 297–304. CODASPY ’15, Association for Computing Machinery, New York, NY , USA (2015). https://doi.org/10.1145...

  14. [14]

    Automated Software Engineering30(2023)

    Cui, Y ., Sun, Y ., Lin, Z.: Droidhook: a novel api- hook based android malware dynamic analysis sand- box. Automated Software Engineering30(2023). https://doi.org/10.1007/s10515-023-00378-w

  15. [15]

    Journal of Systems and Software183, 111092 (Jan 2022)

    Da Costa, F.H., Medeiros, I., Menezes, T., Da Silva, J.V ., Da Silva, I.L., Bonifácio, R., Narasimhan, K., Ribeiro, M.: Exploring the use of static and dynamic analysis to improve the performance of the mining sandbox approach for android malware identification. Journal of Systems and Software183, 111092 (Jan 2022). https://doi.org/10.1016/j.jss.2021.111092

  16. [16]

    51(3) (May 2018)

    Dijkhuizen, N.V ., Ham, J.V .D.: A survey of network traffic anonymisation techniques and implementations 12 Massidda et al. 51(3) (May 2018). https://doi.org/10.1145/3182660

  17. [17]

    Journal of Systems Architecture125, 102452 (Apr 2022)

    Faghihi, F., Zulkernine, M., Ding, S.: Camod- roid: An android application analysis environ- ment resilient against sandbox evasion. Journal of Systems Architecture125, 102452 (Apr 2022). https://doi.org/10.1016/j.sysarc.2022.102452

  18. [18]

    Pattern Recognition33(12), 2099–2101 (Dec 2000)

    Fumera, G., Roli, F., Giacinto, G.: Reject option with multiple thresholds. Pattern Recognition33(12), 2099–2101 (Dec 2000). https://doi.org/10.1016/S0031- 3203(00)00059-5

  19. [19]

    In: Proceedings of the 31st In- ternational Conference on Neural Information Process- ing Systems

    Geifman, Y ., El-Yaniv, R.: Selective classification for deep neural networks. In: Proceedings of the 31st In- ternational Conference on Neural Information Process- ing Systems. p. 4885–4894. NIPS’17, Curran Asso- ciates Inc., Red Hook, NY , USA (Dec 2017),10.5555/ 3295222.3295241

  20. [20]

    AI and Ethics2(3), 477–491 (Aug 2022)

    Goldsteen, A., Ezov, G., Shmelkin, R., Moffie, M., Farkash, A.: Data minimization for gdpr compliance in machine learning models. AI and Ethics2(3), 477–491 (Aug 2022). https://doi.org/10.1007/s43681- 021-00095-8

  21. [21]

    https://blog.research.google/2017/04/federatedlearning- collaborative.html, accessed online on March 2026

    Google: Federated learning: Collaborative ma- chine learning without centralized training data. https://blog.research.google/2017/04/federatedlearning- collaborative.html, accessed online on March 2026

  22. [22]

    google.com/android/play-protect, accessed on- line on March 2026

    Google: Google play protect.https://developers. google.com/android/play-protect, accessed on- line on March 2026

  23. [23]

    In: 2020 15th Asia Joint Conference on Informa- tion Security (AsiaJCIS)

    Hsu, R.H., Wang, Y .C., Fan, C.I., Sun, B., Ban, T., Takahashi, T., Wu, T.W., Kao, S.W.: A privacy- preserving federated learning system for android malware detection based on edge computing. In: 2020 15th Asia Joint Conference on Informa- tion Security (AsiaJCIS). p. 128–136 (Aug 2020). https://doi.org/10.1109/AsiaJCIS50894.2020.00031

  24. [24]

    and Zhang, Xuyun , title =

    Hu, H., Salcic, Z., Dobbie, G., Zhang, X.: Mem- bership inference attacks on machine learning: A survey. ACM Computing Surveys54(2021). https://doi.org/10.1145/3523273

  25. [25]

    In: 2019 IEEE 26th Inter- national Conference on Software Analysis, Evolution and Reengineering (SANER)

    Hu, Y ., Wang, H., Li, L., Guo, Y ., Xu, G., He, R.: Want to earn a few extra bucks? a first look at money-making apps. In: 2019 IEEE 26th Inter- national Conference on Software Analysis, Evolution and Reengineering (SANER). pp. 332–343 (2019). https://doi.org/10.1109/SANER.2019.8668035

  26. [26]

    Survey of intrusion detection systems: Techniques, datasets and challenges,

    Khraisat, A., Gondal, I., Vamplew, P., Kamruzzaman, J.: Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity2(1), 20 (2019). https://doi.org/10.1186/s42400-019-0038-7

  27. [27]

    Sensors23(44), 2198 (Jan 2023)

    Lee, S.: Distributed detection of malicious an- droid apps while preserving privacy using feder- ated learning. Sensors23(44), 2198 (Jan 2023). https://doi.org/10.3390/s23042198

  28. [28]

    In: 2014 9th IEEE Conference on Industrial Electronics and Applications

    Li, J., Zhai, L., Zhang, X., Quan, D.: Research of an- droid malware detection based on network traffic mon- itoring. In: 2014 9th IEEE Conference on Industrial Electronics and Applications. pp. 1739–1744 (2014). https://doi.org/10.1109/ICIEA.2014.6931449

  29. [29]

    In: Proceedings 2017 Network and Distributed System Security Sympo- sium

    Mariconti, E., Onwuzurike, L., Andriotis, P., De Cristo- faro, E., Ross, G., Stringhini, G.: Mamadroid: Detecting android malware by building markov chains of behavioral models. In: Proceedings 2017 Network and Distributed System Security Sympo- sium. Internet Society, San Diego, CA (2017). https://doi.org/10.14722/ndss.2017.23353

  30. [30]

    In: In- formation Security: 24th International Conference, ISC 2021, Virtual Event, November 10–12, 2021, Proceed- ings

    Norouzian, M.R., Xu, P., Eckert, C., Zarras, A.: Hy- broid: Toward android malware detection and catego- rization with program code and network traffic. In: In- formation Security: 24th International Conference, ISC 2021, Virtual Event, November 10–12, 2021, Proceed- ings. p. 259–278. Springer-Verlag, Berlin, Heidelberg (2021). https://doi.org/10.1007/978...

  31. [31]

    In: 28th USENIX Security Sym- posium (USENIX Security 19)

    Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J., Cavallaro, L.: TESSERACT: Eliminating ex- perimental bias in malware classification across space and time. In: 28th USENIX Security Sym- posium (USENIX Security 19). pp. 729–746. USENIX Association, Santa Clara, CA (Aug 2019),https://www.usenix.org/conference/ usenixsecurity19/presentation/pendlebury

  32. [32]

    (2026),https://github

    PRALab: End-to-end implementation of ml-based an- droid malware detectors. (2026),https://github. com/pralab/android-detectors

  33. [33]

    Results in Engineering28, 107050 (2025)

    Prasad, A., Chandra, S., Alenazy, W.M., Ali, G., Shah, S., ElAffendi, M.: Andromd: An android malware detection framework based on source code analysis and permission scan- ning. Results in Engineering28, 107050 (2025). https://doi.org/10.1016/j.rineng.2025.107050

  34. [34]

    In: 28th USENIX Security Symposium (USENIX Security 19)

    Reardon, J., Feal, Á., Wijesekera, P., On, A.E.B., Vallina-Rodriguez, N., Egelman, S.: 50 ways to leak your data: An exploration of apps’ circumvention of the android permissions system. In: 28th USENIX Security Symposium (USENIX Security 19). pp. 603–620. USENIX Association, Santa Clara, CA (Aug 2019),https://www.usenix.org/conference/ usenixsecurity19/p...

  35. [35]

    In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Secu- rity

    Shokri, R., Shmatikov, V .: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Secu- rity. p. 1310–1321. CCS ’15, Association for Com- puting Machinery, New York, NY , USA (2015). https://doi.org/10.1145/2810103.2813687

  36. [36]

    Journal of Infor- Don’t Trust Us: A privacy-by-design android malware detection pipeline 13 mation Security and Applications80, 103691 (2024)

    Soi, D., Sanna, A., Maiorca, D., Giacinto, G.: Enhancing android malware detection explainability through function call graph apis. Journal of Infor- Don’t Trust Us: A privacy-by-design android malware detection pipeline 13 mation Security and Applications80, 103691 (2024). https://doi.org/10.1016/j.jisa.2023.103691

  37. [37]

    In- ternational Journal of Information Security14(2015)

    Spreitzenbarth, M., Schreck, T., Echtler, F., Arp, D., Hoffmann, J.: Mobile-sandbox: combining static and dynamic analysis with machine-learning techniques. In- ternational Journal of Information Security14(2015). https://doi.org/10.1007/s10207-014-0250-0

  38. [38]

    IEEE Access12, 57261–57287 (2024)

    Sutter, T., Kehrer, T., Rennhard, M., Tellen- bach, B., Klein, J.: Dynamic security anal- ysis on android: A systematic literature re- view. IEEE Access12, 57261–57287 (2024). https://doi.org/10.1109/ACCESS.2024.3390612

  39. [39]

    Wolford, B.: What is gdpr, the eu’s new data protection law? (Nov 2018),https://gdpr.eu/ what-is-gdpr/

  40. [40]

    In: 2020 IEEE 45th Conference on Local Computer Networks (LCN)

    Yao, W., Li, Y ., Lin, W., Hu, T., Chowdhury, I., Masood, R., Seneviratne, S.: Security apps under the looking glass: An empirical analysis of android security apps. In: 2020 IEEE 45th Conference on Local Computer Networks (LCN). p. 381–384 (Nov 2020). https://doi.org/10.1109/LCN48667.2020.9314784, iSSN: 0742-1303

  41. [41]

    In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communica- tions Security

    Zhang, X., Zhang, Y ., Zhong, M., Ding, D., Cao, Y ., Zhang, Y ., Zhang, M., Yang, M.: Enhancing state-of- the-art classifiers with api semantics to detect evolved android malware. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communica- tions Security. p. 757–770. CCS ’20, Association for Computing Machinery, New York, NY , USA (Nov 2...

  42. [42]

    Zhou, Z., Zhu, J., Yu, F., Li, X., Peng, X., Liu, T., Han, B.: Model inversion attacks: A survey of approaches and countermeasures (2025),https://arxiv.org/ abs/2411.10023