pith. sign in

arxiv: 2605.29450 · v1 · pith:KTALFBEBnew · submitted 2026-05-28 · 💻 cs.CR

Protecting On-Device AI Inference: A Systematic Review of Attacks and Defence Mechanisms

Pith reviewed 2026-06-29 06:51 UTC · model grok-4.3

classification 💻 cs.CR
keywords on-device inferenceAI securitymodel extractionadversarial attackstrusted execution environmentssystematic reviewedge AI privacy
0
0 comments X

The pith

A systematic review of on-device AI inference finds attacks outpace defenses, with no mitigations for roughly one third of the attack literature.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper conducts the first comprehensive survey of threats to on-device AI inference and the defenses proposed against them. It maps attacks into categories such as model extraction and adversarial examples, then matches them to defenses including trusted execution environments, homomorphic encryption, and differential privacy. The central finding is an imbalance: IP-related attacks receive disproportionate defense coverage while entire attack classes lack any associated papers. This asymmetry points to concrete gaps where future work could close the difference between known risks and available protections. A sympathetic reader would care because on-device models now run in phones and edge hardware, making these gaps directly relevant to deployed systems.

Core claim

To the best of our knowledge, this paper presents the first comprehensive review of threats and corresponding defence mechanisms targeting on-device inference. Our results show that the attack and defence literature are unbalanced: approximately one quarter of the surveyed attack papers focus on Intellectual Property (IP) attacks, whereas half of the defence solutions tackle the same issue. More importantly, some attack categories have no defence paper associated to them, such as adversarial attacks that account for roughly one third of the attack literature. This asymmetry between known attacks and available mitigations highlights clear opportunities for future research on securing on-devic

What carries the argument

Systematic literature review that categorizes attacks on on-device inference into IP theft, adversarial, and related classes, then maps each class to existing defense techniques.

If this is right

  • Defense research should prioritize categories that currently have zero associated papers, starting with adversarial attacks.
  • The observed skew toward IP defenses suggests that other threats such as data breaches during inference receive less attention than their prevalence in the attack literature warrants.
  • Future surveys can use the same categorization to track whether the imbalance narrows over time.
  • Practical deployments of on-device models inherit the documented gaps until new defenses are developed for uncovered attack types.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Mobile application developers may need to combine multiple existing defenses rather than rely on any single technique to cover the full range of documented threats.
  • Hardware vendors could evaluate whether their trusted execution environment offerings address the attack classes that currently lack software-level mitigations.
  • Testing frameworks for on-device inference could incorporate adversarial examples as a standard evaluation axis given the absence of dedicated defenses.

Load-bearing premise

The search and categorization process captured a representative sample of all relevant attack and defense papers without significant omissions or bias.

What would settle it

Publication or discovery of one or more defense papers that address adversarial attacks on on-device inference models and meet the survey's inclusion criteria.

Figures

Figures reproduced from arXiv: 2605.29450 by Alexandros Fakis, Georgios Karopoulos, Marios Anagnostopoulos, Vasileios Kouliaridis, Zisis Tsiatsikas.

Figure 1
Figure 1. Figure 1: Attack categorisation. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Attack taxonomy. 44 [PITH_FULL_IMAGE:figures/full_fig_p044_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Protection mechanisms’ taxonomy. Each shape represents the AI model that was [PITH_FULL_IMAGE:figures/full_fig_p046_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Mapping of defences to attack categories. [PITH_FULL_IMAGE:figures/full_fig_p048_4.png] view at source ↗
read the original abstract

The need for secure and private Artificial Intelligence (AI) and Machine Learning (ML) on edge and mobile devices has increased the necessity of protecting the architecture of these systems from threats to both security and privacy. With an ever-increasing number of pre-trained AI models being used on mobile platforms for client-side inference, there are rising concerns about the risks associated with the theft/extraction of AI models, adversarial attacks on AI models, and data breaches. As a result of this trend, a variety of defence mechanisms have been proposed to protect against these threats. These include Trusted Execution Environments (TEEs), homomorphic encryption, obfuscation, and differential privacy, among others. However, current surveys largely focus on edge intelligence, which includes distributed training, and thus overlook security and privacy issues that are specific to on-device AI inference. To the best of our knowledge, this paper presents the first comprehensive review of threats and corresponding defence mechanisms targeting on-device inference. Our results show that the attack and defence literature are unbalanced: approximately one quarter of the surveyed attack papers focus on Intellectual Property (IP) attacks, whereas half of the defence solutions tackle the same issue. More importantly, some attack categories have no defence paper associated to them, such as adversarial attacks that account for roughly one third of the attack literature. This asymmetry between known attacks and available mitigations highlights clear opportunities for future research on securing on-device AI inference.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper claims to provide the first comprehensive review of threats (IP theft, adversarial attacks, data breaches) and corresponding defenses (TEEs, homomorphic encryption, obfuscation, differential privacy) for on-device AI inference on edge/mobile devices. It reports an empirical imbalance from the surveyed literature: approximately one quarter of attack papers focus on IP attacks while half of defense papers address the same; adversarial attacks comprise roughly one third of the attack literature yet have zero associated defenses. The abstract positions this asymmetry as highlighting clear opportunities for future research.

Significance. If the underlying literature search and categorization are exhaustive and unbiased, the review would usefully map the attack-defense gap in on-device inference security and privacy, a topic distinct from distributed edge training. This could guide targeted research on currently undefended attack vectors.

major comments (1)
  1. [Abstract] Abstract (and, by extension, the methods description): the central claims—the 'first comprehensive review' assertion and the specific reported fractions (one quarter of attacks on IP, half of defenses on IP, one third of attacks being adversarial with zero defenses)—rest on an exhaustive, unbiased search and accurate classification. No information is supplied on databases queried, search strings, date range, inclusion/exclusion criteria, number of papers screened, or inter-rater process for assigning papers to categories such as 'IP attack' or 'adversarial attack'. Without these details the imbalance findings cannot be verified or reproduced.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback on the need for methodological transparency. We address the single major comment below and will revise the manuscript to incorporate the requested details.

read point-by-point responses
  1. Referee: [Abstract] Abstract (and, by extension, the methods description): the central claims—the 'first comprehensive review' assertion and the specific reported fractions (one quarter of attacks on IP, half of defenses on IP, one third of attacks being adversarial with zero defenses)—rest on an exhaustive, unbiased search and accurate classification. No information is supplied on databases queried, search strings, date range, inclusion/exclusion criteria, number of papers screened, or inter-rater process for assigning papers to categories such as 'IP attack' or 'adversarial attack'. Without these details the imbalance findings cannot be verified or reproduced.

    Authors: We agree that the manuscript lacks a dedicated description of the literature search and categorization process, which is required to support the claims of comprehensiveness and the reported proportions. In the revised version we will add a Methods section that specifies: the databases queried (IEEE Xplore, ACM Digital Library, Google Scholar, arXiv, and ScienceDirect); the search strings (combinations of terms such as 'on-device inference security', 'model extraction attack', 'adversarial example on-device', 'trusted execution environment inference', 'homomorphic encryption mobile AI'); the date range (papers published 2015–2024); inclusion/exclusion criteria (focus on on-device inference rather than distributed training, English-language peer-reviewed or pre-print works); the number of papers initially retrieved, screened, and included; and the categorization procedure (initial independent assignment by two authors followed by discussion to resolve disagreements). These additions will enable verification and reproduction of the imbalance statistics without altering the core findings. revision: yes

Circularity Check

0 steps flagged

No significant circularity; paper is a literature survey with no derivations or self-referential reductions

full rationale

This is a systematic review paper whose central claim is that it provides the first comprehensive coverage of on-device inference threats and defenses. No equations, derivations, fitted parameters, or mathematical steps exist in the provided text. Claims rest on external literature rather than any internal definitions, self-citations that bear the load of the argument, or reductions of predictions to inputs by construction. The lack of explicit search methodology affects verifiability of the 'first comprehensive' assertion but does not match any enumerated circularity pattern and leaves the derivation chain empty.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities; the paper performs no derivations and introduces no new technical constructs.

pith-pipeline@v0.9.1-grok · 5804 in / 1064 out tokens · 27150 ms · 2026-06-29T06:51:34.005601+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

97 extracted references · 78 canonical work pages · 3 internal anchors

  1. [1]

    Farnaaz, M

    N. Farnaaz, M. Jabbar, Random forest modeling for network intrusion detection system, Procedia Computer Science 89 (2016) 213–217, twelfth International Conference on Communication Networks, ICCN 2016, August 19– 21, 2016, Bangalore, India Twelfth International Conference on Data Mining and Warehousing, ICDMW 2016, August 19-21, 2016, Bangalore, India Twe...

  2. [2]

    Zhang, Y

    D. Zhang, Y. Yu, J. Dong, C. Li, D. Su, C. Chu, D. Yu, Mm-llms: Recent advances in multimodal large language models (2024).arXiv: 2401.13601. URLhttps://arxiv.org/abs/2401.13601

  3. [3]

    Y. Yao, J. Duan, K. Xu, Y. Cai, Z. Sun, Y. Zhang, A survey on large language model (llm) security and privacy: The good, the bad, and the ugly, High-Confidence Computing 4 (2) (2024) 100211. doi:https://doi.org/10.1016/j.hcc.2024.100211. URLhttps://www.sciencedirect.com/science/article/pii/ S266729522400014X 52

  4. [4]

    Estevez, M

    M. Estevez, M. T. Ballestar, J. Sainz, Market research and knowledge using generative ai: the power of large language mod- els, Journal of Innovation & Knowledge 10 (5) (2025) 100796. doi:https://doi.org/10.1016/j.jik.2025.100796. URLhttps://www.sciencedirect.com/science/article/pii/ S2444569X25001416

  5. [5]

    Adeboye, 12 best gpu cloud providers for ai/ml in 2025, https://northflank.com/blog/12-best-gpu-cloud-providers, ac- cessed: 2025-09-16 (July 2025)

    D. Adeboye, 12 best gpu cloud providers for ai/ml in 2025, https://northflank.com/blog/12-best-gpu-cloud-providers, ac- cessed: 2025-09-16 (July 2025)

  6. [6]

    Calder, S

    A. Calder, S. Watkins, IT Governance: A Manager’s Guide to Data Security and ISO 27001 / ISO 27002, 4th Edition, Kogan Page Ltd., GBR, 2008

  7. [7]

    org/standard/43757.html(2015)

    ISO/IEC 27017:2015 — Information technology — Security tech- niques — Code of practice for information security controls based on ISO/IEC 27002 for cloud services, available from:https://www.iso. org/standard/43757.html(2015)

  8. [8]

    S. Zhu, H. Leung, X. Wang, J. Wei, H. Xu, When fintech meets pri- vacy: Securing financial llms with differential private fine-tuning (2025). arXiv:2509.08995. URLhttps://arxiv.org/abs/2509.08995

  9. [9]

    C. F. Ruan, Y. Qin, X. Zhou, R. Lai, H. Jin, Y. Dong, B. Hou, M.-S. Yu, Y. Zhai, S. Agarwal, H. Cao, S. Feng, T. Chen, Webllm: A high- performancein-browserllminferenceengine(2024).arXiv:2412.15803. URLhttps://arxiv.org/abs/2412.15803

  10. [10]

    Heydari, Q

    S. Heydari, Q. H. Mahmoud, Tiny machine learning and on-device infer- ence: A survey of applications, challenges, and future directions, Sensors 25 (10) (2025).doi:10.3390/s25103191. URLhttps://www.mdpi.com/1424-8220/25/10/3191

  11. [11]

    Jiang, J

    W. Jiang, J. Yasmin, J. Jones, N. Synovic, J. Kuo, N. Bielanski, Y. Tian, G. K. Thiruvathukal, J. C. Davis, Peatmoss: A dataset and initial analy- sis of pre-trained models in open-source software, in: Proceedings of the 21st International Conference on Mining Software Repositories, MSR ’24, Association for Computing Machinery, New York, NY, USA, 2024, 53...

  12. [12]

    Eynard, Wasm-agents: AI agents run- ning in your browser,https://blog.mozilla.ai/ wasm-agents-ai-agents-running-in-your-browser/, accessed: YYYY-MM-DD (Jul

    D. Eynard, Wasm-agents: AI agents run- ning in your browser,https://blog.mozilla.ai/ wasm-agents-ai-agents-running-in-your-browser/, accessed: YYYY-MM-DD (Jul. 3 2025)

  13. [13]

    M. Sabt, M. Achemlal, A. Bouabdallah, Trusted execution environ- ment: What it is, and what it is not, in: 2015 IEEE Trustcom/Big- DataSE/ISPA, Vol. 1, 2015, pp. 57–64.doi:10.1109/Trustcom.2015. 357

  14. [14]

    G. Kornaros, Hardware-assisted machine learning in resource- constrained iot environments for security: Review and future prospec- tive, IEEE Access 10 (2022) 58603–58622.doi:10.1109/ACCESS.2022. 3179047

  15. [15]

    O.Aouedi, T.-H.Vu, A.Sacco, D.C.Nguyen, K.Piamrat, G.Marchetto, Q.-V. Pham, A survey on intelligent internet of things: Applications, se- curity, privacy, and future directions, IEEE Communications Surveys & Tutorials 27 (2) (2025) 1238–1292.doi:10.1109/COMST.2024.3430368

  16. [16]

    X. Liu, B. Li, S. Chen, Z. Xu, A survey on multi-user privacy issues in edge intelligence: State of the art, challenges, and future directions, Electronics 14 (12) (2025).doi:10.3390/electronics14122401. URLhttps://www.mdpi.com/2079-9292/14/12/2401

  17. [17]

    Mukherjee, R

    M. Mukherjee, R. Matam, C. X. Mavromoustakis, H. Jiang, G. Mas- torakis, M. Guo, Intelligent edge computing: Security and privacy chal- lenges, IEEE Communications Magazine 58 (9) (2020) 26–31.doi: 10.1109/MCOM.001.2000297

  18. [18]

    S. Dhar, J. Guo, J. J. Liu, S. Tripathi, U. Kurup, M. Shah, A sur- vey of on-device machine learning: An algorithms and learning the- ory perspective, ACM Trans. Internet Things 2 (3) (Jul. 2021).doi: 10.1145/3450494. URLhttps://doi.org/10.1145/3450494

  19. [19]

    Wang, X., Tang, Z., Guo, J., Meng, T., Wang, C., Wang, T., and Jia, W

    X.Wang, Z.Tang, J.Guo, T.Meng, C.Wang, T.Wang, W.Jia, Empow- ering edge intelligence: A comprehensive survey on on-device ai models, 54 ACM Computing Surveys 57 (9) (2025) 1–39.doi:10.1145/3724420. URLhttp://dx.doi.org/10.1145/3724420

  20. [20]

    Huckelberry, Y

    J. Huckelberry, Y. Zhang, A. Sansone, J. Mickens, P. A. Beerel, V. J. Reddi, Tinyml security: Exploring vulnerabilities in resource- constrained machine learning systems (2024).arXiv:2411.07114. URLhttps://arxiv.org/abs/2411.07114

  21. [22]

    M. G. S. Murshed, C. Murphy, D. Hou, N. Khan, G. Ananthanarayanan, F. Hussain, Machine learning at the network edge: A survey, ACM Computing Surveys 54 (8) (2021) 1–37.doi:10.1145/3469029. URLhttp://dx.doi.org/10.1145/3469029

  22. [23]

    S. Deng, H. Zhao, W. Fang, J. Yin, S. Dustdar, A. Y. Zomaya, Edge intelligence: The confluence ofedge computing and artificialintelligence, IEEE Internet of Things Journal 7 (8) (2020) 7457–7469.doi:10.1109/ jiot.2020.2984887. URLhttp://dx.doi.org/10.1109/JIOT.2020.2984887

  23. [24]

    Y. Shi, K. Yang, T. Jiang, J. Zhang, K. B. Letaief, Communication- efficient edge ai: Algorithms and systems (2020).arXiv:2002.09668. URLhttps://arxiv.org/abs/2002.09668

  24. [25]

    J. Chen, X. Ran, Deep learning with edge computing: A review, Pro- ceedings of the IEEE 107 (8) (2019) 1655–1674.doi:10.1109/JPROC. 2019.2921977

  25. [26]

    Kitchenham, S

    B. Kitchenham, S. Charters, Guidelines for performing systematic liter- ature reviews in software engineering, Tech. Rep. EBSE-2007-01, Keele University and Durham University (2007)

  26. [27]

    Hyub Kim, J

    D. Hyub Kim, J. O’Brien Weiss, S. Kundu, Extracting dnn architectures via runtime profiling on mobile gpus, IEEE Journal on Emerging and Selected Topics in Circuits and Systems 14 (4) (2024) 620–633.doi: 10.1109/JETCAS.2024.3488597. 55

  27. [28]

    Y. Lee, S. Jun, Y. Cho, W. Han, H. Moon, Y. Paek, Precise extraction of deep learning models via side-channel attacks on edge/endpoint devices, in: European Symposium on Research in Computer Security, Springer, 2022, pp. 364–383

  28. [29]

    Nayan, Q

    T. Nayan, Q. Guo, M. A. Duniawi, M. Botacin, S. Uluagac, R. Sun, SoK: All you need to know about On-Device ML model extraction - the gap between research and practice, in: 33rd USENIX Security Symposium (USENIX Security 24), USENIX Association, Philadelphia, PA, 2024, pp. 5233–5250. URLhttps://www.usenix.org/conference/usenixsecurity24/ presentation/nayan

  29. [30]

    P. Ren, C. Zuo, X. Liu, W. Diao, Q. Zhao, S. Guo, Demistify: Iden- tifying on-device machine learning models stealing and reuse vulner- abilities in mobile apps, in: Proceedings of the IEEE/ACM 46th In- ternational Conference on Software Engineering, ICSE ’24, Associa- tion for Computing Machinery, New York, NY, USA, 2024.doi: 10.1145/3597503.3623325. URL...

  30. [31]

    M. Li, Y. Li, H. Han, X. Ke, T. Wang, F. Xu, L. Fang, Redlc: Learning- driven reverse engineering for deep learning compilers, in: 2024 IEEE 35th International Symposium on Software Reliability Engineering (IS- SRE), 2024, pp. 204–215.doi:10.1109/ISSRE62328.2024.00029

  31. [32]

    M. Zhou, X. Gao, J. Wu, K. Liu, H. Sun, L. Li, Investigating white- box attacks for on-device models, in: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, ICSE ’24, As- sociation for Computing Machinery, New York, NY, USA, 2024.doi: 10.1145/3597503.3639144. URLhttps://doi.org/10.1145/3597503.3639144

  32. [33]

    H. Wu, Y. Gong, X. Ke, H. Liang, F. Xu, Y. Liu, S. Zhong, Tim: Enabling large-scale white-box testing on in-app deep learning mod- els, IEEE Transactions on Information Forensics and Security 19 (2024) 8188–8203.doi:10.1109/TIFS.2024.3455761

  33. [34]

    H. Cao, S. Li, Y. Zhou, M. Fan, X. Zhao, Y. Tang, Cheating your apps: Black-box adversarial attacks on deep learning apps, Journal 56 of Software: Evolution and Process 36 (4) (2024) e2528.arXiv: https://onlinelibrary.wiley.com/doi/pdf/10.1002/smr.2528, doi:https://doi.org/10.1002/smr.2528. URLhttps://onlinelibrary.wiley.com/doi/abs/10.1002/smr. 2528

  34. [35]

    G. Xu, H. Shao, J. Cui, H. Bai, J. Li, G. Bai, S. Liu, W. Meng, X. Zheng, Gendroid: A query-efficient black-box android adversar- ial attack framework, Computers & Security 132 (2023) 103359. doi:https://doi.org/10.1016/j.cose.2023.103359. URLhttps://www.sciencedirect.com/science/article/pii/ S0167404823002699

  35. [36]

    Huang, C

    Y. Huang, C. Chen, Smart app attack: Hacking deep learning models in android apps, IEEE Transactions on Information Forensics and Security 17 (2022) 1827–1840.doi:10.1109/TIFS.2022.3172213

  36. [37]

    Huang, H

    Y. Huang, H. Hu, C. Chen, Robustness of on-device models: Adversarial attack to deep learning models on android apps (2021).arXiv:2101. 04401. URLhttps://arxiv.org/abs/2101.04401

  37. [38]

    H. Cao, S. Li, Y. Zhou, M. Fan, X. Zhao, Y. Tang, Towards black-box attacks on deep learning apps (2021).arXiv:2107.12732. URLhttps://arxiv.org/abs/2107.12732

  38. [39]

    K. Chen, D. Zhang, S. Guan, B. Mi, J. Shen, G. Wang, Private data leakage in federated human activity recognition for wearable healthcare devices, arXiv preprint arXiv:2405.10979 (2024)

  39. [40]

    X. Dong, H. Yin, J. M. Alvarez, J. Kautz, P. Molchanov, H. Kung, Privacy vulnerability of split computing to data-free model inversion attacks, arXiv preprint arXiv:2107.06304 (2021)

  40. [41]

    Y. Sang, Y. Huang, S. Huang, H. Cui, Beyond the model: Data pre- processing attack to deep learning models in android apps, in: Proceed- ings of the 2023 Secure and Trustworthy Deep Learning Systems Work- shop, SecTL ’23, Association for Computing Machinery, New York, NY, USA, 2023.doi:10.1145/3591197.3591308. URLhttps://doi.org/10.1145/3591197.3591308 57

  41. [42]

    Y. Li, J. Hua, H. Wang, C. Chen, Y. Liu, Deeppayload: Black-box back- door attack on deep learning models through neural payload injection, in: 2021 IEEE/ACM 43rd International Conference on Software En- gineering (ICSE), 2021, pp. 263–274.doi:10.1109/ICSE43902.2021. 00035

  42. [43]

    Z. Wang, S. Huang, Y. Huang, H. Cui, Energy-latency attacks to on- device neural networks via sponge poisoning, in: Proceedings of the 2023 Secure and Trustworthy Deep Learning Systems Workshop, SecTL ’23, Association for Computing Machinery, New York, NY, USA, 2023. doi:10.1145/3591197.3591307. URLhttps://doi.org/10.1145/3591197.3591307

  43. [44]

    Y. Sun, G. Xiong, J. Liu, Z. Liu, J. Cui, TSQP: Safeguarding Real-Time Inference for Quantization Neural Networks on Edge Devices , in: 2025 IEEE Symposium on Security and Privacy (SP), IEEE Computer Society, Los Alamitos, CA, USA, 2025, pp. 2114–2132. doi:10.1109/SP61157.2025.00001. URLhttps://doi.ieeecomputersociety.org/10.1109/SP61157. 2025.00001

  44. [45]

    R. Ding, T. Xu, A. A. Ding, Y. Fei, Graph in the vault: Protecting edge gnn inference with trusted execution environment (2025).arXiv: 2502.15012. URLhttps://arxiv.org/abs/2502.15012

  45. [46]

    Z. Liu, T. Zhou, Y. Luo, X. Xu, Tbnet: A neural architectural defense framework facilitating dnn model protection in trusted execution envi- ronments (2024).arXiv:2405.03974. URLhttps://arxiv.org/abs/2405.03974

  46. [47]

    D. Li, Z. Zhang, M. Yao, Y. Cai, Y. Guo, X. Chen, Teeslice: Pro- tecting sensitive neural network models in trusted execution environ- ments when attackers have pre-trained models, ACM Trans. Softw. Eng. Methodol.Just Accepted (Dec. 2024).doi:10.1145/3707453. URLhttps://doi.org/10.1145/3707453

  47. [48]

    M. Yang, W. Yi, J. Wang, H. Hu, X. Xu, Z. Li, Penetralium: Privacy-preserving and memory-efficient neural network inference at the edge, Future Generation Computer Systems 156 (2024) 30–41. 58 doi:https://doi.org/10.1016/j.future.2024.03.008. URLhttps://www.sciencedirect.com/science/article/pii/ S0167739X24000797

  48. [49]

    Z. Zhu, Z. Qu, N. Jia, W. Zhou, B. Ye, Enhancing on-device inference se- curity through tee-integrated dual-network architecture, in: 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2024, pp. 247–254.doi:10.1109/SMC54092.2024.10831158

  49. [50]

    Z. Liu, Y. Luo., S. Duan, T. Zhou, X. Xu, Mirrornet: A tee-friendly framework for secure on-device dnn inference, in: 2023 IEEE/ACM In- ternational Conference on Computer Aided Design (ICCAD), 2023, pp. 1–9.doi:10.1109/ICCAD57390.2023.10323746

  50. [51]

    L. Sun, S. Wang, H. Wu, Y. Gong, F. Xu, Y. Liu, H. Han, S. Zhong, Leap: Trustzone based developer-friendly tee for intelligent mobile apps, IEEE Transactions on Mobile Computing 22 (12) (2023) 7138–7155. doi:10.1109/TMC.2022.3207745

  51. [52]

    Costa, T

    M. Costa, T. Gomes, J. Cabral, J. Monteiro, A. Tavares, S. Pinto, Secureqnn: Introducing a privacy-preserving framework for qnns at the deep edge, in: C. Anutariya, M. M. Bonsangue (Eds.), Data Sci- ence and Artificial Intelligence, Springer Nature Singapore, Singapore, 2023, pp. 3–17

  52. [53]

    Zhang, C

    Z. Zhang, C. Gong, Y. Cai, Y. Yuan, B. Liu, D. Li, Y. Guo, X. Chen, No privacy left outside: On the (in-)security of tee-shielded dnn partition for on-device ml (2023).arXiv:2310.07152. URLhttps://arxiv.org/abs/2310.07152

  53. [54]

    M. S. Islam, M. Zamani, C. H. Kim, L. Khan, K. W. Hamlen, Confi- dential execution of deep learning inference at the untrusted edge with arm trustzone, in: Proceedings of the Thirteenth ACM Conference on Data and Application Security and Privacy, CODASPY ’23, Associa- tion for Computing Machinery, New York, NY, USA, 2023, p. 153–164. doi:10.1145/3577923.3...

  54. [55]

    Z. Sun, R. Sun, C. Liu, A. R. Chowdhury, L. Lu, S. Jha, Shadownet: A secure and efficient on-device model inference system for convolutional 59 neural networks (2023).arXiv:2011.05905. URLhttps://arxiv.org/abs/2011.05905

  55. [56]

    Gangal, M

    A. Gangal, M. Ye, S. Wei, Hybridtee: Secure mobile dnn execution using hybrid trusted execution environment, in: 2020 Asian Hardware Oriented Security and Trust Symposium (AsianHOST), 2020, pp. 1–6. doi:10.1109/AsianHOST51057.2020.9358260

  56. [57]

    F. Mo, A. S. Shamsabadi, K. Katevas, S. Demetriou, I. Leontiadis, A. Cavallaro, H. Haddadi, Darknetz: towards model privacy at the edge using trusted execution environments, in: Proceedings of the 18th In- ternational Conference on Mobile Systems, Applications, and Services, MobiSys ’20, ACM, 2020, p. 161–174.doi:10.1145/3386901.3388946. URLhttp://dx.doi....

  57. [58]

    P. M. VanNostrand, I. Kyriazis, M. Cheng, T. Guo, R. J. Walls, Con- fidential deep learning: Executing proprietary models on untrusted de- vices (2019).arXiv:1908.10730. URLhttps://arxiv.org/abs/1908.10730

  58. [59]

    Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

    B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. G. Howard, H. Adam, D. Kalenichenko, Quantization and training of neural networks for ef- ficient integer-arithmetic-only inference, CoRR abs/1712.05877 (2017). arXiv:1712.05877. URLhttp://arxiv.org/abs/1712.05877

  59. [60]

    Zhang, L

    Z. Zhang, L. K. L. Ng, B. Liu, Y. Cai, D. Li, Y. Guo, X. Chen, Teeslice: slicing dnn models for secure and efficient deployment, in: Proceed- ings of the 2nd ACM International Workshop on AI and Software Test- ing/Analysis, AISTA 2022, Association for Computing Machinery, New York, NY, USA, 2022, p. 1–8.doi:10.1145/3536168.3543299. URLhttps://doi.org/10.1...

  60. [61]

    P. Li, J. Huang, Y. Peng, S. Zhang, A novel access control and privacy- enhancing approach for models in edge computing, in: 2025 IEEE Wire- less Communications and Networking Conference (WCNC), 2025, pp. 1–6.doi:10.1109/WCNC61545.2025.10978383

  61. [62]

    Baciu, A

    V.-E. Baciu, A. Braeken, L. Segers, B. d. Silva, Secure tiny machine learning on edge devices: A lightweight dual attestation mechanism 60 for machine learning, Future Internet 17 (2) (2025).doi:10.3390/ fi17020085. URLhttps://www.mdpi.com/1999-5903/17/2/85

  62. [63]

    Jabbeen, V

    M. Jabbeen, V. Kumar, R. Sen, Iotcloak: Practical integrity checks of machine learning inference code and models on tiny iot devices, in: Pro- ceedings of the 2nd International Workshop on Security and Privacy of Sensing Systems, Sensors S&P ’25, Association for Computing Ma- chinery, New York, NY, USA, 2025, p. 17–21.doi:10.1145/3722566. 3727630. URLhttp...

  63. [64]

    Q. Li, Y. Xie, T. Du, Z. Shen, Z. Qin, H. Peng, X. Zhao, X. Zhu, J. Yin, X. Zhang, Coreguard: Safeguarding foundational capabilities of llms against model stealing in edge deployment (2024).arXiv:2410.13903. URLhttps://arxiv.org/abs/2410.13903

  64. [65]

    J. Chen, H. Zheng, T. Liu, J. Liu, Y. Cheng, X. Zhang, S. Ji, Edgepro: Edge deep learning model protection via neuron authorization, IEEE Transactions on Dependable and Secure Computing 21 (5) (2024) 4967– 4981.doi:10.1109/TDSC.2024.3365730

  65. [66]

    Q. Li, Z. Shen, Z. Qin, Y. Xie, X. Zhang, T. Du, S. Cheng, X. Wang, J. Yin, Translinkguard: Safeguarding transformer models against model stealing in edge deployment, in: Proceedings of the 32nd ACM Interna- tional Conference on Multimedia, MM ’24, ACM, 2024, p. 3479–3488. doi:10.1145/3664647.3680786. URLhttp://dx.doi.org/10.1145/3664647.3680786

  66. [67]

    Z. Sun, R. Sun, L. Lu, A. Mislove, Mind your weight(s): A large-scale study on insufficient machine learning model protection in mobile apps, in: 30th USENIX Security Symposium (USENIX Security 21), USENIX Association, 2021, pp. 1955–1972. URLhttps://www.usenix.org/conference/usenixsecurity21/ presentation/sun-zhichuang

  67. [68]

    Zhang, H

    Q. Zhang, H. Zhong, W. Shi, L. Liu, A trusted and collaborative framework for deep learning in iot, Computer Networks 193 (2021) 108055.doi:https://doi.org/10.1016/j.comnet.2021.108055. 61 URLhttps://www.sciencedirect.com/science/article/pii/ S1389128621001523

  68. [69]

    Zhang, N

    Z. Zhang, N. Wang, Z. Zhang, Y. Zhang, T. Zhang, J. Liu, Y. Wu, Groupcover: A secure, efficient and scalable inference framework for on- device model protection based on TEEs, in: Forty-first International Conference on Machine Learning, 2024. URLhttps://openreview.net/forum?id=4mU6LNMaIu

  69. [70]

    C.-Y. Yang, G. Ramshankar, N. Eliopoulos, P. Jajal, S. Nambiar, E. Miller, X. Zhang, D. J. Tian, S.-H. Chen, C.-F. Perng, Y.-H. Lu, Securing deep neural networks on edge from membership inference at- tacks using trusted execution environments, in: Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED ’24, Ass...

  70. [71]

    H. Yang, D. Zhang, Y. Zhao, Y. Li, Y. Liu, A first look at efficient and secure on-device llm inference against kv leakage, in: Proceedings of the 19th Workshop on Mobility in the Evolving Internet Architecture, MobiArch ’24, Association for Computing Machinery, New York, NY, USA, 2024, p. 13–18.doi:10.1145/3691555.3696827. URLhttps://doi.org/10.1145/3691...

  71. [72]

    M. Zhou, X. Gao, P. Liu, J. Grundy, C. Chen, X. Chen, L. Li, Model- less is the best model: Generating pure code implementations to replace on-device dl models (2024).arXiv:2403.16479. URLhttps://arxiv.org/abs/2403.16479

  72. [73]

    M. Zhou, X. Gao, X. Chen, C. Chen, J. Grundy, L. Li, Dynamo: Protecting mobile dl models through coupling obfuscated dl operators, in: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, ASE ’24, ACM, 2024, p. 204–215. doi:10.1145/3691620.3694998. URLhttp://dx.doi.org/10.1145/3691620.3694998

  73. [74]

    M. Zhou, X. Gao, J. Wu, J. Grundy, X. Chen, C. Chen, L. Li, Modelob- fuscator: Obfuscating model information to protect deployed ml-based systems, in: Proceedings of the 32nd ACM SIGSOFT International 62 Symposium on Software Testing and Analysis, ISSTA 2023, Association for Computing Machinery, New York, NY, USA, 2023, p. 1005–1017. doi:10.1145/3597926.3...

  74. [75]

    J. Hou, H. Liu, Y. Liu, Y. Wang, P.-J. Wan, X.-Y. Li, Model protection: Real-time privacy-preserving inference service for model privacy at the edge, IEEE Transactions on Dependable and Secure Computing 19 (6) (2022) 4270–4284.doi:10.1109/TDSC.2021.3126315

  75. [76]

    J. Hua, Y. Li, H. Wang, Mmguard: Automatically protecting on-device deep learning models in android apps, in: 2021 IEEE Security and Privacy Workshops (SPW), 2021, pp. 71–77.doi:10.1109/SPW53761. 2021.00019

  76. [77]

    Z. Jian, X. Liu, Q. Dong, L. Cheng, X. Xie, T. Li, Smartzone: Runtime support for secure and efficient on-device inference on arm trustzone, IEEE Transactions on Computers (2025) 1–17doi:10.1109/TC.2025. 3557971

  77. [78]

    X. Xie, H. Wang, Z. Jian, T. Li, W. Wang, Z. Xu, G. Wang, Memory- efficient and secure dnn inference on trustzone-enabled consumer iot devices (2024).arXiv:2403.12568. URLhttps://arxiv.org/abs/2403.12568

  78. [79]

    B. Hu, Y. Wang, J. Cheng, T. Zhao, Y. Xie, X. Guo, Y. Chen, Se- cure and efficient mobile dnn using trusted execution environments, in: Proceedings of the 2023 ACM Asia Conference on Computer and Com- munications Security, ASIA CCS ’23, Association for Computing Ma- chinery, NewYork, NY,USA,2023, p.274–285.doi:10.1145/3579856. 3582820. URLhttps://doi.org/...

  79. [80]

    J. Choi, J. Kim, C. Lim, S. Lee, J. Lee, D. Song, Y. Kim, Guardiann: Fast and secure on-device inference in trustzone using embedded sram and cryptographic hardware, in: Proceedings of the 23rd ACM/IFIP International Middleware Conference, Middleware ’22, Association for Computing Machinery, New York, NY, USA, 2022, p. 15–28.doi:10. 1145/3528535.3531513. ...

  80. [81]

    R. Liu, L. Garcia, Z. Liu, B. Ou, M. Srivastava, Secdeep: Secure and performant on-device deep learning inference framework for mo- bile and iot devices, in: Proceedings of the International Conference on Internet-of-Things Design and Implementation, IoTDI ’21, Associa- tion for Computing Machinery, New York, NY, USA, 2021, p. 67–79. doi:10.1145/3450268.3...

Showing first 80 references.