Recognition: unknown
Secure eFPGA-Enabled Edge LLM Inference: Architectural and Hardware Countermeasures
Pith reviewed 2026-05-08 11:31 UTC · model grok-4.3
The pith
A hybrid ASIC+eFPGA architecture secures edge transformer inference against side-channel and supply-chain attacks while preserving ASIC efficiency.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that integrating an eFPGA with an ASIC accelerator enables security-oriented mechanisms such as adaptive runtime monitoring, side-channel mitigation, and post-deployment patching for transformer inference. These mechanisms enhance resilience against runtime attacks including power, electromagnetic, and timing analysis as well as fault injection, and against supply-chain attacks such as hardware Trojans and untrusted third-party IPs, while retaining the performance and energy benefits of the ASIC's optimized dataflows, specialized architectures, low-bitwidth computation, and efficient memory hierarchies.
What carries the argument
The hybrid ASIC+eFPGA architecture, in which the reconfigurable logic of the eFPGA implements security functions atop the efficient ASIC transformer accelerator.
If this is right
- Runtime monitoring and mitigation reduce successful side-channel extraction of model weights or inputs during inference.
- Post-deployment patching allows security fixes or responses to new threats without hardware redesign or replacement.
- Configurable defenses narrow the attack surface from supply-chain threats such as hardware Trojans or untrusted IPs.
- The energy and throughput advantages of specialized ASIC dataflows remain available for transformer inference.
Where Pith is reading between the lines
- Future edge AI hardware designs may routinely reserve small reconfigurable regions for security functions that can evolve after deployment.
- The same hybrid pattern could extend to other domain-specific accelerators where efficiency must coexist with updatability.
- Quantifying the approach requires concrete implementations that measure both attack resistance and any added overhead on real silicon.
Load-bearing premise
That adding the eFPGA can deliver effective runtime monitoring, side-channel mitigation, and patching without reducing the energy efficiency or speed of the underlying ASIC dataflow and memory hierarchy.
What would settle it
A hardware prototype measurement showing either that model extraction via side-channel analysis still succeeds or that the hybrid design increases power draw or inference latency relative to the pure ASIC baseline under identical workloads.
Figures
read the original abstract
Edge deployment of transformer-based models increasingly relies on ASIC accelerators due to their high performance and energy efficiency, achieved through optimized dataflows, specialized architectures, low-bitwidth computation, and efficient memory hierarchies. However, these advantages come with significant security vulnerabilities. ASIC-based DNN accelerators are susceptible to side-channel attacks (e.g., power, electromagnetic, and timing analysis) and fault injection attacks (e.g., voltage manipulation, clock glitches, and memory perturbations), which can lead to model extraction or compromised inference integrity. Furthermore, threats introduced during design and fabrication, such as hardware Trojans or untrusted third-party IPs, further expand the attack surface. To address these challenges, we explore a hybrid ASIC+eFPGA architecture that combines the efficiency of ASICs with the flexibility of reconfigurable logic. The integrated eFPGA enables security-oriented mechanisms such as adaptive runtime monitoring, side-channel mitigation and post-deployment patching. By leveraging these capabilities, the proposed approach enhances system resilience against both runtime and supply-chain attacks, while preserving the performance benefits of ASIC-based transformer inference.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a hybrid ASIC+eFPGA architecture for secure edge inference of transformer-based LLMs. It argues that pure ASIC accelerators, while efficient due to optimized dataflows, low-bitwidth compute, and memory hierarchies, are vulnerable to side-channel attacks, fault injection, and supply-chain threats such as hardware Trojans. The eFPGA fabric is integrated to enable adaptive runtime monitoring, side-channel countermeasures, and post-deployment patching, with the claim that this combination improves resilience against runtime and supply-chain attacks while retaining the performance and energy advantages of the underlying ASIC design.
Significance. If the performance-preservation and resilience claims were quantitatively validated, the work would represent a meaningful architectural contribution to secure edge AI hardware by showing how reconfigurable logic can be added to ASIC dataflow engines without sacrificing their efficiency optimizations. The proposal correctly identifies a real tension between ASIC specialization and security needs, and the high-level integration concept is plausible.
major comments (2)
- Abstract and the high-level architecture description: the central claim that the hybrid design 'preserves the performance benefits of ASIC-based transformer inference' is unsupported. No area, power, latency, energy, or throughput estimates are provided, nor is there any comparison against a pure-ASIC baseline or any simulation of the integrated dataflow and memory hierarchy. This directly undermines the headline result, as eFPGA insertion could introduce routing, clock, or memory-access overheads that erode the very optimizations the paper seeks to retain.
- Abstract and security-mechanism section: the assertion that the eFPGA enables 'enhanced system resilience against both runtime and supply-chain attacks' is presented without any attack model, threat evaluation, or resilience metric. No concrete mechanisms (e.g., specific monitoring logic, side-channel sensor placement, or patching protocol) are analyzed for effectiveness or overhead, leaving the resilience improvement as an untested qualitative statement.
minor comments (2)
- The manuscript would benefit from a clearer block diagram that explicitly shows how the eFPGA interfaces with the ASIC dataflow engine, memory hierarchy, and compute units without disrupting the optimized low-bitwidth paths.
- A brief discussion of related work on eFPGA-based security monitors or hybrid ASIC-FPGA DNN accelerators would help situate the contribution.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The work is an architectural proposal for integrating eFPGA fabric into ASIC-based LLM accelerators to enable security features. We address the two major comments below and indicate the revisions we will make.
read point-by-point responses
-
Referee: Abstract and the high-level architecture description: the central claim that the hybrid design 'preserves the performance benefits of ASIC-based transformer inference' is unsupported. No area, power, latency, energy, or throughput estimates are provided, nor is there any comparison against a pure-ASIC baseline or any simulation of the integrated dataflow and memory hierarchy. This directly undermines the headline result, as eFPGA insertion could introduce routing, clock, or memory-access overheads that erode the very optimizations the paper seeks to retain.
Authors: We agree that the performance-preservation claim in the abstract and architecture section lacks quantitative support in the current manuscript. The paper focuses on the high-level integration concept rather than detailed implementation metrics or simulations. In the revised version we will qualify the claim by noting that the eFPGA is intended to be placed in non-critical paths (e.g., for monitoring and patching logic) so that the core ASIC dataflow, memory hierarchy, and low-bitwidth compute units remain unchanged. We will also add a new subsection with overhead estimates drawn from published eFPGA-ASIC co-integration studies and a qualitative discussion of how the hybrid design avoids disrupting optimized transformer dataflows. revision: yes
-
Referee: Abstract and security-mechanism section: the assertion that the eFPGA enables 'enhanced system resilience against both runtime and supply-chain attacks' is presented without any attack model, threat evaluation, or resilience metric. No concrete mechanisms (e.g., specific monitoring logic, side-channel sensor placement, or patching protocol) are analyzed for effectiveness or overhead, leaving the resilience improvement as an untested qualitative statement.
Authors: We acknowledge that the security claims are currently stated at a high level without an explicit attack model or quantitative resilience analysis. The manuscript describes the eFPGA's role in enabling adaptive monitoring, side-channel countermeasures, and post-deployment patching, but does not evaluate specific implementations. In revision we will insert a dedicated threat-model subsection that enumerates the considered runtime attacks (power/EM side-channel, fault injection) and supply-chain threats (hardware Trojans), describe concrete example mechanisms (e.g., on-fabric power sensors for anomaly detection and partial reconfiguration for Trojan isolation), and provide a qualitative resilience argument together with estimated area and power overheads for those mechanisms. Full experimental validation of attack resistance will be noted as future work. revision: yes
Circularity Check
No circularity: conceptual architectural proposal with no derivation chain
full rationale
The manuscript is a high-level conceptual proposal for a hybrid ASIC+eFPGA architecture to address security vulnerabilities in edge transformer inference. It contains no equations, derivations, parameter fittings, or quantitative models that could reduce to inputs by construction. Claims about preserving ASIC performance benefits while adding runtime monitoring and patching are presented as qualitative assertions supported by block diagrams and descriptions, without any self-citation load-bearing steps, ansatz smuggling, or renaming of known results. The work is therefore self-contained as an architectural idea with no circular elements in its reasoning chain.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption ASIC-based DNN accelerators are susceptible to side-channel attacks (power, electromagnetic, timing) and fault injection attacks.
- domain assumption eFPGA integration enables security mechanisms such as adaptive runtime monitoring, side-channel mitigation, and post-deployment patching.
Reference graph
Works this paper leans on
-
[1]
Mishraet al.,Artificial intelligence and hardware accelerators
A. Mishraet al.,Artificial intelligence and hardware accelerators. Springer, 2023
2023
-
[2]
Efficient hardware architectures for accelerat- ing deep neural networks: Survey,
P. Dhilleswararaoet al., “Efficient hardware architectures for accelerat- ing deep neural networks: Survey,”IEEE access, vol. 10, pp. 131 788– 131 828, 2022
2022
-
[3]
Timeloop: A systematic approach to dnn accelerator evaluation,
A. Parasharet al., “Timeloop: A systematic approach to dnn accelerator evaluation,” inIEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). IEEE, 2019, pp. 304–315
2019
-
[4]
SoK: Model reverse engineering threats for neural network hardware,
S. Potluri and F. Koushanfar, “SoK: Model reverse engineering threats for neural network hardware,” inCryptology ePrint Archive, 2024
2024
-
[5]
Defense against ml-based power side-channel at- tacks on dnn accelerators with adversarial attacks,
X. Yanet al., “Defense against ml-based power side-channel at- tacks on dnn accelerators with adversarial attacks,”arXiv preprint arXiv:2312.04035, 2023
-
[6]
Time to leak: Cross-device timing attack on edge deep learning accelerator,
Y .-S. Wonet al., “Time to leak: Cross-device timing attack on edge deep learning accelerator,” inInternational Conference on Electronics, Information, and Communication (ICEIC). IEEE, 2021, pp. 1–4
2021
-
[7]
Mercury: An automated remote side-channel attack to nvidia deep learning accelerator,
X. Yanet al., “Mercury: An automated remote side-channel attack to nvidia deep learning accelerator,” inarXiv preprint, 2023
2023
-
[8]
Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,
W. Feduset al., “Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,”Journal of Machine Learning Research, vol. 23, no. 120, pp. 1–39, 2022
2022
-
[9]
A survey: Collaborative hardware and software design in the era of large language models,
C. Guoet al., “A survey: Collaborative hardware and software design in the era of large language models,”IEEE Circuits and Systems Magazine, vol. 25, no. 1, pp. 35–57, 2025
2025
-
[10]
J. Liet al., “Large language model inference acceleration: A compre- hensive hardware perspective,”arXiv preprint arXiv:2410.04466, 2024
-
[11]
{WaferLLM}: Large language model inference at wafer scale,
C. Heet al., “{WaferLLM}: Large language model inference at wafer scale,” in19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25), 2025, pp. 257–273
2025
-
[12]
Oaken: Fast and efficient llm serving with online- offline hybrid kv cache quantization,
M. Kimet al., “Oaken: Fast and efficient llm serving with online- offline hybrid kv cache quantization,” inProceedings of the 52nd Annual International Symposium on Computer Architecture, 2025, pp. 482–497
2025
-
[13]
Hardware acceleration of llms: A comprehen- sive survey and comparison,
N. Koilia and C. Kachris, “Hardware acceleration of llms: A comprehen- sive survey and comparison,”arXiv preprint arXiv:2409.03384, 2024
-
[14]
{Deep-Dup}: An adversarial weight dupli- cation attack framework to crush deep neural network in{Multi- Tenant}{FPGA},
A. S. Rakinet al., “{Deep-Dup}: An adversarial weight dupli- cation attack framework to crush deep neural network in{Multi- Tenant}{FPGA},” in30th USENIX Security Symposium (USENIX Se- curity 21), 2021, pp. 1919–1936
2021
-
[15]
Neighbors from hell: V oltage attacks against deep learning accelerators on multi-tenant fpgas,
A. Boutroset al., “Neighbors from hell: V oltage attacks against deep learning accelerators on multi-tenant fpgas,” inInt’l Conf. on Field- Programmable Technology (ICFPT). IEEE, 2020, pp. 103–111
2020
-
[16]
A practical remote power attack on machine learning accelerators in cloud fpgas,
S. Tianet al., “A practical remote power attack on machine learning accelerators in cloud fpgas,” in2023 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2023, pp. 1–6
2023
-
[17]
Energon: Unveiling transformers from gpu power and thermal side-channels,
A. Chaudhuriet al., “Energon: Unveiling transformers from gpu power and thermal side-channels,” in2025 IEEE/ACM International Confer- ence On Computer Aided Design (ICCAD). IEEE, 2025, pp. 1–9
2025
-
[18]
I know what you said: Unveiling hardware cache{Side- Channels}in local large language model inference,
Z. Gaoet al., “I know what you said: Unveiling hardware cache{Side- Channels}in local large language model inference,” in34th USENIX Security Symposium (USENIX Security 25), 2025, pp. 1649–1668
2025
-
[19]
One flip away from chaos: Unraveling single points of failure in quantized dnn s,
C. Gongyeet al., “One flip away from chaos: Unraveling single points of failure in quantized dnn s,” inIEEE Int’l Symposium on Hardware Oriented Security and Trust (HOST). IEEE, 2024, pp. 332–342
2024
-
[20]
Novel hardware trojan attack on activation param- eters of fpga-based dnn accelerators,
R. Mukherjeeet al., “Novel hardware trojan attack on activation param- eters of fpga-based dnn accelerators,”IEEE Embedded Systems Letters, vol. 14, no. 3, pp. 131–134, 2022
2022
-
[21]
Hardware redaction via designer-directed fine-grained efpga insertion,
P. Mohanet al., “Hardware redaction via designer-directed fine-grained efpga insertion,” in2021 Design, Automation & Test in Europe Confer- ence & Exhibition (DATE). IEEE, 2021, pp. 1186–1191
2021
-
[22]
Shell: Shrinking efpga fabrics for logic locking,
H. M. Kamaliet al., “Shell: Shrinking efpga fabrics for logic locking,” in2023 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2023, pp. 1–6
2023
-
[23]
Soar: Secure once, adapt at runtime with efpga-based redaction for ip protection,
V . Daset al., “Soar: Secure once, adapt at runtime with efpga-based redaction for ip protection,”IEEE Access, 2025
2025
-
[24]
Powerscout: A security-oriented power delivery network modeling framework for cross-domain side-channel analysis,
H. Zhuet al., “Powerscout: A security-oriented power delivery network modeling framework for cross-domain side-channel analysis,” in2020 Asian Hardware Oriented Security and Trust Symposium (AsianHOST). IEEE, 2020, pp. 1–6
2020
-
[25]
Advances in logic locking: Past, present, and prospects,
H. M. Kamaliet al., “Advances in logic locking: Past, present, and prospects,”Cryptology ePrint Archive, 2022
2022
-
[26]
Advancing trustworthiness in system-in- package: A novel root-of-trust hardware security module for heteroge- neous integration,
M. S. U. I. Samiet al., “Advancing trustworthiness in system-in- package: A novel root-of-trust hardware security module for heteroge- neous integration,”IEEE Access, vol. 12, pp. 48 081–48 107, 2024
2024
-
[27]
Nuredact: Non-uniform efpga architecture for low- overhead and secure ip redaction,
V . Daset al., “Nuredact: Non-uniform efpga architecture for low- overhead and secure ip redaction,”arXiv preprint arXiv:2601.11770, 2026
-
[28]
Securing ai hardware: Challenges in detecting and mitigating hardware trojans in ml accelerators,
K. I. Gubbiet al., “Securing ai hardware: Challenges in detecting and mitigating hardware trojans in ml accelerators,” inIEEE Int’l Midwest Symp. on Circuits and Systems (MWSCAS). IEEE, 2023, pp. 821–825
2023
-
[29]
Taking efpga security to the next level,
Flex Logix Technologies and Intrinsic ID, “Taking efpga security to the next level,” https://assets.flex-logix.com/resources/2022%2007%2011% 20Taking%20eFPGA%20security%20to%20the%20Next%20Level.pdf, Jul. 2022, white paper
2022
-
[30]
Transformers: A security perspective,
B. S. Latibari, N. Nazari, M. A. Chowdhuryet al., “Transformers: A security perspective,” inIEEE Access, IEEE. Institute of Electrical and Electronics Engineers Inc., 2024, pp. 181 071–181 105
2024
-
[31]
A threshold implementation-based neural network accel- erator with power and electromagnetic side-channel countermeasures,
S. Majiet al., “A threshold implementation-based neural network accel- erator with power and electromagnetic side-channel countermeasures,” IEEE Journal of Solid-State Circuits, vol. 58, no. 1, pp. 141–154, 2022
2022
-
[32]
A dynamic and differential cmos logic with signal independent power consumption to withstand differential power analysis on smart cards,
K. Tiriet al., “A dynamic and differential cmos logic with signal independent power consumption to withstand differential power analysis on smart cards,” inIEEE European Solid-State Circuits Conference (ESSCIRC). IEEE, 2002, pp. 403–406
2002
-
[33]
A side-channel and fault-attack resistant aes circuit working on duplicated complemented values,
M. Doulcier-Verdier, J.-M. Dutertre, J. Fournier, J.-B. Rigaud, B. Robis- son, and A. Tria, “A side-channel and fault-attack resistant aes circuit working on duplicated complemented values,” inIEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers. IEEE, 2011, pp. 274–276
2011
-
[34]
The benefits and costs of netlist randomization based side-channel countermeasures: An in-depth evaluation,
A. Asgharet al., “The benefits and costs of netlist randomization based side-channel countermeasures: An in-depth evaluation,”Journal of Low Power Electronics and Applications, vol. 12, no. 3, p. 42, 2022
2022
-
[35]
Increasing side-channel resistance by netlist randomization and fpga-based reconfiguration,
A. Asghar, B. Hettweret al., “Increasing side-channel resistance by netlist randomization and fpga-based reconfiguration,” inInt’ Symposium on Applied Reconfigurable Computing, 2021, pp. 173–187
2021
-
[36]
Bomanet: Boolean masking of an entire neural network,
A. Dubey, R. Cammarota, and A. Aysu, “Bomanet: Boolean masking of an entire neural network,” inIEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 2020, pp. 51:1–51:9
2020
-
[37]
Guarding machine learning hardware against phys- ical side-channel attacks,
A. Dubeyet al., “Guarding machine learning hardware against phys- ical side-channel attacks,”ACM Journal on Emerging Technologies in Computing Systems (JETC), vol. 18, no. 3, pp. 1–31, 2022
2022
-
[38]
Modulonet: Neural networks meet mod- ular arithmetic for efficient hardware masking,
A. Dubey, A. Ahmadet al., “Modulonet: Neural networks meet mod- ular arithmetic for efficient hardware masking,”IACR Transactions on Cryptographic Hardware and Embedded Systems, pp. 506–556, 2022
2022
-
[39]
Blackjack: Secure machine learning on iot devices through hardware-based shuffling,
K. Ganesan, M. Fishkin, O. Lin, and N. E. Jerger, “Blackjack: Secure machine learning on iot devices through hardware-based shuffling,” in arXiv preprint, 2023
2023
-
[40]
Secureloop: Design space exploration of secure dnn accelerators,
K. Lee, M. Yan, J. S. Emer, and A. P. Chandrakasan, “Secureloop: Design space exploration of secure dnn accelerators,” inInt’l Symp. on Microarchitecture (MICRO). IEEE/ACM, 2023, pp. 194–208
2023
-
[41]
Phantom:{Privacy-Preserving}deep neural network model obfuscation in heterogeneous{TEE}and{GPU}system,
J. Baiet al., “Phantom:{Privacy-Preserving}deep neural network model obfuscation in heterogeneous{TEE}and{GPU}system,” inUSENIX Security Symposium, 2025, pp. 5565–5582
2025
-
[42]
Secureinfer: Heterogeneous tee- gpu architecture for privacy-critical tensors for large language model deployment,
T. Nayan, Z. Zhang, and R. Sun, “Secureinfer: Heterogeneous tee- gpu architecture for privacy-critical tensors for large language model deployment,” inInternational Conference on Intelligent Computing and Systems at the Edge (ICEdge), 2025, pp. 1–7
2025
-
[43]
Lightweight aes design for iot applications: Optimiza- tions in fpga and asic with dfa countermeasure strategies,
S. Ahmedet al., “Lightweight aes design for iot applications: Optimiza- tions in fpga and asic with dfa countermeasure strategies,”IEEE Access, vol. 13, pp. 22 489–22 509, 2025
2025
-
[44]
Efficient soc security monitoring: quality attributes and potential solutions,
M. M. M. Rahmanet al., “Efficient soc security monitoring: quality attributes and potential solutions,”IEEE Design & Test, vol. 41, no. 4, pp. 26–34, 2023
2023
-
[45]
The road not taken: efpga accelerators utilized for soc security auditing,
M. M. M. Rahman, S. Tareket al., “The road not taken: efpga accelerators utilized for soc security auditing,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 43, no. 10, pp. 3068–3082, 2024
2024
-
[46]
Hardware-supported patching of security bugs in hardware ip blocks,
W.-K. Liuet al., “Hardware-supported patching of security bugs in hardware ip blocks,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 42, no. 1, pp. 54–67, 2022
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.