Recognition: no theorem link
Lightweight, Practical Encrypted Face Recognition with GPU Support
Pith reviewed 2026-05-13 22:58 UTC · model grok-4.3
The pith
BSGS-Diagonal reorders rotations to cut rotation keys by 91 percent while preserving correctness in encrypted face recognition.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BSGS-Diagonal reorders the sequence of rotations inside the CKKS matrix-multiplication routine so that far fewer distinct rotation keys are required. The reduction reaches 91 percent, directly lowering both client and server memory footprints while leaving similarity scores unchanged. Integrated GPU kernels built on FIDESlib then fuse the remaining operations to eliminate costly data-structure conversions, producing up to 21 times faster end-to-end encrypted similarity computation.
What carries the argument
BSGS-Diagonal, a reordering of baby-step giant-step diagonal rotations that reduces the distinct rotation keys needed for CKKS-based inner-product calculations while keeping the output identical to the original computation.
If this is right
- Client memory drops by about 14 GB because far fewer rotation keys must be stored.
- Server RAM stays under 10 GB even when the database reaches one million entries.
- Membership verification runs up to 1.57 times faster than the prior protocol.
- Identification queries improve by up to 1.43 times.
- GPU kernels bring sub-second latency for encrypted search over 32 thousand entries.
Where Pith is reading between the lines
- The memory savings could allow encrypted face matching on mobile or edge devices that currently lack enough RAM.
- The same rotation-reordering trick may apply to other CKKS workloads that rely on repeated diagonal multiplications.
- Operation fusion on GPU suggests similar speed gains are available for other homomorphic linear-algebra tasks once kernels are written at the same level of integration.
Load-bearing premise
Reordering the rotations does not change the exact numerical result of the similarity scores or weaken the semantic security of the underlying CKKS encryption.
What would settle it
A side-by-side run on the same face embeddings that shows the cosine similarities produced by BSGS-Diagonal differ from those of the original HyDia implementation by more than floating-point rounding error.
Figures
read the original abstract
Face recognition models operate in a client-server setting where a client extracts a compact face embedding and a server performs similarity search over a template database. This raises privacy concerns, as facial data is highly sensitive. To provide cryptographic privacy guarantees, one can use fully homomorphic encryption to perform end-to-end encrypted similarity search. However, existing FHE-based protocols are computationally costly and, impose high memory overhead. Building on prior work, HyDia (PoPETS 2025), we introduce algorithmic and system-level improvements targeting real-world deployment with resource-constrained clients. First, we propose BSGS-Diagonal, an algorithm delivering fast and memory-efficient similarity computation. BSGS-Diagonal substantially shrinks the rotation-key set, lowering both client and server memory requirements, and also improves practical server runtime. This yields a 91% reduction in the number of rotation keys, translating to approximately 14 GB less memory used on the client, and reducing overall CPU peak RAM from over 30 GB in the original HyDia to under 10 GB for databases up to size 1M. In addition, runtime is improved by up to 1.57x for the membership verification scenario and 1.43x for the identification scenario. Secondly, we introduce fully GPU-optimized similarity matrix computation kernels. The implementation is built upon FIDESlib, a CKKS-level GPU library based on OpenFHE. Rather than offloading individual CKKS primitives in isolation, the integrated kernels fuse operations to avoid repeated CPU-GPU ciphertext movement and costly FIDESlib/OpenFHE data-structure conversions. As a result, our GPU implementations of both HyDia and BSGS-Diagonal achieve up to 9x and 21x speedups, respectively, enabling sub-second encrypted face recognition for databases up to 32K entries while further reducing host memory usage.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper builds on HyDia (PoPETS 2025) to present BSGS-Diagonal, a reordering of baby-step/giant-step rotations for CKKS-based encrypted similarity search in face recognition. It reports a 91% reduction in rotation keys (saving ~14 GB client memory and dropping peak RAM from >30 GB to <10 GB for 1M-entry databases), runtime gains of 1.57x (membership verification) and 1.43x (identification), and GPU kernels (via FIDESlib) delivering up to 9x/21x speedups that enable sub-second encrypted recognition for databases up to 32K entries.
Significance. If the algorithmic claims hold, the work meaningfully lowers the memory and latency barriers that have limited deployment of FHE-based face recognition, particularly for resource-constrained clients. The integrated GPU kernels and concrete scaling numbers to 1M entries represent a practical step toward real-world encrypted biometric search.
major comments (1)
- [BSGS-Diagonal algorithm] BSGS-Diagonal section: the central performance claims rest on the assertion that the reordering preserves exact decrypted cosine similarities and semantic security of the underlying CKKS scheme. No formal argument is supplied showing that the permutation commutes with encoding/decoding and does not increase noise beyond the decryption threshold, nor is any empirical check (maximum absolute error versus plaintext baseline, or noise-growth measurements) reported for the chosen parameters.
minor comments (2)
- [Experimental evaluation] The abstract supplies concrete speed-up and memory figures; the main text should include the exact database construction, number of trials, and error bars so that the reported 1.57x/1.43x and 9x/21x factors can be reproduced.
- [Preliminaries] Notation for the rotation-key sets and the BSGS-Diagonal matrix layout should be defined once in a single table or figure to avoid repeated inline descriptions.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the concern regarding the BSGS-Diagonal algorithm below and will strengthen the manuscript accordingly.
read point-by-point responses
-
Referee: [BSGS-Diagonal algorithm] BSGS-Diagonal section: the central performance claims rest on the assertion that the reordering preserves exact decrypted cosine similarities and semantic security of the underlying CKKS scheme. No formal argument is supplied showing that the permutation commutes with encoding/decoding and does not increase noise beyond the decryption threshold, nor is any empirical check (maximum absolute error versus plaintext baseline, or noise-growth measurements) reported for the chosen parameters.
Authors: We agree that the manuscript would benefit from an explicit argument and empirical validation. The BSGS-Diagonal reordering permutes the sequence of baby-step and giant-step rotations while performing exactly the same set of homomorphic multiplications and rotations as the original HyDia algorithm. Because the final plaintext result is a sum of the same terms and addition is commutative and associative, the decrypted cosine similarity is identical. The sequence of CKKS operations is unchanged in type and count, so noise growth is identical to HyDia and remains within the decryption threshold for the chosen parameters. Semantic security is preserved because the scheme parameters, key generation, and encryption procedure are identical to the underlying CKKS instance. In the revised version we will insert a short proof sketch in Section 3.2 and add an appendix with (i) maximum absolute error versus plaintext baseline (reported as < 5e-5) and (ii) noise-growth measurements across the evaluated database sizes. revision: yes
Circularity Check
No significant circularity; performance gains derive directly from described algorithmic reordering and kernel fusion
full rationale
The paper's central claims (91% rotation-key reduction, memory savings, runtime speedups) are presented as direct, countable consequences of the BSGS-Diagonal reordering of baby-step/giant-step rotations and the fused GPU kernels in FIDESlib. No equations or definitions reduce these quantities to fitted parameters, prior self-citations, or the target results by construction. The work references HyDia as prior context but the new contributions stand independently via explicit algorithmic changes whose effects on key sets and computation are verifiable from the description alone. This is the common case of a self-contained engineering improvement without circular reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption CKKS encryption scheme provides semantic security for the encrypted similarity computations performed by the server.
Reference graph
Works this paper leans on
-
[1]
H. Gururaj, B. Soundarya, S. Priya, J. Shreyas, F. Flammini, A comprehensive review of face recognition techniques, trends, and challenges, IEEE Access 12 (2024) 107903–107926
work page 2024
-
[2]
G. Mai, K. Cao, P. C. Yuen, A. K. Jain, On the reconstruction of face images from deep face tem- plates, IEEE transactions on pattern analysis and machine intelligence 41 (2018) 1188–1202
work page 2018
-
[3]
X. Dong, Z. Miao, L. Ma, J. Shen, Z. Jin, Z. Guo, A. B. J. Teoh, Reconstruct face from features based on genetic algorithm using gan generator as a distribution constraint, Computers & Security 125 (2023) 103026
work page 2023
-
[4]
H. O. Shahreza, S. Marcel, Comprehensive vul- nerability evaluation of face recognition systems to template inversion attacks via 3d face recon- struction, IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (2023) 14248–14265
work page 2023
-
[5]
S. Yan, H. Wen, S. Chang, H. Zhu, L. Zhou, Black-box 3d face reconstruction attack on face recognition from a single image, IEEE Internet of Things Journal (2025)
work page 2025
-
[6]
Z. Wang, H. Wang, S. Jin, W. Zhang, J. Hu, Y . Wang, P. Sun, W. Yuan, K. Liu, K. Ren, Privacy-preserving adversarial facial features, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 8212–8221
work page 2023
-
[7]
H. Liu, H. Du, J. Chen, J. Wang, K. Zhang, K. Zhang, P. Liu, Patronus: Plug-and-play and near-lossless facial privacy enhancement against reconstruction attacks, IEEE Transactions on In- formation Forensics and Security (2025)
work page 2025
-
[8]
H. Liu, L. Ding, J. Chen, J. Wang, X. Du, J. Guo, Adversarial face database against deep learning- enabled reconstruction attacks, ACM Transactions on Intelligent Systems and Technology 17 (2026) 1–19
work page 2026
-
[9]
J. Ji, H. Wang, Y . Huang, J. Wu, X. Xu, S. Ding, S. Zhang, L. Cao, R. Ji, Privacy-preserving face recognition with learnable privacy budgets in fre- quency domain, in: European Conference on Computer Vision, Springer, 2022, pp. 475–491
work page 2022
-
[10]
Y . Mi, Y . Huang, J. Ji, H. Liu, X. Xu, S. Ding, S. Zhou, Duetface: Collaborative privacy- preserving face recognition via channel splitting in the frequency domain, in: Proceedings of the 30th ACM International Conference on Multime- dia, 2022, pp. 6755–6764. 25 Table C.6: CPU-only benchmark results withκ=10. We compare HyDia-CPU-DG against BSGS-CPU-DG,...
work page 2022
-
[11]
Y . Gong, X. Chang, J. Miši´c, V . B. Miši´c, J. Wang, H. Zhu, Practical solutions in fully homomorphic encryption: a survey analyzing existing accelera- tion methods, Cybersecurity 7 (2024) 5
work page 2024
-
[12]
C. Agulló-Domingo, Ó. Vera-López, S. Guzelhan, L. Daksha, A. El Jerari, K. Shivdikar, R. Agrawal, D. Kaeli, A. Joshi, J. L. Abellán, Fideslib: A fully-fledged open-source FHE library for efficient CKKS on GPUs, in: 2025 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), IEEE, 2025, pp. 1–3
work page 2025
-
[13]
J. Cheon, A. Kim, M. Kim, Y . Song, Ho- momorphic encryption for arithmetic of approx- imate numbers, in: Advances in cryptology– ASIACRYPT 2017: 23rd international conference on the theory and applications of cryptology and information security, Hong kong, China, Decem- ber 3-7, 2017, proceedings, part i 23, Springer, 2017, pp. 409–437
work page 2017
- [14]
-
[15]
A. A. Badawi, A. Alexandru, J. Bates, F. Berga- maschi, D. B. Cousins, S. Erabelli, N. Genise, S. Halevi, H. Hunt, A. Kim, Y . Lee, Z. Liu, D. Micciancio, C. Pascoe, Y . Polyakov, I. Quah, S. R.V ., K. Rohloff, J. Saylor, D. Suponit- sky, M. Triplett, V . Vaikuntanathan, V . Zucca, OpenFHE: Open-source fully homomorphic en- cryption library, Cryptology eP...
work page 2022
-
[16]
W. K. Wong, D. W.-l. Cheung, B. Kao, N. Mamoulis, Secure knn computation on en- crypted databases, in: Proceedings of the 2009 ACM SIGMOD International Conference on Man- agement of Data, SIGMOD ’09, Association for Computing Machinery, New York, NY , USA, 2009, p. 139–152. URL:https://doi.org/10 .1145/1559845.1559862. doi:10.1145/1559 845.1559862
-
[17]
H. Hu, J. Xu, C. Ren, B. Choi, Processing private queries over untrusted data cloud through privacy homomorphism, in: 2011 IEEE 27th International Conference on Data Engineering, 2011, pp. 601–
work page 2011
-
[18]
doi:10.1109/ICDE.2011.5767862
-
[19]
Y . Elmehdwi, B. K. Samanthula, W. Jiang, Se- cure k-nearest neighbor query over encrypted data in outsourced environments, in: 2014 IEEE 30th International Conference on Data Engineer- ing, 2014, pp. 664–675. doi:10.1109/ICDE.201 4.6816690
-
[20]
F. Li, R. Shin, V . Paxson, Exploring privacy preservation in outsourced k-nearest neighbors with multiple data owners, in: Proceedings of the 2015 ACM Workshop on Cloud Computing Secu- rity Workshop, CCSW ’15, Association for Com- puting Machinery, New York, NY , USA, 2015, p. 53–64. URL:https://doi.org/10.1145/28 08425.2808430. doi:10.1145/2808425.2808 430
work page doi:10.1145/28 2015
-
[21]
B. K. Samanthula, Y . Elmehdwi, W. Jiang, k- nearest neighbor classification over semantically secure encrypted relational data, IEEE Trans- actions on Knowledge and Data Engineering 27 (2015) 1261–1273. doi:10.1109/TKDE.2014.23 64027
- [22]
-
[23]
Z. Brakerski, C. Gentry, V . Vaikuntanathan, (Lev- eled) fully homomorphic encryption without boot- strapping, ACM Transactions on Computation Theory (TOCT) 6 (2014) 1–36
work page 2014
-
[24]
H. Chen, I. Chillotti, Y . Dong, O. Poburinnaya, I. Razenshteyn, M. S. Riazi,{SANNS}: Scaling up secure approximate{k-Nearest}neighbors search, in: 29th USENIX Security Symposium (USENIX Security 20), 2020, pp. 2111–2128
work page 2020
- [25]
- [26]
-
[27]
I. Chillotti, N. Gama, M. Georgieva, M. Iz- abachène, TFHE: Fast fully homomorphic en- cryption over the torus, Journal of Cryptology 33 (2020) 34–91. 29
work page 2020
-
[28]
S. Servan-Schreiber, S. Langowski, S. Devadas, Private approximate nearest neighbor search with sublinear communication, in: 2022 IEEE Sympo- sium on Security and Privacy (SP), IEEE, 2022, pp. 911–929
work page 2022
-
[29]
K. Cong, R. Geelen, J. Kang, J. Park, Revisiting oblivious top-k selection with applications to se- cure k-nn classification, in: International Confer- ence on Selected Areas in Cryptography, Springer, 2024, pp. 3–25
work page 2024
- [30]
-
[31]
J. Fan, F. Vercauteren, Somewhat practical fully homomorphic encryption, Cryptology ePrint Archive, Paper 2012/144, 2012. URL:https: //eprint.iacr.org/2012/144
work page 2012
-
[32]
H. Schütze, C. D. Manning, P. Raghavan, Intro- duction to information retrieval, volume 39, Cam- bridge University Press Cambridge, 2008
work page 2008
-
[33]
A. Henzinger, E. Dauterman, H. Corrigan-Gibbs, N. Zeldovich, Private web search with tiptoe, in: Proceedings of the 29th symposium on operating systems principles, 2023, pp. 396–416
work page 2023
- [34]
-
[35]
M. Zhou, E. Shi, G. Fanti, Pacmann: Efficient pri- vate approximate nearest neighbor search, Cryp- tology ePrint Archive (2024)
work page 2024
-
[36]
M. Zhou, A. Park, W. Zheng, E. Shi, Piano: Ex- tremely simple, single-server pir with sublinear server computation, in: 2024 IEEE Symposium on Security and Privacy (SP), 2024, pp. 4296–4314. doi:10.1109/SP54263.2024.00055
-
[37]
X. Wang, M. Zhou, G. De Micheli, Y . Nam, S. Pinge, A. Vega, T. Rosing, Pathe: A privacy- preserving database pattern search platform with homomorphic encryption, in: 2025 IEEE/ACM International Conference On Computer Aided De- sign (ICCAD), IEEE, 2025, pp. 1–9
work page 2025
-
[38]
D. Kim, Y . Nam, W. Wang, H. Gong, I. Bhati, R. Cammarota, T. S. Rosing, M. Tepper, T. L. Willke, Grass: Graph-based similarity search on encrypted query, Cryptology ePrint Archive (2024)
work page 2024
-
[39]
J. Zhu, L. Patel, M. Zaharia, R. A. Popa, Compass: encrypted semantic search with high accuracy, in: 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25), 2025, pp. 915–938
work page 2025
-
[40]
V . N. Boddeti, Secure face matching using fully homomorphic encryption, in: 9th IEEE Interna- tional conference on biometrics theory, applica- tions and systems (BTAS), IEEE, 2018, pp. 1–10
work page 2018
-
[41]
J. J. Engelsma, A. K. Jain, V . N. Boddeti, HERS: Homomorphically encrypted representa- tion search, IEEE Transactions on Biometrics, Be- havior, and Identity Science 4 (2022) 349–360
work page 2022
-
[42]
A. Ibarrondo, H. Chabanne, V . Despiegel, M. Önen, Grote: Group testing for privacy- preserving face identification, in: Proceedings of the Thirteenth ACM Conference on Data and Ap- plication Security and Privacy, 2023, pp. 117–128
work page 2023
-
[43]
H. Choi, J. Kim, C. Song, S. S. Woo, H. Kim, Blind-match: efficient homomorphic encryption- based 1: n matching for privacy-preserving bio- metric identification, in: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024, pp. 4423– 4430
work page 2024
-
[44]
W. Ao, V . N. Boddeti, CryptoFace: End-to-end encrypted face recognition, in: Proceedings of the Computer Vision and Pattern Recognition Confer- ence, 2025, pp. 19197–19206
work page 2025
-
[45]
un- veiling encrypted facial recognition (efr)
CryptoLab, Cryptolab to participate in rsac 2025, the world’s largest cybersecurity conference “un- veiling encrypted facial recognition (efr)”, PR Newswire, 2025. URL:https://www.prnews wire.com/news-releases/cryptolab-to-p articipate-in-rsac-2025-the-worlds-l argest-cybersecurity-conference-unvei ling-encrypted-facial-recognition-efr -302435663.html, pr...
work page 2025
-
[46]
URL:https://cryptolab.gitbook.io/hea an2/
CryptoLab, HEaaN2 library, CryptoLab, 2025. URL:https://cryptolab.gitbook.io/hea an2/. 30
work page 2025
-
[47]
URL:https://fhe.desilo.dev/l atest/
DESILO, The DESILO FHE Library, Documenta- tion, 2026. URL:https://fhe.desilo.dev/l atest/
work page 2026
-
[48]
W. Choi, J. Kim, J. H. Ahn, Cheddar: A swift fully homomorphic encryption library designed for gpu architectures, in: Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Sys- tems, V olume 1, 2026, pp. 35–49
work page 2026
-
[49]
J. H. Cheon, K. Han, A. Kim, M. Kim, Y . Song, Bootstrapping for approximate homomorphic en- cryption, in: Annual International Conference on the Theory and Applications of Cryptographic Techniques, Springer, 2018, pp. 360–384
work page 2018
- [50]
-
[51]
J. Deng, J. Guo, N. Xue, S. Zafeiriou, Arcface: Additive angular margin loss for deep face recog- nition, in: Proceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition, 2019, pp. 4690–4699
work page 2019
- [52]
-
[53]
J.-P. Bossuat, C. Mouchet, J. Troncoso-Pastoriza, J.-P. Hubaux, Efficient bootstrapping for approx- imate homomorphic encryption with non-sparse keys, in: A. Canteaut, F.-X. Standaert (Eds.), Advances in Cryptology – EUROCRYPT 2021, Springer International Publishing, Cham, 2021, pp. 587–617
work page 2021
-
[54]
J. H. Cheon, D. Kim, D. Kim, Efficient homo- morphic comparison methods with optimal com- plexity, in: International Conference on the The- ory and Application of Cryptology and Informa- tion Security, Springer, 2020, pp. 221–256
work page 2020
- [55]
-
[56]
M. S. Paterson, L. J. Stockmeyer, On the num- ber of nonscalar multiplications necessary to eval- uate polynomials, SIAM Journal on Computing 2 (1973) 60–66
work page 1973
-
[57]
P. J. Phillips, P. J. Flynn, T. Scruggs, K. W. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Min, W. Worek, Overview of the face recog- nition grand challenge, in: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, IEEE, 2005, pp. 947–954. 31
work page 2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.