Fast Homomorphic Linear Algebra with BLAS

Damien Stehl\'e; Guillaume Hanrot; Jai Hyun Park; Jung Hee Cheon; Youngjin Bae

arxiv: 2503.16080 · v2 · submitted 2025-03-20 · 💻 cs.CR

Fast Homomorphic Linear Algebra with BLAS

Youngjin Bae , Jung Hee Cheon , Guillaume Hanrot , Jai Hyun Park , Damien Stehl\'e This is my paper

Pith reviewed 2026-05-22 23:21 UTC · model grok-4.3

classification 💻 cs.CR

keywords homomorphic encryptionCKKSlinear algebraBLASmatrix multiplicationprivacy-preserving computationencrypted AI

0 comments

The pith

CKKS homomorphic matrix operations reduce directly to plaintext BLAS routines with a 4-12x slowdown for square matrix multiplication.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to map matrix-vector products, vector-vector products, and matrix multiplications in the CKKS homomorphic encryption scheme onto equivalent plaintext floating-point operations. These reductions let the computation reuse highly optimized BLAS libraries that already exploit hardware parallelism and cache behavior. A reader would care because many privacy-preserving applications in AI and scientific computing rely on exactly these linear algebra kernels, and the claimed overhead is small enough to make encrypted versions practical for moderate to large matrices.

Core claim

We provide reductions for matrix-vector products, vector-vector products for moderate-sized to large matrices to their plaintext equivalents. Combined with BLAS, we demonstrate that the efficiency loss between CKKS-based encrypted square matrix multiplication and double-precision floating-point square matrix multiplication is a mere 4-12 factor, depending on the precise situation.

What carries the argument

The explicit reductions that translate each homomorphic matrix operation into one or more calls to standard plaintext linear-algebra routines while preserving correctness.

If this is right

Encrypted matrix multiplication becomes feasible at scales where plaintext BLAS already runs efficiently.
Existing high-performance linear-algebra code bases can be reused for homomorphic workloads with only a small constant-factor penalty.
Applications that need many matrix-vector or dot-product steps can switch to the encrypted setting without rewriting their core loops.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the reductions compose cleanly, the same technique might apply to higher-level algorithms such as iterative solvers or neural-network layers that are built from these primitives.
Hardware vendors optimizing BLAS could indirectly accelerate encrypted computation by the same factor without any changes to the encryption library.
The approach separates the cryptographic noise management from the arithmetic scheduling, which may simplify porting to new hardware accelerators.

Load-bearing premise

The reductions from homomorphic CKKS operations to plaintext linear algebra preserve both correctness and the claimed efficiency without hidden overheads from encoding, noise management, or memory access patterns that are not captured by the high-level mapping.

What would settle it

A side-by-side timing measurement, on the same hardware, of a direct CKKS matrix-multiplication implementation versus the same computation after applying the paper's reductions and calling BLAS, for matrix sizes between 256 and 4096.

Figures

Figures reproduced from arXiv: 2503.16080 by Damien Stehl\'e, Guillaume Hanrot, Jai Hyun Park, Jung Hee Cheon, Youngjin Bae.

**Figure 1.** Figure 1: Visualization of a C-MT application to re-formatting client-wise ciphertexts into feature-wise ciphertexts. for 0 ≤ j < N, where m(X) = P 0≤i<N miXi is an element in Rq, and Gal(R/Z) is the group of automorphisms of Q[X]/(XN + 1) that fix Q. By abuse, we let the latter act on Rq; if m = P 0≤i<N miXi ∈ Rq, we define σ(m) = P 0≤i<N miσ(X) i ∈ Rq. Given N plaintexts {mi = P j Mi,jXj}0≤i<N in Rq that each enc… view at source ↗

read the original abstract

Homomorphic encryption is a cryptographic paradigm allowing to compute on encrypted data, opening a wide range of applications in privacy-preserving data manipulation, notably in AI. Many of those applications require significant linear algebra computations (matrix-vector products, and matrix-matrix products). This central role of linear algebra computations goes far beyond homomorphic algebra and applies to most areas of scientific computing. This high versatility led, over time, to the development of a set of highly optimized routines, specified in 1979 under the name BLAS (basic linear algebra subroutines). Motivated both by the applicative importance of homomorphic linear algebra and the access to highly efficient implementations of cleartext linear algebra able to draw the most out of available hardware, we explore the connections between CKKS-based homomorphic linear algebra and floating-point plaintext linear algebra. The CKKS homomorphic encryption system is the most natural choice in this setting, as it natively handles real numbers and offers a large SIMD parallelism. We provide reductions for matrix-vector products, vector-vector products for moderate-sized to large matrices to their plaintext equivalents. Combined with BLAS, we demonstrate that the efficiency loss between CKKS-based encrypted square matrix multiplication and double-precision floating-point square matrix multiplication is a mere 4-12 factor, depending on the precise situation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to establish reductions mapping CKKS homomorphic matrix-vector and vector-vector products (for moderate-to-large matrices) onto equivalent plaintext linear-algebra operations. When these reductions are combined with existing BLAS libraries, the authors report that the slowdown of CKKS-based encrypted square matrix multiplication relative to double-precision plaintext multiplication is only a factor of 4-12, depending on the precise setting.

Significance. If the reductions are shown to preserve both correctness and the claimed performance without unaccounted overheads from rotations, key-switching or encoding, the result would be significant: it would demonstrate that homomorphic linear algebra can directly exploit decades of hardware-specific optimization in plaintext BLAS, substantially lowering the barrier to practical privacy-preserving linear-algebra workloads in machine learning and scientific computing.

major comments (2)

[Abstract] Abstract: the central performance claim (4-12× slowdown) rests on the assertion that the listed homomorphic operations reduce to plaintext BLAS calls. The manuscript provides no explicit accounting of the cost of ciphertext rotations, key-switching, or noise-management steps that are required by CKKS but are not obviously expressible as BLAS primitives; if these steps dominate the runtime, the reported factor understates the true overhead.
[Experimental evaluation] The experimental section does not report matrix dimensions, number of repetitions, or whether timings include encoding/decoding and rotation-key application. Without these details it is impossible to verify that the measured 4-12× ratio isolates the linear-algebra kernel as claimed.

minor comments (2)

Notation for ciphertext packing and SIMD slot alignment is introduced without a dedicated preliminary section, making the reduction arguments harder to follow for readers outside the immediate subfield.
[Abstract] The abstract refers to “moderate-sized to large matrices” without quantitative bounds; a short table or sentence giving the range of n for which the reductions apply would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. We address the major comments below and will revise the manuscript accordingly to improve clarity on performance accounting and experimental details.

read point-by-point responses

Referee: [Abstract] Abstract: the central performance claim (4-12× slowdown) rests on the assertion that the listed homomorphic operations reduce to plaintext BLAS calls. The manuscript provides no explicit accounting of the cost of ciphertext rotations, key-switching, or noise-management steps that are required by CKKS but are not obviously expressible as BLAS primitives; if these steps dominate the runtime, the reported factor understates the true overhead.

Authors: Our reductions are constructed to express the dominant costs of CKKS matrix-vector and matrix-matrix multiplication as sequences of plaintext BLAS operations. The auxiliary costs of rotations, key-switching, and noise management are either integrated into the reduction (via batched plaintext equivalents) or captured in the end-to-end timings that produce the reported 4-12× factor. To make this accounting explicit, we will add a dedicated subsection with a cost breakdown of each CKKS primitive relative to the BLAS calls. revision: yes
Referee: [Experimental evaluation] The experimental section does not report matrix dimensions, number of repetitions, or whether timings include encoding/decoding and rotation-key application. Without these details it is impossible to verify that the measured 4-12× ratio isolates the linear-algebra kernel as claimed.

Authors: We agree that these experimental parameters must be stated explicitly for reproducibility. The revised manuscript will report the exact matrix dimensions evaluated, the number of repetitions used to obtain stable timings, and confirm that all measurements include encoding, decoding, and rotation-key application steps. This will demonstrate that the 4-12× ratio reflects the complete homomorphic linear-algebra workload. revision: yes

Circularity Check

0 steps flagged

No circularity; reductions map to independent external BLAS implementations

full rationale

The paper's derivation consists of explicit algorithmic reductions that map CKKS matrix-vector and vector-vector products onto plaintext equivalents, then invokes independently developed and benchmarked BLAS libraries for the plaintext side. The reported 4-12x slowdown factors are obtained by direct timing comparisons against double-precision floating-point matrix multiplication rather than by any quantity defined in terms of the homomorphic result itself. No self-definitional steps, fitted inputs renamed as predictions, load-bearing self-citations, uniqueness theorems, or ansatzes smuggled via prior work appear in the abstract or described chain; the central efficiency claim therefore remains self-contained against external plaintext benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard algebraic properties of the CKKS scheme and the existence of optimized BLAS implementations; no new free parameters, invented entities, or ad-hoc axioms are introduced in the abstract.

axioms (2)

domain assumption CKKS natively supports real-number arithmetic and SIMD-style packing of multiple values into ciphertexts
Invoked to justify why CKKS is the natural choice for linear algebra over encrypted data.
standard math BLAS libraries deliver near-optimal performance for plaintext floating-point matrix operations on available hardware
Used as the performance baseline for the claimed overhead factor.

pith-pipeline@v0.9.0 · 5767 in / 1380 out tokens · 91113 ms · 2026-05-22T23:21:37.876738+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 1 internal anchor

[1]

OpenBLAS: An optimized BLAS library – version 0.3.26, available at https://www.openblas.net/

work page
[2]

754-2019 - ieee standard for floating-point arithmetic (2019)

work page 2019
[3]

Albrecht, M.R., Player, R., Scott, S.: On the concrete hardness of learning with errors. J. Math. Cryptol. (2015), software available at https://github.com/malb/lattice-estimator

work page 2015
[4]

In: USENIX (2021)

Ali, A., Lepoint, T., Patel, S., Raykova, M., Schoppmann, P., Seth, K., Yeo, K.: Communication-computation trade-offs in PIR. In: USENIX (2021)

work page 2021
[5]

In: SODA (2025)

Alman, J., Duan, R., Williams, V.V., Xu, Y., Xu, Z., Zhou, R.: More asymmetry yields faster matrix multiplica- tion. In: SODA (2025)

work page 2025
[6]

In: IEEE S&P (2018)

Angel, S., Chen, H., Laine, K., Setty, S.T.V.: PIR with compressed queries and amortized query processing. In: IEEE S&P (2018)

work page 2018
[7]

In: CRYPTO (2024)

Bae, Y., Cheon, J.H., Hanrot, G., Park, J.H., Stehl´ e, D.: Plaintext-ciphertext matrix multiplication and FHE bootstrapping: Fast and fused. In: CRYPTO (2024)

work page 2024
[8]

In: CRYPTO (2023)

Bae, Y., Cheon, J.H., Kim, J., Park, J.H., Stehl´ e, D.: HERMES: efficient ring packing using MLWE ciphertexts and application to transciphering. In: CRYPTO (2023)

work page 2023
[9]

In: SAC (2016)

Bajard, J.C., Eynard, J., Hasan, M.A., Zucca, V.: A full RNS variant of FV like somewhat homomorphic encryp- tion schemes. In: SAC (2016)

work page 2016
[10]

In: TCC (2005)

Boneh, D., Goh, E.J., Nissim, K.: Evaluating 2-DNF formulas on ciphertexts. In: TCC (2005)

work page 2005
[11]

Boura, C., Gama, N., Georgieva, M., Jetchev, D.: CHIMERA: combining Ring-LWE-based fully homomorphic encryption schemes. J. Math. Cryptol. (2020)

work page 2020
[12]

In: CRYPTO (2012)

Brakerski, Z.: Fully homomorphic encryption without modulus switching from classical gapSVP. In: CRYPTO (2012)

work page 2012
[13]

ACM Trans

Brakerski, Z., Gentry, C., Vaikuntanathan, V.: (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory (2014)

work page 2014
[14]

In: STOC (2013)

Brakerski, Z., Langlois, A., Peikert, C., Regev, O., Stehl´ e, D.: Classical hardness of learning with errors. In: STOC (2013)

work page 2013
[15]

In: CRYPTO (2016)

Brakerski, Z., Perlman, R.: Lattice-based fully dynamic multi-key FHE with short ciphertexts. In: CRYPTO (2016)

work page 2016
[16]

In: ACNS (2021)

Chen, H., Dai, W., Kim, M., Song, Y.: Homomorphic conversion between (ring) LWE ciphertexts. In: ACNS (2021)

work page 2021
[17]

IEEE Trans

Chen, J., Yang, L., Wu, W., Liu, Y., Feng, Y.: Homomorphic matrix operations under bicyclic encoding. IEEE Trans. Inf. Forensics Secur. (2024)

work page 2024
[18]

In: EUROCRYPT (2018)

Cheon, J.H., Han, K., Kim, A., Kim, M., Song, Y.: Bootstrapping for approximate homomorphic encryption. In: EUROCRYPT (2018)

work page 2018
[19]

In: ASIACRYPT (2017)

Cheon, J.H., Kim, A., Kim, M., Song, Y.: Homomorphic encryption for arithmetic of approximate numbers. In: ASIACRYPT (2017)

work page 2017
[20]

In: ASIACRYPT (2017)

Chillotti, I., Gama, N., Georgieva, M., Izabach` ene, M.: Faster packed homomorphic operations and efficient circuit bootstrapping for TFHE. In: ASIACRYPT (2017)

work page 2017
[21]

CryptoLab: HEaaN library (2022), available at https://www.cryptolab.co.kr/en/products-en/heaan-he/ 42

work page 2022
[22]

In: NAACL-HLT (2019)

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)

work page 2019
[23]

Ding, Y., Guo, H., Guan, Y., Liu, W., Huo, J., Guan, Z., Zhang, X.: East: Efficient and accurate secure transformer framework for inference (2023), available at https://arxiv.org/abs/2308.09923

work page arXiv 2023
[24]

ACM Trans

Dumas, J.G., Giorgi, P., Pernet, C.: Dense linear algebra over word-size prime fields: the FFLAS and FFPACK packages. ACM Trans. Math. Softw. (2008)

work page 2008
[25]

Available at http://eprint.iacr

Fan, J., Vercauteren, F.: Somewhat practical fully homomorphic encryption. Available at http://eprint.iacr. org/2012/144 (2012)

work page 2012
[26]

(2014), http://linalg.org/projects/fflas-ffpack

FFLAS14, T.F.F.G.: FFLAS-FFPACK: Finite Field Linear Algebra Subroutines / Package, v2.0.0 edn. (2014), http://linalg.org/projects/fflas-ffpack

work page 2014
[27]

In: IEEE S&P (2023)

Froelicher, D., Cho, H., Edupalli, M., Sousa, J.S., Bossuat, J.P., Pyrgelis, A., Troncoso-Pastoriza, J.R., Berger, B., Hubaux, J.P.: Scalable and privacy-preserving federated principal component analysis. In: IEEE S&P (2023)

work page 2023
[28]

The Journal of Supercomputing (2024)

Gao, Y., Quan, G., Homsi, S., Wen, W., Wang, L.: Secure and efficient general matrix multiplication on cloud using homomorphic encryption. The Journal of Supercomputing (2024)

work page 2024
[29]

In: CRYPTO (2012)

Gentry, C., Halevi, S., Smart, N.P.: Homomorphic evaluation of the aes circuit. In: CRYPTO (2012)

work page 2012
[30]

In: CRYPTO (2013)

Gentry, C., Sahai, A., Waters, B.: Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based. In: CRYPTO (2013)

work page 2013
[31]

In: EUROCRYPT (2010)

Gentry, C., Halevi, S., Vaikuntanathan, V.: A simple BGN-type cryptosystem from LWE. In: EUROCRYPT (2010)

work page 2010
[32]

In: ICS (2010)

Goldwasser, S., Kalai, Y.T., Peikert, C., Vaikuntanathan, V.: Robustness of the learning with errors assumption. In: ICS (2010)

work page 2010
[33]

arXiv preprint arXiv:2312.14250 (2023)

G¨ unther, M., Sch¨ utze, L., Becher, K., Strufe, T., Castrillon, J.: Helium: A language and compiler for fully homomorphic encryption with support for proxy re-encryption. arXiv preprint arXiv:2312.14250 (2023)

work page arXiv 2023
[34]

In: CRYPTO (2014)

Halevi, S., Shoup, V.: Algorithms in HElib. In: CRYPTO (2014)

work page 2014
[35]

In: CRYPTO (2018)

Halevi, S., Shoup, V.: Faster homomorphic linear transformations in HElib. In: CRYPTO (2018)

work page 2018
[36]

In: NIPS (2022)

Hao, M., Li, H., Chen, H., Xing, P., Xu, G., Zhang, T.: Iron: Private inference on transformers. In: NIPS (2022)

work page 2022
[37]

In: CCS (2024)

He, J., Yang, K., Tang, G., Huang, Z., Lin, L., Wei, C., Yan, Y., Wang, W.: Rhombus: Fast homomorphic matrix-vector multiplication for secure two-party inference. In: CCS (2024)

work page 2024
[38]

In: SOSP (2023)

Henzinger, A., Dauterman, E., Corrigan-Gibbs, H., Zeldovich, N.: Private web search with tiptoe. In: SOSP (2023)

work page 2023
[39]

Hong, S., Choi, Y.A., Joo, D.S., G¨ ursoy, G.: Privacy-preserving model evaluation for logistic and linear regression using homomorphically encrypted genotype data. J. Biomed. Informs. (2024)

work page 2024
[40]

IEEE Trans

Huang, Z., Hong, C., Weng, C., Lu, W., Qu, H.: More efficient secure matrix multiplication for unbalanced recommender systems. IEEE Trans. Dependable Secure Comput. (2021)

work page 2021
[41]

In: CCS (2018)

Jiang, X., Kim, M., Lauter, K., Song, Y.: Secure outsourced matrix computation and application to neural networks. In: CCS (2018)

work page 2018
[42]

Langlois, A., Stehl´ e, D.: Worst-case to average-case reductions for module lattices. Des. Codes Cryptogr. (2015)

work page 2015
[43]

In: ASIACRYPT (2023)

Lee, J.W., Lee, E., Kim, Y.S., No, J.S.: Rotation key reduction for client-server systems of deep neural network on fully homomorphic encryption. In: ASIACRYPT (2023)

work page 2023
[44]

IEEE Trans

Liu, J., Zhang, L.F.: Privacy-preserving and publicly verifiable matrix multiplication. IEEE Trans. Services Comput. (2022)

work page 2022
[45]

In: EUROCRYPT (2010)

Lyubashevsky, V., Peikert, C., Regev, O.: On ideal lattices and learning with errors over rings. In: EUROCRYPT (2010)

work page 2010
[46]

Ma, X., Ma, C., Jiang, Y., Ge, C.: Improved privacy-preserving PCA using optimized homomorphic matrix multiplication. Comput. Secur. (2024)

work page 2024
[47]

In: PETS (2021)

Mouchet, C., Troncoso-Pastoriza, J., Bossuat, J.P., Hubaux, J.P.: Multiparty homomorphic encryption from ring-learning-with-errors. In: PETS (2021)

work page 2021
[48]

In: IEEE S&P (2024)

Pang, Q., Zhu, J., M¨ ollering, H., Zheng, W., Schneider, T.: BOLT: Privacy-preserving, accurate and efficient inference for transformers. In: IEEE S&P (2024)

work page 2024
[49]

In: EUROCRYPT (2025)

Park, J.H.: Ciphertext-ciphertext matrix multiplication: Fast for large matrices. In: EUROCRYPT (2025)

work page 2025
[50]

In: STOC (2008)

Peikert, C., Waters, B.: Lossy trapdoor functions and their applications. In: STOC (2008)

work page 2008
[51]

Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre- training (2018), available at https://openai.com/research/language-unsupervised

work page 2018
[52]

In: DAC (2023) 43

Ren, X., Chen, Z., Gu, Z., Lu, Y., Zhong, R., Lu, W., Zhang, J., Zhang, Y., Wu, H., Zheng, X., Liu, H., Chu, T., Hong, C., Wei, C., Niu, D., Xie, Y.: CHAM: A customized homomorphic encryption accelerator for fast matrix-vector product. In: DAC (2023) 43

work page 2023
[53]

In: ASIACRYPT (2009)

Stehl´ e, D., Steinfeld, R., Tanaka, K., Xagawa, K.: Efficient public key encryption based on ideal lattices. In: ASIACRYPT (2009)

work page 2009
[54]

The FLINT team: FLINT: Fast Library for Number Theory (2023), version 3.0.0, https://flintlib.org

work page 2023
[55]

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozi` ere, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., Lample, G.: LLaMA: Open and efficient foundation language models (2023), available at https://arxiv.org/abs/2302.13971

work page internal anchor Pith review Pith/arXiv arXiv 2023
[56]

Zhang, J., Liu, J., Yang, X., Wang, Y., Chen, K., Hou, X., Ren, K., Yang, X.: Secure transformer inference made non-interactive (2023), available at https://eprint.iacr.org/2024/136

work page 2023
[57]

Zheng, X., Li, H., Wang, D.: A new framework for fast homomorphic matrix multiplication (2023), available at https://eprint.iacr.org/2023/1649

work page 2023
[58]

In: MSN (2023) 44

Zhu, W., Li, X.: Secure mutual learning with low interactions for deep model training. In: MSN (2023) 44

work page 2023

[1] [1]

OpenBLAS: An optimized BLAS library – version 0.3.26, available at https://www.openblas.net/

work page

[2] [2]

754-2019 - ieee standard for floating-point arithmetic (2019)

work page 2019

[3] [3]

Albrecht, M.R., Player, R., Scott, S.: On the concrete hardness of learning with errors. J. Math. Cryptol. (2015), software available at https://github.com/malb/lattice-estimator

work page 2015

[4] [4]

In: USENIX (2021)

Ali, A., Lepoint, T., Patel, S., Raykova, M., Schoppmann, P., Seth, K., Yeo, K.: Communication-computation trade-offs in PIR. In: USENIX (2021)

work page 2021

[5] [5]

In: SODA (2025)

Alman, J., Duan, R., Williams, V.V., Xu, Y., Xu, Z., Zhou, R.: More asymmetry yields faster matrix multiplica- tion. In: SODA (2025)

work page 2025

[6] [6]

In: IEEE S&P (2018)

Angel, S., Chen, H., Laine, K., Setty, S.T.V.: PIR with compressed queries and amortized query processing. In: IEEE S&P (2018)

work page 2018

[7] [7]

In: CRYPTO (2024)

Bae, Y., Cheon, J.H., Hanrot, G., Park, J.H., Stehl´ e, D.: Plaintext-ciphertext matrix multiplication and FHE bootstrapping: Fast and fused. In: CRYPTO (2024)

work page 2024

[8] [8]

In: CRYPTO (2023)

Bae, Y., Cheon, J.H., Kim, J., Park, J.H., Stehl´ e, D.: HERMES: efficient ring packing using MLWE ciphertexts and application to transciphering. In: CRYPTO (2023)

work page 2023

[9] [9]

In: SAC (2016)

Bajard, J.C., Eynard, J., Hasan, M.A., Zucca, V.: A full RNS variant of FV like somewhat homomorphic encryp- tion schemes. In: SAC (2016)

work page 2016

[10] [10]

In: TCC (2005)

Boneh, D., Goh, E.J., Nissim, K.: Evaluating 2-DNF formulas on ciphertexts. In: TCC (2005)

work page 2005

[11] [11]

Boura, C., Gama, N., Georgieva, M., Jetchev, D.: CHIMERA: combining Ring-LWE-based fully homomorphic encryption schemes. J. Math. Cryptol. (2020)

work page 2020

[12] [12]

In: CRYPTO (2012)

Brakerski, Z.: Fully homomorphic encryption without modulus switching from classical gapSVP. In: CRYPTO (2012)

work page 2012

[13] [13]

ACM Trans

Brakerski, Z., Gentry, C., Vaikuntanathan, V.: (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory (2014)

work page 2014

[14] [14]

In: STOC (2013)

Brakerski, Z., Langlois, A., Peikert, C., Regev, O., Stehl´ e, D.: Classical hardness of learning with errors. In: STOC (2013)

work page 2013

[15] [15]

In: CRYPTO (2016)

Brakerski, Z., Perlman, R.: Lattice-based fully dynamic multi-key FHE with short ciphertexts. In: CRYPTO (2016)

work page 2016

[16] [16]

In: ACNS (2021)

Chen, H., Dai, W., Kim, M., Song, Y.: Homomorphic conversion between (ring) LWE ciphertexts. In: ACNS (2021)

work page 2021

[17] [17]

IEEE Trans

Chen, J., Yang, L., Wu, W., Liu, Y., Feng, Y.: Homomorphic matrix operations under bicyclic encoding. IEEE Trans. Inf. Forensics Secur. (2024)

work page 2024

[18] [18]

In: EUROCRYPT (2018)

Cheon, J.H., Han, K., Kim, A., Kim, M., Song, Y.: Bootstrapping for approximate homomorphic encryption. In: EUROCRYPT (2018)

work page 2018

[19] [19]

In: ASIACRYPT (2017)

Cheon, J.H., Kim, A., Kim, M., Song, Y.: Homomorphic encryption for arithmetic of approximate numbers. In: ASIACRYPT (2017)

work page 2017

[20] [20]

In: ASIACRYPT (2017)

Chillotti, I., Gama, N., Georgieva, M., Izabach` ene, M.: Faster packed homomorphic operations and efficient circuit bootstrapping for TFHE. In: ASIACRYPT (2017)

work page 2017

[21] [21]

CryptoLab: HEaaN library (2022), available at https://www.cryptolab.co.kr/en/products-en/heaan-he/ 42

work page 2022

[22] [22]

In: NAACL-HLT (2019)

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)

work page 2019

[23] [23]

Ding, Y., Guo, H., Guan, Y., Liu, W., Huo, J., Guan, Z., Zhang, X.: East: Efficient and accurate secure transformer framework for inference (2023), available at https://arxiv.org/abs/2308.09923

work page arXiv 2023

[24] [24]

ACM Trans

Dumas, J.G., Giorgi, P., Pernet, C.: Dense linear algebra over word-size prime fields: the FFLAS and FFPACK packages. ACM Trans. Math. Softw. (2008)

work page 2008

[25] [25]

Available at http://eprint.iacr

Fan, J., Vercauteren, F.: Somewhat practical fully homomorphic encryption. Available at http://eprint.iacr. org/2012/144 (2012)

work page 2012

[26] [26]

(2014), http://linalg.org/projects/fflas-ffpack

FFLAS14, T.F.F.G.: FFLAS-FFPACK: Finite Field Linear Algebra Subroutines / Package, v2.0.0 edn. (2014), http://linalg.org/projects/fflas-ffpack

work page 2014

[27] [27]

In: IEEE S&P (2023)

Froelicher, D., Cho, H., Edupalli, M., Sousa, J.S., Bossuat, J.P., Pyrgelis, A., Troncoso-Pastoriza, J.R., Berger, B., Hubaux, J.P.: Scalable and privacy-preserving federated principal component analysis. In: IEEE S&P (2023)

work page 2023

[28] [28]

The Journal of Supercomputing (2024)

Gao, Y., Quan, G., Homsi, S., Wen, W., Wang, L.: Secure and efficient general matrix multiplication on cloud using homomorphic encryption. The Journal of Supercomputing (2024)

work page 2024

[29] [29]

In: CRYPTO (2012)

Gentry, C., Halevi, S., Smart, N.P.: Homomorphic evaluation of the aes circuit. In: CRYPTO (2012)

work page 2012

[30] [30]

In: CRYPTO (2013)

Gentry, C., Sahai, A., Waters, B.: Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based. In: CRYPTO (2013)

work page 2013

[31] [31]

In: EUROCRYPT (2010)

Gentry, C., Halevi, S., Vaikuntanathan, V.: A simple BGN-type cryptosystem from LWE. In: EUROCRYPT (2010)

work page 2010

[32] [32]

In: ICS (2010)

Goldwasser, S., Kalai, Y.T., Peikert, C., Vaikuntanathan, V.: Robustness of the learning with errors assumption. In: ICS (2010)

work page 2010

[33] [33]

arXiv preprint arXiv:2312.14250 (2023)

G¨ unther, M., Sch¨ utze, L., Becher, K., Strufe, T., Castrillon, J.: Helium: A language and compiler for fully homomorphic encryption with support for proxy re-encryption. arXiv preprint arXiv:2312.14250 (2023)

work page arXiv 2023

[34] [34]

In: CRYPTO (2014)

Halevi, S., Shoup, V.: Algorithms in HElib. In: CRYPTO (2014)

work page 2014

[35] [35]

In: CRYPTO (2018)

Halevi, S., Shoup, V.: Faster homomorphic linear transformations in HElib. In: CRYPTO (2018)

work page 2018

[36] [36]

In: NIPS (2022)

Hao, M., Li, H., Chen, H., Xing, P., Xu, G., Zhang, T.: Iron: Private inference on transformers. In: NIPS (2022)

work page 2022

[37] [37]

In: CCS (2024)

He, J., Yang, K., Tang, G., Huang, Z., Lin, L., Wei, C., Yan, Y., Wang, W.: Rhombus: Fast homomorphic matrix-vector multiplication for secure two-party inference. In: CCS (2024)

work page 2024

[38] [38]

In: SOSP (2023)

Henzinger, A., Dauterman, E., Corrigan-Gibbs, H., Zeldovich, N.: Private web search with tiptoe. In: SOSP (2023)

work page 2023

[39] [39]

Hong, S., Choi, Y.A., Joo, D.S., G¨ ursoy, G.: Privacy-preserving model evaluation for logistic and linear regression using homomorphically encrypted genotype data. J. Biomed. Informs. (2024)

work page 2024

[40] [40]

IEEE Trans

Huang, Z., Hong, C., Weng, C., Lu, W., Qu, H.: More efficient secure matrix multiplication for unbalanced recommender systems. IEEE Trans. Dependable Secure Comput. (2021)

work page 2021

[41] [41]

In: CCS (2018)

Jiang, X., Kim, M., Lauter, K., Song, Y.: Secure outsourced matrix computation and application to neural networks. In: CCS (2018)

work page 2018

[42] [42]

Langlois, A., Stehl´ e, D.: Worst-case to average-case reductions for module lattices. Des. Codes Cryptogr. (2015)

work page 2015

[43] [43]

In: ASIACRYPT (2023)

Lee, J.W., Lee, E., Kim, Y.S., No, J.S.: Rotation key reduction for client-server systems of deep neural network on fully homomorphic encryption. In: ASIACRYPT (2023)

work page 2023

[44] [44]

IEEE Trans

Liu, J., Zhang, L.F.: Privacy-preserving and publicly verifiable matrix multiplication. IEEE Trans. Services Comput. (2022)

work page 2022

[45] [45]

In: EUROCRYPT (2010)

Lyubashevsky, V., Peikert, C., Regev, O.: On ideal lattices and learning with errors over rings. In: EUROCRYPT (2010)

work page 2010

[46] [46]

Ma, X., Ma, C., Jiang, Y., Ge, C.: Improved privacy-preserving PCA using optimized homomorphic matrix multiplication. Comput. Secur. (2024)

work page 2024

[47] [47]

In: PETS (2021)

Mouchet, C., Troncoso-Pastoriza, J., Bossuat, J.P., Hubaux, J.P.: Multiparty homomorphic encryption from ring-learning-with-errors. In: PETS (2021)

work page 2021

[48] [48]

In: IEEE S&P (2024)

Pang, Q., Zhu, J., M¨ ollering, H., Zheng, W., Schneider, T.: BOLT: Privacy-preserving, accurate and efficient inference for transformers. In: IEEE S&P (2024)

work page 2024

[49] [49]

In: EUROCRYPT (2025)

Park, J.H.: Ciphertext-ciphertext matrix multiplication: Fast for large matrices. In: EUROCRYPT (2025)

work page 2025

[50] [50]

In: STOC (2008)

Peikert, C., Waters, B.: Lossy trapdoor functions and their applications. In: STOC (2008)

work page 2008

[51] [51]

Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre- training (2018), available at https://openai.com/research/language-unsupervised

work page 2018

[52] [52]

In: DAC (2023) 43

Ren, X., Chen, Z., Gu, Z., Lu, Y., Zhong, R., Lu, W., Zhang, J., Zhang, Y., Wu, H., Zheng, X., Liu, H., Chu, T., Hong, C., Wei, C., Niu, D., Xie, Y.: CHAM: A customized homomorphic encryption accelerator for fast matrix-vector product. In: DAC (2023) 43

work page 2023

[53] [53]

In: ASIACRYPT (2009)

Stehl´ e, D., Steinfeld, R., Tanaka, K., Xagawa, K.: Efficient public key encryption based on ideal lattices. In: ASIACRYPT (2009)

work page 2009

[54] [54]

The FLINT team: FLINT: Fast Library for Number Theory (2023), version 3.0.0, https://flintlib.org

work page 2023

[55] [55]

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.A., Lacroix, T., Rozi` ere, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., Lample, G.: LLaMA: Open and efficient foundation language models (2023), available at https://arxiv.org/abs/2302.13971

work page internal anchor Pith review Pith/arXiv arXiv 2023

[56] [56]

Zhang, J., Liu, J., Yang, X., Wang, Y., Chen, K., Hou, X., Ren, K., Yang, X.: Secure transformer inference made non-interactive (2023), available at https://eprint.iacr.org/2024/136

work page 2023

[57] [57]

Zheng, X., Li, H., Wang, D.: A new framework for fast homomorphic matrix multiplication (2023), available at https://eprint.iacr.org/2023/1649

work page 2023

[58] [58]

In: MSN (2023) 44

Zhu, W., Li, X.: Secure mutual learning with low interactions for deep model training. In: MSN (2023) 44

work page 2023