Volley Revolver: A Novel Matrix-Encoding Method for Privacy-Preserving Neural Networks (Inference)

John Chiang

arxiv: 2201.12577 · v10 · submitted 2022-01-29 · 💻 cs.CR · cs.CV

Volley Revolver: A Novel Matrix-Encoding Method for Privacy-Preserving Neural Networks (Inference)

John Chiang This is my paper

Pith reviewed 2026-05-24 11:56 UTC · model grok-4.3

classification 💻 cs.CR cs.CV

keywords homomorphic encryptionprivacy-preserving inferencematrix encodingconvolutional neural networksencrypted convolutionMNIST classificationciphertext matrix multiplication

0 comments

The pith

A matrix-encoding method performs homomorphic matrix multiplication and convolution directly on ciphertexts for neural network inference.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents an encoding technique that encrypts one matrix and the transpose of the other, then applies additional operations to compute their product over encrypted data. For convolutions it first expands each kernel into a full-size matrix matching the input image dimensions, producing multiple ciphertexts that combine with the encrypted input to yield partial results. These are accumulated to finish the convolution step. The method is demonstrated by running a convolutional network on encrypted MNIST images, completing inference on 32 examples in roughly 287 seconds on a 40-vCPU cloud instance while requiring only one 19.8 MB ciphertext upload. A reader would care because the approach keeps all intermediate values encrypted, allowing a cloud service to classify private images without ever seeing them in plaintext.

Core claim

For two matrices A and B, encrypt A and the transpose of B into separate ciphertexts; homomorphic matrix multiplication then follows from a sequence of additional operations on those ciphertexts. For convolution, each kernel is first spanned into a matrix the same size as the input image, generating several ciphertexts; each such ciphertext is multiplied with the ciphertext of the input image to produce a portion of the convolution output, after which the portions are summed to recover the full result.

What carries the argument

The matrix-encoding method that converts matrix multiplication into ciphertext operations on A and B-transpose, together with the kernel-spanning procedure that turns each convolution into a set of matrix multiplications over ciphertexts.

If this is right

Convolutional layers can be evaluated entirely under encryption without decrypting intermediate feature maps.
A single ciphertext can hold 32 images at once, allowing batched inference at the reported cloud cost.
Only the final likelihood vector needs to be returned to the data owner, keeping the model weights hidden on the server.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same encoding could be applied to fully-connected layers by treating weight matrices directly as the A or B inputs.
If the method scales without precision loss to deeper networks, it would support private inference on models larger than the MNIST example shown.
The ciphertext size and runtime figures imply that image dimensions and batch size remain practical constraints for real deployments.

Load-bearing premise

The encoding steps and kernel expansion preserve exact numerical values so that decryption yields the identical results a plaintext network would produce.

What would settle it

Decrypt the outputs of the encrypted CNN on the MNIST test set and compare them element-wise to the outputs of the identical network run on the corresponding plaintext images; any systematic difference would show the encoding introduced error.

Figures

Figures reproduced from arXiv: 2201.12577 by John Chiang.

**Figure 2.** Figure 2: Our convolution operation algorithm with [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

read the original abstract

In this work, we present a novel matrix-encoding method that is particularly convenient for neural networks to make predictions in a privacy-preserving manner using homomorphic encryption. Based on this encoding method, we implement a convolutional neural network for handwritten image classification over encryption. For two matrices $A$ and $B$ to perform homomorphic multiplication, the main idea behind it, in a simple version, is to encrypt matrix $A$ and the transpose of matrix $B$ into two ciphertexts respectively. With additional operations, the homomorphic matrix multiplication can be calculated over encrypted matrices efficiently. For the convolution operation, we in advance span each convolution kernel to a matrix space of the same size as the input image so as to generate several ciphertexts, each of which is later used together with the ciphertext encrypting input images for calculating some of the final convolution results. We accumulate all these intermediate results and thus complete the convolution operation. In a public cloud with 40 vCPUs, our convolutional neural network implementation on the MNIST testing dataset takes $\sim$ 287 seconds to compute ten likelihoods of 32 encrypted images of size $28 \times 28$ simultaneously. The data owner only needs to upload one ciphertext ($\sim 19.8$ MB) encrypting these 32 images to the public cloud.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a concrete matrix-encoding and kernel-spanning trick for homomorphic CNN inference plus a timing number, but supplies no derivation or check that the decrypted outputs are correct.

read the letter

The key takeaway is that this paper describes a matrix-encoding approach for homomorphic encryption applied to neural network inference, specifically for matrix multiplication and convolution in CNNs, along with some timing results on MNIST. The method encrypts one matrix and the transpose of the other for multiplication, and for convolution it expands kernels into input-sized matrices to produce multiple ciphertexts whose results are accumulated. What stands out as potentially useful is the kernel-spanning technique, which seems designed to fit convolution into the matrix multiplication primitive they define. This could be a practical way to handle the operation without custom packing for each layer. The implementation reports a runtime of roughly 287 seconds for processing 32 encrypted 28x28 images to get 10 likelihoods each on a 40 vCPU cloud setup, with a single ciphertext upload of about 19.8 MB. That gives a sense of the scale they achieved. On the other hand, the description stops at the high level. There are no step-by-step derivations showing how the additional operations produce the correct product, no error bounds or noise analysis for the homomorphic scheme, and no small-scale test case where they show plaintext and decrypted results match. The MNIST experiment gives only timing, with no mention of accuracy or direct comparison to a non-encrypted run. This leaves open the possibility that the spanning or accumulation steps introduce indexing errors or miss some terms, as the stress-test note suggests. Because the full manuscript was referenced but the core claims rest on unshown details, it's difficult to judge the soundness. The work appears to be an attempt at a concrete construction rather than a theoretical advance. This paper would mainly interest people already working on homomorphic encryption for machine learning who are looking for encoding variants. A general reader or someone outside the subfield would not get much from it without the missing verification steps. I would not bring this to a reading group as is, because the gaps make discussion speculative. I would not cite it. And I do not think it deserves peer review until the authors add the algebraic checks and experimental validation that would let others reproduce and trust the results.

Referee Report

3 major / 1 minor

Summary. The paper introduces a novel matrix-encoding technique called 'Volley Revolver' for efficient homomorphic matrix multiplication and convolution operations in privacy-preserving neural network inference using homomorphic encryption. For matrix multiplication, it encrypts matrix A and the transpose of B into ciphertexts and applies additional operations; for convolution, each kernel is spanned to an input-sized matrix to produce multiple ciphertexts whose results are accumulated. The authors implement a CNN for MNIST handwritten digit classification and report that inference on 32 encrypted 28x28 images (producing 10 likelihoods each) takes approximately 287 seconds on a public cloud with 40 vCPUs, requiring the data owner to upload only one ~19.8 MB ciphertext.

Significance. If the encoding and spanning procedures are shown to preserve exact arithmetic equivalence to plaintext CNN operations (modulo HE noise), the approach could offer efficiency gains for HE-based private inference by reducing ciphertext counts for convolutions. The concrete runtime numbers and single-ciphertext upload detail constitute a practical contribution that would be citable if correctness is established.

major comments (3)

[Abstract] Abstract: the description of spanning each convolution kernel to an input-sized matrix and accumulating partial results claims this completes the convolution operation, but supplies no algebraic verification, small-scale example, or handling of boundary/padding cases demonstrating that the decrypted output matches standard convolution without off-by-one indexing or dropped terms.
[Experimental Results] Experimental section (MNIST timing): the reported ~287-second runtime for 32 images provides no plaintext baseline computation or numerical comparison of decrypted outputs against unencrypted CNN results, leaving the correctness of the matrix-encoding and accumulation steps unverified.
[Method] Method description: the statement that 'with additional operations, the homomorphic matrix multiplication can be calculated over encrypted matrices efficiently' after encrypting A and B^T lacks any equations, pseudocode, or derivation showing the sequence of HE operations and the final decryption step that recovers the product.

minor comments (1)

[Abstract] Abstract: the phrasing 'span each convolution kernel to a matrix space of the same size as the input image so as to generate several ciphertexts' is ambiguous and would benefit from explicit notation or a diagram clarifying the spanning and accumulation mechanics.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the constructive comments. We address each major point below and will revise the manuscript to incorporate the requested verifications and details.

read point-by-point responses

Referee: [Abstract] Abstract: the description of spanning each convolution kernel to an input-sized matrix and accumulating partial results claims this completes the convolution operation, but supplies no algebraic verification, small-scale example, or handling of boundary/padding cases demonstrating that the decrypted output matches standard convolution without off-by-one indexing or dropped terms.

Authors: We agree that the abstract and method description require additional verification. The revised manuscript will include an algebraic proof of equivalence (modulo HE noise), a small-scale worked example, and explicit treatment of padding and boundary conditions to confirm that decrypted results match standard convolution. revision: yes
Referee: [Experimental Results] Experimental section (MNIST timing): the reported ~287-second runtime for 32 images provides no plaintext baseline computation or numerical comparison of decrypted outputs against unencrypted CNN results, leaving the correctness of the matrix-encoding and accumulation steps unverified.

Authors: We will add a plaintext baseline runtime and direct numerical comparisons (showing decrypted outputs match unencrypted results within noise bounds) to the experimental section. revision: yes
Referee: [Method] Method description: the statement that 'with additional operations, the homomorphic matrix multiplication can be calculated over encrypted matrices efficiently' after encrypting A and B^T lacks any equations, pseudocode, or derivation showing the sequence of HE operations and the final decryption step that recovers the product.

Authors: We will expand the method section with the full sequence of HE operations, equations, pseudocode, and a derivation of the decryption step that recovers the matrix product. revision: yes

Circularity Check

0 steps flagged

No circularity: construction is self-contained algebraic description without fitted parameters or self-referential definitions

full rationale

The paper presents a matrix-encoding scheme for homomorphic matrix multiplication and convolution by describing explicit steps (encrypt A and B^T, span kernels to input-sized matrices, accumulate intermediate ciphertexts). No equations, fitted parameters, or self-citations appear in the provided text that would reduce any claimed result to a quantity defined by the authors' own prior choices. The derivation chain consists of direct algorithmic descriptions rather than predictions or uniqueness theorems imported from the same authors, satisfying the criteria for a self-contained construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that standard homomorphic encryption schemes support the required ciphertext operations after the proposed encoding, plus the unstated premise that the encoding introduces no numerical error in the decrypted neural-network outputs.

axioms (1)

domain assumption Homomorphic encryption supports the additional operations needed after the matrix encoding to produce correct encrypted matrix products.
Invoked in the abstract when stating that homomorphic matrix multiplication can be calculated efficiently with additional operations.

invented entities (1)

Volley Revolver matrix-encoding method no independent evidence
purpose: To allow efficient homomorphic multiplication of matrices for neural-network layers while data remains encrypted.
New encoding procedure introduced by the paper; no independent evidence outside the paper is provided.

pith-pipeline@v0.9.0 · 5760 in / 1377 out tokens · 51193 ms · 2026-05-24T11:56:19.556899+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references · 11 canonical work pages · 1 internal anchor

[1]

Brutzkus, A., Gilad-Bachrach, R., and Elisha, O. (2019). Low latency privacy preserving inference. In International Conference on Machine Learning, pages 812–821. PMLR

work page 2019
[2]

Chabanne, H., de Wargny, A., Milgram, J., Morel, C., and Prouff, E. (2017). Privacy-preserving classiﬁcation on deep neural network. IACR Cryptol. ePrint Arch., 2017:35

work page 2017
[3]

H., Kim, A., Kim, M., and Song, Y

Cheon, J. H., Kim, A., Kim, M., and Song, Y . (2017). Homomorphic encryption for arithmetic of approximate numbers. In International Conference on the Theory and Application of Cryptology and Information Security, pages 409–437. Springer

work page 2017
[4]

Chou, E., Beal, J., Levy, D., Yeung, S., Haque, A., and Fei-Fei, L. (2018). Faster cryptonets: Leveraging sparsity for real-world encrypted inference. arXiv preprint arXiv:1811.09953

work page internal anchor Pith review Pith/arXiv arXiv 2018
[5]

Gentry, C. (2009). Fully homomorphic encryption using ideal lattices. In Proceedings of the forty-ﬁrst annual ACM symposium on Theory of computing, pages 169–178

work page 2009
[6]

Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K., Naehrig, M., and Wernsing, J. (2016). Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In International conference on machine learning, pages 201–210. PMLR. 2https://petitioner.github.io/images/family/MyFather.jpg 3https://petitioner.github.io/images/family/MyS...

work page 2016
[7]

and Shoup, V

Halevi, S. and Shoup, V . (2020). Helib design principles.Tech. Rep. https://github.com/ homenc/HElib

work page 2020
[8]

H., and Park, D

Han, K., Hong, S., Cheon, J. H., and Park, D. (2019). Logistic regression on homomorphic en- crypted data at scale. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, volume 33, pages 9466–9471

work page 2019
[9]

Jiang, X., Kim, M., Lauter, K., and Song, Y . (2018). Secure outsourced matrix computation and application to neural networks. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 1209–1222

work page 2018
[10]

Kim, A., Song, Y ., Kim, M., Lee, K., and Cheon, J. H. (2018). Logistic regression model training based on the approximate homomorphic encryption. BMC medical genomics, 11(4):83

work page 2018
[11]

and Vercauteren, F

Smart, N. and Vercauteren, F. (2011). Fully homomorphic simd operations. Cryptology ePrint Archive, Report 2011/133. https://ia.cr/2011/133. 13

work page 2011

[1] [1]

Brutzkus, A., Gilad-Bachrach, R., and Elisha, O. (2019). Low latency privacy preserving inference. In International Conference on Machine Learning, pages 812–821. PMLR

work page 2019

[2] [2]

Chabanne, H., de Wargny, A., Milgram, J., Morel, C., and Prouff, E. (2017). Privacy-preserving classiﬁcation on deep neural network. IACR Cryptol. ePrint Arch., 2017:35

work page 2017

[3] [3]

H., Kim, A., Kim, M., and Song, Y

Cheon, J. H., Kim, A., Kim, M., and Song, Y . (2017). Homomorphic encryption for arithmetic of approximate numbers. In International Conference on the Theory and Application of Cryptology and Information Security, pages 409–437. Springer

work page 2017

[4] [4]

Chou, E., Beal, J., Levy, D., Yeung, S., Haque, A., and Fei-Fei, L. (2018). Faster cryptonets: Leveraging sparsity for real-world encrypted inference. arXiv preprint arXiv:1811.09953

work page internal anchor Pith review Pith/arXiv arXiv 2018

[5] [5]

Gentry, C. (2009). Fully homomorphic encryption using ideal lattices. In Proceedings of the forty-ﬁrst annual ACM symposium on Theory of computing, pages 169–178

work page 2009

[6] [6]

Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K., Naehrig, M., and Wernsing, J. (2016). Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In International conference on machine learning, pages 201–210. PMLR. 2https://petitioner.github.io/images/family/MyFather.jpg 3https://petitioner.github.io/images/family/MyS...

work page 2016

[7] [7]

and Shoup, V

Halevi, S. and Shoup, V . (2020). Helib design principles.Tech. Rep. https://github.com/ homenc/HElib

work page 2020

[8] [8]

H., and Park, D

Han, K., Hong, S., Cheon, J. H., and Park, D. (2019). Logistic regression on homomorphic en- crypted data at scale. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, volume 33, pages 9466–9471

work page 2019

[9] [9]

Jiang, X., Kim, M., Lauter, K., and Song, Y . (2018). Secure outsourced matrix computation and application to neural networks. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 1209–1222

work page 2018

[10] [10]

Kim, A., Song, Y ., Kim, M., Lee, K., and Cheon, J. H. (2018). Logistic regression model training based on the approximate homomorphic encryption. BMC medical genomics, 11(4):83

work page 2018

[11] [11]

and Vercauteren, F

Smart, N. and Vercauteren, F. (2011). Fully homomorphic simd operations. Cryptology ePrint Archive, Report 2011/133. https://ia.cr/2011/133. 13

work page 2011