pith. sign in

arxiv: 2201.12577 · v10 · submitted 2022-01-29 · 💻 cs.CR · cs.CV

Volley Revolver: A Novel Matrix-Encoding Method for Privacy-Preserving Neural Networks (Inference)

Pith reviewed 2026-05-24 11:56 UTC · model grok-4.3

classification 💻 cs.CR cs.CV
keywords homomorphic encryptionprivacy-preserving inferencematrix encodingconvolutional neural networksencrypted convolutionMNIST classificationciphertext matrix multiplication
0
0 comments X

The pith

A matrix-encoding method performs homomorphic matrix multiplication and convolution directly on ciphertexts for neural network inference.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents an encoding technique that encrypts one matrix and the transpose of the other, then applies additional operations to compute their product over encrypted data. For convolutions it first expands each kernel into a full-size matrix matching the input image dimensions, producing multiple ciphertexts that combine with the encrypted input to yield partial results. These are accumulated to finish the convolution step. The method is demonstrated by running a convolutional network on encrypted MNIST images, completing inference on 32 examples in roughly 287 seconds on a 40-vCPU cloud instance while requiring only one 19.8 MB ciphertext upload. A reader would care because the approach keeps all intermediate values encrypted, allowing a cloud service to classify private images without ever seeing them in plaintext.

Core claim

For two matrices A and B, encrypt A and the transpose of B into separate ciphertexts; homomorphic matrix multiplication then follows from a sequence of additional operations on those ciphertexts. For convolution, each kernel is first spanned into a matrix the same size as the input image, generating several ciphertexts; each such ciphertext is multiplied with the ciphertext of the input image to produce a portion of the convolution output, after which the portions are summed to recover the full result.

What carries the argument

The matrix-encoding method that converts matrix multiplication into ciphertext operations on A and B-transpose, together with the kernel-spanning procedure that turns each convolution into a set of matrix multiplications over ciphertexts.

If this is right

  • Convolutional layers can be evaluated entirely under encryption without decrypting intermediate feature maps.
  • A single ciphertext can hold 32 images at once, allowing batched inference at the reported cloud cost.
  • Only the final likelihood vector needs to be returned to the data owner, keeping the model weights hidden on the server.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same encoding could be applied to fully-connected layers by treating weight matrices directly as the A or B inputs.
  • If the method scales without precision loss to deeper networks, it would support private inference on models larger than the MNIST example shown.
  • The ciphertext size and runtime figures imply that image dimensions and batch size remain practical constraints for real deployments.

Load-bearing premise

The encoding steps and kernel expansion preserve exact numerical values so that decryption yields the identical results a plaintext network would produce.

What would settle it

Decrypt the outputs of the encrypted CNN on the MNIST test set and compare them element-wise to the outputs of the identical network run on the corresponding plaintext images; any systematic difference would show the encoding introduced error.

Figures

Figures reproduced from arXiv: 2201.12577 by John Chiang.

Figure 1
Figure 1. Figure 1: Our matrix multiplication algorithm with [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Our convolution operation algorithm with [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
read the original abstract

In this work, we present a novel matrix-encoding method that is particularly convenient for neural networks to make predictions in a privacy-preserving manner using homomorphic encryption. Based on this encoding method, we implement a convolutional neural network for handwritten image classification over encryption. For two matrices $A$ and $B$ to perform homomorphic multiplication, the main idea behind it, in a simple version, is to encrypt matrix $A$ and the transpose of matrix $B$ into two ciphertexts respectively. With additional operations, the homomorphic matrix multiplication can be calculated over encrypted matrices efficiently. For the convolution operation, we in advance span each convolution kernel to a matrix space of the same size as the input image so as to generate several ciphertexts, each of which is later used together with the ciphertext encrypting input images for calculating some of the final convolution results. We accumulate all these intermediate results and thus complete the convolution operation. In a public cloud with 40 vCPUs, our convolutional neural network implementation on the MNIST testing dataset takes $\sim$ 287 seconds to compute ten likelihoods of 32 encrypted images of size $28 \times 28$ simultaneously. The data owner only needs to upload one ciphertext ($\sim 19.8$ MB) encrypting these 32 images to the public cloud.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper introduces a novel matrix-encoding technique called 'Volley Revolver' for efficient homomorphic matrix multiplication and convolution operations in privacy-preserving neural network inference using homomorphic encryption. For matrix multiplication, it encrypts matrix A and the transpose of B into ciphertexts and applies additional operations; for convolution, each kernel is spanned to an input-sized matrix to produce multiple ciphertexts whose results are accumulated. The authors implement a CNN for MNIST handwritten digit classification and report that inference on 32 encrypted 28x28 images (producing 10 likelihoods each) takes approximately 287 seconds on a public cloud with 40 vCPUs, requiring the data owner to upload only one ~19.8 MB ciphertext.

Significance. If the encoding and spanning procedures are shown to preserve exact arithmetic equivalence to plaintext CNN operations (modulo HE noise), the approach could offer efficiency gains for HE-based private inference by reducing ciphertext counts for convolutions. The concrete runtime numbers and single-ciphertext upload detail constitute a practical contribution that would be citable if correctness is established.

major comments (3)
  1. [Abstract] Abstract: the description of spanning each convolution kernel to an input-sized matrix and accumulating partial results claims this completes the convolution operation, but supplies no algebraic verification, small-scale example, or handling of boundary/padding cases demonstrating that the decrypted output matches standard convolution without off-by-one indexing or dropped terms.
  2. [Experimental Results] Experimental section (MNIST timing): the reported ~287-second runtime for 32 images provides no plaintext baseline computation or numerical comparison of decrypted outputs against unencrypted CNN results, leaving the correctness of the matrix-encoding and accumulation steps unverified.
  3. [Method] Method description: the statement that 'with additional operations, the homomorphic matrix multiplication can be calculated over encrypted matrices efficiently' after encrypting A and B^T lacks any equations, pseudocode, or derivation showing the sequence of HE operations and the final decryption step that recovers the product.
minor comments (1)
  1. [Abstract] Abstract: the phrasing 'span each convolution kernel to a matrix space of the same size as the input image so as to generate several ciphertexts' is ambiguous and would benefit from explicit notation or a diagram clarifying the spanning and accumulation mechanics.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the constructive comments. We address each major point below and will revise the manuscript to incorporate the requested verifications and details.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the description of spanning each convolution kernel to an input-sized matrix and accumulating partial results claims this completes the convolution operation, but supplies no algebraic verification, small-scale example, or handling of boundary/padding cases demonstrating that the decrypted output matches standard convolution without off-by-one indexing or dropped terms.

    Authors: We agree that the abstract and method description require additional verification. The revised manuscript will include an algebraic proof of equivalence (modulo HE noise), a small-scale worked example, and explicit treatment of padding and boundary conditions to confirm that decrypted results match standard convolution. revision: yes

  2. Referee: [Experimental Results] Experimental section (MNIST timing): the reported ~287-second runtime for 32 images provides no plaintext baseline computation or numerical comparison of decrypted outputs against unencrypted CNN results, leaving the correctness of the matrix-encoding and accumulation steps unverified.

    Authors: We will add a plaintext baseline runtime and direct numerical comparisons (showing decrypted outputs match unencrypted results within noise bounds) to the experimental section. revision: yes

  3. Referee: [Method] Method description: the statement that 'with additional operations, the homomorphic matrix multiplication can be calculated over encrypted matrices efficiently' after encrypting A and B^T lacks any equations, pseudocode, or derivation showing the sequence of HE operations and the final decryption step that recovers the product.

    Authors: We will expand the method section with the full sequence of HE operations, equations, pseudocode, and a derivation of the decryption step that recovers the matrix product. revision: yes

Circularity Check

0 steps flagged

No circularity: construction is self-contained algebraic description without fitted parameters or self-referential definitions

full rationale

The paper presents a matrix-encoding scheme for homomorphic matrix multiplication and convolution by describing explicit steps (encrypt A and B^T, span kernels to input-sized matrices, accumulate intermediate ciphertexts). No equations, fitted parameters, or self-citations appear in the provided text that would reduce any claimed result to a quantity defined by the authors' own prior choices. The derivation chain consists of direct algorithmic descriptions rather than predictions or uniqueness theorems imported from the same authors, satisfying the criteria for a self-contained construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that standard homomorphic encryption schemes support the required ciphertext operations after the proposed encoding, plus the unstated premise that the encoding introduces no numerical error in the decrypted neural-network outputs.

axioms (1)
  • domain assumption Homomorphic encryption supports the additional operations needed after the matrix encoding to produce correct encrypted matrix products.
    Invoked in the abstract when stating that homomorphic matrix multiplication can be calculated efficiently with additional operations.
invented entities (1)
  • Volley Revolver matrix-encoding method no independent evidence
    purpose: To allow efficient homomorphic multiplication of matrices for neural-network layers while data remains encrypted.
    New encoding procedure introduced by the paper; no independent evidence outside the paper is provided.

pith-pipeline@v0.9.0 · 5760 in / 1377 out tokens · 51193 ms · 2026-05-24T11:56:19.556899+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

11 extracted references · 11 canonical work pages · 1 internal anchor

  1. [1]

    Brutzkus, A., Gilad-Bachrach, R., and Elisha, O. (2019). Low latency privacy preserving inference. In International Conference on Machine Learning, pages 812–821. PMLR

  2. [2]

    Chabanne, H., de Wargny, A., Milgram, J., Morel, C., and Prouff, E. (2017). Privacy-preserving classification on deep neural network. IACR Cryptol. ePrint Arch., 2017:35

  3. [3]

    H., Kim, A., Kim, M., and Song, Y

    Cheon, J. H., Kim, A., Kim, M., and Song, Y . (2017). Homomorphic encryption for arithmetic of approximate numbers. In International Conference on the Theory and Application of Cryptology and Information Security, pages 409–437. Springer

  4. [4]

    Chou, E., Beal, J., Levy, D., Yeung, S., Haque, A., and Fei-Fei, L. (2018). Faster cryptonets: Leveraging sparsity for real-world encrypted inference. arXiv preprint arXiv:1811.09953

  5. [5]

    Gentry, C. (2009). Fully homomorphic encryption using ideal lattices. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 169–178

  6. [6]

    Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K., Naehrig, M., and Wernsing, J. (2016). Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In International conference on machine learning, pages 201–210. PMLR. 2https://petitioner.github.io/images/family/MyFather.jpg 3https://petitioner.github.io/images/family/MyS...

  7. [7]

    and Shoup, V

    Halevi, S. and Shoup, V . (2020). Helib design principles.Tech. Rep. https://github.com/ homenc/HElib

  8. [8]

    H., and Park, D

    Han, K., Hong, S., Cheon, J. H., and Park, D. (2019). Logistic regression on homomorphic en- crypted data at scale. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 9466–9471

  9. [9]

    Jiang, X., Kim, M., Lauter, K., and Song, Y . (2018). Secure outsourced matrix computation and application to neural networks. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 1209–1222

  10. [10]

    Kim, A., Song, Y ., Kim, M., Lee, K., and Cheon, J. H. (2018). Logistic regression model training based on the approximate homomorphic encryption. BMC medical genomics, 11(4):83

  11. [11]

    and Vercauteren, F

    Smart, N. and Vercauteren, F. (2011). Fully homomorphic simd operations. Cryptology ePrint Archive, Report 2011/133. https://ia.cr/2011/133. 13