Multi-Vector Embeddings are Provably More Expressive than Single Vector Embeddings

Rajesh Jayaram

arxiv: 2606.23475 · v1 · pith:OBKNN57Inew · submitted 2026-06-22 · 💻 cs.DS · cs.IR

Multi-Vector Embeddings are Provably More Expressive than Single Vector Embeddings

Rajesh Jayaram This is my paper

Pith reviewed 2026-06-26 06:14 UTC · model grok-4.3

classification 💻 cs.DS cs.IR

keywords multi-vector embeddingssingle-vector embeddingsChamfer similaritydimension lower boundsexpressiveness separationpattern matrix methodNAND functionneural information retrieval

0 comments

The pith

Multi-vector embeddings with m vectors require single-vector dimension (ε² m)^Ω(1/ε) to approximate their Chamfer similarities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that certain collections of multi-vector embeddings, each using at most m vectors in R^d, have Chamfer similarity matrices that cannot be approximated within additive error ε by any single-vector embeddings unless the single-vector dimension D reaches (ε² m) raised to Ω(1/ε). This lower bound applies when the number of such embeddings is at most 2 to the power of a polynomial in m. The result follows from constructing a hard instance via the Pattern Matrix Method in which the similarity matrix exactly encodes the NAND_k boolean function. A sympathetic reader would care because the bound rules out the possibility that single-vector embeddings could match multi-vector expressiveness with dimension linear in m d, confirming a separation in representation power at fixed size.

Core claim

There exist collections of multi-vector embeddings in R^d each containing at most m vectors such that any single-vector embeddings in R^D satisfying |<q_i, d_j> - Chamfer(Q_i, X_j)| ≤ ε for all i,j must have D = (ε² m)^Ω(1/ε), for dataset sizes n ≤ 2^{poly(m)}. The proof constructs the hard instance so that the Chamfer matrix encodes the NAND_k function, which forces the stated dimension lower bound.

What carries the argument

The Pattern Matrix Method construction that encodes the NAND_k boolean function into the Chamfer similarity matrix of the chosen multi-vector sets.

If this is right

The MUVERA upper bound of D = m^{O(1/ε²)} cannot be strengthened to D linear in m d.
Multi-vector embeddings can express similarity structures that single-vector embeddings cannot approximate even roughly at comparable representation size.
Any attempt to replace multi-vector retrieval with single-vector embeddings must accept either large approximation error or a super-polynomial increase in dimension for constant ε.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Retrieval systems may need to retain multi-vector storage rather than compress to single vectors if they aim to preserve the full range of expressible similarities.
The NAND_k encoding suggests that other boolean functions could be used to derive similar separations for different similarity measures.
Practical dimension-reduction techniques for multi-vector models would have to operate in a regime that respects this exponential gap.

Load-bearing premise

The Pattern Matrix Method produces a Chamfer similarity matrix that exactly encodes the NAND_k boolean function for the chosen multi-vector sets, forcing the dimension lower bound under the n ≤ 2^{poly(m)} dataset size restriction.

What would settle it

A calculation for small fixed m and ε that exhibits a single-vector dimension smaller than (ε² m)^Ω(1/ε) while still approximating the constructed Chamfer matrix within ε would falsify the separation.

read the original abstract

Multi-vector (MV) embeddings have become a powerful paradigm in neural information retrieval (IR), achieving high retrieval accuracy by representing data with multiple vectors and scoring them via the non-linear Chamfer similarity. Despite their widely perceived superiority over single-vector (SV) embeddings which use inner product similarity, to date there is no formal proof that SV similarities cannot approximate MV similarities with the same representation size. Specifically, we ask the following: for any bounded dataset size $n \leq 2^{poly(m)}$, what is the smallest dimension $D$ so that given any collection of MV embeddings $Q_1,\dots,Q_n,X_1,\dots,X_n \subset \mathbb{R}^d$ containing at most $m$ vectors each, there always exist $q_1,\dots,q_n$, $d_1,\dots,d_n \in \mathbb{R}^{D}$ satisfying $|\langle q_i, d_j \rangle - \texttt{Chamfer}(Q_i,X_j)| \leq \epsilon$ for all $i,j$? Recently, the MUVERA algorithm demonstrated that $D = m^{O(1/\epsilon^2)}$ is possible. If improved to $D = md$, this would imply that MV embeddings are no more expressive than SV embeddings. In this paper, we rule out this scenario. Specifically, we prove the existence of a collection of MV embeddings in $\mathbb{R}^d$, each containing at most $m$ vectors, which require single-vector dimension of $D =(\epsilon^2 m)^{\Omega(1/\epsilon)}$ to approximate, establishing a strong separation in representation size between MV and SV embeddings. Our proof leverages the Pattern Matrix Method by constructing a hard instance whose Chamfer similarity matrix encodes the $NAND_k$ boolean function. Our results confirm a long-held belief in the IR community: at a fixed representation size, multi-vector embeddings can express similarities which cannot even be approximately represented by single vector embeddings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a new exponential lower bound separating MV and SV embedding expressivity via a pattern-matrix reduction to NAND_k.

read the letter

The main thing to know is that the authors prove an exponential separation: for MV sets of m vectors each, approximating their Chamfer similarities with inner products requires single-vector dimension D = (ε² m)^Ω(1/ε). This rules out any hope that SV embeddings can match MV power with only polynomial dimension blowup, at least for n up to 2^poly(m).

The paper does well by turning the MUVERA upper bound into a matching-style lower bound. It constructs a hard instance whose Chamfer matrix exactly encodes the NAND_k communication matrix, then invokes the known pattern-matrix lower bound on approximate rank. The argument is direct, uses no free parameters, and engages the prior literature cleanly.

The soft spot is the exactness of the encoding step. The lower bound stands only if the chosen vectors in R^d make Chamfer match the boolean function without slack that a smaller-D SV embedding could exploit. The abstract states the construction works under the size restriction, and the full proof presumably spells out the vector placements and error control. If that check holds, the separation is real; if small perturbations allow better approximation, the exponent weakens. That is the one place that needs close reading.

This is for theorists working on embedding expressivity, communication complexity, and neural IR. A reader who cares about representation size lower bounds will find the result useful and the technique standard. The work shows clear engagement with the literature and no internal contradictions.

Recommendation: send it to peer review.

Referee Report

1 major / 1 minor

Summary. The paper claims to prove that multi-vector embeddings with at most m vectors each in R^d are strictly more expressive than single-vector embeddings: for datasets of size n ≤ 2^{poly(m)}, there exist MV sets whose Chamfer similarities cannot be ε-approximated by any inner-product matrix of dimension smaller than D = (ε² m)^Ω(1/ε). The separation is obtained by constructing a hard instance via the Pattern Matrix Method whose Chamfer matrix exactly encodes the NAND_k communication matrix.

Significance. If the lower-bound construction holds, the result supplies the first rigorous separation between MV and SV representation power at fixed size, ruling out the possibility that D = m d suffices and confirming a widely held intuition in the IR community. The application of communication-complexity techniques (Pattern Matrix Method) to obtain an explicit dimension lower bound on approximate rank is a methodological strength.

major comments (1)

[Proof of the main lower bound via Pattern Matrix Method] The central lower-bound claim (abstract and proof of the main theorem) rests on the assertion that the constructed Chamfer similarity matrix exactly equals the NAND_k communication matrix. Because Chamfer is a nonlinear aggregate over the m vectors, the vector placement in R^d must be shown to produce an exact encoding; any approximation error or perturbation that permits a lower-dimensional SV matrix to achieve the ε-approximation would invalidate the inherited pattern-matrix dimension lower bound. No explicit coordinates, error analysis, or edge-case verification for this encoding appear in the provided description.

minor comments (1)

The abstract should state the precise definition of Chamfer similarity employed (max versus sum of pairwise inner products) and the precise range of d for which the MV sets live.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thorough review and for identifying the need for greater clarity in the lower-bound construction. We address the major comment below and will revise the manuscript to strengthen the presentation of the proof details.

read point-by-point responses

Referee: [Proof of the main lower bound via Pattern Matrix Method] The central lower-bound claim (abstract and proof of the main theorem) rests on the assertion that the constructed Chamfer similarity matrix exactly equals the NAND_k communication matrix. Because Chamfer is a nonlinear aggregate over the m vectors, the vector placement in R^d must be shown to produce an exact encoding; any approximation error or perturbation that permits a lower-dimensional SV matrix to achieve the ε-approximation would invalidate the inherited pattern-matrix dimension lower bound. No explicit coordinates, error analysis, or edge-case verification for this encoding appear in the provided description.

Authors: We agree that the encoding step requires explicit verification to ensure the inherited approximate-rank lower bound applies without error. In the full manuscript (Section 3), the construction is given by embedding the pattern matrix of NAND_k into Chamfer similarities as follows: for each row/column index corresponding to an input to NAND_k, we place m vectors in R^d (with d = O(k log n)) such that the maximum inner product over the m pairs is exactly 1 when NAND_k evaluates to true and exactly 0 otherwise. This is achieved by dedicating one vector per 'witness' bit in the pattern matrix and using orthogonal directions to isolate the max operation; the nonlinearity of Chamfer is precisely what allows the exact match. The subsequent ε-approximation by an SV inner-product matrix is then ruled out by the known (ε² m)^Ω(1/ε) lower bound on the approximate rank of the NAND_k pattern matrix (via the Pattern Matrix Method). We will add an expanded subsection (new Section 3.3) containing the coordinate assignments, a formal proof that Chamfer equals the communication matrix with zero error, and verification for the edge cases (m=1, ε→0, and n=2^{poly(m)}). revision: yes

Circularity Check

0 steps flagged

No circularity: external communication-complexity lower bound via Pattern Matrix Method

full rationale

The paper derives its separation by constructing MV sets in R^d whose Chamfer matrix exactly encodes NAND_k (under the stated n bound) and then invoking the known approximate-rank lower bound for that function from the Pattern Matrix Method. This is a self-contained mathematical reduction against an external technique; no equations reduce to fitted parameters, self-definitions, or load-bearing self-citations. The central claim therefore stands on independent content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The proof relies on the Pattern Matrix Method from communication complexity and the assumption that a suitable MV collection exists whose Chamfer matrix realizes NAND_k; no free parameters or new entities are introduced.

axioms (1)

domain assumption The Pattern Matrix Method yields communication lower bounds that translate directly into dimension lower bounds for matrix approximation.
Invoked to obtain the stated D scaling from the NAND_k encoding.

pith-pipeline@v0.9.1-grok · 5894 in / 1348 out tokens · 26736 ms · 2026-06-26T06:14:37.621570+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 4 linked inside Pith

[1]

On strengths and limitations of single-vector embeddings.arXiv preprint arXiv:2603.29519,

[AGK+26] Mihir Agarwal, Ankit Garg, Neeraj Kayal, Kirankumar Shiragur, et al. On strengths and limitations of single-vector embeddings.arXiv preprint arXiv:2603.29519,

arXiv
[2]

Muvera: Multi-vector retrieval via fixed dimensional encodings.arXiv preprint arXiv:2405.19504,

[DHJ+24] Laxman Dhulipala, Majid Hadian, Rajesh Jayaram, Jason Lee, and Vahab Mir- rokni. Muvera: Multi-vector retrieval via fixed dimensional encodings.arXiv preprint arXiv:2405.19504,

Pith/arXiv arXiv
[3]

Colpali: Efficient document retrieval with vision language models

[FSW+25] Manuel Faysse, Hugues Sibille, Tony Wu, Bilel Omrani, Gautier Viaud, C´ eline Hudelot, and Pierre Colombo. Colpali: Efficient document retrieval with vision language models. 13 InInternational Conference on Learning Representations, volume 2025, pages 61424– 61449,

2025
[4]

Coil: Revisit exact lexical match in infor- mation retrieval with contextualized inverted list.arXiv preprint arXiv:2104.07186,

[GDC21] Luyu Gao, Zhuyun Dai, and Jamie Callan. Coil: Revisit exact lexical match in infor- mation retrieval with contextualized inverted list.arXiv preprint arXiv:2104.07186,

arXiv
[5]

Approximate algorithms for chamfer distance under translation.arXiv preprint arXiv:2605.25280,

[HZZ26] Gil Halevi, Daniel Zhang, and Jason Zhang. Approximate algorithms for chamfer distance under translation.arXiv preprint arXiv:2605.25280,

Pith/arXiv arXiv
[6]

Break- ing the curse of dimensionality: On the stability of modern vector retrieval.arXiv preprint arXiv:2512.12458,

[LMSC25] Vihan Lakshman, Blaise Munyampirwa, Julian Shun, and Benjamin Coleman. Break- ing the curse of dimensionality: On the stability of modern vector retrieval.arXiv preprint arXiv:2512.12458,

arXiv
[7]

Mteb: Massive text embedding benchmark.arXiv preprint arXiv:2210.07316,

[MTMR22] Niklas Muennighoff, Nouamane Tazi, Lo¨ ıc Magne, and Nils Reimers. Mteb: Massive text embedding benchmark.arXiv preprint arXiv:2210.07316,

Pith/arXiv arXiv
[8]

Efficient multi-vector dense retrieval using bit vectors.arXiv preprint arXiv:2404.02805,

[NRV24] Franco Maria Nardini, Cosimo Rulli, and Rossano Venturini. Efficient multi-vector dense retrieval using bit vectors.arXiv preprint arXiv:2404.02805,

arXiv
[9]

Multi-vector retrieval as sparse align- ment.arXiv preprint arXiv:2211.01267,

[QLD+22] Yujie Qian, Jinhyuk Lee, Sai Meher Karthik Duddu, Zhuyun Dai, Siddhartha Brahma, Iftekhar Naim, Tao Lei, and Vincent Y Zhao. Multi-vector retrieval as sparse align- ment.arXiv preprint arXiv:2211.01267,

arXiv
[10]

Colbertv2: effective and efficient retrieval via lightweight late interaction (2022).URL preprint arXiv:2112.01488,

[SKSF+21] Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, and Matei Zaharia. Colbertv2: effective and efficient retrieval via lightweight late interaction (2022).URL preprint arXiv:2112.01488,

arXiv 2022
[11]

On the theoretical limitations of embedding-based retrieval.arXiv preprint arXiv:2508.21038,

[WBNL25] Orion Weller, Michael Boratko, Iftekhar Naim, and Jinhyuk Lee. On the theoretical limitations of embedding-based retrieval.arXiv preprint arXiv:2508.21038,

arXiv
[12]

Pseudo-relevance feedback for multiple representation dense retrieval

[WMTO21] Xiao Wang, Craig Macdonald, Nicola Tonellotto, and Iadh Ounis. Pseudo-relevance feedback for multiple representation dense retrieval. InProceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval, pages 297–306,

2021
[13]

Filip: Fine-grained interactive language-image pre-training.arXiv preprint arXiv:2111.07783,

[YHH+21] Lewei Yao, Runhui Huang, Lu Hou, Guansong Lu, Minzhe Niu, Hang Xu, Xiao- dan Liang, Zhenguo Li, Xin Jiang, and Chunjing Xu. Filip: Fine-grained interactive language-image pre-training.arXiv preprint arXiv:2111.07783,

arXiv
[14]

Neural information retrieval: A literature review.arXiv preprint arXiv:1611.06792,

[ZRB+16] Ye Zhang, Md Mustafizur Rahman, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, et al. Neural information retrieval: A literature review.arXiv preprint arXiv:1611.06792,

Pith/arXiv arXiv

[1] [1]

On strengths and limitations of single-vector embeddings.arXiv preprint arXiv:2603.29519,

[AGK+26] Mihir Agarwal, Ankit Garg, Neeraj Kayal, Kirankumar Shiragur, et al. On strengths and limitations of single-vector embeddings.arXiv preprint arXiv:2603.29519,

arXiv

[2] [2]

Muvera: Multi-vector retrieval via fixed dimensional encodings.arXiv preprint arXiv:2405.19504,

[DHJ+24] Laxman Dhulipala, Majid Hadian, Rajesh Jayaram, Jason Lee, and Vahab Mir- rokni. Muvera: Multi-vector retrieval via fixed dimensional encodings.arXiv preprint arXiv:2405.19504,

Pith/arXiv arXiv

[3] [3]

Colpali: Efficient document retrieval with vision language models

[FSW+25] Manuel Faysse, Hugues Sibille, Tony Wu, Bilel Omrani, Gautier Viaud, C´ eline Hudelot, and Pierre Colombo. Colpali: Efficient document retrieval with vision language models. 13 InInternational Conference on Learning Representations, volume 2025, pages 61424– 61449,

2025

[4] [4]

Coil: Revisit exact lexical match in infor- mation retrieval with contextualized inverted list.arXiv preprint arXiv:2104.07186,

[GDC21] Luyu Gao, Zhuyun Dai, and Jamie Callan. Coil: Revisit exact lexical match in infor- mation retrieval with contextualized inverted list.arXiv preprint arXiv:2104.07186,

arXiv

[5] [5]

Approximate algorithms for chamfer distance under translation.arXiv preprint arXiv:2605.25280,

[HZZ26] Gil Halevi, Daniel Zhang, and Jason Zhang. Approximate algorithms for chamfer distance under translation.arXiv preprint arXiv:2605.25280,

Pith/arXiv arXiv

[6] [6]

Break- ing the curse of dimensionality: On the stability of modern vector retrieval.arXiv preprint arXiv:2512.12458,

[LMSC25] Vihan Lakshman, Blaise Munyampirwa, Julian Shun, and Benjamin Coleman. Break- ing the curse of dimensionality: On the stability of modern vector retrieval.arXiv preprint arXiv:2512.12458,

arXiv

[7] [7]

Mteb: Massive text embedding benchmark.arXiv preprint arXiv:2210.07316,

[MTMR22] Niklas Muennighoff, Nouamane Tazi, Lo¨ ıc Magne, and Nils Reimers. Mteb: Massive text embedding benchmark.arXiv preprint arXiv:2210.07316,

Pith/arXiv arXiv

[8] [8]

Efficient multi-vector dense retrieval using bit vectors.arXiv preprint arXiv:2404.02805,

[NRV24] Franco Maria Nardini, Cosimo Rulli, and Rossano Venturini. Efficient multi-vector dense retrieval using bit vectors.arXiv preprint arXiv:2404.02805,

arXiv

[9] [9]

Multi-vector retrieval as sparse align- ment.arXiv preprint arXiv:2211.01267,

[QLD+22] Yujie Qian, Jinhyuk Lee, Sai Meher Karthik Duddu, Zhuyun Dai, Siddhartha Brahma, Iftekhar Naim, Tao Lei, and Vincent Y Zhao. Multi-vector retrieval as sparse align- ment.arXiv preprint arXiv:2211.01267,

arXiv

[10] [10]

Colbertv2: effective and efficient retrieval via lightweight late interaction (2022).URL preprint arXiv:2112.01488,

[SKSF+21] Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, and Matei Zaharia. Colbertv2: effective and efficient retrieval via lightweight late interaction (2022).URL preprint arXiv:2112.01488,

arXiv 2022

[11] [11]

On the theoretical limitations of embedding-based retrieval.arXiv preprint arXiv:2508.21038,

[WBNL25] Orion Weller, Michael Boratko, Iftekhar Naim, and Jinhyuk Lee. On the theoretical limitations of embedding-based retrieval.arXiv preprint arXiv:2508.21038,

arXiv

[12] [12]

Pseudo-relevance feedback for multiple representation dense retrieval

[WMTO21] Xiao Wang, Craig Macdonald, Nicola Tonellotto, and Iadh Ounis. Pseudo-relevance feedback for multiple representation dense retrieval. InProceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval, pages 297–306,

2021

[13] [13]

Filip: Fine-grained interactive language-image pre-training.arXiv preprint arXiv:2111.07783,

[YHH+21] Lewei Yao, Runhui Huang, Lu Hou, Guansong Lu, Minzhe Niu, Hang Xu, Xiao- dan Liang, Zhenguo Li, Xin Jiang, and Chunjing Xu. Filip: Fine-grained interactive language-image pre-training.arXiv preprint arXiv:2111.07783,

arXiv

[14] [14]

Neural information retrieval: A literature review.arXiv preprint arXiv:1611.06792,

[ZRB+16] Ye Zhang, Md Mustafizur Rahman, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, et al. Neural information retrieval: A literature review.arXiv preprint arXiv:1611.06792,

Pith/arXiv arXiv