Can You Trust the Vectors in Your Vector Database? Black-Hole Attack from Embedding Space Defects

· 2026 · cs.CR · arXiv 2604.05480

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Vector databases serve as the retrieval backbone of modern AI applications, yet their security remains largely unexplored. We propose the Black-Hole Attack, a poisoning attack that injects a small number of malicious vectors near the geometric center of the stored vectors. These injected vectors attract queries like a black hole and frequently appear in the top-k retrieval results for most queries. This attack is enabled by a phenomenon we term centrality-driven hubness: in high-dimensional embedding spaces, vectors near the centroid become nearest neighbors of a disproportionately large number of other vectors, while this centroid region is nearly empty in practice. The attack shows that vectors in a vector database cannot be blindly trusted: geometric defects in high-dimensional embeddings make retrieval inherently vulnerable. Based on this insight, we propose four attack paths tailored to different attacker capabilities. Our experiments show that up to 94.4% of queries are successfully attacked. Additionally, we study two directions of defense: hubness mitigation and detection-based filtering. Hubness mitigation either significantly reduces retrieval accuracy or provides only limited protection, while the detection-based defense is effective against some attack paths but fails against others. A robust and adaptive defense thus remains an open problem, and our findings indicate that vector databases require more careful treatment of security.

representative citing papers

Data Agents Under Attack: Vulnerabilities in LLM-Driven Analytical Systems

cs.CR · 2026-06-07 · unverdicted · novelty 7.0

The paper introduces a layered vulnerability framework and attack taxonomy for LLM-driven data agents and demonstrates attacks on four open-source and two production systems.

citing papers explorer

Showing 1 of 1 citing paper.

Data Agents Under Attack: Vulnerabilities in LLM-Driven Analytical Systems cs.CR · 2026-06-07 · unverdicted · none · ref 27 · internal anchor
The paper introduces a layered vulnerability framework and attack taxonomy for LLM-driven data agents and demonstrates attacks on four open-source and two production systems.

Can You Trust the Vectors in Your Vector Database? Black-Hole Attack from Embedding Space Defects

fields

years

verdicts

representative citing papers

citing papers explorer