pith. sign in

arxiv: 2501.15630 · v2 · pith:5IMV3CQ2new · submitted 2025-01-26 · 💻 cs.CL · quant-ph

Quantum-Enhanced Attention Mechanism in NLP: A Hybrid Classical-Quantum Approach

classification 💻 cs.CL quant-ph
keywords attentionhybridquantumapproachclassicalclassical-quantummechanismmodel
0
0 comments X
read the original abstract

Recent advances in quantum computing have opened new pathways for enhancing deep learning architectures, particularly in domains characterized by high-dimensional and context-rich data such as natural language processing (NLP). In this work, we present a hybrid classical-quantum Transformer model that integrates a quantum-enhanced attention mechanism into the standard classical architecture. By embedding token representations into a quantum Hilbert space via parameterized variational circuits and exploiting entanglement-aware kernel similarities, the model captures complex semantic relationships beyond the reach of conventional dot-product attention. We demonstrate the effectiveness of this approach across diverse NLP benchmarks, showing improvements in both efficiency and representational capacity. The results section reveal that the quantum attention layer yields globally coherent attention maps and more separable latent features, while requiring comparatively fewer parameters than classical counterparts. These findings highlight the potential of quantum-classical hybrid models to serve as a powerful and resource-efficient alternative to existing attention mechanisms in NLP.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Quantum Parameterized Self-Attention Network for Image Classification

    quant-ph 2026-05 unverdicted novelty 6.0

    QPSAN implements self-attention via PQCs with 5 parameters, establishes a theoretical framework for its scoring properties, and reports outperformance over ViT on four vision datasets that grows with data complexity.