Skip to content
ReleasedMay 2026

Quantum Hybrid Modules for AI

Attention, Optimization, and Verification on Near-Term Quantum Hardware

Vikram Lex

Quantum ComputingQuantum Machine LearningQAOAGrover SearchHybrid Quantum-ClassicalNISQNeurosymbolic AI
22
Experiments
Spanning simulation, cloud, QPU hardware, and GPU
100
Qubits
Maximum scale tested on Rigetti Cepheus QPU
20,862×
Query Speedup
Grover verification at 30 qubits on NVIDIA A100
5
Platforms
Statevector, Azure Quantum, AWS Braket, Rigetti Cepheus, IonQ Forte
Abstract

Paper Summary

We propose Quantum Hybrid Modules (QHM), a framework for inserting quantum subroutines into classical AI pipelines with explicit assumptions about data loading, noise, and measurement overhead. The framework comprises three components: (1) a Quantum Self-Attention Neural Network (QSANN) computing sparse attention via Hadamard-test inner products in O(n√(n/k) · log d/ε) time assuming qRAM; (2) a QAOA-based module for combinatorial optimization sub-problems; and (3) a Grover-accelerated neurosymbolic verification loop for quadratically faster constraint checking. We evaluate QHM through 22 experiments spanning local statevector simulation and NVIDIA A100 GPU scaling up to 30 qubits, Azure and Braket cloud simulators/emulators, and five real-QPU experiments on Rigetti Cepheus and IonQ Forte, including 8–100-qubit superconducting QAOA and 8–30-qubit trapped-ion QAOA. Simulations confirm the intended mechanisms: QAOA exceeds the Goemans–Williamson reference value on selected 3-regular instances, ideal Hadamard-test attention reproduces classical attention, and Grover verification reaches a 20,862× query reduction at 30 qubits. Classical, dequantized, and qRAM-free baselines expose current limits: small quantum kernels and retrieval experiments lag tuned RBF/cosine baselines, hybrid VQC layers add training cost without consistent accuracy gains, and shot-based NISQ attention is limited by systematic inner-product bias. Hardware results are depth- and platform-dependent: shallow kernel and attention circuits retain measurable signal, Rigetti QAOA depolarizes to the random baseline and is not recovered by post-hoc mitigation, and IonQ Forte with default compiler-level mitigation preserves QAOA signal but remains below noiseless performance. We do not claim near-term quantum advantage; we characterize the gap between asymptotic theory and present-day hardware.

Framework

Three Quantum Modules

Each module targets a specific computational pattern where quantum subroutines offer asymptotic advantages under stated assumptions.

QSANN

Quantum Self-Attention Neural Network

Computes attention via Hadamard-test inner products in O(n√(n/k) · log d/ε) time assuming qRAM. Statevector simulation achieves cosine similarity 1.000000 with classical attention at all scales (4–32 tokens).

QAOA Module

Quantum Approximate Optimization Algorithm

Solves combinatorial optimization sub-problems embedded within AI decision-making pipelines. At depth p=3 on 3-regular graphs, approximation ratios reach 0.889–0.966, exceeding the Goemans–Williamson guarantee (0.878).

Grover Verification

Grover-Accelerated Constraint Checking

Quadratically faster neurosymbolic verification for AI safety and alignment. At 22 qubits (4.2M states), reduces violations from 55% to 0.04% in 40 iterations with 665,762× fewer queries than classical search.

Architecture

System Architecture

QHM architecture overview showing classical model delegating computation to quantum module via amplitude encoding

QHM architecture: a classical deep learning model (left) delegates computation to a quantum module (right) via amplitude encoding; a hybrid optimizer jointly updates both parameter sets.

Experiments

Experiment Highlights

22 experiments across four tiers of quantum infrastructure, from local simulation to real QPU hardware.

Statevector Simulation

Local CPU simulation, 4–20 qubits

Quantum Kernel Classification4–8q

ZZ-feature map SVM: 0.617 accuracy (8q) vs Linear SVM 0.967. Honest negative result — small quantum kernels lack expressiveness.

QAOA MAX-CUT8–20q

At p=3: ratios 0.889–0.966, exceeding GW guarantee (0.878). At p=5: 0.908–0.983.

Quantum Attention (QSANN)3–6q

Cosine similarity 1.000000 at all scales (4–32 tokens). 1,000 shots suffice for >0.9999 fidelity.

Grover Verification4–20q

O(√N) query scaling confirmed. 652× speedup at 20 qubits. Peak success 99.99% at optimal iterations.

Cloud Simulators & Emulators

Azure Quantum + Quantinuum H2-1E, 2–8 qubits

Core Circuit Validation2–5q

IonQ Simulator error <1%; Rigetti QVM 0.4–14%. Clear noise hierarchy across backends.

Scaled QAOA + Kernel6–8q

8q QAOA p=3 error 0.112 (IonQ), 0.010 (Rigetti). 6q kernel error 5×10⁻⁵.

Quantinuum H2-1E Emulator2–8q

QAOA 2–3% relative error; deeper circuits 4–15%. Clear noise hierarchy validated.

QPU Hardware

Rigetti Cepheus (superconducting) + IonQ Forte (trapped-ion)

QAOA on Rigetti Cepheus8–100q

43 QPU tasks. Edge-cut 0.489±0.009 — indistinguishable from random. Full depolarization at all scales.

Hadamard Test on Cepheus3q

10,240 shots. Systematic noise bias but reproducible (±0.04). Signal retained at 36 gates.

QAOA on IonQ Forte8–30q

Ratios 0.558–0.615, 12–23% above random. Peak 0.783 at p=3 (140 gates), collapses at p=4.

Kernel on IonQ Forte4q

Mean overlap 0.222 (3.6× random, 2.7× Cepheus). Within-class P(0)=0.89–0.93 vs cross-class 0.51.

GPU-Accelerated (NVIDIA A100)

Large-scale simulation, 10–30 qubits

Hybrid Classical-Quantum Pipeline10q

7 encoders × 3 tasks. Dequantized baseline matches hybrid accuracy at 200× lower cost.

GPU-Scaled QAOA24q

24-node graph, ratio 0.916 at p=4. Exceeds GW guarantee by +4.3%.

Neurosymbolic Verification Loop10–30q

22q: violations 55%→0.04% in 40 iterations, 665,762× query reduction. 20,862× at 30q.

Error Mitigation Analysis8–100q

ZNE + readout correction on Cepheus noise model. No mitigation recovers thermalized circuits. Positive control confirms method works at moderate noise.

Results

Selected Figures

Key results from simulation, cloud, and hardware experiments.

QHM data flow pipeline from classical input through quantum circuit to post-processing

Data flow: classical input → amplitude encoding → quantum circuit (kernel / QAOA / Grover) → measurement → classical post-processing.

QSANN Hadamard test circuit for quantum attention inner products

Hadamard test circuit for QSANN attention. The ancilla qubit mediates the inner product between query and key states.

Quantum-neurosymbolic verification loop with Grover-accelerated constraint checking

Quantum-neurosymbolic verification loop: Grover-based oracle checks constraints in O(√(N/k)) queries per iteration.

QAOA on Rigetti Cepheus showing full depolarization at all qubit counts

QAOA p=1 on Rigetti Cepheus (8–100 qubits). All QPU runs cluster at ~0.49 (random), while noiseless simulator achieves 0.75–0.80.

IonQ Forte vs Rigetti Cepheus QAOA comparison across graph sizes

IonQ Forte vs Rigetti Cepheus across graph sizes. IonQ preserves 12–23% above random; Cepheus thermalizes at every size.

GPU Grover search scaling from 10 to 30 qubits on NVIDIA A100

GPU-accelerated Grover search scaling (10–30 qubits). Query speedup reaches 20,862× at 30 qubits (1.07B states, 17.2 GB VRAM).

Neurosymbolic verification loop showing 665,762x query reduction

Verification loop at 22 qubits (4.2M states). Violation rate drops from 55% to 0.04% with 665,762× fewer quantum queries than classical search.

Citation

Cite This Paper

@article{lex2025qhm,
  title={Quantum Hybrid Modules for {AI}: Attention, Optimization, and
         Verification on Near-Term Quantum Hardware},
  author={Lex, Vikram},
  year={2025},
  doi={10.5281/zenodo.20103457},
  note={Under review}
}