Bridging Quantum Computing Paradigms toward Semiconductor Yield: A Controlled CV-versus-DV Comparison on Wafer-Map Defect Classification

Jonghyeok Im; Kyoungsik Kim; Monu Nath Baitha; Yeonhong Kim

arxiv: 2607.00961 · v1 · pith:HHQLDLPKnew · submitted 2026-07-01 · 🪐 quant-ph · cs.LG

Bridging Quantum Computing Paradigms toward Semiconductor Yield: A Controlled CV-versus-DV Comparison on Wafer-Map Defect Classification

Yeonhong Kim , Jonghyeok Im , Monu Nath Baitha , Kyoungsik Kim This is my paper

Pith reviewed 2026-07-02 12:04 UTC · model grok-4.3

classification 🪐 quant-ph cs.LG

keywords quantum neural networkscontinuous-variable quantum computingdiscrete-variable quantum computingwafer defect classificationsemiconductor yieldCV versus DV comparisonquantum machine learning

0 comments

The pith

A continuous-variable quantum head reaches 79.7% accuracy on wafer defect classification while the matched discrete-variable head reaches only 61.6%.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper fixes a convolutional backbone of roughly 4.3 million parameters and swaps only the final head between classical dense, CV-QNN, and DV-QNN versions. At four qumodes or qubits the CV version pulls ahead by a non-overlapping 18-point margin, with the biggest lift on the Edge-Loc defect class that DV essentially cannot detect. Training curves indicate the DV shortfall is a hard representational limit at the chosen Fock cutoff rather than a failure to optimize. Both quantum heads still trail a pure classical baseline of 85%, yet the controlled swap isolates where the structured CV layer already supplies measurable spatial discrimination.

Core claim

When a shared classical backbone feeds interchangeable heads, the continuous-variable quantum head consistently outperforms the discrete-variable head at every tested size; at four units the gap is 79.7 +/- 1.8% versus 61.6 +/- 1.4%, driven by markedly higher recall on spatially localized defects that the DV head fails to separate from similar-looking classes.

What carries the argument

Interchangeable CV-QNN and DV-QNN heads attached to a fixed convolutional backbone, which holds every other variable constant so that observed accuracy differences trace to the quantum paradigm alone.

If this is right

The DV limitation appears as a representational-capacity ceiling visible in the training curves rather than an optimization failure.
CV's advantage on Edge-Loc stems from its structured, neural-network-analogue layer combined with continuous phase-space encoding at d=2.
DV accuracy on hardware remains stable at shallow depth and declines only when circuits become deepest.
Both quantum heads stay below the classical 85% baseline, indicating that current noise and scale still limit practical use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the CV advantage survives when the Fock cutoff or circuit depth is increased, hardware roadmaps for quantum ML in manufacturing could prioritize continuous-variable platforms for spatially structured tasks.
The same controlled-head design could be applied to other image datasets that contain fine-grained localization cues to test whether the pattern generalizes beyond wafer maps.

Load-bearing premise

That any accuracy gap between the two quantum heads is caused by intrinsic properties of the CV versus DV layer rather than by implementation details, optimizer behavior, or the specific Fock cutoff chosen.

What would settle it

If the DV head, given an equivalent number of trainable parameters or a different encoding at the same cutoff, closes the 18-point gap on the Edge-Loc class while the backbone stays fixed, the claim that the advantage is intrinsic to the CV paradigm would be falsified.

read the original abstract

Realizing quantum neural networks (QNNs) in industry requires knowing which quantum computing paradigm suits which task. Motivated by AI accelerators and high-bandwidth memory, where die stacking makes wafer-level defect screening central to yield, we study WM-811K wafer-map defect classification (eight classes), comparing the dominant paradigms, continuous-variable (CV) and discrete-variable (DV), under controlled conditions. To isolate the quantum circuit as the sole variable, a shared convolutional backbone (~4.3M parameters) feeds interchangeable heads (classical dense, CV-QNN, or DV-QNN) as the only structural difference; each quantum head is scaled over three sizes (3, 4, 8 qumodes/qubits). The CV head consistently outperforms the DV head: at four qumodes/qubits it reaches 79.7 +/- 1.8% accuracy versus 61.6 +/- 1.4%, a non-overlapping 18-point gap. The advantage is sharpest on the spatially localized Edge-Loc class, easily confused with Scratch, which CV recovers with recall 0.66 +/- 0.06 while DV fails at every size (<=0.05), showing the structured CV layer better captures fine spatial distinctions between defect types. Training curves show the DV limitation is a representational-capacity ceiling, not an optimization failure; at the Fock cutoff used here (d = 2) the CV advantage reflects two intrinsic properties, a structured, neural-network-analogue layer and continuous phase-space encoding, not Hilbert-space dimensionality. On IBM hardware, DV accuracy holds at shallow depth, degrading only at the deepest circuit. Both quantum heads remain below the classical baseline (85.0%), but the controlled setting isolates where a structured head already helps and, as noise and scale improve, which paradigm can deliver practical advantage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Controlled CV vs DV head swap on wafer defects shows a clear accuracy gap favoring CV, but the isolation of paradigm from implementation details looks incomplete.

read the letter

The main takeaway is that swapping only the final head on a fixed 4.3M-parameter convolutional backbone produces a consistent edge for the CV version over DV on the WM-811K dataset. At four qumodes/qubits the numbers are 79.7 +/- 1.8% versus 61.6 +/- 1.4%, with the largest separation on the Edge-Loc class where CV recall reaches 0.66 while DV stays near zero across sizes.

They scale both heads over three sizes, report error bars, include training curves that point to a capacity limit rather than optimization failure for DV, and add a shallow-depth IBM hardware check for the DV case. The per-class recall tables and the explicit attempt to hold the backbone constant are the concrete contributions here.

The design tries to make the quantum circuit the only variable, which is a reasonable framing for comparing paradigms on spatial defect data. The fact that both quantum heads still trail the classical 85% baseline is stated plainly.

The soft spot is whether the heads are matched closely enough for the gap to be attributed cleanly to CV properties like the structured layer and phase-space encoding at d=2. The abstract describes the CV head with neural-network-analogue structure but gives no parallel detail on the DV head's circuit depth, variational parameter count, or exact layout. At equal Hilbert-space dimension any residual difference must come from those choices, so the 18-point gap cannot yet be locked to the paradigm label alone.

This is useful for readers working on quantum ML for industrial inspection tasks or anyone running controlled paradigm comparisons on image-like data. The empirical grounding and class breakdowns are strong enough to justify sending it to a serious referee, provided the methods section supplies the missing head-matching details.

Referee Report

2 major / 1 minor

Summary. The manuscript reports a controlled empirical comparison of continuous-variable (CV) and discrete-variable (DV) quantum neural network heads for eight-class defect classification on the WM-811K wafer-map dataset. A shared ~4.3M-parameter convolutional backbone feeds interchangeable heads (classical dense, CV-QNN, DV-QNN) scaled at 3/4/8 qumodes/qubits; the CV head reaches 79.7 +/- 1.8% accuracy versus 61.6 +/- 1.4% for DV at size 4, with the largest gap on the Edge-Loc class (recall 0.66 +/- 0.06 vs <=0.05). The advantage is attributed to intrinsic CV properties (structured NN-analogue layer and continuous phase-space encoding at Fock cutoff d=2) rather than Hilbert-space dimension; DV is described as hitting a representational-capacity ceiling. A shallow-depth IBM hardware check is included for DV, and both quantum heads remain below the classical baseline of 85.0%.

Significance. If the isolation of the quantum paradigm as the sole variable holds, the work supplies concrete, class-resolved benchmarks on paradigm suitability for a high-value industrial task in semiconductor yield. The non-overlapping error bars, multi-size scaling, and explicit contrast between optimization failure versus capacity ceiling constitute a useful empirical contribution; the attempt to hold the backbone fixed while varying only the head is a methodological strength worth preserving.

major comments (2)

[Abstract] Abstract: the central claim that the observed 18-point gap (and Edge-Loc recall difference) reflects 'intrinsic properties' of the CV paradigm requires that the CV and DV heads be matched in variational parameter count, layer depth, and optimization protocol. The asymmetric description of the CV head as incorporating a 'structured, neural-network-analogue layer' without a parallel specification for the DV head leaves open the possibility that the gap arises from uncontrolled implementation differences rather than the paradigm itself.
[Abstract (quantum-head description)] The experimental design (shared backbone, 'interchangeable' heads) is load-bearing for attributing any gap to CV versus DV. Without reported details on the number of trainable parameters per head or explicit confirmation that the DV head receives an equivalent structured variational circuit at d=2, the isolation of the paradigm cannot be verified from the provided information.

minor comments (1)

[Abstract] The abstract states that 'Training curves show the DV limitation is a representational-capacity ceiling'; a figure reference and brief description of how the curves distinguish capacity from optimization failure would improve verifiability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on the abstract. The points raised correctly identify that the central claim of paradigm isolation requires explicit verification of head equivalence, which the current abstract does not fully supply. We will revise the abstract to add the missing details on parameter counts, layer structures, and confirmation of equivalent variational circuits, thereby strengthening the attribution of the observed gap to CV versus DV properties rather than implementation differences.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the observed 18-point gap (and Edge-Loc recall difference) reflects 'intrinsic properties' of the CV paradigm requires that the CV and DV heads be matched in variational parameter count, layer depth, and optimization protocol. The asymmetric description of the CV head as incorporating a 'structured, neural-network-analogue layer' without a parallel specification for the DV head leaves open the possibility that the gap arises from uncontrolled implementation differences rather than the paradigm itself.

Authors: We agree that the abstract must explicitly demonstrate matching to support the claim. The manuscript states that the heads are interchangeable with the quantum paradigm as the sole structural difference and that both use identical optimization protocols and scaling sizes. However, the abstract does not report the per-head trainable parameter counts or provide a parallel description of the DV circuit. In revision we will add these specifications (including the number of variational parameters for each head at each size and confirmation that the DV head employs an equivalent structured variational ansatz at the d=2 cutoff) so that the isolation of the paradigm is verifiable from the abstract alone. revision: yes
Referee: [Abstract (quantum-head description)] The experimental design (shared backbone, 'interchangeable' heads) is load-bearing for attributing any gap to CV versus DV. Without reported details on the number of trainable parameters per head or explicit confirmation that the DV head receives an equivalent structured variational circuit at d=2, the isolation of the paradigm cannot be verified from the provided information.

Authors: We concur that the load-bearing claim of interchangeability requires these details to be stated. The full manuscript describes the heads as scaled identically (3/4/8) with the backbone fixed, but the abstract omits the concrete parameter counts and the explicit statement that the DV head is given a matching structured variational circuit at d=2. We will revise the abstract to include both the per-head parameter counts and the confirmation of equivalent DV circuit structure, ensuring the experimental design can be verified directly from the abstract. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical accuracies from controlled training on public dataset

full rationale

The paper reports measured classification accuracies from training neural networks (classical and quantum) on the WM-811K dataset. The architecture uses a fixed ~4.3M-parameter convolutional backbone with interchangeable heads; reported figures (e.g., 79.7% CV vs 61.6% DV at size 4) are direct experimental outcomes, not quantities derived from equations that reduce to the inputs by construction. No self-definitional relations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The comparison is presented as an empirical isolation experiment rather than a first-principles derivation.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the empirical performance numbers obtained under the shared-backbone design; the only explicit modeling choice is the Fock cutoff and the assumption that the backbone isolates the quantum variable.

free parameters (1)

Fock cutoff d = 2
Set to 2 for the CV head to enable direct comparison with DV at equivalent Hilbert-space dimension per mode.

axioms (1)

domain assumption The shared convolutional backbone isolates the quantum circuit as the sole variable
Explicitly stated as the design choice that allows attribution of the accuracy gap to the choice of quantum paradigm.

pith-pipeline@v0.9.1-grok · 5889 in / 1438 out tokens · 28280 ms · 2026-07-02T12:04:31.120819+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages · 1 internal anchor

[1]

PennyLane: Automatic differentiation of hybrid quantum-classical computations

Bartlett SD, Sanders BC, Braunstein SL, Nemoto K (2002) Efficient classical simulation of continuous variable quantum information processes. Phys Rev Lett 88(9):097904. https://doi.org/10.1103/PhysRevLett.88.097904 Bergholm V et al. (2022) PennyLane: Automatic differentiation of hybrid quantum-classical computations. arXiv:1811.04968 Biamonte J et al. (20...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1103/physrevlett.88.097904 2002
[2]

IEEE Trans Semicond Manuf 28(1):1–12

https://www.trendforce.com/presscenter/news/20251030-12762.html Wu MJ, Jang JSR, Chen JL (2015) Wafer map failure pattern recognition and similarity ranking for large-scale data sets. IEEE Trans Semicond Manuf 28(1):1–12. https://doi.org/10.1109/TSM.2014.2364237 Yang YF, Sun M (2022) Semiconductor defect detection by hybrid classical-quantum deep learning...

work page doi:10.1109/tsm.2014.2364237 2015

[1] [1]

PennyLane: Automatic differentiation of hybrid quantum-classical computations

Bartlett SD, Sanders BC, Braunstein SL, Nemoto K (2002) Efficient classical simulation of continuous variable quantum information processes. Phys Rev Lett 88(9):097904. https://doi.org/10.1103/PhysRevLett.88.097904 Bergholm V et al. (2022) PennyLane: Automatic differentiation of hybrid quantum-classical computations. arXiv:1811.04968 Biamonte J et al. (20...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1103/physrevlett.88.097904 2002

[2] [2]

IEEE Trans Semicond Manuf 28(1):1–12

https://www.trendforce.com/presscenter/news/20251030-12762.html Wu MJ, Jang JSR, Chen JL (2015) Wafer map failure pattern recognition and similarity ranking for large-scale data sets. IEEE Trans Semicond Manuf 28(1):1–12. https://doi.org/10.1109/TSM.2014.2364237 Yang YF, Sun M (2022) Semiconductor defect detection by hybrid classical-quantum deep learning...

work page doi:10.1109/tsm.2014.2364237 2015