Parametric Operator Inference to Simulate the Purging Process in Semiconductor Manufacturing

Boris Kramer; Hyeonghun Kim; Seunghyon Kang

arxiv: 2504.03990 · v3 · pith:FV5VZH77new · submitted 2025-04-04 · 🧮 math.NA · cs.NA· physics.comp-ph

Parametric Operator Inference to Simulate the Purging Process in Semiconductor Manufacturing

Seunghyon Kang , Hyeonghun Kim , Boris Kramer This is my paper

Pith reviewed 2026-05-22 20:45 UTC · model grok-4.3

classification 🧮 math.NA cs.NAphysics.comp-ph

keywords reduced-order modelingoperator inferenceparametric modelingsemiconductor manufacturingpurging processcomputational fluid dynamicsPECVD

0 comments

The pith

Parametric operator inference creates reduced-order models that predict purging flow fields in semiconductor chambers across 25 parameter combinations with a maximum error of 9.32%.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies parametric Operator Inference to learn reduced-order models from CFD data for the argon purging process inside a PECVD chamber. It trains nine separate ROMs on varying inlet mass flow rates and outlet pressures, then uses linear interpolation to obtain predictions at 25 total parameter points including 16 combinations absent from the training set. The resulting models are evaluated on 64% of the data and shown to reproduce the key flow features while delivering a 142-fold reduction in online compute time relative to the full CFD solver. A reader would care because such models could support rapid evaluation of contamination-control strategies during semiconductor manufacturing without repeated expensive simulations.

Core claim

The parametric OpInf framework learns nine ROMs from CFD snapshot data for different argon mass flow rates at the inlet and different outlet pressures. It then interpolates these ROMs to forecast the flow field for 25 parameter combinations, 16 of which are unseen during training. Trained on only 36% of the available data, the interpolated models reproduce the purging behavior across the full parameter domain with a maximum error of 9.32% and run approximately 142 times faster than the original CFD simulation.

What carries the argument

Parametric Operator Inference (OpInf), a non-intrusive data-driven method that learns low-dimensional dynamical models from simulation snapshots and interpolates the learned operators across parameter values.

Load-bearing premise

The simplified CFD model that omits plasma dynamics and chemical reactions still captures the essential purging flow features, and linear interpolation among the nine learned ROMs accurately represents the flow at the 16 unseen parameter points.

What would settle it

Compute a full-order CFD solution at one of the 16 unseen parameter combinations and measure whether the L2 or pointwise error between that solution and the interpolated OpInf ROM exceeds 9.32%.

Figures

Figures reproduced from arXiv: 2504.03990 by Boris Kramer, Hyeonghun Kim, Seunghyon Kang.

**Figure 2.** Figure 2: (a) Computational domain of a simplified PECVD chamber. B.C.1: mass flow inlet boundary [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Left: velocity field, Right: streamlines at the onset of the purging process, initiated upon [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: (a) Normalized singular value decay. (b) the cumulative energy [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: (a) Datasets used for training and testing. Of the 25 datasets, nine (36%) are used for [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Contour plots of the temperature in the vertical cross-section for the case with the largest [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Contour plots of the y-velocity vy in the vertical cross-section for the case with the largest error, (µp, µq) = (0.56, 0.95). Each column represents t = 0s, 0.5s, and 1s. (a) FOM, (b) ROM predictions, (c) Pointwise normalized absolute error. We compare the y-velocity and radial velocity flow fields at the monitor location for the case with the largest error, (µp, µq) = (0.56, 0.95), in Figures 10 and 11. … view at source ↗

**Figure 8.** Figure 8: Contour plots of the radial velocity vr = p v 2 x + v 2 z in the vertical cross-section for the case with the largest error, (µp, µq) = (0.56, 0.95). Each column represents t = 0s, 0.5s, and 1s. (a) FOM, (b) ROM predictions, (c) Pointwise normalized absolute error [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 9.** Figure 9: Monitor location: a quarter circular xz-plane with a 150 mm radius, positioned 1 mm above the wafer surface. (a) (b) (c) [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗

**Figure 10.** Figure 10: Contour plots of the y-velocity at the monitor location for the case with the largest error, [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

**Figure 11.** Figure 11: Contour plots of the radial velocity at the monitor location for the case with the largest [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗

read the original abstract

This work presents the application of parametric Operator Inference (OpInf) -- a nonintrusive reduced-order modeling (ROM) technique that learns a low-dimensional representation of a high-fidelity model -- to the numerical model of the purging process in semiconductor manufacturing. Leveraging the data-driven nature of the OpInf framework, we aim to forecast the flow field within a plasma-enhanced chemical vapor deposition (PECVD) chamber using computational fluid dynamics (CFD) simulation data. Our model simplifies the system by excluding plasma dynamics and chemical reactions, while still capturing the key features of the purging flow behavior. The parametric OpInf framework learns nine ROMs based on varying argon mass flow rates at the inlet and different outlet pressures. It then interpolates these ROMs to predict the system's behavior for 25 parameter combinations, including 16 scenarios that are not seen in training. The parametric OpInf ROMs, trained on 36\% of the data and tested on 64\%, demonstrate accuracy across the entire parameter domain, with a maximum error of 9.32\%. Furthermore, the ROM achieves an approximate 142-fold speedup in online computations compared to the full-order model CFD simulation. These OpInf ROMs may be used for fast and accurate predictions of the purging flow in the PECVD chamber, which could facilitate effective particle contamination control in semiconductor manufacturing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript applies parametric Operator Inference (OpInf) to create reduced-order models of the purging flow in a PECVD chamber for semiconductor manufacturing. Nine ROMs are learned from CFD data at selected combinations of argon inlet mass flow rate and outlet pressure; these ROMs are then interpolated to predict the flow field at 25 parameter points (16 of which are unseen in training). The underlying CFD model omits plasma dynamics and chemical reactions. Reported results include a maximum error of 9.32% on the 64% test portion of the data and an online speedup of approximately 142 times relative to the full-order CFD simulation.

Significance. If the accuracy claims hold under the stated modeling assumptions, the work supplies a practical, nonintrusive surrogate that could support rapid evaluation of purging scenarios for contamination control. The parametric OpInf construction and the reported computational speedup constitute a concrete engineering application of data-driven ROM techniques.

major comments (2)

[Abstract] Abstract: the central claim that linear interpolation of the nine learned reduced operators produces accurate predictions at the 16 unseen parameter combinations (with max error 9.32%) is load-bearing. No evidence is supplied that the map from (argon flow rate, outlet pressure) to the reduced operators is sufficiently smooth for linear interpolation, nor is cross-validation on a denser parameter grid or an alternative interpolation scheme reported.
[Abstract] Abstract: the modeling assumption that excluding plasma dynamics and chemical reactions leaves the dominant purging flow features intact for contamination-control purposes is not accompanied by a quantitative sensitivity study or comparison against a model that retains those effects.

minor comments (2)

The abstract states a 36%/64% training/testing split but does not specify how the split was performed across the 25-point parameter grid or which norm is used to compute the 9.32% maximum error.
Clarify whether the interpolation is performed on the reduced operators themselves or on their coefficients, and state the precise interpolation procedure (e.g., linear in parameter space).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below with the strongest honest defense of the manuscript while acknowledging where revisions can strengthen the presentation.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that linear interpolation of the nine learned reduced operators produces accurate predictions at the 16 unseen parameter combinations (with max error 9.32%) is load-bearing. No evidence is supplied that the map from (argon flow rate, outlet pressure) to the reduced operators is sufficiently smooth for linear interpolation, nor is cross-validation on a denser parameter grid or an alternative interpolation scheme reported.

Authors: The reported maximum error of 9.32% on the 16 unseen test points constitutes empirical evidence that linear interpolation is effective for this parameter regime and sampling density. The manuscript does not supply a theoretical smoothness analysis or results from a denser grid or alternative schemes, as generating additional high-fidelity CFD data is computationally expensive. We will revise the methods and results sections to explicitly discuss the parameter sampling strategy and to present the observed interpolation accuracy as practical validation of the approach. revision: partial
Referee: [Abstract] Abstract: the modeling assumption that excluding plasma dynamics and chemical reactions leaves the dominant purging flow features intact for contamination-control purposes is not accompanied by a quantitative sensitivity study or comparison against a model that retains those effects.

Authors: The manuscript states the modeling simplification explicitly and justifies it on the basis that the purging flow for particle contamination control is dominated by the inert-gas transport captured in the CFD model. No quantitative sensitivity study against a plasma-inclusive multiphysics model is provided, as constructing and validating such a model lies outside the scope of the present work on parametric OpInf for flow data. We will add a concise limitations paragraph in the introduction or conclusions to clarify this assumption and its implications. revision: yes

Circularity Check

0 steps flagged

No significant circularity: data-driven parametric OpInf from external CFD simulations

full rationale

The paper trains parametric OpInf reduced-order models on CFD data at 9 parameter combinations (36% of data) and interpolates the learned operators to predict at 25 total points including 16 unseen ones (64% held-out test data). This is a standard non-intrusive data-driven workflow with explicit validation against full-order simulations; no derivation step reduces by construction to its own fitted inputs, no load-bearing self-citation chains, and no ansatz or uniqueness claim that loops back to the target result. The reported 9.32% max error and 142x speedup are empirical outcomes on external data rather than tautological predictions.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

Review performed on abstract only; ledger entries are inferred from the high-level description of the OpInf workflow and the stated modeling simplifications.

free parameters (2)

ROM dimension / rank
OpInf requires choosing the reduced dimension; abstract does not specify how it was selected or whether it was fitted.
Interpolation weights or kernel parameters
Parametric interpolation among the nine base ROMs requires additional choices whose values are not reported.

axioms (2)

domain assumption The flow dynamics admit a low-dimensional linear operator representation learnable from snapshot data
Core premise of the Operator Inference framework invoked when the abstract states that nine ROMs are learned from CFD data.
ad hoc to paper Excluding plasma and chemical reactions does not alter the dominant flow features relevant to purging
Explicit modeling choice stated in the abstract that underpins the entire reduced model.

pith-pipeline@v0.9.0 · 5775 in / 1495 out tokens · 73505 ms · 2026-05-22T20:45:08.794337+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The parametric OpInf framework learns nine ROMs … It then interpolates these ROMs … min … Dℓ bO⊤ℓ − ˙bS⊤ℓ … LinearNDInterpolator

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.