pith. machine review for the scientific record. sign in

arxiv: 2604.14579 · v2 · submitted 2026-04-16 · 📊 stat.ME · math.ST· stat.TH

Recognition: unknown

HASOD: A Hybrid Adaptive Screening-Optimization Design for High-Dimensional Industrial Experiments

Authors on Pith no claims yet

Pith reviewed 2026-05-10 10:57 UTC · model grok-4.3

classification 📊 stat.ME math.STstat.TH
keywords HASODfactor screeningresponse optimizationhigh-dimensional experimentsCWESSdefinitive screening designindustrial experimentationGaussian process optimization
0
0 comments X

The pith

HASOD unifies factor screening and response optimization into one adaptive three-phase framework for high-dimensional industrial experiments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents HASOD as a sequential design that identifies critical factors and optimizes the response surface without requiring a full redesign between those tasks. It begins with a modified definitive screening design paired to a new CWESS statistic that uses ElasticNet regression to flag interactions, then adaptively augments the design according to the factors found, and finishes with Gaussian-process global optimization guided by uncertainty. The authors prove that CWESS asymptotically distinguishes active from inactive factors with classification consistency. Simulations across six scenarios show the approach maintains high detection rates and lower prediction error than traditional separate-phase methods and eight competing techniques.

Core claim

The central claim is that a single adaptive structure can perform both screening and optimization for high-dimensional industrial experiments. Phase 1 employs a modified definitive screening design with the Cumulative Weighted Effect Screening Statistic that incorporates ElasticNet regression for interaction detection. Phase 2 selects augmentation strategies ranging from full factorial to response-surface designs on the basis of the factors identified. Phase 3 applies Gaussian-process optimization with uncertainty-guided refinement. The proof establishes that CWESS separates active and inactive factors asymptotically, and the empirical results show improved detection accuracy and prediction,

What carries the argument

The Cumulative Weighted Effect Screening Statistic (CWESS), which weights factor effects and incorporates ElasticNet regression to detect interactions while providing asymptotic separation of active from inactive factors.

If this is right

  • Eliminates the need for separate experimental redesign between the screening and optimization stages.
  • Maintains at least 90 percent factor detection accuracy even in interaction-heavy systems.
  • Delivers lower mean prediction error than traditional sequential methods while using only a moderate increase in total runs.
  • Supplies classification-consistency guarantees for the screening step that most existing methods lack.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The adaptive augmentation rule in phase two could be extended to include explicit cost or time constraints on run selection.
  • Testing the method on actual production data sets would reveal whether the simulated accuracy gains translate to messy real-world noise.
  • The Gaussian-process final stage could be swapped for other surrogate models suited to different response surfaces.
  • The asymptotic separation result for CWESS may motivate similar consistency proofs for other screening statistics used in sequential designs.

Load-bearing premise

The six simulated test scenarios represent the interaction patterns and sample sizes typical of real high-dimensional industrial experiments, allowing the asymptotic consistency of CWESS to hold in practice.

What would settle it

A real high-dimensional industrial dataset in which HASOD detects fewer than 90 percent of the known active factors or produces higher prediction error than a conventional sequential screening-then-optimization procedure.

Figures

Figures reproduced from arXiv: 2604.14579 by Kumarjit Pathak.

Figure 1
Figure 1. Figure 1: Comparison of different DOE methods including center points, axial points, and factorial corners, supporting quadratic modeling, interaction estimation, and pre￾cise optimization [2]. However RSM fundamentally assumes that critical factors have already been identified. The moment we apply RSM to high-dimensional spaces with numerous potentially negligible factors, the experimental requirements become prohi… view at source ↗
Figure 2
Figure 2. Figure 2: The HASOD framework workflow showing three-phase sequential strategy with decision points and adaptive design selection. Phase 1 performs [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: HASOD vs Competitors on detection accuracy by scenario [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: HASOD Detection Accuracy Distribution vs others [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: HASOD Comprehensive Benchmark Result: Performance comparison across all scenarios and methods showing detection accuracy, prediction error, [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
read the original abstract

Industrial experimentation requires both factor screening to identify critical variables and response optimization to find optimal operating conditions. Traditional approaches treat these as separate phases, necessitating costly sequential experimentation and full experimental redesign between phases. This paper introduces HASOD (Hybrid Adaptive Screening-Optimization Design), a novel three-phase sequential framework that simultaneously addresses factor identification and response surface optimization within a unified adaptive structure. Phase 1 employs a modified Definitive Screening Design with an enhanced Cumulative Weighted Effect Screening Statistic (CWESS) incorporating interaction detection via ElasticNet regression. Phase 2 adaptively selects augmentation strategies -- from full factorial to Response Surface Methodology designs -- based on critical factors identified in Phase 1. Phase 3 applies Gaussian process-based global optimization with uncertainty-guided refinement near the predicted optimum. We prove that CWESS asymptotically separates active from inactive factors, providing classification consistency guarantees absent from most screening methodologies. Across six test scenarios, HASOD achieves 97.08% factor detection accuracy -- 13.75 percentage points above traditional sequential methods (83.33%) -- and significantly outperforms all eight competitor methods (p < 0.001). HASOD yields improved prediction performance (mean error: 3.61) while maintaining >=90% detection across all scenarios including interaction-heavy systems. The framework requires an average of 41.5 experimental runs -- a 43% increase over traditional approaches -- yet delivers superior detection accuracy with dramatically reduced prediction error. HASOD offers a theoretically grounded, unified framework that eliminates sequential redesign without sacrificing predictive capability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes HASOD, a three-phase sequential framework for high-dimensional industrial experiments that combines factor screening (modified DSD with CWESS statistic augmented by ElasticNet regression for interactions in Phase 1), adaptive design augmentation based on identified factors (Phase 2), and Gaussian process global optimization with uncertainty refinement (Phase 3). It claims to prove that CWESS asymptotically separates active from inactive factors with classification consistency, and reports simulation results across six scenarios showing 97.08% factor detection accuracy (13.75 pp above traditional sequential methods), superiority over eight competitors (p<0.001), mean prediction error of 3.61, and an average of 41.5 runs.

Significance. If the asymptotic consistency result for CWESS holds under conditions matching the simulation regimes and the scenarios adequately represent real high-dimensional industrial systems with interactions, the work would offer a theoretically grounded unification of screening and optimization that reduces redesign overhead. The explicit attempt at a consistency proof and the multi-method comparisons are positive features; however, the absence of shipped code or full experimental details limits immediate reproducibility and external validation.

major comments (2)
  1. [Theoretical analysis] Theoretical analysis section: the claimed proof that CWESS asymptotically separates active from inactive factors with classification consistency is load-bearing for the headline performance claims, yet the manuscript provides no derivation, no explicit regularity conditions (e.g., p = o(n), restricted eigenvalue or irrepresentable condition on the design matrix), and no discussion of how ElasticNet interaction detection interacts with those conditions; this gap prevents assessment of whether the finite-sample results (average 41.5 runs, interaction-heavy scenarios) are covered by the asymptotics.
  2. [Simulation study] Simulation study section: the six custom test scenarios are the sole basis for the 97.08% accuracy, p<0.001 superiority, and cross-scenario >=90% detection claims, but no data-exclusion rules, full design matrices, or reproducible code are supplied; without these, it is impossible to verify whether the reported numbers are robust or whether the scenarios satisfy the conditions needed for the CWESS consistency result to underwrite the finite-sample behavior.
minor comments (2)
  1. [Methods] Notation for CWESS and the three phases is introduced without a consolidated table of symbols or explicit pseudocode for the adaptive augmentation rule in Phase 2.
  2. [Results] The abstract and results text report mean prediction error of 3.61 but do not specify the error metric (RMSE, MAE, etc.) or the hold-out procedure used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment point by point below, outlining the revisions we will undertake to strengthen the paper.

read point-by-point responses
  1. Referee: [Theoretical analysis] Theoretical analysis section: the claimed proof that CWESS asymptotically separates active from inactive factors with classification consistency is load-bearing for the headline performance claims, yet the manuscript provides no derivation, no explicit regularity conditions (e.g., p = o(n), restricted eigenvalue or irrepresentable condition on the design matrix), and no discussion of how ElasticNet interaction detection interacts with those conditions; this gap prevents assessment of whether the finite-sample results (average 41.5 runs, interaction-heavy scenarios) are covered by the asymptotics.

    Authors: We acknowledge that the current manuscript does not provide a full derivation of the asymptotic classification consistency for CWESS or the complete set of regularity conditions. In the revised version, we will expand the theoretical analysis section to include the complete proof, explicitly stating the regularity conditions (including p = o(n), restricted eigenvalue conditions, and the irrepresentable condition on the design matrix) and detailing how the ElasticNet component for interaction detection is compatible with these conditions. This addition will directly address whether the reported finite-sample performance (including the average 41.5 runs and interaction-heavy scenarios) falls within the scope of the asymptotic guarantees. revision: yes

  2. Referee: [Simulation study] Simulation study section: the six custom test scenarios are the sole basis for the 97.08% accuracy, p<0.001 superiority, and cross-scenario >=90% detection claims, but no data-exclusion rules, full design matrices, or reproducible code are supplied; without these, it is impossible to verify whether the reported numbers are robust or whether the scenarios satisfy the conditions needed for the CWESS consistency result to underwrite the finite-sample behavior.

    Authors: We agree that the absence of these details limits independent verification. In the revision, we will include the data-exclusion rules, the complete design matrices for all six scenarios, and deposit the full simulation code in a public repository (with a permanent link in the manuscript). This will allow direct reproduction of the 97.08% accuracy figures and confirmation that the scenarios satisfy the regularity conditions underlying the CWESS consistency result. revision: yes

Circularity Check

0 steps flagged

No significant circularity in HASOD derivation chain

full rationale

The paper claims an independent asymptotic proof that CWESS separates active from inactive factors with classification consistency, then reports finite-sample performance on six defined test scenarios. No load-bearing step reduces the proof, the CWESS definition, or the reported accuracy gains to a self-definition, fitted-input prediction, or self-citation chain by construction. The simulations test the framework but do not constitute or presuppose the theoretical result; the structure remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 2 invented entities

The central claims rest on an unshown asymptotic proof for CWESS, simulation-based performance, and standard assumptions from screening design and Gaussian process literature; several components are introduced without independent external validation.

free parameters (2)
  • ElasticNet regularization parameters
    Used within CWESS for interaction detection; values not specified in abstract.
  • Gaussian process kernel hyperparameters
    Control uncertainty-guided refinement in Phase 3; fitted or chosen per scenario.
axioms (2)
  • domain assumption CWESS provides asymptotic classification consistency for active vs inactive factors
    Invoked as the theoretical foundation but proof details absent from abstract.
  • ad hoc to paper The six test scenarios adequately represent high-dimensional industrial systems including interactions
    Basis for all reported accuracy and comparison results.
invented entities (2)
  • CWESS (Cumulative Weighted Effect Screening Statistic) no independent evidence
    purpose: Enhanced screening statistic incorporating interaction detection via ElasticNet
    Newly defined component central to Phase 1; no independent evidence provided beyond the paper's simulations.
  • HASOD three-phase adaptive framework no independent evidence
    purpose: Unified structure eliminating separate screening and optimization phases
    Core contribution; performance tied to internal test scenarios.

pith-pipeline@v0.9.0 · 5574 in / 1629 out tokens · 39066 ms · 2026-05-10T10:57:47.695647+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 1 canonical work pages · 1 internal anchor

  1. [1]

    D. C. Montgomery,Design and Analysis of Experiments, 9th ed. John Wiley & Sons, 2017

  2. [2]

    R. H. Myers, D. C. Montgomery, and C. M. Anderson-Cook,Response Surface Methodology: Process and Product Optimization Using De- signed Experiments, 4th ed. John Wiley & Sons, 2016

  3. [3]

    A systematic approach to planning for a designed industrial experiment,

    D. E. Coleman and D. C. Montgomery, “A systematic approach to planning for a designed industrial experiment,”Technometrics, vol. 35, no. 1, pp. 1–12, 1993

  4. [4]

    The design of optimum multifactorial experiments,

    R. L. Plackett and J. P. Burman, “The design of optimum multifactorial experiments,”Biometrika, vol. 33, no. 4, pp. 305–325, 1946

  5. [5]

    The2 k−p fractional factorial designs,

    G. E. Box and J. S. Hunter, “The2 k−p fractional factorial designs,” Technometrics, vol. 3, no. 3, pp. 311–351, 1961

  6. [6]

    C. F. J. Wu and M. S. Hamada,Experiments: Planning, Analysis, and Optimization, 3rd ed. John Wiley & Sons, 2021

  7. [7]

    On the experimental attainment of optimum conditions,

    G. E. Box and K. B. Wilson, “On the experimental attainment of optimum conditions,”Journal of the Royal Statistical Society Series B, vol. 13, no. 1, pp. 1–45, 1951

  8. [8]

    Some new three level designs for the study of quantitative variables,

    G. E. Box and D. W. Behnken, “Some new three level designs for the study of quantitative variables,”Technometrics, vol. 2, no. 4, pp. 455– 475, 1960

  9. [9]

    A comparison of three methods for selecting values of input variables in the analysis of output from a computer code,

    M. D. McKay, R. J. Beckman, and W. J. Conover, “A comparison of three methods for selecting values of input variables in the analysis of output from a computer code,”Technometrics, vol. 21, no. 2, pp. 239– 245, 1979. 10

  10. [10]

    On the distribution of points in a cube and the approx- imate evaluation of integrals,

    I. M. Sobol, “On the distribution of points in a cube and the approx- imate evaluation of integrals,”USSR Computational Mathematics and Mathematical Physics, vol. 7, no. 4, pp. 86–112, 1967

  11. [11]

    T. J. Santner, B. J. Williams, and W. I. Notz,The Design and Analysis of Computer Experiments, 2nd ed. Springer, 2018

  12. [12]

    Taking the human out of the loop: A review of bayesian optimization,

    B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. De Freitas, “Taking the human out of the loop: A review of bayesian optimization,” Proceedings of the IEEE, vol. 104, no. 1, pp. 148–175, 2016

  13. [13]

    A Tutorial on Bayesian Optimization

    P. I. Frazier, “A tutorial on bayesian optimization,” 2018, arXiv preprint arXiv:1807.02811

  14. [14]

    C. E. Rasmussen and C. K. Williams,Gaussian Processes for Machine Learning. MIT Press, 2006

  15. [15]

    A. Dean, M. Morris, J. Stufken, and D. Bingham,Handbook of Design and Analysis of Experiments. Chapman and Hall/CRC, 2015

  16. [16]

    Regularization and variable selection via the elastic net,

    H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,”Journal of the Royal Statistical Society Series B, vol. 67, no. 2, pp. 301–320, 2005

  17. [17]

    A general theory of minimum aberration and its applications,

    C.-S. Cheng and B. Tang, “A general theory of minimum aberration and its applications,”The Annals of Statistics, vol. 33, no. 2, pp. 944–958, 2005

  18. [18]

    A class of three-level designs for definitive screening in the presence of second-order effects,

    B. Jones and C. J. Nachtsheim, “A class of three-level designs for definitive screening in the presence of second-order effects,”Journal of Quality Technology, vol. 43, no. 1, pp. 1–15, 2011

  19. [19]

    Blocking schemes for definitive screening designs,

    ——, “Blocking schemes for definitive screening designs,”Technomet- rics, vol. 59, no. 1, pp. 74–83, 2017

  20. [20]

    Space-filling designs for computer experiments: A review,

    V . R. Joseph, “Space-filling designs for computer experiments: A review,”Quality Engineering, vol. 28, no. 1, pp. 28–35, 2016

  21. [21]

    Orthogonal arrays for computer experiments, integration and visualization,

    A. B. Owen, “Orthogonal arrays for computer experiments, integration and visualization,”Statistica Sinica, vol. 2, no. 2, pp. 439–452, 1992

  22. [22]

    The coordinate-exchange algorithm for constructing exact optimal experimental designs,

    R. K. Meyer and C. J. Nachtsheim, “The coordinate-exchange algorithm for constructing exact optimal experimental designs,”Technometrics, vol. 37, no. 1, pp. 60–69, 1995

  23. [23]

    A. C. Atkinson and A. N. Donev,Optimum Experimental Designs. Oxford University Press, 1992

  24. [24]

    Quick and easy analysis of unreplicated factorials,

    R. V . Lenth, “Quick and easy analysis of unreplicated factorials,” Technometrics, vol. 31, no. 4, pp. 469–473, 1989

  25. [25]

    Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces,

    R. Storn and K. Price, “Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces,”Journal of Global Optimization, vol. 11, no. 4, pp. 341–359, 1997