arxiv: 2605.04905 · v2 · submitted 2026-05-06 · 💻 cs.LG · cs.DB

Recognition: 1 theorem link

· Lean Theorem

Cross-Model Consistency of Feature Importance in Electrospinning: Separating Robust from Model-Dependent Features

Mehrab Mahdian , Ferenc Ender , Tamas Pardy

Authors on Pith no claims yet

Pith reviewed 2026-05-13 06:55 UTC · model grok-4.3

classification 💻 cs.LG cs.DB

keywords electrospinningfeature importanceSHAPmachine learningmodel consistencypolyvinyl alcoholprocess parametersinterpretability

0 comments

The pith

Machine learning models for electrospinning agree on accuracy but produce divergent feature importance rankings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates whether feature importance from machine learning models remains consistent when modeling electrospinning data from a small set of 96 polyvinyl alcohol experiments. Twenty-one models across linear, tree-based, kernel-based, neural network, and instance-based families were trained and interpreted using SHAP values to derive comparable importance scores. A rank-based statistical analysis then quantified how much parameter rankings varied between models. The work finds that models can reach similar predictive performance while assigning very different importance to most inputs, with only solution concentration showing complete consistency across all models. This separation of predictive success from interpretive reliability implies that single-model feature rankings cannot be trusted without additional checks.

Core claim

Although several models achieved comparable predictive accuracy on the 96-experiment dataset, substantial differences were observed in their SHAP-derived feature importance rankings. Solution concentration emerged as the most robust and consistently influential parameter with variability of zero, whereas flow rate and applied voltage exhibited high ranking variability exceeding 0.9. The results demonstrate that predictive performance and interpretive reliability are fundamentally distinct properties, indicating that feature importance derived from a single ML model may be unreliable for small experimental datasets.

What carries the argument

The variability metric calculated from SHAP-based feature rankings across 21 different model families, which quantifies inter-model agreement on the relative importance of process parameters.

If this is right

Single-model feature importance carries a high risk of being model-dependent rather than data-driven when datasets are small.
Solution concentration should be treated as the primary reliable control variable for electrospinning process optimization.
Flow rate and applied voltage need cross-model validation before any importance ranking is used in experimental planning.
Cross-model consistency analysis becomes necessary for trustworthy interpretation in machine learning applied to limited process data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same consistency check could be applied to other fabrication processes to identify parameters that remain robust regardless of modeling approach.
Larger experimental datasets might reduce the model-dependent variability observed here, making single-model interpretations more reliable.
Incorporating physical constraints into the models could potentially increase agreement on feature rankings across families.

Load-bearing premise

That differences in SHAP feature rankings across model families directly reflect the true robustness of each parameter rather than model-specific biases or effects from the limited 96-experiment dataset.

What would settle it

Training the same 21 models on a substantially larger electrospinning dataset and finding that all models converge on identical feature importance rankings would falsify the claim of inherent model dependence.

Figures

Figures reproduced from arXiv: 2605.04905 by Ferenc Ender, Mehrab Mahdian, Tamas Pardy.

**Figure 1.** Figure 1: Measured nanofiber diameter distribution for the 96 PVA electrospin view at source ↗

**Figure 3.** Figure 3: SHAP-based feature importance rank matrix across all evaluated view at source ↗

read the original abstract

Electrospinning is a highly sensitive fabrication process in which small variations in operating parameters can significantly influence fiber morphology and material performance. Machine learning (ML) methods are increasingly employed to model these process-structure relationships and to identify the relative importance of processing variables. However, most existing studies rely on a single ML model, implicitly assuming that the resulting feature importance is robust and reproducible. In this study, the consistency of feature importance across multiple ML model families was systematically evaluated using a curated dataset of 96 polyvinyl alcohol (PVA) electrospinning experiments. Twenty-one ML models representing linear, tree-based, kernel-based, neural network, and instance-based approaches were trained and compared. To provide a unified interpretability framework, SHAP (SHapley Additive exPlanations) values were used to calculate feature importance consistently across all models. A rank-based statistical analysis was then performed to quantify inter-model agreement and assess the robustness of parameter rankings. The results demonstrate that predictive performance and interpretive reliability are fundamentally distinct properties. Although several models achieved comparable predictive accuracy, substantial differences were observed in their feature importance rankings. Solution concentration emerged as the most robust and consistently influential parameter (variability = 0), whereas flow rate and applied voltage exhibited high ranking variability (variability > 0.9), indicating strong model dependence. These findings suggest that feature importance derived from a single ML model may be unreliable, particularly for small experimental datasets, and highlight the importance of cross-model validation for achieving trustworthy interpretation in ML-assisted electrospinning research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SHAP feature rankings vary a lot across 21 models on the 96-point electrospinning set, with concentration stable and flow/voltage not, but the variability is likely inflated by small-sample noise.

read the letter

The paper trains 21 models spanning linear, tree, kernel, neural, and instance-based families on 96 PVA electrospinning runs, applies SHAP uniformly, and reports rank variability across the models. Concentration shows zero variability and is labeled robust; flow rate and voltage exceed 0.9 and are labeled model-dependent. The central claim is that predictive accuracy and feature-importance stability are separate things, so single-model SHAP outputs should not be trusted on small experimental tables.

Referee Report

3 major / 2 minor

Summary. The paper trains 21 ML models spanning linear, tree-based, kernel, neural, and instance-based families on a curated set of 96 PVA electrospinning experiments. It applies SHAP uniformly to obtain feature-importance rankings, then computes a rank-variability statistic across models. Solution concentration is reported as fully robust (variability = 0) while flow rate and applied voltage show high variability (> 0.9), leading to the claim that single-model feature importance is unreliable on small experimental datasets and that cross-model consistency checks are required.

Significance. If the variability metric can be shown to exceed finite-sample noise, the work supplies a concrete cautionary result for the growing use of ML interpretability in process optimization: predictive accuracy and feature-rank stability are distinct, and reliance on any single model family risks model-specific artifacts. The uniform SHAP protocol across 21 models and the explicit separation of accuracy from interpretability are positive methodological steps.

major comments (3)

[Methods (model training and SHAP computation)] The manuscript provides no description of the train/test splitting procedure, cross-validation scheme, or hyperparameter selection method used to train the 21 models. Because SHAP values and the subsequent rank-variability statistic are functionals of the fitted models, the reported values (concentration variability = 0, flow-rate/voltage variability > 0.9) cannot be assessed for sensitivity to these choices on an n = 96 table.
[Results (rank-variability analysis)] No bootstrap, permutation, or repeated-subsampling experiment is presented to establish that the observed rank dispersion exceeds the variability expected from sampling noise alone. With only 96 rows, small perturbations in the training data can reorder top features; without such a stability check the claim that high variability indicates genuine model dependence rather than finite-sample artifact remains untested.
[Results (predictive performance comparison)] The abstract states that several models achieved comparable predictive accuracy, yet no table or figure reports the actual performance metrics (R², MAE, or RMSE) for each of the 21 models. Without these numbers it is impossible to verify that the models whose rankings differ are all comparably competent predictors.

minor comments (2)

[Methods] The exact formula or aggregation rule used to convert per-model SHAP rankings into the scalar variability score (0 for concentration, > 0.9 for others) should be stated explicitly, preferably with a small worked example.
[Figures] Figure captions and axis labels should indicate the number of models (21) and the number of experiments (96) so that readers can immediately gauge the scale of the cross-model comparison.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and detailed comments, which highlight important aspects of methodological transparency and statistical rigor. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation.

read point-by-point responses

Referee: [Methods (model training and SHAP computation)] The manuscript provides no description of the train/test splitting procedure, cross-validation scheme, or hyperparameter selection method used to train the 21 models. Because SHAP values and the subsequent rank-variability statistic are functionals of the fitted models, the reported values (concentration variability = 0, flow-rate/voltage variability > 0.9) cannot be assessed for sensitivity to these choices on an n = 96 table.

Authors: We agree that these details are necessary for full reproducibility and for readers to evaluate potential sensitivity of the variability statistic. In the revised manuscript we will add a dedicated Methods subsection that specifies the 80/20 train/test split, the 5-fold cross-validation scheme used for hyperparameter tuning, and the grid-search procedure (with ranges and libraries) applied to each of the 21 model families. This addition will directly address the concern. revision: yes
Referee: [Results (rank-variability analysis)] No bootstrap, permutation, or repeated-subsampling experiment is presented to establish that the observed rank dispersion exceeds the variability expected from sampling noise alone. With only 96 rows, small perturbations in the training data can reorder top features; without such a stability check the claim that high variability indicates genuine model dependence rather than finite-sample artifact remains untested.

Authors: The referee is correct that a direct stability check against finite-sample noise would strengthen the interpretation. We will therefore add a bootstrap analysis to the revised Results section: 500 bootstrap resamples of the 96 experiments will be drawn, each model retrained, SHAP rankings recomputed, and the resulting distribution of the rank-variability statistic reported. We will show that the observed variability of 0 for concentration lies well within the bootstrap noise envelope while the values >0.9 for flow rate and voltage lie outside it, thereby confirming that the reported model dependence exceeds sampling variability. revision: yes
Referee: [Results (predictive performance comparison)] The abstract states that several models achieved comparable predictive accuracy, yet no table or figure reports the actual performance metrics (R², MAE, or RMSE) for each of the 21 models. Without these numbers it is impossible to verify that the models whose rankings differ are all comparably competent predictors.

Authors: We will insert a new table in the Results section that lists R², MAE, and RMSE on the test set for all 21 models. The table will demonstrate that the models with divergent feature rankings nevertheless achieve comparable predictive performance (within a narrow band of R² values), thereby supporting the manuscript’s central distinction between accuracy and interpretability. revision: yes

Circularity Check

0 steps flagged

No significant circularity in cross-model SHAP variability analysis

full rationale

The paper trains 21 independent models on the 96-row experimental dataset, computes SHAP values for each, derives feature ranks, and then calculates a direct statistical variability metric across those ranks. This variability (e.g., concentration = 0, flow rate/voltage > 0.9) is a straightforward dispersion statistic and does not reduce to any fitted parameter, self-definition, or self-citation chain. No ansatz is smuggled, no uniqueness theorem is invoked, and the central claim follows from the empirical computation on external data rather than tautological re-expression of inputs. The analysis is self-contained against the provided experimental table.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the empirical computation of SHAP values and rank variability from the 96-experiment dataset together with the assumption that SHAP provides comparable importance scores across model families.

axioms (1)

domain assumption SHAP values provide a consistent and comparable measure of feature importance across linear, tree-based, kernel, neural, and instance-based models
The paper adopts SHAP as the unified interpretability framework for all 21 models.

pith-pipeline@v0.9.0 · 5584 in / 1213 out tokens · 115655 ms · 2026-05-13T06:55:08.325439+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean Jcost_pos_of_ne_one unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Solution concentration emerged as the most robust... (variability = 0), whereas flow rate and applied voltage exhibited high ranking variability (variability > 0.9)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 1 internal anchor

[1]

J., Staiger, M

Tucker, N., Stanger, J. J., Staiger, M. P., Razzaq, H. & Hofman, K. The History of the Science and Technology of Electrospinning from 1600 to 1995.Journal of Engineered Fibers and Fabrics7, 63–73 (2012). https://doi.org/10.1177/155892501200702S10

work page doi:10.1177/155892501200702s10 1995
[2]

& Kundu, S

Bhardwaj, N. & Kundu, S. C. Electrospinning: A fascinating fiber fabrication technique.Biotechnology Advances28, 325–347 (2010). https://doi.org/10.1016/j.biotechadv.2010.01.004

work page doi:10.1016/j.biotechadv.2010.01.004 2010
[3]

https://doi.org/10.1016/j.jddst.2021.103060

Pisani, S., De Santis, F., and Fracassi, F., A Design of Experiment (DOE) approach to correlate PLA electrospinning parameters with nanofiber diameter and mechanical properties for soft tissue regeneration purposes, Journal of Drug Delivery Science and Technology,63, Article 103060 (2021). https://doi.org/10.1016/j.jddst.2021.103060

work page doi:10.1016/j.jddst.2021.103060 2021
[4]

https://doi.org/10

Shao, Z., Wang, Q., Gui, Z., Shen, R., Chen, R., Liu, Y ., and Zheng, G., Electrospun bimodal nanofibrous membranes for high-performance, multifunctional, and light-weight air filtration: A review,Separation and Purification Technology,358, Article 130417 (2025). https://doi.org/10. 1016/j.seppur.2024.130417

work page arXiv 2025
[5]

Doroudkhani, Z. S., Mazloom, J., and Mahinzad Ghaziani, M., Optical and electrochemical performance of electrospun NiO–Mn 3O4 nanocom- posites for energy storage applications,Scientific Reports,15(1), Article 11436 (2025). https://doi.org/10.1038/s41598-025-96008-4

work page doi:10.1038/s41598-025-96008-4 2025
[6]

https://doi.org/10.1002/app.57774

Wang, H., Li, S., Dai, T., Yang, Y ., Wang, L., Yao, J., Zhu, G., Guo, B., Khabibulla, P. & Zhang, M. Multi-structured nanofibers for advanced multifunctional protective fabrics via coaxial electrospinning.Journal of Applied Polymer Science63, Article e57774 (2025). https://doi.org/10.1002/app.57774

work page doi:10.1002/app.57774 2025
[7]

Haghi, A. K. & Akbari, M. Trends in electrospinning of natural nanofibers.Physica Status Solidi (A)204, 1830–1834 (2007). https://doi.org/10.1002/pssa.200675301

work page doi:10.1002/pssa.200675301 2007
[8]

B., Lima, F

Medeiros, G. B., Lima, F. A., de Almeida, D. S., Guerra, V . G. & Aguiar, M. L. Modification and functionalization of fibers formed by electrospinning: A review.Membranes12, 861 (2022). https://doi.org/ 10.3390/membranes12090861

work page doi:10.3390/membranes12090861 2022
[9]

E., Lim, T

Ramakrishna, S., Fujihara, K., Teo, W. E., Lim, T. C. & Ma, Z. (eds.) An introduction to electrospinning and nanofibers. InAn Introduction to Electrospinning and Nanofibers(World Scientific Publishing, 2005). https://doi.org/10.1142/9789812567611 0003

work page doi:10.1142/9789812567611 2005
[10]

& Xia, Y

Xue, J., Wu, T., Dai, Y . & Xia, Y . Electrospinning and electrospun nanofibers: Methods, materials, and applications.Chemical Reviews119, 5298–5415 (2019). https://doi.org/10.1021/acs.chemrev.8b00593

work page doi:10.1021/acs.chemrev.8b00593 2019
[11]

C., He, J., Chen, C., Wong, S

Wang, S.-X., Yap, C. C., He, J., Chen, C., Wong, S. Y . & Li, X. Elec- trospinning: A facile technique for fabricating functional nanofibers for environmental applications.Nanotechnology Reviews5, 51–73 (2016). https://doi.org/10.1515/ntrev-2015-0065

work page doi:10.1515/ntrev-2015-0065 2016
[12]

Electrospinning-Data.org: A FAIR, Structured Knowledge Resource for Nanofiber Fabrication

Mahdian, M., Ender, F., & Pardy, T. Electrospinning-Data.org: A FAIR, Structured Knowledge Resource for Nanofiber Fabrication.arXiv preprint arXiv:2603.27841 (2026). doi:10.48550/arXiv.2603.27841

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2603.27841 2026
[13]

Towards controllable electrospinning: a systematic analysis of the effect of input parameters on nanofiber morphology.TechRxiv2025, 0507 (2025)

Mahdian, M., Stummer, T., Sepsik, N., Ender, F., Balogh-Weiser, D., & Pardy, T. Towards controllable electrospinning: a systematic analysis of the effect of input parameters on nanofiber morphology.TechRxiv2025, 0507 (2025). doi:10.36227/techrxiv.174662365.53016549/v1

work page doi:10.36227/techrxiv.174662365.53016549/v1 2025
[14]

ISBN 9781510838819

D’Amour, A.et al.Underspecification presents challenges for credibility in modern machine learning.arXiv(2020). https://doi.org/10.48550/ arXiv.2011.03395

work page arXiv 2020
[15]

Steven L

Breiman, L. Statistical modeling: The two cultures (with comments and a rejoinder by the author).Statistical Science16, 199–231 (2001). https: //doi.org/10.1214/ss/1009213726

work page doi:10.1214/ss/1009213726 2001
[16]

& Siddiqui, I

Khan, K.-u.-H. & Siddiqui, I. A. A predictive model for electrospun based Polyvinyl alcohol (PV A) nanofibers diameter using an artificial neural network.Scientific Reports15, Article 21591 (2025). https://doi.org/10.1038/s41598-025-92877-x

work page doi:10.1038/s41598-025-92877-x 2025
[17]

E., Arellanes- Lozada, P., Mel´endez-Bustamante, F

Cuahuizo-Huitzil, G., Olivares-Xometl, O., Castro, M. E., Arellanes- Lozada, P., Mel´endez-Bustamante, F. J., Pineda Torres, I. H., Santacruz- V´azquez, C. & Santacruz-V ´azquez, V . Artificial neural networks for predicting the diameter of electrospun nanofibers synthesized from solutions/emulsions of biopolymers and oils.Materials16, Article 5720 (2023)...

work page doi:10.3390/ma16165720 2023
[18]

Frontiers in Bioengineering and Biotechnology14, 1713804 (2026)

Rold ´an, E.et al.FibreCastML: An open web platform for predicting electrospun nanofibre diameter distributions for biomedical applications. Frontiers in Bioengineering and Biotechnology14, 1713804 (2026). https://doi.org/10.3389/fbioe.2026.1713804

work page doi:10.3389/fbioe.2026.1713804 2026
[19]

& Pardy, T

Mahdian, M., Ender, F. & Pardy, T. A surrogate-based inverse design framework for targeted diameter control of electrospun nanofibers. Scientific Reports(2026). https://doi.org/10.1038/s41598-026-40692-3

work page doi:10.1038/s41598-026-40692-3 2026
[20]

& Sabir, T

Rold ´an, E. & Sabir, T. SpinCastML: An open decision-making appli- cation for inverse design of electrospinning manufacturing: a machine learning, optimal sampling and inverse Monte Carlo approach.arXiv https://doi.org/10.48550/arXiv.2602.09120 (2026)

work page doi:10.48550/arxiv.2602.09120 2026
[21]

& Asmatulu, E

Subeshan, B. & Asmatulu, E. Data-driven prediction and optimization of electrospun nanofibrous scaffold diameters for tissue engineering appli- cations using machine learning and genetic algorithms.The International Journal of Advanced Manufacturing Technology143, 2823–2854 (2026). https://doi.org/10.1007/s00170-026-17480-4

work page doi:10.1007/s00170-026-17480-4 2026
[22]

Sarma, S., Phadkule, S. S. & Gaur, C. Electrospun Fiber Experimen- tal Attributes Dataset (FEAD).Zenodohttps://doi.org/10.5281/zenodo. 10301664 (2023)

work page doi:10.5281/zenodo 2023
[23]

Cogni-e-SpinDB 1.0: Open Dataset of Electrospinning Pa- rameter Configurations and Resultant Nanofiber Morphologies.Scientific Data13, 201 (2026)

Mahdian, M., Stummer, T., Sepsik, N., Ender, F., Balogh-Weiser, D., & Pardy, T. Cogni-e-SpinDB 1.0: Open Dataset of Electrospinning Pa- rameter Configurations and Resultant Nanofiber Morphologies.Scientific Data13, 201 (2026). doi:10.1038/s41597-025-06520-5

work page doi:10.1038/s41597-025-06520-5 2026
[24]

& Haghi, A

Ziabari, M., Mottaghitalab, V . & Haghi, A. A new approach for optimization of electrospun nanofiber formation process.Ko- rean J. Chem. Eng.27(1), 340–354 (2010). https://doi.org/10.1007/ s11814-009-0309-1

work page 2010
[25]

Zenodo, 2025

Mahdian, M., Stummer, T., Sepsik, N.et al.Cogni-e-Spin DB 1.0: Open Dataset of Electrospinning Parameter Configurations and Resultant Nanofiber Morphologies [Data set].Zenodo(2025). https://doi.org/10.5281/zenodo.16731638

work page doi:10.5281/zenodo.16731638 2025
[26]

Lundberg, S. M. & Lee, S. A unified approach to interpreting model predictions. InAdvances in Neural Information Processing Systems30, 4765–4774 (Curran Associates, Inc., 2017). https://proceedings.neurips.cc/paper/2017/hash/ 8a20a8621978632d76c43dfd28b67767-Abstract.html

work page 2017