Recognition: 1 theorem link
· Lean TheoremCross-Model Consistency of Feature Importance in Electrospinning: Separating Robust from Model-Dependent Features
Pith reviewed 2026-05-13 06:55 UTC · model grok-4.3
The pith
Machine learning models for electrospinning agree on accuracy but produce divergent feature importance rankings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Although several models achieved comparable predictive accuracy on the 96-experiment dataset, substantial differences were observed in their SHAP-derived feature importance rankings. Solution concentration emerged as the most robust and consistently influential parameter with variability of zero, whereas flow rate and applied voltage exhibited high ranking variability exceeding 0.9. The results demonstrate that predictive performance and interpretive reliability are fundamentally distinct properties, indicating that feature importance derived from a single ML model may be unreliable for small experimental datasets.
What carries the argument
The variability metric calculated from SHAP-based feature rankings across 21 different model families, which quantifies inter-model agreement on the relative importance of process parameters.
If this is right
- Single-model feature importance carries a high risk of being model-dependent rather than data-driven when datasets are small.
- Solution concentration should be treated as the primary reliable control variable for electrospinning process optimization.
- Flow rate and applied voltage need cross-model validation before any importance ranking is used in experimental planning.
- Cross-model consistency analysis becomes necessary for trustworthy interpretation in machine learning applied to limited process data.
Where Pith is reading between the lines
- The same consistency check could be applied to other fabrication processes to identify parameters that remain robust regardless of modeling approach.
- Larger experimental datasets might reduce the model-dependent variability observed here, making single-model interpretations more reliable.
- Incorporating physical constraints into the models could potentially increase agreement on feature rankings across families.
Load-bearing premise
That differences in SHAP feature rankings across model families directly reflect the true robustness of each parameter rather than model-specific biases or effects from the limited 96-experiment dataset.
What would settle it
Training the same 21 models on a substantially larger electrospinning dataset and finding that all models converge on identical feature importance rankings would falsify the claim of inherent model dependence.
Figures
read the original abstract
Electrospinning is a highly sensitive fabrication process in which small variations in operating parameters can significantly influence fiber morphology and material performance. Machine learning (ML) methods are increasingly employed to model these process-structure relationships and to identify the relative importance of processing variables. However, most existing studies rely on a single ML model, implicitly assuming that the resulting feature importance is robust and reproducible. In this study, the consistency of feature importance across multiple ML model families was systematically evaluated using a curated dataset of 96 polyvinyl alcohol (PVA) electrospinning experiments. Twenty-one ML models representing linear, tree-based, kernel-based, neural network, and instance-based approaches were trained and compared. To provide a unified interpretability framework, SHAP (SHapley Additive exPlanations) values were used to calculate feature importance consistently across all models. A rank-based statistical analysis was then performed to quantify inter-model agreement and assess the robustness of parameter rankings. The results demonstrate that predictive performance and interpretive reliability are fundamentally distinct properties. Although several models achieved comparable predictive accuracy, substantial differences were observed in their feature importance rankings. Solution concentration emerged as the most robust and consistently influential parameter (variability = 0), whereas flow rate and applied voltage exhibited high ranking variability (variability > 0.9), indicating strong model dependence. These findings suggest that feature importance derived from a single ML model may be unreliable, particularly for small experimental datasets, and highlight the importance of cross-model validation for achieving trustworthy interpretation in ML-assisted electrospinning research.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper trains 21 ML models spanning linear, tree-based, kernel, neural, and instance-based families on a curated set of 96 PVA electrospinning experiments. It applies SHAP uniformly to obtain feature-importance rankings, then computes a rank-variability statistic across models. Solution concentration is reported as fully robust (variability = 0) while flow rate and applied voltage show high variability (> 0.9), leading to the claim that single-model feature importance is unreliable on small experimental datasets and that cross-model consistency checks are required.
Significance. If the variability metric can be shown to exceed finite-sample noise, the work supplies a concrete cautionary result for the growing use of ML interpretability in process optimization: predictive accuracy and feature-rank stability are distinct, and reliance on any single model family risks model-specific artifacts. The uniform SHAP protocol across 21 models and the explicit separation of accuracy from interpretability are positive methodological steps.
major comments (3)
- [Methods (model training and SHAP computation)] The manuscript provides no description of the train/test splitting procedure, cross-validation scheme, or hyperparameter selection method used to train the 21 models. Because SHAP values and the subsequent rank-variability statistic are functionals of the fitted models, the reported values (concentration variability = 0, flow-rate/voltage variability > 0.9) cannot be assessed for sensitivity to these choices on an n = 96 table.
- [Results (rank-variability analysis)] No bootstrap, permutation, or repeated-subsampling experiment is presented to establish that the observed rank dispersion exceeds the variability expected from sampling noise alone. With only 96 rows, small perturbations in the training data can reorder top features; without such a stability check the claim that high variability indicates genuine model dependence rather than finite-sample artifact remains untested.
- [Results (predictive performance comparison)] The abstract states that several models achieved comparable predictive accuracy, yet no table or figure reports the actual performance metrics (R², MAE, or RMSE) for each of the 21 models. Without these numbers it is impossible to verify that the models whose rankings differ are all comparably competent predictors.
minor comments (2)
- [Methods] The exact formula or aggregation rule used to convert per-model SHAP rankings into the scalar variability score (0 for concentration, > 0.9 for others) should be stated explicitly, preferably with a small worked example.
- [Figures] Figure captions and axis labels should indicate the number of models (21) and the number of experiments (96) so that readers can immediately gauge the scale of the cross-model comparison.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and detailed comments, which highlight important aspects of methodological transparency and statistical rigor. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation.
read point-by-point responses
-
Referee: [Methods (model training and SHAP computation)] The manuscript provides no description of the train/test splitting procedure, cross-validation scheme, or hyperparameter selection method used to train the 21 models. Because SHAP values and the subsequent rank-variability statistic are functionals of the fitted models, the reported values (concentration variability = 0, flow-rate/voltage variability > 0.9) cannot be assessed for sensitivity to these choices on an n = 96 table.
Authors: We agree that these details are necessary for full reproducibility and for readers to evaluate potential sensitivity of the variability statistic. In the revised manuscript we will add a dedicated Methods subsection that specifies the 80/20 train/test split, the 5-fold cross-validation scheme used for hyperparameter tuning, and the grid-search procedure (with ranges and libraries) applied to each of the 21 model families. This addition will directly address the concern. revision: yes
-
Referee: [Results (rank-variability analysis)] No bootstrap, permutation, or repeated-subsampling experiment is presented to establish that the observed rank dispersion exceeds the variability expected from sampling noise alone. With only 96 rows, small perturbations in the training data can reorder top features; without such a stability check the claim that high variability indicates genuine model dependence rather than finite-sample artifact remains untested.
Authors: The referee is correct that a direct stability check against finite-sample noise would strengthen the interpretation. We will therefore add a bootstrap analysis to the revised Results section: 500 bootstrap resamples of the 96 experiments will be drawn, each model retrained, SHAP rankings recomputed, and the resulting distribution of the rank-variability statistic reported. We will show that the observed variability of 0 for concentration lies well within the bootstrap noise envelope while the values >0.9 for flow rate and voltage lie outside it, thereby confirming that the reported model dependence exceeds sampling variability. revision: yes
-
Referee: [Results (predictive performance comparison)] The abstract states that several models achieved comparable predictive accuracy, yet no table or figure reports the actual performance metrics (R², MAE, or RMSE) for each of the 21 models. Without these numbers it is impossible to verify that the models whose rankings differ are all comparably competent predictors.
Authors: We will insert a new table in the Results section that lists R², MAE, and RMSE on the test set for all 21 models. The table will demonstrate that the models with divergent feature rankings nevertheless achieve comparable predictive performance (within a narrow band of R² values), thereby supporting the manuscript’s central distinction between accuracy and interpretability. revision: yes
Circularity Check
No significant circularity in cross-model SHAP variability analysis
full rationale
The paper trains 21 independent models on the 96-row experimental dataset, computes SHAP values for each, derives feature ranks, and then calculates a direct statistical variability metric across those ranks. This variability (e.g., concentration = 0, flow rate/voltage > 0.9) is a straightforward dispersion statistic and does not reduce to any fitted parameter, self-definition, or self-citation chain. No ansatz is smuggled, no uniqueness theorem is invoked, and the central claim follows from the empirical computation on external data rather than tautological re-expression of inputs. The analysis is self-contained against the provided experimental table.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption SHAP values provide a consistent and comparable measure of feature importance across linear, tree-based, kernel, neural, and instance-based models
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanJcost_pos_of_ne_one unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Solution concentration emerged as the most robust... (variability = 0), whereas flow rate and applied voltage exhibited high ranking variability (variability > 0.9)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Tucker, N., Stanger, J. J., Staiger, M. P., Razzaq, H. & Hofman, K. The History of the Science and Technology of Electrospinning from 1600 to 1995.Journal of Engineered Fibers and Fabrics7, 63–73 (2012). https://doi.org/10.1177/155892501200702S10
-
[2]
Bhardwaj, N. & Kundu, S. C. Electrospinning: A fascinating fiber fabrication technique.Biotechnology Advances28, 325–347 (2010). https://doi.org/10.1016/j.biotechadv.2010.01.004
-
[3]
https://doi.org/10.1016/j.jddst.2021.103060
Pisani, S., De Santis, F., and Fracassi, F., A Design of Experiment (DOE) approach to correlate PLA electrospinning parameters with nanofiber diameter and mechanical properties for soft tissue regeneration purposes, Journal of Drug Delivery Science and Technology,63, Article 103060 (2021). https://doi.org/10.1016/j.jddst.2021.103060
-
[4]
Shao, Z., Wang, Q., Gui, Z., Shen, R., Chen, R., Liu, Y ., and Zheng, G., Electrospun bimodal nanofibrous membranes for high-performance, multifunctional, and light-weight air filtration: A review,Separation and Purification Technology,358, Article 130417 (2025). https://doi.org/10. 1016/j.seppur.2024.130417
-
[5]
Doroudkhani, Z. S., Mazloom, J., and Mahinzad Ghaziani, M., Optical and electrochemical performance of electrospun NiO–Mn 3O4 nanocom- posites for energy storage applications,Scientific Reports,15(1), Article 11436 (2025). https://doi.org/10.1038/s41598-025-96008-4
-
[6]
https://doi.org/10.1002/app.57774
Wang, H., Li, S., Dai, T., Yang, Y ., Wang, L., Yao, J., Zhu, G., Guo, B., Khabibulla, P. & Zhang, M. Multi-structured nanofibers for advanced multifunctional protective fabrics via coaxial electrospinning.Journal of Applied Polymer Science63, Article e57774 (2025). https://doi.org/10.1002/app.57774
-
[7]
Haghi, A. K. & Akbari, M. Trends in electrospinning of natural nanofibers.Physica Status Solidi (A)204, 1830–1834 (2007). https://doi.org/10.1002/pssa.200675301
-
[8]
Medeiros, G. B., Lima, F. A., de Almeida, D. S., Guerra, V . G. & Aguiar, M. L. Modification and functionalization of fibers formed by electrospinning: A review.Membranes12, 861 (2022). https://doi.org/ 10.3390/membranes12090861
-
[9]
Ramakrishna, S., Fujihara, K., Teo, W. E., Lim, T. C. & Ma, Z. (eds.) An introduction to electrospinning and nanofibers. InAn Introduction to Electrospinning and Nanofibers(World Scientific Publishing, 2005). https://doi.org/10.1142/9789812567611 0003
-
[10]
Xue, J., Wu, T., Dai, Y . & Xia, Y . Electrospinning and electrospun nanofibers: Methods, materials, and applications.Chemical Reviews119, 5298–5415 (2019). https://doi.org/10.1021/acs.chemrev.8b00593
-
[11]
Wang, S.-X., Yap, C. C., He, J., Chen, C., Wong, S. Y . & Li, X. Elec- trospinning: A facile technique for fabricating functional nanofibers for environmental applications.Nanotechnology Reviews5, 51–73 (2016). https://doi.org/10.1515/ntrev-2015-0065
-
[12]
Electrospinning-Data.org: A FAIR, Structured Knowledge Resource for Nanofiber Fabrication
Mahdian, M., Ender, F., & Pardy, T. Electrospinning-Data.org: A FAIR, Structured Knowledge Resource for Nanofiber Fabrication.arXiv preprint arXiv:2603.27841 (2026). doi:10.48550/arXiv.2603.27841
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2603.27841 2026
-
[13]
Mahdian, M., Stummer, T., Sepsik, N., Ender, F., Balogh-Weiser, D., & Pardy, T. Towards controllable electrospinning: a systematic analysis of the effect of input parameters on nanofiber morphology.TechRxiv2025, 0507 (2025). doi:10.36227/techrxiv.174662365.53016549/v1
-
[14]
D’Amour, A.et al.Underspecification presents challenges for credibility in modern machine learning.arXiv(2020). https://doi.org/10.48550/ arXiv.2011.03395
-
[15]
Breiman, L. Statistical modeling: The two cultures (with comments and a rejoinder by the author).Statistical Science16, 199–231 (2001). https: //doi.org/10.1214/ss/1009213726
-
[16]
Khan, K.-u.-H. & Siddiqui, I. A. A predictive model for electrospun based Polyvinyl alcohol (PV A) nanofibers diameter using an artificial neural network.Scientific Reports15, Article 21591 (2025). https://doi.org/10.1038/s41598-025-92877-x
-
[17]
E., Arellanes- Lozada, P., Mel´endez-Bustamante, F
Cuahuizo-Huitzil, G., Olivares-Xometl, O., Castro, M. E., Arellanes- Lozada, P., Mel´endez-Bustamante, F. J., Pineda Torres, I. H., Santacruz- V´azquez, C. & Santacruz-V ´azquez, V . Artificial neural networks for predicting the diameter of electrospun nanofibers synthesized from solutions/emulsions of biopolymers and oils.Materials16, Article 5720 (2023)...
-
[18]
Frontiers in Bioengineering and Biotechnology14, 1713804 (2026)
Rold ´an, E.et al.FibreCastML: An open web platform for predicting electrospun nanofibre diameter distributions for biomedical applications. Frontiers in Bioengineering and Biotechnology14, 1713804 (2026). https://doi.org/10.3389/fbioe.2026.1713804
-
[19]
Mahdian, M., Ender, F. & Pardy, T. A surrogate-based inverse design framework for targeted diameter control of electrospun nanofibers. Scientific Reports(2026). https://doi.org/10.1038/s41598-026-40692-3
-
[20]
Rold ´an, E. & Sabir, T. SpinCastML: An open decision-making appli- cation for inverse design of electrospinning manufacturing: a machine learning, optimal sampling and inverse Monte Carlo approach.arXiv https://doi.org/10.48550/arXiv.2602.09120 (2026)
-
[21]
Subeshan, B. & Asmatulu, E. Data-driven prediction and optimization of electrospun nanofibrous scaffold diameters for tissue engineering appli- cations using machine learning and genetic algorithms.The International Journal of Advanced Manufacturing Technology143, 2823–2854 (2026). https://doi.org/10.1007/s00170-026-17480-4
-
[22]
Sarma, S., Phadkule, S. S. & Gaur, C. Electrospun Fiber Experimen- tal Attributes Dataset (FEAD).Zenodohttps://doi.org/10.5281/zenodo. 10301664 (2023)
-
[23]
Mahdian, M., Stummer, T., Sepsik, N., Ender, F., Balogh-Weiser, D., & Pardy, T. Cogni-e-SpinDB 1.0: Open Dataset of Electrospinning Pa- rameter Configurations and Resultant Nanofiber Morphologies.Scientific Data13, 201 (2026). doi:10.1038/s41597-025-06520-5
-
[24]
Ziabari, M., Mottaghitalab, V . & Haghi, A. A new approach for optimization of electrospun nanofiber formation process.Ko- rean J. Chem. Eng.27(1), 340–354 (2010). https://doi.org/10.1007/ s11814-009-0309-1
work page 2010
-
[25]
Mahdian, M., Stummer, T., Sepsik, N.et al.Cogni-e-Spin DB 1.0: Open Dataset of Electrospinning Parameter Configurations and Resultant Nanofiber Morphologies [Data set].Zenodo(2025). https://doi.org/10.5281/zenodo.16731638
-
[26]
Lundberg, S. M. & Lee, S. A unified approach to interpreting model predictions. InAdvances in Neural Information Processing Systems30, 4765–4774 (Curran Associates, Inc., 2017). https://proceedings.neurips.cc/paper/2017/hash/ 8a20a8621978632d76c43dfd28b67767-Abstract.html
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.