Topological Data Analysis combined with Machine Learning for Predicting Permeability of Porous Media
Pith reviewed 2026-05-19 22:13 UTC · model grok-4.3
The pith
Topological data analysis supplies effective features for machine learning models that predict permeability in porous media from structure.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Features that describe the geometry of synthetic porous media, their topological connectivity, and their representation as pore networks can be fed as inputs into machine learning algorithms together with exact permeability ground truth; topological data analysis features in particular form a useful set that yields meaningful permeability predictions.
What carries the argument
Topological data analysis features that capture connectivity of the pore space, supplied as inputs to machine learning models trained on exact permeability values.
If this is right
- Different combinations of structural, topological, and network features can be compared to identify which structural aspects most influence predicted permeability.
- Topological data analysis features integrate easily into existing machine learning pipelines for this prediction task.
- The trained models can estimate permeability for new porous structures without repeating full flow simulations.
- Results clarify the relative utility of geometric versus connectivity measures for flow properties.
Where Pith is reading between the lines
- The same feature pipeline could be applied to real experimental samples to test whether predictions track physical measurements outside the synthetic training distribution.
- Inverting the trained model might allow design of pore geometries that achieve a target permeability without exhaustive trial simulations.
- The approach could extend to forecasting other transport or mechanical properties in complex media once suitable ground-truth data become available.
Load-bearing premise
Features extracted from synthetic porous media are sufficient for a machine learning model to learn the physical relationships that determine permeability rather than fitting only to patterns in the training examples.
What would settle it
Train the model on one collection of synthetic porous media, then test its permeability predictions against direct measurements made on laboratory samples that share comparable structural statistics.
Figures
read the original abstract
Flow in porous media is difficult to address using standard analytical or numerical methods due to its complexity. However, since synthetic representations of porous media are easy to produce and data from physical experiments are becoming more widely available, the problem is well-suited to studies that include machine learning (ML) techniques. We discuss a number of features that can be extracted from such data, and their utility as input variables into a standard ML algorithm. These features include structural measures describing the geometry of the porous media, topological measures describing the connectivity, and network measures obtained by modeling the porous media as simplified pore networks. These features enable the prediction of the permeability of the considered (synthetic) porous materials using ML techniques that also leverage the separately computed exact permeability (ground truth). Comparing results obtained using different input variables helps develop a better understanding of the utility of various measures for predicting permeability based on the porous media structure. We show, in particular, that topological data analysis (TDA) provides a useful set of features that can be easily combined with ML to yield meaningful results.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes using topological data analysis (TDA) in combination with machine learning to predict permeability in synthetic porous media. Features are extracted from the media including structural geometry measures, topological descriptors (via TDA such as persistence diagrams and Betti numbers), and network measures from pore-network models; these are fed into standard supervised regressors trained against separately computed exact permeability values as ground truth. The central claim is that TDA features are particularly useful and yield meaningful predictive results when combined with ML.
Significance. If validated, the work could offer a practical data-driven route to permeability estimation for complex porous structures where direct simulation is costly. Credit is due for the reproducible synthetic data generation pipeline and the explicit use of exact permeability as supervised target, which avoids circularity in the prediction task itself. However, the significance is limited by the absence of evidence that TDA features improve generalization or reflect flow physics beyond in-sample correlations on the synthetic ensemble.
major comments (2)
- [§4.3] §4.3 (Results): the reported cross-validated R² and MAE values for models including TDA features are presented without an ablation that removes the TDA descriptors while retaining structural and network features; this omission prevents assessment of whether TDA adds incremental predictive power or merely correlates with the synthetic generation process.
- [§5] §5 (Discussion): the claim that the approach 'captures the underlying physics' is not supported by any comparison against direct numerical simulation of Stokes flow on the identical geometries or by testing on experimental porous samples; all metrics remain within the synthetic distribution used for training.
minor comments (2)
- [Figure 2] Figure 2: axis labels and color scales for the persistence diagrams are not defined in the caption, making it difficult to interpret the topological features being extracted.
- [§3.2] §3.2: the vectorization procedure that converts persistence diagrams into fixed-length feature vectors for the ML input should be stated explicitly (e.g., which summary statistics or kernel are used).
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. We provide point-by-point responses to the major comments below, indicating the revisions made to address them.
read point-by-point responses
-
Referee: [§4.3] §4.3 (Results): the reported cross-validated R² and MAE values for models including TDA features are presented without an ablation that removes the TDA descriptors while retaining structural and network features; this omission prevents assessment of whether TDA adds incremental predictive power or merely correlates with the synthetic generation process.
Authors: We agree with the referee that an ablation study is required to properly assess the contribution of the TDA features. Accordingly, we have performed and included an ablation analysis in the revised Section 4.3. We report the cross-validated R² and MAE for models using structural and network features with and without the TDA descriptors. The results demonstrate that TDA features do provide incremental predictive power on top of the other features. revision: yes
-
Referee: [§5] §5 (Discussion): the claim that the approach 'captures the underlying physics' is not supported by any comparison against direct numerical simulation of Stokes flow on the identical geometries or by testing on experimental porous samples; all metrics remain within the synthetic distribution used for training.
Authors: We thank the referee for this important point. The ground truth permeability is computed using direct numerical simulation, but we accept that the original discussion may have overstated the physical interpretation. In the revised manuscript, we have modified the discussion in Section 5 to avoid claiming that the approach 'captures the underlying physics'. Instead, we note that the topological features are selected for their relevance to pore space connectivity, which is physically linked to permeability. We have also added a statement that validation on experimental samples is left for future work. revision: partial
Circularity Check
No circularity: supervised ML uses independently computed ground-truth permeability
full rationale
The paper extracts geometric, topological (including TDA), and network features from synthetic porous media and trains standard regressors to predict permeability, with the target obtained via separate exact computation serving as ground truth. This is a conventional supervised learning pipeline with no self-definitional reduction, no fitted parameter renamed as a prediction, and no load-bearing self-citation chain. The claim that TDA features are useful rests on comparative performance across feature sets rather than any equation that equates the output to the inputs by construction. The setup remains falsifiable against external exact solvers and held-out data.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Synthetic porous media generated by the authors' procedure are representative enough for ML training to capture permeability trends.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We show, in particular, that topological data analysis (TDA) provides a useful set of features... Total Persistence (TP), calculated by summing the lifetimes... 0th dimension Total Persistence, 1st dimension Total Persistence, 2nd dimension Total Persistence
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the neural network... three dense layers of 128 nodes each... MAPE of approximately 4.70%
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Sanaei, P. and Richardson, G. W. and Witelski, T. and Cummings, L. J. , title =. J. Fluid Mech. , volume =. 2016 , doi =
work page 2016
-
[2]
Edelsbrunner, H. and M. Three-dimensional alpha shapes , journal =. 1994 , doi =
work page 1994
-
[3]
Gostick, J. and others , title =. J. Open Source Softw. , volume =. 2019 , doi =
work page 2019
-
[4]
Gostick, J. and others , title =. Comput. Sci. Eng. , volume =. 2016 , doi =
work page 2016
- [5]
-
[6]
Graczyk, K. and Matyka, M. , title =. Sci. Rep. , volume =. 2020 , doi =
work page 2020
-
[7]
Paszke, A. and others , title =. Advances in Neural Information Processing Systems , year =
-
[8]
Ferguson, J. C. and others , title =. SoftwareX , volume =. 2021 , doi =
work page 2021
-
[9]
GUDHI User and Reference Manual , year =
-
[10]
Suzuki, A. and others , title =. Sci. Rep. , volume =. 2021 , doi =
work page 2021
- [11]
-
[12]
Zhang, J. and others , title =. Water Resour. Res. , volume =. 2024 , doi =
work page 2024
-
[13]
R. Predicting permeability via statistical learning on higher-order microstructural information , journal =. 2020 , doi =
work page 2020
-
[14]
Vittadello, S. T. and Stumpf, M. P. H. , title =. R. Soc. Open Sci. , volume =. 2021 , doi =
work page 2021
-
[15]
Sanaei, P. and Cummings, L. J. , title =. J. Fluid Mech. , volume =. 2017 , doi =
work page 2017
-
[16]
Gu, B. and Kondic, L. and Cummings, L. J. , title =. Phys. Rev. Fluids , volume =. 2023 , doi =
work page 2023
-
[17]
Sanaei, P. and Cummings, L. J. , title =. Phys. Rev. Fluids , volume =. 2018 , doi =
work page 2018
-
[18]
Sanaei, P. and Cummings, L. J. , title =. Phys. Rev. Fluids , volume =. 2019 , doi =
work page 2019
-
[19]
Sun, Y. and others , title =. Phys. Rev. Fluids , volume =. 2020 , doi =
work page 2020
-
[20]
Gu, B. and Kondic, L. and Cummings, L. J. , title =. J. Membr. Sci. , volume =. 2022 , doi =
work page 2022
-
[21]
Cummings, L. J. and Gu, B. and Kondic, L. , title =. Annu. Rev. Fluid Mech. , volume =. 2026 , doi =
work page 2026
-
[22]
Grover, A. and Leskovec, J. , title =. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages =. 2016 , doi =
work page 2016
-
[23]
Otter, N. and others , title =. EPJ Data Science , volume =. 2017 , doi =
work page 2017
- [24]
-
[25]
Kingma, D. P. and Ba, J. , title =. International Conference on Learning Representations , year =
-
[26]
Agarap, A. F. , title =. 2018 , archivePrefix =
work page 2018
-
[27]
Ali, D. and others , title =. IEEE Trans. Pattern Anal. Mach. Intell. , volume =. 2023 , doi =
work page 2023
-
[28]
Influence of topology on performance of pore networks in membrane filters , author =. Phys. Rev. E , volume =. 2026 , month =. doi:10.1103/s8n8-vxzx , url =
-
[29]
Japan Journal of Industrial and Applied Mathematics , volume=
A topological measurement of protein compressibility , author=. Japan Journal of Industrial and Applied Mathematics , volume=. 2015 , publisher=
work page 2015
-
[30]
Tanaka, A.M. and Asaad, A.T. and Cooper, R. and Nanda, V. , journal=. 2025 , publisher=
work page 2025
-
[31]
IEEE transactions on pattern analysis and machine intelligence , volume=
Skeletonization and partitioning of digital images using discrete morse theory , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2014 , publisher=
work page 2014
-
[32]
Robins, V. and Wood, P. J. and Sheppard, A. P. , journal =. Theory and. 2011 , doi =
work page 2011
-
[33]
Sheppard, A. and Sok, R. M. and Averdunk, H. and Robins, V. B. and Ghous, A. , booktitle =. Analysis of rock microstructure using high-resolution
-
[34]
Topological similarity of random cell complexes and applications , author=. Physical Review E , volume=. 2016 , publisher=
work page 2016
-
[35]
Temporal network analysis using zigzag persistence , author=. EPJ Data Science , volume=. 2023 , publisher=
work page 2023
-
[36]
The Kinetic Hourglass Data Structure for Computing the Bottleneck Distance of Dynamic Data , author=. ArXiv preprint , volume=. 2025 , publisher=
work page 2025
-
[37]
Annual Review of Fluid Mechanics , volume=
Multiphase flow in porous media , author=. Annual Review of Fluid Mechanics , volume=. 1988 , publisher=
work page 1988
-
[38]
A dataset of 3D structural and simulated transport properties of complex porous media , author=. Scientific Data , volume=. 2022 , publisher=
work page 2022
-
[39]
Ecological modelling , volume=
Review and comparison of methods to study the contribution of variables in artificial neural network models , author=. Ecological modelling , volume=. 2003 , publisher=
work page 2003
-
[40]
Efficient topological layer based on persistent landscapes , author=. arXiv e-prints , pages=
- [41]
- [42]
-
[43]
Tropical coordinates on the space of persistence barcodes , author=. Found. Comp. Math. , volume=. 2019 , publisher=
work page 2019
-
[44]
Statistical topological data analysis using persistence landscapes , author=. The J. Machine Learning Res. , volume=. 2015 , publisher=
work page 2015
-
[45]
Proceedings of the 22nd Annual Symposium on Computational Geometry , pages=
Vines and vineyards by updating persistence in linear time , author=. Proceedings of the 22nd Annual Symposium on Computational Geometry , pages=
-
[46]
G. W. Baxter and R. P. Behringer and T. Fagert and G. A. Johnson , title =. Phys. Rev. Lett. , year =
-
[47]
Daniel I. Goldman and Harry L. Swinney , title =. Phys. Rev. Lett. , year =
- [48]
-
[49]
I. Zuriguel and L. A. Pugnaloni and A. Garcimartin and D. Maza , title =. Phys. Rev. E , year =
-
[50]
I. Zuriguel and A. Garcimartin and D. Maza and L. A. Pugnaloni and J. M. Pastor , title =. Phys. Rev. E , year =
- [51]
- [52]
-
[53]
P. L. Bransby and P. M. Blair-Fish , title =. Powder Tech. , year =
-
[54]
A. Drescher and T. W. Cousins and P. L. Bransby , title =. G\'. 1978 , volume =
work page 1978
-
[55]
A. Jenike , title =. Bulletin of the University of Utah, Experiment Station No. 123 , year =
-
[56]
R. L. Michalowski , title =. Powder Tech. , year =
-
[57]
L. E. Silbert and D. Ertaz and G. S. Grest and T. C. Halsey and D. Levine , title =. Phys. Rev. E , year =
- [58]
-
[59]
D. M. Mueth and G. F. Debregeas and G. S. Karczmar and P. J. Eng and S. R. Nagel and H. M. Jaeger , title =. Nature , year =
- [60]
-
[61]
R. P. Behringer and E. van Doorn and R. R. Hartley and H. K. Pak , title =. Granular Matter , year =
- [62]
- [63]
-
[64]
A. Aguirre and I. Ippolito and A. Calvo and C. Henrique and D. Bideau , title =. , year =
- [65]
- [66]
-
[67]
J.O. Aidanp\"a\"a and H.H. Shen and R.B. Gupta and M. Babi\'c , title =. Mechanics of Materials , year =
- [68]
-
[69]
T. Akiyama , title =. Int. J. of Mod. Phys. B , volume =. 1993 , exist =
work page 1993
-
[70]
R. Albert and I. Albert and D. Hornbaker and P. Schiffer and A.-L. Barab\'asi , title =. Phys. Rev. E , year =
-
[71]
R. Albert and M. A. Pfeifer and A.-L. Barab\'asi , title =. 1998 , exist =
work page 1998
-
[72]
R. Albert and M. A. Pfeifer and P. Schiffer and A.-L. Barab\'asi , title =. 1998 , exist =
work page 1998
-
[73]
B. J. Alder and S. P. Frankel and V. A. Lewinson , title =. J. Chem. Phys. , year =
-
[74]
B. J. Alder and T. E. Wainwright , title =. J. Chem. Phys. , year =
-
[75]
B. J. Alder and T. E. Wainwright , title =. Phys. Rev. , year =
-
[76]
B. J. Alder , title =. Phys. Rev. Lett. , year =
-
[77]
B. J. Alder , title =. J. Chem. Phys. , year =
-
[78]
B.-J. Alder and T. E. Wainwright , title =. Phys. Rev. Lett. , volume =. 1967 , exist =
work page 1967
-
[79]
B. J. Alder and T. E. Wainwright , title =. Phys. Rev. A , volume =. 1970 , exist =
work page 1970
-
[80]
F. J. Alexander and J. L. Lebowitz , title =. J. Phys. A: Math.Gen. , year =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.