arxiv: 2604.15169 · v1 · submitted 2026-04-16 · 💻 cs.LG

Recognition: unknown

Assessing the Potential of Masked Autoencoder Foundation Models in Predicting Downhole Metrics from Surface Drilling Data

Aleksander Berezowski , Hassan Hassanzadeh , Gouri Ginde

Authors on Pith no claims yet

Pith reviewed 2026-05-10 11:35 UTC · model grok-4.3

classification 💻 cs.LG

keywords masked autoencodersdrilling datadownhole predictionsurface sensorsself-supervised learningtime seriesoil and gassystematic review

0 comments

The pith

Masked autoencoder foundation models offer an unexplored path to predict downhole drilling metrics from surface sensor data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts a systematic review of thirteen studies on predicting downhole metrics using surface drilling data. It finds that existing work relies on supervised neural networks like ANNs and LSTMs, which require scarce labeled downhole measurements. Masked autoencoder foundation models stand out because they pre-train on large amounts of unlabeled data by learning to reconstruct masked segments, allowing them to handle multiple prediction tasks and generalize across wells. A sympathetic reader would care because this approach could make better use of the vast unlabeled surface data collected in oil and gas operations. The review concludes that MAEFMs represent a technically feasible opportunity that has not yet been explored in this domain.

Core claim

This systematic mapping study reviews thirteen papers from 2015 to 2025 and identifies that no research has applied masked autoencoder foundation models to the task of predicting downhole metrics from surface drilling data. The study maps eight common surface metrics and seven target downhole metrics, noting that current methods use architectures such as artificial neural networks and long short-term memory networks. It establishes that masked autoencoder foundation models are technically feasible for drilling analytics due to their self-supervised pre-training on unlabeled data, support for multi-task prediction, and potential for improved generalization across wells.

What carries the argument

Masked Autoencoder Foundation Models (MAEFMs), which use self-supervised pre-training to reconstruct masked portions of time-series data and thereby learn representations useful for downstream prediction tasks.

If this is right

MAEFMs can leverage abundant unlabeled surface sensor data for pre-training without needing downhole labels.
They enable simultaneous prediction of multiple downhole metrics through multi-task learning.
Improved generalization across different wells may result from the learned representations.
Future work should include empirical comparisons against ANN and LSTM baselines on drilling datasets.
Broader use in oil and gas operations becomes possible if the models prove effective.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Domain-specific fine-tuning or masking strategies might be necessary to adapt MAEFMs to the unique characteristics of drilling sensor noise and sampling rates.
Combining MAEFMs with physics-informed constraints could further improve prediction reliability in safety-critical drilling scenarios.
Similar self-supervised techniques may address label scarcity in other industrial monitoring applications involving time-series sensor data.
Public benchmarks of drilling datasets could accelerate testing of these models by the research community.

Load-bearing premise

The performance advantages of masked autoencoders in other time-series domains will transfer to drilling sensor data without specific adaptations.

What would settle it

Training a masked autoencoder foundation model and a standard LSTM on the same collection of surface drilling data with limited downhole labels, then comparing their prediction errors on unseen wells, would directly test whether the proposed advantages materialize.

Figures

Figures reproduced from arXiv: 2604.15169 by Aleksander Berezowski, Gouri Ginde, Hassan Hassanzadeh.

**Figure 2.** Figure 2: Bar chart illustrating the frequency of downhole metric frequencies [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Masked autoencoder training architecture [17]. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Oil and gas drilling operations generate extensive time-series data from surface sensors, yet accurate real-time prediction of critical downhole metrics remains challenging due to the scarcity of labelled downhole measurements. This systematic mapping study reviews thirteen papers published between 2015 and 2025 to assess the potential of Masked Autoencoder Foundation Models (MAEFMs) for predicting downhole metrics from surface drilling data. The review identifies eight commonly collected surface metrics and seven target downhole metrics. Current approaches predominantly employ neural network architectures such as artificial neural networks (ANNs) and long short-term memory (LSTM) networks, yet no studies have explored MAEFMs despite their demonstrated effectiveness in time-series modeling. MAEFMs offer distinct advantages through self-supervised pre-training on abundant unlabeled data, enabling multi-task prediction and improved generalization across wells. This research establishes that MAEFMs represent a technically feasible but unexplored opportunity for drilling analytics, recommending future empirical validation of their performance against existing models and exploration of their broader applicability in oil and gas operations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a straightforward mapping review that correctly flags an empty spot for masked autoencoders in drilling analytics but gives no evidence the approach would survive the domain's noise and physics.

read the letter

The paper's main contribution is a systematic scan of thirteen studies from 2015-2025 on predicting downhole metrics from surface drilling sensors. It lists eight common surface variables and seven target downhole ones, shows that current work relies on ANNs and LSTMs, and notes that no one has tried masked autoencoder foundation models yet. That gap identification is accurate and saves someone else the initial literature work. The self-supervised pre-training angle is presented as a natural fit for the large amounts of unlabeled surface data that drilling rigs produce, which is a fair observation on its face. The write-up stays focused and avoids overclaiming new methods or results. Where it weakens is the jump to technical feasibility. The advantages cited come from generic time-series MAE papers, yet drilling streams carry rig-induced noise, irregular sampling, strong physical couplings between variables, and well-to-well distribution shifts that are not addressed. No comparison of data traits or even a high-level sketch of how masking would interact with the listed metrics appears. The abstract also leaves the search protocol, inclusion rules, and quality checks implicit, so the completeness of the gap claim is hard to judge without the full methods section. This work is useful for applied researchers in petroleum engineering or energy-focused ML who need a quick map of what has been tried and where the white space sits. It will not change deployed models or deliver new predictions, but it can steer the next empirical paper. I would send it to peer review if the methods are documented properly, since a clean gap review in a narrow domain still helps the field even when it stays at the suggestion stage.

Referee Report

2 major / 1 minor

Summary. The manuscript is a systematic mapping study reviewing thirteen papers (2015-2025) on predicting downhole metrics from surface drilling sensor data. It identifies eight commonly collected surface metrics and seven target downhole metrics, observes that existing approaches rely primarily on ANNs and LSTMs, and documents the complete absence of Masked Autoencoder Foundation Models (MAEFMs). The authors argue that MAEFMs constitute a technically feasible but unexplored opportunity because self-supervised pre-training on abundant unlabeled data can enable multi-task prediction and improved generalization across wells, and they recommend future empirical validation.

Significance. If the mapping is comprehensive, the paper usefully documents a clear research gap in the application of self-supervised foundation models to drilling time-series analytics. Credit is given for the systematic identification of the unlabeled-data opportunity and for framing a concrete recommendation for empirical follow-up work. The significance remains preliminary because the feasibility assessment rests on cross-domain analogy without domain-specific grounding.

major comments (2)

[Abstract] Abstract: the claim that 'no studies have explored MAEFMs' is presented without any description of the search strategy, databases, keywords, inclusion/exclusion criteria, or quality assessment used to select and evaluate the thirteen papers; this directly affects the reliability of the identified gap.
[Discussion] Discussion (paragraph on MAEFM advantages): the assertion that MAEFMs are 'technically feasible' relies solely on demonstrated performance in generic time-series domains and does not address drilling-specific data characteristics (variable sampling rates, rig-induced noise, physical constraints, well-specific distributions) or provide even a high-level sketch of how masking would interact with the 8-to-7 metric mapping; this is load-bearing for the central 'feasible but unexplored opportunity' claim.

minor comments (1)

The eight surface and seven downhole metrics are listed in the abstract but would be clearer if summarized in a dedicated table with units and typical sampling characteristics.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our systematic mapping study. The feedback has prompted us to enhance the transparency of our methodology and to provide additional context on the feasibility of MAEFMs in the drilling domain. We address each major comment below and have revised the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'no studies have explored MAEFMs' is presented without any description of the search strategy, databases, keywords, inclusion/exclusion criteria, or quality assessment used to select and evaluate the thirteen papers; this directly affects the reliability of the identified gap.

Authors: We agree that the abstract would benefit from briefly contextualizing the systematic mapping process to support the gap claim. The full manuscript (Section 2) already details the search strategy, including databases (Scopus, Web of Science, IEEE Xplore), keywords combining drilling/sensor/downhole terms, the 2015-2025 timeframe, inclusion criteria limited to peer-reviewed studies on surface-to-downhole prediction using ML, and the screening process that yielded the thirteen papers. We have revised the abstract to include a concise statement: 'This systematic mapping study reviews thirteen papers published between 2015 and 2025...' This addition improves transparency without altering length constraints. revision: yes
Referee: [Discussion] Discussion (paragraph on MAEFM advantages): the assertion that MAEFMs are 'technically feasible' relies solely on demonstrated performance in generic time-series domains and does not address drilling-specific data characteristics (variable sampling rates, rig-induced noise, physical constraints, well-specific distributions) or provide even a high-level sketch of how masking would interact with the 8-to-7 metric mapping; this is load-bearing for the central 'feasible but unexplored opportunity' claim.

Authors: We acknowledge that the original discussion drew primarily from cross-domain time-series results and did not explicitly address drilling-specific traits or sketch the masking-to-mapping interaction. We have revised the Discussion to include a high-level sketch: masking can be applied selectively to the eight surface metrics (e.g., WOB, RPM, torque) while the decoder reconstructs or predicts the seven downhole targets; variable sampling rates can be handled via interpolation or positional encodings in the transformer backbone; rig noise via robust reconstruction losses; physical constraints via physics-informed regularization in the latent space; and well-specific distributions via domain-adversarial training or fine-tuning. We continue to emphasize that only empirical validation can confirm these adaptations, consistent with the paper's recommendation for future work. revision: yes

Circularity Check

0 steps flagged

No circularity: literature review with no derivations or self-referential reductions

full rationale

This systematic mapping study reviews 13 external papers on surface-to-downhole prediction and notes the absence of MAEFM applications. It contains no equations, fitted parameters, predictions, or derivations that could reduce to its own inputs by construction. The claim of technical feasibility rests on cited results from other time-series domains rather than any self-citation chain or ansatz smuggled from prior author work. No load-bearing step equates to a renaming, self-definition, or fitted-input prediction within the paper itself.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that MAEFM properties observed in general time-series tasks apply to drilling data, plus the implicit assumption that the selected thirteen papers adequately represent current approaches.

axioms (2)

domain assumption Masked autoencoders have demonstrated effectiveness in time-series modeling outside drilling
Invoked when stating MAEFMs offer distinct advantages; the paper cites general literature but does not re-derive or test the claim for drilling data.
domain assumption The thirteen reviewed papers cover the relevant state of the art
The mapping study selects these papers to conclude no MAEFM use; selection criteria are not detailed in the abstract.

pith-pipeline@v0.9.0 · 5485 in / 1393 out tokens · 47611 ms · 2026-05-10T11:35:15.537340+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 5 canonical work pages

[1]

Ai-driven optimization of drilling performance through torque management using machine learning and differential evolution,

F. S. Boukredera, A. Hadjadj, M. R. Youcefi, and H. Ouadi, “Ai-driven optimization of drilling performance through torque management using machine learning and differential evolution,”Processes, vol. 13, no. 5,
[2]

Available: https://www.mdpi.com/2227-9717/13/5/1472

[Online]. Available: https://www.mdpi.com/2227-9717/13/5/1472
[3]

A novel hybrid transfer learning method for bottom hole pressure prediction,

R. Zhang, X. Song, G. Li, Z. Lv, Z. Zhu, C. Zhang, and C. Gong, “A novel hybrid transfer learning method for bottom hole pressure prediction,” 2023. [3]Drilling Data Based Approach for Equivalent Circulation Density Prediction While Drilling, ser. U.S. Rock Mechanics/Geomechanics Symposium, vol. 57th U.S. Rock Mechanics/Geomechanics Symposium, 06 2023. [O...

work page doi:10.56952/arma-2023- 2023
[4]

Intelligent model for predicting downhole vibrations using surface drilling data during horizontal drilling,

R. Saadeldin, H. Gamal, S. Elkatatny, and A. Abdulraheem, “Intelligent model for predicting downhole vibrations using surface drilling data during horizontal drilling,”Journal of energy resources technology, vol. 144, no. 8, 2022

2022
[5]

Detecting downhole vi- brations through drilling horizontal sections: machine learning study,

R. Saadeldin, H. Gamal, and S. Elkatatny, “Detecting downhole vi- brations through drilling horizontal sections: machine learning study,” Scientific reports, vol. 13, no. 1, pp. 6204–14, 2023

2023
[6]

A systematic review of data science and machine learning applications to the oil and gas industry,

Z. Tariq, M. S. Aljawad, A. Hasan, M. Murtaza, E. Mohammed, A. El-Husseiny, S. A. Alarifi, M. Mahmoud, and A. Abdulraheem, “A systematic review of data science and machine learning applications to the oil and gas industry,”Journal of Petroleum Exploration and Production Technology, vol. 11, no. 12, p. 4339–4374, Sep 2021

2021
[7]

Explainable machine-learning-based prediction of equivalent circulating density using surface-based drilling data,

G. Ekechukwu and A. Adejumo, “Explainable machine-learning-based prediction of equivalent circulating density using surface-based drilling data,”Scientific reports, vol. 14, no. 1, pp. 17 780–9, 2024

2024
[8]

Machine learning models for equivalent circulating density prediction from drilling data,

H. Gamal, A. Abdelaal, and S. Elkatatny, “Machine learning models for equivalent circulating density prediction from drilling data,”ACS omega, vol. 6, no. 41, pp. 27 430–27 442, 2021

2021
[9]

New approach to evaluate the equivalent circulating density (ecd) using artificial intelligence techniques,

K. Z. Abdelgawad, M. Elzenary, S. Elkatatny, M. Mahmoud, A. Ab- dulraheem, and S. Patil, “New approach to evaluate the equivalent circulating density (ecd) using artificial intelligence techniques,”Journal of petroleum exploration and production technology, vol. 9, no. 2, pp. 1569–1578, 2019

2019
[10]

The different member equivalent circulating density prediction model and drilling parameter optimization under narrow density window,

W. Zhao, Z. Yang, T. Wang, Y . Zhou, W. Song, J. Li, and P. Zhai, “The different member equivalent circulating density prediction model and drilling parameter optimization under narrow density window,”Frontiers in earth science (Lausanne), vol. 13, 2025

2025
[11]

Bottom hole pressure prediction based on hybrid neural networks and bayesian optimization,

C.-K. Zhang, R. Zhang, Z.-P. Zhu, X.-Z. Song, Y .-A. Su, G.-S. Li, and L. Han, “Bottom hole pressure prediction based on hybrid neural networks and bayesian optimization,”Petroleum science, vol. 20, no. 6, pp. 3712–3722, 2023

2023
[12]

An online hybrid prediction model for mud pit volume in the complex geological drilling process,

Y . Zhou, X. Chen, E. F. Fukushima, M. Wu, W. Cao, and T. Terano, “An online hybrid prediction model for mud pit volume in the complex geological drilling process,”Control engineering practice, vol. 111, pp. 104 793–, 2021

2021
[13]

Deep learning approach to prediction of drill-bit torque in directional drilling sliding mode: En- ergy saving,

W. CAO, D. MEI, Y . GUO, and H. Ghorbani, “Deep learning approach to prediction of drill-bit torque in directional drilling sliding mode: En- ergy saving,”Measurement : journal of the International Measurement Confederation, vol. 250, pp. 117 144–, 2025

2025
[14]

Machine learning-based trigger detection of drilling events based on drilling data,

J. Zhao, Y . Shen, W. Chen, Z. Zhang, and S. Johnston, “Machine learning-based trigger detection of drilling events based on drilling data,” 2017

2017
[15]

Downhole data correction for data-driven rate of penetration prediction modeling,

M. A. Encinas, A. T. Tunkiel, and D. Sui, “Downhole data correction for data-driven rate of penetration prediction modeling,”Journal of petroleum science & engineering, vol. 210, pp. 109 904–, 2022

2022
[16]

Using trees, bagging, and random forests to predict rate of penetration during drilling,

C. Hegde, S. Wallace, and K. Gray, “Using trees, bagging, and random forests to predict rate of penetration during drilling,” 2015

2015
[17]

Masked autoencoders are scalable vision learners,

K. He, X. Chen, S. Xie, Y . Li, P. Dollar, and R. Girshick, “Masked autoencoders are scalable vision learners,”2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2022

2022
[18]

Learning repre- sentations by back-propagating errors,

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning repre- sentations by back-propagating errors,”Nature, vol. 323, no. 6088, p. 533–536, Oct 1986

1986
[19]

Deep autoencoder neural networks: A comprehensive review and new perspectives,

I. D. Mienye and T. G. Swart, “Deep autoencoder neural networks: A comprehensive review and new perspectives,”Archives of Computational Methods in Engineering, vol. 32, no. 7, p. 3981–4000, Mar 2025

2025
[20]

Systematic mapping studies in software engineering,

K. Petersen, R. Feldt, S. Mujtaba, and M. Mattsson, “Systematic mapping studies in software engineering,”Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering, vol. 17, 06 2008

2008
[21]

Drilling industry glossary

P. Vertex, “Drilling industry glossary.” [Online]. Available: https://www.pvisoftware.com/drilling-glossary
[22]

Flow rate,

IADC85, “Flow rate,” 12 2010. [Online]. Available: https://iadclexicon.org/flow-rate/

2010
[23]

What is standpipe pressure?

Contact, “What is standpipe pressure?” 06 2024. [Online]. Available: https://contactinstruments.com/what-is-standpipe-pressure/

2024
[24]

Energy glossary

Schlumberger, “Energy glossary.” [Online]. Available: https://glossary.slb.com/en
[25]

Bottom hole pressure in drilling,

D. Manual, “Bottom hole pressure in drilling,” 12 2023. [Online]. Avail- able: https://www.drillingmanual.com/bottom-hole-pressure-in-drilling/

2023
[26]

Drilling vibration monitoring and control system,

N. E. T. Laboratory, “Drilling vibration monitoring and control system,”
[27]

Available: https://www.netl.doe.gov/node/3788

[Online]. Available: https://www.netl.doe.gov/node/3788
[28]

Drilling parameters definitions & optimization,

D. Manual, “Drilling parameters definitions & optimization,” 10
[29]

[Online]. Available: https://www.drillingmanual.com/drilling- parameters-optimization-performance-oil-gas/ [28]Machine Learning–Based Trigger Detection of Drilling Events Based on Drilling Data, ser. SPE Eastern Regional Meeting, vol. SPE Eastern Regional Meeting, 10 2017. [Online]. Available: https://doi.org/10.2118/187512-MS

work page doi:10.2118/187512-ms 2017
[30]

Foundation models,

J. Schneider, C. Meske, and P. Kuss, “Foundation models,”Business & Information Systems Engineering, vol. 66, no. 2, p. 221–231, Jan 2024

2024
[31]

Multimae: Multi- modal multi-task masked autoencoders,

R. Bachmann, D. Mizrahi, A. Atanov, and A. Zamir, “Multimae: Multi- modal multi-task masked autoencoders,”Lecture Notes in Computer Science, p. 348–367, 2022

2022
[32]

Parameter-Efficient Transfer Learning for NLP

N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. de Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter- efficient transfer learning for nlp,” 2019. [Online]. Available: https://arxiv.org/abs/1902.00751

work page Pith review arXiv 2019
[33]

The ucr time series clas- sification archive,

H. A. Dau, E. Keogh, K. Kamgar, C.-C. M. Yeh, Y . Zhu, S. Gharghabi, C. A. Ratanamahatana, Yanping, B. Hu, N. Begum, A. Bagnall, A. Mueen, G. Batista, and Hexagon-ML, “The ucr time series clas- sification archive,” October 2018

2018
[34]

Ti-mae: Self-supervised masked time series autoencoders, 2023

Z. Li, Z. Rao, L. Pan, P. Wang, and Z. Xu, “Ti-mae: Self- supervised masked time series autoencoders,” 2023. [Online]. Available: https://arxiv.org/abs/2301.08871

work page arXiv 2023
[35]

Mtsmae: Masked autoencoders for multivariate time-series forecasting,

P. Tang and X. Zhang, “Mtsmae: Masked autoencoders for multivariate time-series forecasting,” 2022. [Online]. Available: https://arxiv.org/abs/2210.02199

work page arXiv 2022