pith. sign in

arxiv: 2606.00834 · v1 · pith:GKXW6NLOnew · submitted 2026-05-30 · 📊 stat.AP · cs.AI· cs.LG· math.PR

Hybrid Probabilistic Forecasting of Under-Five Malaria Admissions in Ghana: A Gaussian Process Regression with Holt-Winters Smoothing

Pith reviewed 2026-06-28 17:43 UTC · model grok-4.3

classification 📊 stat.AP cs.AIcs.LGmath.PR
keywords malaria forecastingGaussian process regressionHolt-Winters smoothinghybrid probabilistic modelGhana district dataunder-five admissionsseasonal time seriesearly warning systems
0
0 comments X

The pith

A hybrid of Gaussian process regression and Holt-Winters smoothing produces accurate probabilistic forecasts of under-five malaria admissions across Ghana districts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that pairing Gaussian process regression, which models non-linear patterns and uncertainty, with Holt-Winters exponential smoothing, which handles seasonality and long-range stability, creates forecasts that are more reliable than either component alone for monthly malaria case counts. This matters in settings like Ghana where strong seasonal swings, incomplete reporting, and shifting transmission make standard models unreliable for planning control efforts. The authors apply the hybrid to ten years of district data and demonstrate that it explains nearly all observed variation while keeping most predictions inside stated uncertainty intervals.

Core claim

The hybrid framework of Gaussian Process Regression with Holt-Winters exponential smoothing achieves an R-squared of 0.9906 on monthly under-five malaria admissions from 2014-2023, compared with 0.8213 for Holt-Winters alone, places 94.2 percent of residuals inside two standard deviation bounds, and generates 2024-2028 forecasts of average monthly admissions between approximately 8,000 and 12,200 cases while identifying stable relative patterns in northern high-burden districts despite large absolute changes.

What carries the argument

The hybrid model that integrates Gaussian Process Regression for non-linear uncertainty quantification with Holt-Winters smoothing to preserve seasonal structure and stabilize long-horizon projections.

If this is right

  • District-level probabilistic forecasts can directly feed into Ghana's national malaria control planning for resource allocation.
  • Spatio-temporal heterogeneity shows that high-burden northern districts maintain stable relative rankings even when absolute numbers fluctuate.
  • The framework supplies a scalable probabilistic early-warning tool usable in other endemic sub-Saharan settings with similar data constraints.
  • Long-horizon projections to 2028 allow advance preparation for expected increases in average monthly case loads.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same hybrid structure could be tested on other seasonal infectious diseases where limited historical records are the main constraint.
  • Adding environmental or intervention covariates might further reduce uncertainty, though the paper does not examine that extension.
  • If the rolling-window validation proves robust, the approach offers a template for updating forecasts in real time as new surveillance data arrive.

Load-bearing premise

Rolling-origin expanding-window checks on the 2014-2023 records are sufficient to confirm that the model will generalize under future changes in transmission intensity and reporting completeness.

What would settle it

New monthly admission counts for 2024 or 2025 that fall consistently outside the projected 8,000-to-12,200 range or that produce more than 5.8 percent of residuals beyond two standard deviations would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2606.00834 by J. Bremang Tandoh, T. Ansah-Narh, Y. Asare Afrane.

Figure 1
Figure 1. Figure 1: Map of Ghana showing regional boundaries and the geospatial distribution of malaria case reporting sites. Red circles represent malaria surveillance locations aggregated at the district level, overlaid on the administrative boundaries of the sixteen regions of Ghana. Regional names are annotated in blue for reference. 4 [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Monthly spatial distribution of under-five malaria admissions across Ghana (2014–2023). Each panel represents a cumulative visualisation of reported cases in a given month aggregated over ten years. Marker size and colour intensity are proportional to the number of admissions, highlighting persistent and seasonal hotspots. Tamale emerges as the most consistent high-burden district across the year, particul… view at source ↗
Figure 3
Figure 3. Figure 3: summarises the complete hybrid modelling pipeline, linking standardisation, GPR fitting, posterior computation through Eqs. (9)–(10), multi-step forecasting, and seasonal smoothing via Eq. (13). This schematic highlights how each component contributes to the construction of the final hybrid forecast. The final forecasting results presented in this study were gen￾erated using the hybrid GPR–Holt–Winters pat… view at source ↗
Figure 4
Figure 4. Figure 4: Spatial distribution of the coefficient of variation in monthly malaria admissions among children under five. Marker size and colour intensity indicate the magnitude of CV for each district. Higher values reflect greater month-to-month variability in malaria burden. 10 [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of classical time-series forecasting models for monthly under-five malaria admissions in Ghana (2014–2023 observed; 2024–2028 forecast). The panels show forecasts from three baseline models: Linear Regression (left), Holt–Winters exponential smoothing (centre), and SARIMA (right). Black points represent observed admissions, dashed lines denote model forecasts, and shaded regions indicate 95% pre… view at source ↗
Figure 6
Figure 6. Figure 6: Residual diagnostics for the SARIMA model fitted to under-five malaria admissions (2014–2023). Panels display: (i) autocorrelation (ACF) and (ii) partial autocorrelation (PACF) of residuals, (iii) a Q–Q plot evaluating normality, (iv) a residual histogram with kernel density overlay, and (v) temporal evolution of standardised residuals including Augmented Dickey–Fuller (ADF) statistics. These diagnostics a… view at source ↗
Figure 7
Figure 7. Figure 7: Gaussian Process Regression (GPR) with Holt–Winters hybrid smoothing for monthly under-five malaria admissions, 2014–2028. The upper panel shows the complete time series (2014–2023 observed; 2024–2028 forecast). The green line represents the in-sample GPR fit with its associated 95% credible interval, while the blue line and light-blue shading denote the GPR predictive mean and its 95% forecast interval. T… view at source ↗
Figure 8
Figure 8. Figure 8: Residual diagnostics for the Gaussian Process Regression (GPR) model (2014–2023). The upper panels present (left) residuals over time and (right) residuals versus fitted values, assessing independence and homoscedasticity. The lower panels show (left) the residual distribution with a fitted normal density curve and (right) the Q–Q plot comparing empirical and theoretical quantiles. Residuals remain centred… view at source ↗
read the original abstract

Accurate malaria forecasting remains a major challenge in sub-Saharan Africa, where strong seasonality, reporting uncertainty, and non-stationary transmission dynamics reduce the reliability of conventional models. In Ghana, district-level malaria surveillance requires forecasting frameworks that are probabilistically rigorous and robust under limited data. This study proposes a hybrid framework integrating Gaussian Process Regression (GPR) with Holt-Winters exponential smoothing for modelling monthly under-five malaria admissions. GPR captures non-linear behaviour and predictive uncertainty, while Holt-Winters stabilises long-horizon forecasts and preserves seasonal structure. Using ten years of district-level data (2014-2023), performance was evaluated via rolling-origin expanding-window validation. The hybrid model achieved $R^2 = 0.9906$ versus $0.8213$ for Holt-Winters alone, with $94.2\%$ of residuals within $\pm 2\sigma$ bounds. Forecasts for 2024-2028 project average monthly admissions from approximately 8{,}000 to 12{,}200 cases. Spatio-temporal analysis revealed pronounced ecological heterogeneity: northern high-burden districts exhibited stable relative patterns despite large absolute fluctuations. The framework provides a scalable probabilistic approach for malaria early warning and operational planning in endemic settings, supporting Ghana's national malaria control strategy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a hybrid Gaussian Process Regression (GPR) combined with Holt-Winters exponential smoothing for probabilistic forecasting of monthly under-five malaria admissions in Ghana. Using 2014-2023 district-level data and rolling-origin expanding-window validation, it reports R² = 0.9906 (vs. 0.8213 for Holt-Winters alone) with 94.2% of residuals within ±2σ bounds, and projects 2024-2028 average monthly admissions ranging from approximately 8,000 to 12,200 cases while highlighting ecological heterogeneity in northern districts.

Significance. If the performance and coverage claims are robust, the hybrid framework could provide a practical probabilistic tool for malaria surveillance and early warning in endemic settings with strong seasonality and limited data, supporting operational planning under Ghana's national strategy. The integration of GPR for uncertainty quantification with Holt-Winters for seasonal stability addresses a relevant gap in applied forecasting for non-stationary epidemiological time series.

major comments (2)
  1. [Abstract and Methods (validation)] Abstract and validation description: The rolling-origin expanding-window scheme applied within the 2014-2023 window cannot test generalization under the non-stationary transmission dynamics and reporting uncertainty explicitly flagged in the abstract; this directly affects the reliability of the 2024-2028 horizon projections and the 94.2% ±2σ coverage claim, as no stress tests for regime shifts (policy, climate, or surveillance changes) are described.
  2. [Abstract and Methods] Abstract and Methods: No information is provided on GPR kernel family, hyperparameter selection procedure, or missing-data handling, all of which are load-bearing for reproducing and interpreting the headline R² = 0.9906 and residual coverage statistics.
minor comments (1)
  1. [Abstract] Abstract: The notation '8{,}000' appears to be an unrendered LaTeX artifact and should be corrected to 8,000 for readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these constructive comments, which highlight important aspects of validation scope and reproducibility. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Abstract and Methods (validation)] Abstract and validation description: The rolling-origin expanding-window scheme applied within the 2014-2023 window cannot test generalization under the non-stationary transmission dynamics and reporting uncertainty explicitly flagged in the abstract; this directly affects the reliability of the 2024-2028 horizon projections and the 94.2% ±2σ coverage claim, as no stress tests for regime shifts (policy, climate, or surveillance changes) are described.

    Authors: We agree that the rolling-origin expanding-window validation remains internal to the 2014-2023 data and does not incorporate explicit stress tests for regime shifts. While this scheme successively evaluates performance on held-out later periods and thereby tests temporal generalization within the observed series, it cannot address external shocks. We will add a limitations paragraph in the Discussion section that explicitly notes this scope restriction and its implications for the long-horizon forecasts and coverage statistics, and we will suggest that post-2023 data be used for external validation when available. revision: partial

  2. Referee: [Abstract and Methods] Abstract and Methods: No information is provided on GPR kernel family, hyperparameter selection procedure, or missing-data handling, all of which are load-bearing for reproducing and interpreting the headline R² = 0.9906 and residual coverage statistics.

    Authors: The referee correctly identifies that these implementation details are absent from the current manuscript. We will expand the Methods section in the revision to specify the GPR kernel family, the hyperparameter selection procedure, and the missing-data handling approach. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation or evaluation chain

full rationale

The paper fits a hybrid GPR + Holt-Winters model to the 2014-2023 district-level series and reports empirical performance via rolling-origin expanding-window validation on held-out segments within that interval. The quoted metrics (R² = 0.9906, 94.2 % residuals within ±2σ) are computed on those out-of-sample folds and are not algebraically forced by the parameter values themselves. The 2024-2028 projections are forward extrapolations from the fitted model; no self-definitional equations, fitted-input-renamed-as-prediction steps, or load-bearing self-citations appear in the provided text that would collapse the claimed results back to the inputs by construction. The validation scheme therefore supplies independent content relative to the model specification.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

Abstract-only review; free parameters and assumptions are inferred from the described methods rather than extracted from full text.

free parameters (2)
  • GPR kernel hyperparameters
    Standard GPR fitting to capture non-linear patterns in the time series.
  • Holt-Winters level, trend, and seasonal smoothing parameters
    Fitted to stabilize long-horizon seasonal forecasts.
axioms (2)
  • domain assumption The malaria admission time series can be adequately modeled as a Gaussian process plus seasonal component
    Core modeling choice stated in the hybrid framework description.
  • domain assumption Ten years of district-level data suffice to learn parameters that generalize beyond the training window
    Implicit in the use of 2014-2023 data for both fitting and rolling validation.

pith-pipeline@v0.9.1-grok · 5780 in / 1249 out tokens · 30300 ms · 2026-06-28T17:43:22.224712+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 5 canonical work pages

  1. [1]

    Casanova, V

    D. Casanova, V . Baptista, M. Costa, B. Freitas, M. d. N. I. Pereira, C. Calçada, P. Mota, O. Kythrich, M. H. J. S. Pereira, N. S. Osório, et al., Artemisinin resistance- associated gene mutations in plasmodium falciparum: a case study of severe malaria from mozambique, Travel Medicine and Infectious Disease 57 (2024) 102684

  2. [2]

    H. J. Oladipo, Y . A. Tajudeen, I. O. Oladunjoye, S. I. Yusuff, R. O. Yusuf, E. M. Oluwaseyi, M. O. AbdulBasit, Y . A. Adebisi, M. S. El-Sherbini, Increasing challenges of malaria control in sub-saharan africa: Priorities for public health research and policymakers, Annals of Medicine and Surgery 81 (2022) 104366

  3. [3]

    Sarpong, D

    E. Sarpong, D. O. Acheampong, G. N. R. Fordjour, A. Anyanful, E. Aninagyei, D. A. Tuoyire, D. Blackhurst, G. B. Kyei, M. Ekor, N. E. Thomford, Zero malaria: a mirage or reality for populations of sub-saharan africa in health transition, Malaria Journal 21 (2022) 314

  4. [4]

    W. H. Organization, et al., World malaria report 2022, World Health Organization, 2022

  5. [5]

    E. K. Aidoo, F. T. Aboagye, G. E. Agginie, F. A. Botch- way, G. Osei-Adjei, M. Appiah, R. D. Takyi, S. A. Sakyi, L. Amoah, G. Arthur, et al., Malaria elimination in ghana: recommendations for reactive case detection strategy im- plementation in a low endemic area of asutsuare, ghana, Malaria Journal 23 (2024) 5

  6. [6]

    Awine, K

    T. Awine, K. Malm, C. Bart-Plange, S. P. Silal, Towards malaria control and elimination in ghana: challenges and decision making tools to guide planning, Global health action 10 (2017) 1381471

  7. [7]

    Awine, K

    T. Awine, K. Malm, N. Y . Peprah, S. P. Silal, Spatio- temporal heterogeneity of malaria morbidity in ghana: analysis of routine health facility data, PLoS One 13 (2018) e0191707

  8. [8]

    Adu-Prah, E

    S. Adu-Prah, E. K. Tetteh, Spatiotemporal analysis of cli- mate variability impacts on malaria prevalence in ghana, Applied Geography 60 (2015) 266–273

  9. [9]

    M. F. Myers, D. Rogers, J. Cox, A. Flahault, S. I. Hay, Forecasting disease risk for increased epidemic prepared- ness in public health, Advances in parasitology 47 (2000) 309–330

  10. [10]

    G. E. Box, G. M. Jenkins, G. C. Reinsel, G. M. Ljung, Time series analysis: forecasting and control, John Wiley & Sons, 2015

  11. [11]

    J. W. Taylor, Short-term electricity demand forecasting using double seasonal exponential smoothing, Journal of the Operational Research Society 54 (2003) 799–805

  12. [12]

    Viboud, K

    C. Viboud, K. Sun, R. Gaffey, M. Ajelli, L. Fumanelli, S. Merler, Q. Zhang, G. Chowell, L. Simonsen, A. Vespig- nani, et al., The rapidd ebola forecasting challenge: Syn- thesis and lessons learnt, Epidemics 22 (2018) 13–21

  13. [13]

    L. Held, M. Höhle, M. Hofmann, A statistical framework for the analysis of multivariate infectious disease surveil- lance counts, Statistical modelling 5 (2005) 187–199

  14. [14]

    C. K. Williams, C. E. Rasmussen, Gaussian processes for machine learning, volume 2, MIT press Cambridge, MA, 2006

  15. [15]

    Paliwal, A

    S. Paliwal, A. Sharma, S. Jain, S. Sharma, Machine learn- ing and deep learning in bioinformatics, in: Bioinformat- ics and Computational Biology, Chapman and Hall/CRC, 2023, pp. 63–74

  16. [16]

    De Souza, L

    D. De Souza, L. Kelly-Hope, B. Lawson, M. Wilson, D. Boakye, Environmental factors associated with the distribution of anopheles gambiae ss in ghana; an impor- tant vector of lymphatic filariasis and malaria, PloS one 5 (2010) e9927

  17. [17]

    M. N. Adokiya, Perspectives of health workers on malaria case referral among pregnant women attending antenatal care in savelugu municipality, ghana: A qualitative de- scriptive study, PloS one 20 (2025) e0319567

  18. [18]

    A. S. Kolekang, Y . Afrane, S. Apanga, D. Zurovac, A. Kwarteng, S. Afari-Asiedu, K. P. Asante, A. Danso- Appiah, Challenges with adherence to the ‘test, treat, and track’malaria case management guideline among pre- scribers in ghana, Malaria Journal 21 (2022) 332

  19. [19]

    J. N. Fobil, A. Kraemer, C. G. Meyer, J. May, Neigh- borhood urban environmental quality conditions are likely to drive malaria and diarrhea mortality in accra, ghana, Journal of environmental and public health 2011 (2011) 484010

  20. [20]

    N. Y . Peprah, W. Mohammed, G. A. Adu, D. Dadzie, S. Oppong, S. Barikisu, J. Narh, S. Appiah, J. Frimpong, K. L. Malm, Patient socio-demographics and clinical factors associated with malaria mortality: a case control study in the northern region of ghana, Malaria Journal 23 (2024) 230

  21. [21]

    T. V . Oheneba-Dornyo, S. Amuzu, A. Maccagnan, T. Tay- lor, Estimating the impact of temperature and rainfall on malaria incidence in ghana from 2012 to 2017, Environ- mental Modeling & Assessment 27 (2022) 473–489. 21

  22. [22]

    K. P. Asante, C. Zandoh, D. B. Dery, C. Brown, G. Ad- jei, Y . Antwi-Dadzie, M. Adjuik, K. Tchum, D. Dosoo, S. Amenga-Etego, et al., Malaria epidemiology in the ahafo area of ghana, Malaria journal 10 (2011) 211

  23. [23]

    F. A. Asante, K. Asenso-Okyere, Economic burden of malaria in ghana, World Health Organization (WHO) (2003) 1–81

  24. [24]

    G. F. Reed, F. Lynn, B. D. Meade, Use of coefficient of variation in assessing variability of quantitative assays, Clinical and Vaccine Immunology 9 (2002) 1235–1239. doi:10.1128/cdli.9.6.1235-1239.2002

  25. [25]

    Akter, M

    T. Akter, M. T. Islam, M. F. Hossain, M. S. Ullah, A com- parative study between time series and machine learning technique to predict dengue fever in dhaka city, Discrete Dynamics in Nature and Society 2024 (2024) 2757381

  26. [26]

    M. L. H. MABASO, M. CRAIG, A. ROSS, T. SMITH, Environmental predictors of the seasonality of malaria transmission in africa: The challenge, The American Jour- nal of Tropical Medicine and Hygiene 76 (2007) 33–38. doi:10.4269/ajtmh.2007.76.33

  27. [27]

    G. M. Assefa, M. D. Muluneh, Z. A. Alemu, The rela- tionship of climate change and malaria incidence in the gambella region, ethiopia, Climate 13 (2025) 104

  28. [28]

    G. J. Abiodun, P. J. Witbooi, K. O. Okosun, R. Maharaj, Exploring the impact of climate variability on malaria transmission using a dynamic mosquito-human malaria model, The open infectious diseases journal 10 (2018) 88

  29. [29]

    C. J. Armando, J. Rocklöv, M. Sidat, Y . Tozan, A. F. Mavume, M. O. Sewe, Spatio-temporal modelling and prediction of malaria incidence in mozambique using cli- matic indicators from 2001 to 2018, Scientific reports 15 (2025) 11971

  30. [30]

    Ebhuoma, M

    O. Ebhuoma, M. Gebreslasie, L. Magubane, A sea- sonal autoregressive integrated moving average (sarima) forecasting model to predict monthly malaria cases in kwazulu-natal, south africa, South African medical jour- nal 108 (2018)

  31. [31]

    Yeboah, J

    D. Yeboah, J. Owusu-Marfo, Y . N. Agyeman, Predictors of malaria vaccine uptake among children 6–24 months in the kassena nankana municipality in the upper east re- gion of ghana, Malaria Journal 21 (2022). doi:10.1186/ s12936-022-04378-1

  32. [32]

    Korenromp, G

    E. Korenromp, G. Mahiané, M. Hamilton, C. Preto- rius, R. Cibulskis, J. Lauer, T. A. Smith, O. J. T. Briët, Malaria intervention scale-up in africa: ef- fectiveness predictions for health programme plan- ning tools, based on dynamic transmission mod- elling, Malaria Journal 15 (2016). URL:http:// dx.doi.org/10.1186/s12936-016-1461-9. doi:10. 1186/s12936-0...

  33. [33]

    Chen, Application progress of ensemble forecast tech- nology in influenza forecast based on infectious disease model, Frontiers in Public Health 11 (2023)

    L. Chen, Application progress of ensemble forecast tech- nology in influenza forecast based on infectious disease model, Frontiers in Public Health 11 (2023). doi:10. 3389/fpubh.2023.1335499

  34. [34]

    Dixon, R

    S. Dixon, R. Keshavamurthy, D. H. Farber, A. Stevens, K. T. Pazdernik, L. E. Charles, A comparison of infec- tious disease forecasting methods across locations, dis- eases, and time, Pathogens 11 (2022) 185. doi:10.3390/ pathogens11020185

  35. [35]

    Meakin, S

    S. Meakin, S. Abbott, N. Bosse, J. Munday, H. Gruson, J. Hellewell, K. Sherratt, L. A. C. Chapman, K. Prem, P. Klepac, T. Jombart, G. M. Knight, Y . Jafari, S. Flasche, W. Waites, M. Jit, R. M. Eggo, C. J. Villabona-Arenas, T. W. Russell, G. Medley, W. J. Edmunds, N. G. Davies, Y . Liu, S. Hué, O. Brady, R. Pung, K. Abbas, A. Gimma, P. Mee, A. Endo, S. Cl...

  36. [36]

    E. L. Ray, N. G. Reich, Prediction of infec- tious disease epidemics via weighted density ensem- bles, PLOS Computational Biology 14 (2018) e1005910. URL:http://dx.doi.org/10.1371/journal.pcbi. 1005910. doi:10.1371/journal.pcbi.1005910

  37. [37]

    K. H. Brodersen, F. Gallusser, J. Koehler, N. Remy, S. L. Scott, Inferring causal impact using bayesian struc- tural time-series models, The Annals of Applied Statis- tics 9 (2015). URL:http://dx.doi.org/10.1214/ 14-AOAS788. doi:10.1214/14-aoas788

  38. [38]

    S. W. Jalloh, B. Malenje, H. Imboga, M. H. Hodges, Fore- casting malaria cases using climate variability in sierra leone, Malaria Journal 24 (2025) 158

  39. [39]

    D. C. Medina, S. E. Findley, B. Guindo, S. Doumbia, Forecasting non-stationary diarrhea, acute respiratory in- fection, and malaria time-series in niono, mali, PLoS One 2 (2007) e1181. 22