A joint meta-analysis framework for the accuracy of two diagnostic tests accounting for varying study designs
Pith reviewed 2026-06-29 02:50 UTC · model grok-4.3
The pith
Bayesian hierarchical model enables joint meta-analysis of two diagnostic tests accounting for conditional dependence.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose a Bayesian hierarchical model for joint meta-analysis of the accuracy of two binary tests, modelling conditional dependence through study-specific log-odds ratios. The model accommodates studies that do not report joint classification data. We show how the model extends to accommodate data from varied study designs, including studies without a gold standard and studies with partial verification, without assuming imperfect reference standards are error-free.
What carries the argument
Study-specific log-odds ratios within a Bayesian hierarchical model; these capture conditional dependence between tests and support extension to incomplete data from varied designs.
Load-bearing premise
The parametrization of conditional dependence with study-specific log-odds ratios maintains stability and supports valid inference for extended study designs.
What would settle it
Running the model on simulated data with known conditional dependence and verifying that joint accuracy estimates match the generating parameters better than independence models.
read the original abstract
Meta-analyses of the accuracy of two diagnostic tests typically assume tests are independent conditional on true disease status. This assumption is often unrealistic and violation leads to biased estimates of the accuracy of tests used in combination. Existing models accounting for conditional dependence require `joint classification' data (results for both tests and the `gold standard' on all participants) from all studies and/or suffer from computational instability. We propose a Bayesian hierarchical model for joint meta-analysis of the accuracy of two binary tests, modelling conditional dependence through study-specific log-odds ratios. The model accommodates studies that do not report joint classification data. We show how the model extends to accommodate data from varied study designs, including studies without a gold standard and studies with partial verification, without assuming imperfect reference standards are error-free. We demonstrate the framework with two example meta-analyses. Our modelling framework retains key features of standard diagnostic test accuracy meta-analysis methods, while allowing for conditional dependence. Ignoring conditional dependence yields biased joint accuracy estimates when conditional dependence is substantial. Our parametrisation maintains computational stability and accommodates data from varied study designs, without requiring an initial data imputation step or assuming error-free reference standards in all studies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a Bayesian hierarchical model for the joint meta-analysis of the accuracy of two binary diagnostic tests. Conditional dependence between tests is parametrized via study-specific log-odds ratios. The model accommodates studies that do not provide joint classification data and extends to varied designs including those without a gold standard and those with partial verification, without requiring the assumption that imperfect reference standards are error-free. The framework is illustrated on two example meta-analyses; the authors claim that the chosen parametrization maintains computational stability and avoids the need for initial data imputation.
Significance. If the model is shown to be identifiable and stable under the claimed extensions, the work would provide a practically useful advance for diagnostic-test meta-analysis by relaxing the conditional-independence assumption while retaining the ability to handle heterogeneous study designs. The hierarchical borrowing of strength across studies and the avoidance of data-imputation steps are potentially valuable features. The central claim that valid posterior inference is obtained for studies lacking a gold standard rests on the log-OR dependence parameters remaining non-redundant under the latent-class likelihood.
major comments (1)
- [model extension to no-gold-standard studies] In the section describing the extension to studies without a gold standard (and similarly for partial-verification designs), the marginal 2×2 table observed in the absence of any reference standard is already saturated by the four test-accuracy parameters plus prevalence. The additional study-specific log-OR dependence parameter is therefore only weakly identified from the data of that study; identifiability relies entirely on the hierarchical prior informed by gold-standard studies. The manuscript does not appear to contain simulation studies or analytic results quantifying the degree of borrowing required for stable posterior inference on the dependence parameters in this setting.
minor comments (2)
- Notation for the study-specific log-odds-ratio parameters and their hierarchical distribution should be introduced with an explicit equation number and linked to the likelihood contribution for each study design.
- The two example meta-analyses would benefit from a brief sensitivity analysis varying the hyperparameters of the hierarchical prior on the log-OR parameters to illustrate robustness.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. The major comment correctly identifies a gap in the current version regarding quantification of identifiability for the dependence parameters in no-gold-standard studies. We address this below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [model extension to no-gold-standard studies] In the section describing the extension to studies without a gold standard (and similarly for partial-verification designs), the marginal 2×2 table observed in the absence of any reference standard is already saturated by the four test-accuracy parameters plus prevalence. The additional study-specific log-OR dependence parameter is therefore only weakly identified from the data of that study; identifiability relies entirely on the hierarchical prior informed by gold-standard studies. The manuscript does not appear to contain simulation studies or analytic results quantifying the degree of borrowing required for stable posterior inference on the dependence parameters in this setting.
Authors: We agree that the manuscript does not currently include simulation studies or analytic results that quantify the degree of borrowing required from gold-standard studies for stable posterior inference on the study-specific log-OR parameters when applied to no-gold-standard or partial-verification designs. The hierarchical structure is designed to enable such borrowing, and the two empirical examples demonstrate stable inference, but this does not substitute for targeted simulations. We will add a new simulation study in the revision that varies the number and proportion of gold-standard studies, examines coverage and precision of the dependence parameters, and assesses overall model stability under the latent-class likelihood. revision: yes
Circularity Check
No circularity: new parametrization proposed as modeling advance
full rationale
The paper presents a novel Bayesian hierarchical model that parametrizes conditional dependence via study-specific log-odds ratios and claims this supports extensions to studies without gold standards or with partial verification. No equations, derivations, or claims in the abstract or described framework reduce any central quantity (such as the dependence parameters or joint accuracy estimates) to a fitted input, self-citation, or ansatz by construction. The proposal is self-contained as a modeling framework whose validity rests on the likelihood and prior structure rather than re-expressing prior results.
Axiom & Free-Parameter Ledger
free parameters (1)
- study-specific log-odds ratios for conditional dependence
axioms (2)
- domain assumption Conditional dependence between two binary tests can be adequately captured by a study-specific log-odds ratio parameter
- domain assumption Bayesian hierarchical priors can be specified such that the model remains computationally stable across study designs
Reference graph
Works this paper leans on
-
[1]
Trikalinos, T. A. and Hoaglin, D. C. and Small, K. M. and Terrin, N. and Schmid, C. H. , title =. Research Synthesis Methods , year =
-
[2]
and Hosmer, W
Smith-Bindman, R. and Hosmer, W. and Feldstein, V. A. and Deeks, J. J. and Goldberg, J. D. , title =. Journal of the American Medical Association , year =
-
[3]
Benacerraf, B. R. and Nadel, A. and Bromley, B. , title =. Radiology , year =
-
[4]
Benacerraf, B. R. and Neuberg, D. and Bromley, B. and Frigoletto, F. D., Jr. , title =. Journal of Ultrasound in Medicine , year =
-
[5]
Benacerraf, B. R. and Neuberg, D. and Frigoletto, F. D., Jr. , title =. Obstetrics and Gynecology , year =
-
[6]
and Periti, E
Biagiotti, R. and Periti, E. and Cariati, E. , title =. Prenatal Diagnosis , year =
-
[7]
and Lieberman, E
Bromley, B. and Lieberman, E. and Benacerraf, B. R. , title =. Ultrasound in Obstetrics and Gynecology , year =
-
[8]
Johnson, M. P. and Michaelson, J. E. and Barr, M., Jr. and Treadwell, M. C. and Hume, R. F., Jr. and Dombrowski, M. P. and Evans, M. I. , title =. American Journal of Obstetrics and Gynecology , year =
-
[9]
Lockwood, C. J. and Lynch, L. and Ghidini, A. and Lapinski, R. and Berkowitz, G. and Thayer, B. and Miller, W. A. , title =. American Journal of Obstetrics and Gynecology , year =
-
[10]
Nyberg, D. A. and Luthy, D. A. and Resta, R. G. and Nyberg, B. C. and Williams, M. A. , title =. Ultrasound in Obstetrics and Gynecology , year =
-
[11]
Nyberg, D. A. and Resta, R. G. and Luthy, D. A. and Hickok, D. E. and Williams, M. A. , title =. American Journal of Obstetrics and Gynecology , year =
-
[12]
Rodis, J. F. and Vintzileos, A. M. and Fleming, A. D. and Ciarleglio, L. and Nardi, D. A. and Feeney, L. and Scorza, W. E. and Campbell, W. A. and Ingardia, C. , title =. American Journal of Obstetrics and Gynecology , year =
-
[13]
Vintzileos, A. M. and Egan, J. F. and Smulian, J. C. and Campbell, W. A. and Guzman, E. R. and Rodis, J. F. , title =. Obstetrics and Gynecology , year =
-
[14]
and Evans, C
Kanagasabai, A. and Evans, C. and Jones, H. E. and Hay, A. D. and Dawson, S. and Savovi\'. Systematic review and meta-analysis of the accuracy of. Clinical Microbiology and Infection , year =
-
[15]
Abd El-Ghany, S. M. and Abdelmaksoud, A. A. and Saber, S. M. and Abd El Hamid, D. H. , title =. Annals of Saudi Medicine , year =
-
[16]
Cohen, J. F. and Chalumeau, M. and Levy, C. and et al. , title =. PLoS One , year =
-
[17]
Ezike, E. N. and Rongkavilit, C. and Fairfax, M. R. and Thomas, R. L. and Asmar, B. I. , title =. Archives of Pediatrics & Adolescent Medicine , year =
-
[18]
and Faddoul, D
Felsenstein, S. and Faddoul, D. and Sposto, R. and Batoon, K. and Polanco, C. M. and Dien Bard, J. , title =. Journal of Clinical Microbiology , year =
-
[19]
Palla, A. H. and Khan, R. A. and Gilani, A. H. and Marra, F. , title =. BMC Pulmonary Medicine , year =
-
[20]
Edmonson, M. B. and Farwell, K. R. , title =. Pediatrics , year =
-
[21]
and Neuman, M
Lindgren, C. and Neuman, M. I. and Monuteaux, M. C. and Mandl, K. D. and Fine, A. M. , title =. Pediatrics , year =
-
[22]
and Morioka, I
Nishiyama, M. and Morioka, I. and Taniguchi-Ikeda, M. and et al. , title =. Journal of International Medical Research , year =
-
[23]
Journal of Clinical Epidemiology , volume =
Study designs for comparative diagnostic test accuracy: a methodological review and classification scheme , author =. Journal of Clinical Epidemiology , volume =. 2021 , doi =
2021
-
[24]
and Joseph, L
Dendukuri, N. and Joseph, L. , title =. Biometrics , year =
-
[25]
and Chen, S
Chu, H. and Chen, S. and Louis, T. A. , title =. Journal of the American Statistical Association , year =
-
[26]
Reitsma, J. B. and Glas, A. S. and Rutjes, A. W. S. and Scholten, R. J. P. M. and Bossuyt, P. M. and Zwinderman, A. H. , title =. Journal of Clinical Epidemiology , year =
-
[27]
and Cole, S
Chu, H. and Cole, S. R. , title =. Journal of Clinical Epidemiology , year =
-
[28]
and Gatsonis, Constantine A
Rutter, Carolyn M. and Gatsonis, Constantine A. , title =. Statistics in Medicine , year =
-
[29]
, title =
Vacek, Pamela M. , title =. Biometrics , year =
-
[30]
and Jennings, Alison and Forster, Alan J
van Walraven, Carl and Austin, Peter C. and Jennings, Alison and Forster, Alan J. , title =. Journal of Clinical Epidemiology , year =
-
[31]
and Cooper, Nicola J
Novielli, Nicola and Sutton, Alexander J. and Cooper, Nicola J. , title =. Value in Health , year =
-
[32]
2026 , note =
Stan Reference Manual , author =. 2026 , note =
2026
-
[33]
Prior Choice Recommendations , year =
-
[34]
McElreath, Richard , title =. 2020 , edition =. doi:10.1201/9780429029608 , address =
-
[35]
2025 , url =
R: A Language and Environment for Statistical Computing , author =. 2025 , url =
2025
-
[36]
2026 , url =
Betancourt, Michael , title =. 2026 , url =
2026
-
[37]
Fanshawe, T. R. and Nicholson, B. and Perera, R. and Oke, J. , title =. Diagnostic and Prognostic Research , year =
-
[38]
Naaktgeboren, C. A. and Bertens, L. C. M. and. Value of composite reference standards in diagnostic research , journal =. 2013 , volume =
2013
-
[39]
and Lian, Q
Ma, X. and Lian, Q. and Chu, H. and Ibrahim, J. G. and Chen, Y. , title =. Biostatistics , year =
-
[40]
and Lesaffre, E
Menten, J. and Lesaffre, E. , title =. BMC Medical Research Methodology , year =
-
[41]
and Hodges, J
Lian, Q. and Hodges, J. S. and Chu, H. , title =. Journal of Applied Statistics , year =
-
[42]
Nyaga, V. N. and Arbyn, M. and Aerts, M. , title =. Statistical Methods in Medical Research , year =
-
[43]
, title =
Nikoloulopoulos, Aristidis K. , title =. Statistical Methods in Medical Research , volume =. 2024 , doi =
2024
-
[44]
JAGS: A Program for Analysis of Bayesian Graphical Models using Gibbs Sampling , volume =
Plummer, Martyn , year =. JAGS: A Program for Analysis of Bayesian Graphical Models using Gibbs Sampling , volume =
-
[45]
and Kulik, Laura M
Heimbach, Julie K. and Kulik, Laura M. and Finn, Richard S. and Sirlin, Claude B. and Abecassis, Michael M. and Roberts, Lewis R. and Zhu, Andrew X. and Murad, M. Hassan and Marrero, Jorge A. , title =. Hepatology , volume =. 2018 , month =
2018
-
[46]
2016 , type =
Tuberculosis , institution =. 2016 , type =
2016
-
[47]
2025 , note =
McBride, Athena , title =. 2025 , note =
2025
-
[48]
, title =
Nikoloulopoulos, Aristidis K. , title =. Statistical Methods in Medical Research , volume =. 2019 , doi =
2019
-
[49]
Statistical Methods in Medical Research , volume =
Hoyer, Annika and Kuss, Oliver , title =. Statistical Methods in Medical Research , volume =. 2018 , doi =
2018
-
[50]
2023 , month =
Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy , editor =. 2023 , month =
2023
-
[51]
Johnson, W. O. and Gastwirth, J. L. and Pearson, L. M. , title =. American Journal of Epidemiology , year =
-
[52]
Hui, S. L. and Walter, S. D. , title =. Biometrics , year =
-
[53]
Takwoingi, Y. and Leeflang, M. M. and Deeks, J. J. , title =. Annals of Internal Medicine , year =. doi:10.7326/0003-4819-158-7-201304020-00006 , note =
-
[54]
Biostatistics , volume =
Pepe, Margaret Sullivan and Janes, Holly , title =. Biostatistics , volume =. 2007 , month =
2007
-
[55]
and Welton, N
Hudak, V. and Welton, N. J. and Derezea, E. and others , title =
-
[56]
Technical Support Document 25: Evidence synthesis of diagnostic test accuracy for decision making
Efthymia Derezea and Ades, \ A E\ and Gabriel Rogers and Sutton, \ Alex J\ and Cooper, \ Nicola J.\ and Jean Hamilton and Jones, \ Hayley E\. Technical Support Document 25: Evidence synthesis of diagnostic test accuracy for decision making. 2024
2024
-
[57]
Cohen, J. F. and Bertille, N. and Cohen, R. and Chalumeau, M. , title =. Cochrane Database of Systematic Reviews , number =. 2016 , publisher =. doi:10.1002/14651858.CD010502.pub2 , keywords =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.