Recognition: unknown
A Review of Statistical Methods for Spontaneous Reporting System Data Mining: Signal Detection and Beyond
Pith reviewed 2026-05-10 02:43 UTC · model grok-4.3
The pith
Contemporary statistical methods for spontaneous reporting system data support both binary signal detection and estimation of signal strength with uncertainty for drug safety.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that a review of contemporary SRS data mining methods and their statistical underpinnings, paired with explicit guidance on constructing contingency tables from aggregated AE-drug counts, supplies a usable foundation for safety assessment across major pharmacovigilance databases.
What carries the argument
Statistical signal detection methods (including disproportionality analyses) together with the preprocessing step of building SRS contingency tables from publicly available aggregated counts.
Load-bearing premise
The selected contemporary methods and preprocessing steps using aggregated counts adequately represent current best practice and can be applied directly without further validation or dataset-specific adjustments.
What would settle it
An analysis of a confirmed drug-adverse event pair that produces materially weaker or stronger signals when the recommended preprocessing steps are omitted.
Figures
read the original abstract
Postmarketing safety surveillance relies on data from spontaneous reporting systems (SRS) such as FAERS, EudraVigilance and VigiBase, and commonly uses SRS data mining methods to assess the associations between drugs and adverse events (AEs). Traditionally, these analyses have focused on signal detection framed as a binary decision problem, whereas more recent work has emphasized more nuanced inference involving signal strength estimation and uncertainty quantification. In this paper, we review contemporary SRS data mining approaches and their statistical underpinnings for safety assessment using data from major pharmacovigilance databases worldwide. In addition to methodological review, we provide practical guidance on data preprocessing for such analysis, including construction of SRS contingency tables using only aggregated AE-drug counts, as are publicly available from databases such as VigiBase and EudraVigilance. We illustrate the guidance via opioid-related datasets obtained from FAERS and VigiBase, complied with subsequent downstream SRS data analyses.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reviews contemporary statistical methods for mining spontaneous reporting system (SRS) data from databases such as FAERS, EudraVigilance, and VigiBase, covering traditional signal detection framed as binary decisions as well as more recent approaches to signal strength estimation and uncertainty quantification. It also supplies practical guidance on preprocessing steps to construct contingency tables from publicly available aggregated AE-drug counts and illustrates the guidance with opioid-related datasets from FAERS and VigiBase.
Significance. If the summaries of methods are accurate and the preprocessing guidance is internally consistent with the stated scope of public aggregated tables, the paper would serve as a useful reference for pharmacovigilance researchers seeking to move beyond binary signal detection toward nuanced inference while working with readily accessible data sources.
minor comments (1)
- [Abstract] Abstract, final sentence: the word 'complied' is almost certainly a typographical error and should read 'combined' to make the intended meaning clear.
Simulated Author's Rebuttal
We thank the referee for their positive summary of our manuscript, accurate characterization of its scope, and recommendation for minor revision. The referee's assessment aligns well with our intent to provide both a methodological review and practical preprocessing guidance for SRS data mining.
Circularity Check
No significant circularity in this review paper
full rationale
This is a review paper summarizing existing SRS data mining methods from external literature and offering practical preprocessing guidance for aggregated counts from public databases. No new derivations, predictions, or equations are introduced that could reduce to the paper's own inputs by construction. The claims are descriptive and illustrative (e.g., opioid example as demonstration, not proof), with methods attributed to cited sources rather than self-referential fits or definitions. Any self-citations are incidental and non-load-bearing for novel results.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
What is Pharmacovigilance? Accessed: 2025-09-03, https://www.who.int/ teams/regulation-prequalification/regulation-and-safety/pharmacovigilance
World Health Organization. What is Pharmacovigilance? Accessed: 2025-09-03, https://www.who.int/ teams/regulation-prequalification/regulation-and-safety/pharmacovigilance
2025
-
[2]
FDA Adverse Event Reporting System
US Food and Drug Administration. FDA Adverse Event Reporting System. Accessed: 2025-09-03, https: //open.fda.gov/data/faers/
2025
-
[3]
EudraVigilance
European Medicines Agency. EudraVigilance. Accessed: 2025-09-03, https://www.ema.europa.eu/en/human- regulatory-overview/research-development/pharmacovigilance-research-development/eudravigilance
2025
-
[4]
Accessing global data with VigiBase search services
World Health Organization. Accessing global data with VigiBase search services. Accessed: 2025-09-03, https://who-umc.org/vigibase-search-services/
2025
-
[5]
Marianthi Markatou and Robert Ball. A pattern discovery framework for adverse event evaluation and inference in spontaneous reporting systems.Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(5):352–367, 2014
2014
-
[6]
Use of electronic health record data for drug safety signal identification: a scoping review.Drug Safety, 46(8):725–742, 2023
Sharon E Davis, Luke Zabotka, Rishi J Desai, Shirley V Wang, Judith C Maro, Kevin Coughlin, José J Hernández- Muñoz, Danijela Stojanovic, Nigam H Shah, and Joshua C Smith. Use of electronic health record data for drug safety signal identification: a scoping review.Drug Safety, 46(8):725–742, 2023
2023
-
[7]
Yihao Tan, Marianthi Markatou, and Saptarshi Chakraborty. Flexible empirical bayesian approaches to pharma- covigilance for simultaneous signal detection and signal strength estimation in spontaneous reporting systems data.Statistics in Medicine, 44(18-19):e70195, 2025
2025
-
[8]
Use of proportional reporting ratios (prrs) for signal generation from spontaneous adverse drug reaction reports.Pharmacoepidemiology and Drug Safety, 10(6):483–486, 2001
Stephen JW Evans, Patrick C Waller, and S Davis. Use of proportional reporting ratios (prrs) for signal generation from spontaneous adverse drug reaction reports.Pharmacoepidemiology and Drug Safety, 10(6):483–486, 2001
2001
-
[9]
The reporting odds ratio and its advantages over the proportional reporting ratio.Pharmacoepidemiology and Drug Safety, 13(8):519–523, 2004
Kenneth J Rothman, Stephan Lanes, and Susan T Sacks. The reporting odds ratio and its advantages over the proportional reporting ratio.Pharmacoepidemiology and Drug Safety, 13(8):519–523, 2004
2004
-
[10]
A likelihood ratio test based method for signal detection with application to fda’s drug safety data.Journal of the American Statistical Association, 106(496):1230–1241, 2011
Lan Huang, Jyoti Zalkikar, and Ram C Tiwari. A likelihood ratio test based method for signal detection with application to fda’s drug safety data.Journal of the American Statistical Association, 106(496):1230–1241, 2011
2011
-
[11]
An evaluation of statistical approaches to postmarketing surveillance.Statistics in Medicine, 39(7):845–874, 2020
Yuxin Ding, Marianthi Markatou, and Robert Ball. An evaluation of statistical approaches to postmarketing surveillance.Statistics in Medicine, 39(7):845–874, 2020
2020
-
[12]
Zero-inflated poisson model based likelihood ratio test for drug safety signal detection.Statistical Methods in Medical Research, 26(1):471–488, 2017
Lan Huang, Dan Zheng, Jyoti Zalkikar, and Ram Tiwari. Zero-inflated poisson model based likelihood ratio test for drug safety signal detection.Statistical Methods in Medical Research, 26(1):471–488, 2017
2017
-
[13]
Yueqin Zhao, Min Yi, and Ram C Tiwari. Extended likelihood ratio test-based methods for signal detection in a drug class with application to fda’s adverse event reporting system database.Statistical Methods in Medical Research, 27(3):876–890, 2018
2018
-
[14]
On the use of the likelihood ratio test methodology in pharmacovigilance.Statistics in Medicine, 41(27):5395–5420, 2022
Saptarshi Chakraborty, Anran Liu, Robert Ball, and Marianthi Markatou. On the use of the likelihood ratio test methodology in pharmacovigilance.Statistics in Medicine, 41(27):5395–5420, 2022
2022
-
[15]
A bayesian neural network method for adverse drug reaction signal generation.European Journal of Clinical Pharmacology, 54:315–321, 1998
Andrew Bate, Marie Lindquist, I Ralph Edwards, Sten Olsson, Roland Orre, Anders Lansner, and R Melhado De Freitas. A bayesian neural network method for adverse drug reaction signal generation.European Journal of Clinical Pharmacology, 54:315–321, 1998
1998
-
[16]
Bayesian data mining in large frequency tables, with an application to the fda spontaneous reporting system.The American Statistician, 53(3):177–190, 1999
William DuMouchel. Bayesian data mining in large frequency tables, with an application to the fda spontaneous reporting system.The American Statistician, 53(3):177–190, 1999
1999
-
[17]
Empirical bayes screening for multi-item associations
William DuMouchel and Daryl Pregibon. Empirical bayes screening for multi-item associations. InProceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 67–76, 2001
2001
-
[18]
Extended multi-item gamma poisson shrinker methods based on the zero-inflated poisson model for postmarket drug safety surveillance.Statistics in Medicine, 39(30):4636–4650, 2020
Seok-Jae Heo and Inkyung Jung. Extended multi-item gamma poisson shrinker methods based on the zero-inflated poisson model for postmarket drug safety surveillance.Statistics in Medicine, 39(30):4636–4650, 2020
2020
-
[19]
Signal detection in FDA AERS database using Dirichlet process.Statistics in Medicine, 34(19):2725–2742, 2015
Na Hu, Lan Huang, and Ram C Tiwari. Signal detection in FDA AERS database using Dirichlet process.Statistics in Medicine, 34(19):2725–2742, 2015
2015
-
[20]
Convex optimization, shape constraints, compound decisions, and empirical bayes rules.Journal of the American Statistical Association, 109(506):674–685, 2014
Roger Koenker and Ivan Mizera. Convex optimization, shape constraints, compound decisions, and empirical bayes rules.Journal of the American Statistical Association, 109(506):674–685, 2014
2014
-
[21]
Empirical Bayes deconvolution estimates.Biometrika, 103(1):1–20, 2016
Bradley Efron. Empirical Bayes deconvolution estimates.Biometrika, 103(1):1–20, 2016
2016
-
[22]
MDDC: An R and Python package for adverse event identification in pharmacovigilance data.Scientific Reports, 15(1):21317, 2025
Anran Liu, Raktim Mukhopadhyay, and Marianthi Markatou. MDDC: An R and Python package for adverse event identification in pharmacovigilance data.Scientific Reports, 15(1):21317, 2025. 18 APREPRINT
2025
-
[23]
Vaccine adverse event enrichment tests.Statistics in Medicine, 40(19):4269–4278, 2021
Shuoran Li and Lili Zhao. Vaccine adverse event enrichment tests.Statistics in Medicine, 40(19):4269–4278, 2021
2021
-
[24]
Sequential generalized likelihood ratio tests for vaccine safety evaluation.Statistics in Medicine, 29(26):2698–2708, 2010
Mei-Chiung Shih, Tze Leung Lai, Joseph F Heyse, and Jie Chen. Sequential generalized likelihood ratio tests for vaccine safety evaluation.Statistics in Medicine, 29(26):2698–2708, 2010
2010
-
[25]
New adaptive lasso approaches for variable selection in automated pharmacovigilance signal detection.BMC Medical Research Methodology, 21(1):271, 2021
Émeline Courtois, Pascale Tubert-Bitter, and Ismaïl Ahmed. New adaptive lasso approaches for variable selection in automated pharmacovigilance signal detection.BMC Medical Research Methodology, 21(1):271, 2021
2021
-
[26]
R package version 1.0.8
Ismaïl Ahmed and Antoine Poncet.PhViD: PharmacoVigilance Signal Detection, 2016. R package version 1.0.8
2016
-
[27]
John Ihrie and Travis Canida.openEBGM: EBGM Disproportionality Scores for Adverse Event Data Mining,
-
[28]
R package version 0.9.1
-
[29]
openEBGM: an R implementation of the gamma-Poisson shrinker data mining model.The R journal, 9(2):499–519, 2017
Travis Canida and John Ihrie. openEBGM: an R implementation of the gamma-Poisson shrinker data mining model.The R journal, 9(2):499–519, 2017
2017
-
[30]
R package version 0.5.1
Anran Liu Saptarshi Chakraborty, Marianthi Markatou.pvLRT: Likelihood Ratio Test-Based Approaches to Pharmacovigilance, 2023. R package version 0.5.1
2023
-
[31]
Saptarshi Chakraborty, Marianthi Markatou, and Robert Ball. Likelihood Ratio Test-Based Drug Safety Assess- ment using R Package pvLRT.The R Journal, 15:101–121, 2023. https://doi.org/10.32614/RJ-2023-027
-
[32]
R package version 0.8
Balasubramanian Narasimhan.sglr: Sequential Generalized Likelihood Ratio Decision Boundaries, 2022. R package version 0.8
2022
-
[33]
R package version 4.5.2
Martin Kulldorff Ivair Ramos Silva.Sequential: Exact Sequential Analysis for Poisson and Binomial Data, 2025. R package version 4.5.2
2025
-
[34]
R package version 1.1.1
Shuoran Li et al.AEenrich: Adverse Event Enrichment Tests, 2026. R package version 1.1.1
2026
-
[35]
R package version 1.1.0
Marianthi Markatou Anran Liu, Raktim Mukhopadhyay.MDDC: Modified Detecting Deviating Cells Algorithm in Pharmacovigilance, 2025. R package version 1.1.0
2025
-
[36]
R package version 0.2-3
Hervé Perdry Emeline Courtois, Ismaïl Ahmed.adapt4pv: Adaptive Approaches for Signal Detection in Pharmacovigilance, 2023. R package version 0.2-3
2023
-
[37]
R package version 0.2.2
Yihao Tan, Saptarshi Chakraborty, Marianthi Markatou, and Raktim Mukhopadhyay.pvEBayes: Empirical Bayes Models for Pharmacovigilance, 2026. R package version 0.2.2
2026
-
[38]
Yihao Tan, Marianthi Markatou, and Saptarshi Chakraborty. pvebayes: An r package for empirical bayes methods in pharmacovigilance.arXiv preprint arXiv:2512.01057, 2025
-
[39]
Ahmed, C
I. Ahmed, C. Dalmasso, F. Haramburu, F. Thiessard, P. Broët, and P. Tubert-Bitter. False discovery rate estimation for frequentist pharmacovigilance signal detection methods.Biometrics, 66(1):301–309, 03 2010
2010
-
[40]
FDR and Bayesian Multiple Comparisons Rules
Peter Müller, Giovanni Parmigiani, and Kenneth Rice. FDR and Bayesian Multiple Comparisons Rules. In Bayesian Statistics 8: Proceedings of the Eighth Valencia International Meeting, page 349–370. Oxford University Press, 07 2006
2006
-
[41]
Chapman and Hall/CRC, 2017
Simon N Wood.Generalized additive models: an introduction with R. Chapman and Hall/CRC, 2017
2017
-
[42]
From here to infinity: sparse finite versus dirichlet process mixtures in model-based clustering.Advances in Data Analysis and Classification, 13:33–64, 2019
Sylvia Frühwirth-Schnatter and Gertraud Malsiner-Walli. From here to infinity: sparse finite versus dirichlet process mixtures in model-based clustering.Advances in Data Analysis and Classification, 13:33–64, 2019
2019
-
[43]
Model-based clustering based on sparse finite gaussian mixtures.Statistics and Computing, 26(1):303–324, 2016
Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, and Bettina Grün. Model-based clustering based on sparse finite gaussian mixtures.Statistics and Computing, 26(1):303–324, 2016
2016
-
[44]
Identifying mixtures of mixtures using bayesian estimation.Journal of Computational and Graphical Statistics, 26(2):285–295, 2017
Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, and Bettina Grün. Identifying mixtures of mixtures using bayesian estimation.Journal of Computational and Graphical Statistics, 26(2):285–295, 2017
2017
-
[45]
Asymptotic behaviour of the posterior distribution in overfitted mixture models.Journal of the Royal Statistical Society Series B: Statistical Methodology, 73(5):689–710, 2011
Judith Rousseau and Kerrie Mengersen. Asymptotic behaviour of the posterior distribution in overfitted mixture models.Journal of the Royal Statistical Society Series B: Statistical Methodology, 73(5):689–710, 2011
2011
-
[46]
European medicines agency policy on access to eudravigilance data for medicinal products for human use, 2025
European Medicines Agency. European medicines agency policy on access to eudravigilance data for medicinal products for human use, 2025. Accessed: 2025-12-27,https://www.ema.europa.eu/en/documents/other/ european-medicines-agency-policy-access-eudravigilance-data-medicinal-products-human-use_ en.pdf
2025
-
[47]
Survigilance: An application for accessing global pharmacovig- ilance data.SoftwareX, 34:102546, 2026
Raktim Mukhopadhyay and Marianthi Markatou. Survigilance: An application for accessing global pharmacovig- ilance data.SoftwareX, 34:102546, 2026
2026
-
[48]
Pentazocine (injection route) - side effects & uses
Mayo Clinic. Pentazocine (injection route) - side effects & uses. Mayo Clinic: Drugs & Supple- ments, December 2025. Accessed: 2025-12-27, https://www.mayoclinic.org/drugs-supplements/ pentazocine-injection-route/description/drg-20074265. 19 APREPRINT
2025
-
[49]
Pentazocine and naloxone (oral route) - side effects & dosage
Mayo Clinic. Pentazocine and naloxone (oral route) - side effects & dosage. Mayo Clinic: Drugs & Sup- plements, December 2025. Accessed: 2025-12-27, https://www.mayoclinic.org/drugs-supplements/ pentazocine-and-naloxone-oral-route/description/drg-20074147
2025
-
[50]
Oracle Corporation, 2025
Oracle Corporation.Oracle Life Sciences Empirica Documentation, Release 2025.4.02. Oracle Corporation, 2025. Accessed: 2026-04-06
2025
-
[51]
Bayesian pharmacovigilance signal detection methods revisited in a multiple comparison setting.Statistics in Medicine, 28(13):1774–1792, 2009
Ismaïl Ahmed, Françoise Haramburu, Annie Fourrier-Réglat, Frantz Thiessard, Carmen Kreft-Jais, Ghada Miremont-Salamé, Bernard Bégaud, and Pascale Tubert-Bitter. Bayesian pharmacovigilance signal detection methods revisited in a multiple comparison setting.Statistics in Medicine, 28(13):1774–1792, 2009. 20
2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.