pith. machine review for the scientific record. sign in

arxiv: 2604.07368 · v1 · submitted 2026-04-06 · 🧬 q-bio.QM

Recognition: 2 theorem links

· Lean Theorem

Time-Varying Environmental and Polygenic Predictors of Substance Use Initiation in Youth: A Survival and Causal Modeling Study in the ABCD Cohort

Mengman Wei, Qian Peng

Pith reviewed 2026-05-10 18:51 UTC · model grok-4.3

classification 🧬 q-bio.QM
keywords substance use initiationpolygenic risk scorestime-varying covariatesadolescent substance usesurvival analysismarginal structural modelsenvironmental predictorslongitudinal cohort
0
0 comments X

The pith

Time-varying factors like impulsivity and parental monitoring, combined with genetic risk, predict earlier substance use initiation in adolescents.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines how changing environmental conditions and inherited genetic risks together influence when adolescents first try alcohol, nicotine, cannabis, or other substances. It draws on repeated measurements from a large cohort tracked over four years and applies survival models to isolate the strongest predictors while accounting for shifting influences over time. Early substance use raises the chance of later addiction and related problems, so clarifying which factors can be modified offers practical leads for prevention programs. The analysis shows that impulsivity and parental monitoring remain linked to timing even after including genetic scores and many other variables, with nicotine-related genetic risk emerging as particularly consistent.

Core claim

Integrating repeated assessments of family, school, neighborhood, behavioral, and health factors with polygenic risk scores in time-to-event models identifies robust predictors of substance initiation. Multivariable time-varying Cox models narrow the set to impulsivity traits, parental monitoring, and select lifestyle elements, while nicotine genetic risk shows the strongest and most stable association. Marginal structural models indicate that higher parental monitoring lowers risk whereas greater impulsivity and caffeine exposure raise it.

What carries the argument

Time-varying Cox proportional hazards models paired with marginal structural modeling to handle dynamic confounding in longitudinal repeated measures.

If this is right

  • Impulsivity traits remain tied to earlier initiation across multiple substances after adjustment.
  • Higher parental monitoring consistently shows a protective association with delayed first use.
  • Polygenic scores for nicotine use add independent predictive value beyond measured environmental factors.
  • Caffeine exposure and certain health indicators emerge as additional risk markers in the adjusted models.
  • The identified factors point to concrete targets such as monitoring and behavioral regulation for prevention efforts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The modeling strategy could be applied to forecast progression from initiation to substance use disorders rather than stopping at first use.
  • School or family programs aimed at impulsivity or monitoring levels might be tested for their ability to shift initiation timing in similar populations.
  • Interactions between specific genetic scores and changing environmental conditions over longer periods remain open for direct examination.
  • Repeated-measures designs like this one could improve prediction in other adolescent behavioral outcomes where both genes and shifting contexts matter.

Load-bearing premise

The models have accounted for all important time-varying confounders and that findings from this youth cohort apply more broadly without major unmeasured biases or selection effects.

What would settle it

A randomized intervention that raises parental monitoring and lowers impulsivity but produces no change in substance initiation rates among comparable adolescents would undermine the causal claims.

Figures

Figures reproduced from arXiv: 2604.07368 by Mengman Wei, Qian Peng.

Figure 1
Figure 1. Figure 1: Domain-level counts of significant predictors for alcohol initiation from univariate time-varying Cox models (FDR and Bonferroni thresholds) [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Domain-level counts of significant predictors for any substance initiation from univariate time-varying Cox models (FDR and Bonferroni thresholds) [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Domain-level counts of significant predictors for nicotine initiation from univariate time-varying Cox models (FDR and Bonferroni thresholds) [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Domain-level counts of significant predictors for cannabis initiation from univariate time-varying Cox models (FDR and Bonferroni thresholds) [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Top causal effects estimated using MSM–IPTW models across substance use initiation outcomes (FDR q < 0.05). Points represent estimated odds ratios with 95% confidence intervals. Panel (A): alcohol initiation; panel (B): any substance initiation; panel (C): cannabis initiation; panel (D): nicotine initiation. Author Contributions Mengman Wei conceived the study, designed the analytical framework, performed … view at source ↗
read the original abstract

Early initiation of alcohol, nicotine, cannabis, and other substances predicts later substance use disorders and related psychopathology. We integrate time-varying environmental factors with polygenic risk scores (PRS) in a longitudinal framework to identify determinants of substance initiation in adolescence. Using data from the Adolescent Brain Cognitive Development (ABCD) Study with repeated assessments over approximately four years, we defined time-to-event outcomes for first use of alcohol, nicotine, cannabis, and any substance. We constructed high-dimensional panels of time-varying environmental covariates across family, school, neighborhood, behavioral, and health domains, alongside time-invariant covariates and PRS for alcohol, cannabis, nicotine, and general substance use disorders. Time-varying Cox models with clustered standard errors were applied. Univariate analyses showed broad associations between earlier initiation and multiple environmental domains, including impulsivity, sleep disturbance, parental monitoring, caffeine use, and school functioning. In multivariable models, a smaller set of predictors remained robust, particularly impulsivity traits, parental monitoring, and selected health and lifestyle factors. PRS were positively associated with earlier initiation, with the strongest and most consistent effects for nicotine-related genetic risk. Secondary analyses using marginal structural models suggested that higher parental monitoring is protective, whereas higher impulsivity and caffeine exposure are associated with increased risk. These results demonstrate that integrating dynamic environmental exposures with genetic liability can identify key risk factors for adolescent substance initiation and highlight actionable targets for prevention.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes data from the ABCD cohort (~4 years of repeated assessments) to model time-to-first-use of alcohol, nicotine, cannabis, and any substance. It applies time-varying Cox models with clustered errors to high-dimensional panels of time-varying environmental covariates (family, school, neighborhood, behavioral, health domains) plus time-invariant factors and PRS for substance use disorders, followed by marginal structural models (MSM) for causal inference on selected predictors.

Significance. If the MSM assumptions hold, the work could help prioritize modifiable targets (e.g., parental monitoring) for preventing adolescent substance initiation by jointly modeling dynamic environmental exposures and genetic liability in a large, longitudinal sample. The longitudinal design and use of externally derived PRS are strengths for reducing some forms of bias.

major comments (2)
  1. [Methods (marginal structural models subsection)] Methods (marginal structural models subsection): The claim that higher parental monitoring is protective while impulsivity and caffeine increase risk rests on the MSM correctly specifying treatment and censoring weights for high-dimensional time-varying confounders. No weight-distribution diagnostics, positivity checks, truncation procedures, or sensitivity analyses for unmeasured confounding (e.g., via e-value or g-computation comparison) are described, which are load-bearing for the causal interpretation.
  2. [Results and Abstract] Results and Abstract: The transition from broad univariate associations to a smaller set of 'robust' multivariable predictors (impulsivity, parental monitoring, selected health/lifestyle factors) and the PRS effects (strongest for nicotine) lacks reported details on covariate selection, missing-data handling, multiple-testing correction across domains, or full effect-size reporting (hazard ratios with CIs), undermining evaluation of whether the reduced model supports the central integrative claim.
minor comments (2)
  1. [Abstract] Abstract: Include a concise statement of sample size, exact follow-up structure, and primary limitations to contextualize the generalizability of the findings.
  2. [Methods (PRS construction)] PRS description: Specify the source GWAS, construction method (e.g., clumping/thresholding), and any ancestry or population-structure adjustments applied.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important areas for improving the transparency and robustness of our causal analyses and reporting. We address each major comment below and will revise the manuscript to incorporate the suggested additions.

read point-by-point responses
  1. Referee: Methods (marginal structural models subsection): The claim that higher parental monitoring is protective while impulsivity and caffeine increase risk rests on the MSM correctly specifying treatment and censoring weights for high-dimensional time-varying confounders. No weight-distribution diagnostics, positivity checks, truncation procedures, or sensitivity analyses for unmeasured confounding (e.g., via e-value or g-computation comparison) are described, which are load-bearing for the causal interpretation.

    Authors: We agree that these diagnostics and sensitivity analyses are essential to support the causal interpretations from the marginal structural models. In the revised manuscript, we will add a dedicated subsection in Methods describing: (i) stabilized weight distributions (means, medians, ranges, and histograms); (ii) positivity checks via covariate overlap plots and assessment of extreme weights; (iii) truncation at the 1st/99th percentiles with sensitivity to alternative thresholds; and (iv) sensitivity analyses including e-values for the key MSM estimates and, where computationally feasible, comparison with g-computation results. These will be reported in Results with a new supplementary table of diagnostics. This will strengthen the causal claims without altering the primary findings. revision: yes

  2. Referee: Results and Abstract: The transition from broad univariate associations to a smaller set of 'robust' multivariable predictors (impulsivity, parental monitoring, selected health/lifestyle factors) and the PRS effects (strongest for nicotine) lacks reported details on covariate selection, missing-data handling, multiple-testing correction across domains, or full effect-size reporting (hazard ratios with CIs), undermining evaluation of whether the reduced model supports the central integrative claim.

    Authors: We acknowledge the need for greater methodological transparency. In the revision we will expand the Methods section to explicitly detail: (1) the covariate selection process (univariate screening followed by domain-informed multivariable inclusion and stepwise backward elimination with a p<0.05 retention threshold, plus sensitivity to LASSO); (2) missing-data handling (multiple imputation by chained equations with 20 imputations, reporting fraction missing and sensitivity to complete-case analysis); (3) multiple-testing correction (FDR within each domain and across the four substance outcomes); and (4) full reporting of hazard ratios and 95% CIs for all predictors in both univariate and final multivariable models, including PRS. A comprehensive supplementary table will present all effect sizes. The Abstract will be updated to note these details and the key HRs. These changes will allow readers to fully evaluate the reduced models. revision: yes

Circularity Check

0 steps flagged

No significant circularity: results derive from fitting standard models to independent cohort data

full rationale

The paper fits time-varying Cox models and marginal structural models to ABCD longitudinal data using externally sourced PRS and observed time-varying covariates. No derivation step reduces a reported association or causal estimate to a quantity defined by the model's own fitted parameters or by self-citation chains. The central claims (associations with impulsivity, parental monitoring, etc.) are statistical outputs from the data rather than tautological re-expressions of inputs. This is the normal non-circular case for empirical modeling papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit details on free parameters, axioms, or invented entities; full text required for complete ledger. Models implicitly rely on standard statistical assumptions such as proportional hazards and no unmeasured confounding.

pith-pipeline@v0.9.0 · 5560 in / 1125 out tokens · 59785 ms · 2026-05-10T18:51:24.637897+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

17 extracted references

  1. [1]

    J David Hawkins, Richard F Catalano, and Janet Y Miller. Risk and protective factors for alcohol and other drug problems in adolescence and early adulthood: implications for substance abuse prevention.Psychological bulletin, 112(1):64, 1992. 12 Short Article Title

  2. [2]

    Age at onset of alcohol use and dsm-iv alcohol abuse and dependence: a 12-year follow-up.Journal of substance abuse, 13(4):493–504, 2001

    Bridget F Grant, Frederick S Stinson, and Thomas C Harford. Age at onset of alcohol use and dsm-iv alcohol abuse and dependence: a 12-year follow-up.Journal of substance abuse, 13(4):493–504, 2001

  3. [3]

    Age at first drink and the first incidence of adult-onset dsm-iv alcohol use disorders.Alcoholism: Clinical and Experimental Research, 32(12):2149–2160, 2008

    Deborah A Dawson, Ris¨ e B Goldstein, S Patricia Chou, W June Ruan, and Bridget F Grant. Age at first drink and the first incidence of adult-onset dsm-iv alcohol use disorders.Alcoholism: Clinical and Experimental Research, 32(12):2149–2160, 2008

  4. [4]

    Young adult sequelae of adolescent cannabis use: an integrative analysis.The Lancet Psychiatry, 1(4):286–293, 2014

    Edmund Silins, L John Horwood, George C Patton, David M Fergusson, Craig A Olsson, Delyse M Hutchinson, Elizabeth Spry, John W Toumbourou, Louisa Degenhardt, Wendy Swift, et al. Young adult sequelae of adolescent cannabis use: an integrative analysis.The Lancet Psychiatry, 1(4):286–293, 2014

  5. [5]

    Adolescent brain cognitive development (abcd) study.https://abcdstudy

    National Institutes of Health (NIH). Adolescent brain cognitive development (abcd) study.https://abcdstudy. org, n.d

  6. [6]

    The conception of the abcd study: From substance use to a broad nih collaboration.Developmental cognitive neuroscience, 32:4–7, 2018

    Nora D Volkow, George F Koob, Robert T Croyle, Diana W Bianchi, Joshua A Gordon, Walter J Koroshetz, Eliseo J P´ erez-Stable, William T Riley, Michele H Bloch, Kevin Conway, et al. The conception of the abcd study: From substance use to a broad nih collaboration.Developmental cognitive neuroscience, 32:4–7, 2018

  7. [7]

    Adolescent brain cognitive development (abcd) study: Overview of substance use assessment methods

    Krista M Lisdahl, Kenneth J Sher, Kevin P Conway, Raul Gonzalez, Sarah W Feldstein Ewing, Sara Jo Nixon, Susan Tapert, Hauke Bartsch, Rita Z Goldstein, and Mary Heitzeg. Adolescent brain cognitive development (abcd) study: Overview of substance use assessment methods. Developmental cognitive neuroscience, 32:80–96, 2018

  8. [8]

    Regression models and life-tables.Journal of the royal statistical society: Series B (methodological), 34(2):187–202, 1972

    David R Cox. Regression models and life-tables.Journal of the royal statistical society: Series B (methodological), 34(2):187–202, 1972

  9. [9]

    Marginal structural models and causal inference in epidemiology, 2000

    James M Robins, Miguel Angel Hernan, and Babette Brumback. Marginal structural models and causal inference in epidemiology, 2000

  10. [10]

    Causal inference: What if chapman & hall.CRC, Boca Raton, 2020

    MA Hern´ an and JM Robins. Causal inference: What if chapman & hall.CRC, Boca Raton, 2020

  11. [11]

    Neuroanatomical variability and substance use initiation in late childhood and early adolescence.JAMA network open, 7(12):e2452027, 2024

    Alex P Miller, David AA Baranger, Sarah E Paul, Hugh Garavan, Scott Mackey, Susan F Tapert, Kimberly H LeBlanc, Arpana Agrawal, and Ryan Bogdan. Neuroanatomical variability and substance use initiation in late childhood and early adolescence.JAMA network open, 7(12):e2452027, 2024

  12. [12]

    Multi- ancestry study of the genetics of problematic alcohol use in over 1 million individuals.Nature Medicine, 29(12):3184– 3192, 2023

    Hang Zhou, Rachel L Kember, Joseph D Deak, Heng Xu, Sylvanus Toikumo, Kai Yuan, Penelope A Lind, Leila Farajzadeh, Lu Wang, Alexander S Hatoum, et al. Multi- ancestry study of the genetics of problematic alcohol use in over 1 million individuals.Nature Medicine, 29(12):3184– 3192, 2023

  13. [13]

    Multi-ancestry genome-wide association study of cannabis use disorder yields insight into disease biology and public health implications.Nature Genetics, 55(12):2094–2103, 2023

    Daniel F Levey, Marco Galimberti, Joseph D Deak, Frank R Wendt, Arjun Bhattacharya, Dora Koller, Kelly M Harrington, Rachel Quaden, Emma C Johnson, Priya Gupta, et al. Multi-ancestry genome-wide association study of cannabis use disorder yields insight into disease biology and public health implications.Nature Genetics, 55(12):2094–2103, 2023

  14. [14]

    Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use.Nature genetics, 51(2):237–244, 2019

    Mengzhen Liu, Yu Jiang, Robbee Wedow, Yue Li, David M Brazel, Fang Chen, Gargi Datta, Jose Davila-Velderrain, Daniel McGuire, Chao Tian, et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use.Nature genetics, 51(2):237–244, 2019

  15. [15]

    Multivariate genome- wide association meta-analysis of over 1 million subjects identifies loci underlying multiple substance use disorders

    Alexander S Hatoum, Sarah MC Colbert, Emma C Johnson, Spencer B Huggett, Joseph D Deak, Gita A Pathak, Mariela V Jennings, Sarah E Paul, Nicole R Karcher, Isabella Hansen, et al. Multivariate genome- wide association meta-analysis of over 1 million subjects identifies loci underlying multiple substance use disorders. Nature mental health, 1(3):210–223, 2023

  16. [16]

    Polygenic prediction via bayesian regression and continuous shrinkage priors.Nature communications, 10(1):1776, 2019

    Tian Ge, Chia-Yen Chen, Yang Ni, Yen-Chen Anne Feng, and Jordan W Smoller. Polygenic prediction via bayesian regression and continuous shrinkage priors.Nature communications, 10(1):1776, 2019

  17. [17]

    Improving polygenic prediction in ancestrally diverse populations.Nature genetics, 54(5):573–580, 2022

    Yunfeng Ruan, Yen-Feng Lin, Yen-Chen Anne Feng, Chia- Yen Chen, Max Lam, Zhenglin Guo, Lin He, Akira Sawa, Alicia R Martin, et al. Improving polygenic prediction in ancestrally diverse populations.Nature genetics, 54(5):573–580, 2022