arxiv: 2605.12852 · v1 · submitted 2026-05-13 · 💻 cs.LG · q-bio.QM

Recognition: 2 theorem links

· Lean Theorem

Multitask Multimodal Fusion with Tabular Foundation Models for Peak and Durability Prediction of Pertussis Booster Response

Divya Sitani

Authors on Pith no claims yet

Pith reviewed 2026-05-14 20:33 UTC · model grok-4.3

classification 💻 cs.LG q-bio.QM

keywords pertussis boosterimmune response predictionmultimodal fusionmultitask learningTabPFNcontrastive losspeak responsedurability

0 comments

The pith

A multitask fusion model using tabular foundation models jointly predicts peak magnitude and long-term durability of pertussis booster immune responses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a multi-task contrastive multimodal fusion architecture to jointly predict the peak magnitude and long-term durability of immune responses to pertussis booster vaccination from heterogeneous modalities. These two outcomes are biologically dissociated rather than redundant, with peak reflecting acute B-cell activation and durability reflecting establishment of long-term humoral memory, yet most models target only one endpoint. By freezing TabPFN-v2 encoders per modality, applying a dual-label supervised contrastive loss that pairs subjects agreeing on either task label, and using missingness-masked attention fusion, the model reaches test AUROCs of 0.797 for peak and 0.755 for durability on a curated subset of 158 subjects. This joint prediction is the only one among logistic regression, XGBoost, and MLP baselines whose confidence intervals lie entirely above chance on both tasks simultaneously, and per-modality contribution analyses recover task-specific patterns consistent with underlying immunology.

Core claim

The proposed multi-task contrastive multimodal fusion architecture, combining frozen TabPFN-v2 per-modality encoders, a dual-label supervised contrastive loss, modality dropout calibrated to empirical missingness, and missingness-masked attention fusion, achieves test AUROC 0.797 (95% CI [0.621, 0.948]) for peak response and 0.755 (95% CI [0.519, 0.945]) for durability on the CMI-PB pertussis booster dataset (n=158, Spearman r=-0.58 between endpoints). Both metrics are statistically significant under joint label permutation testing (N=1000; p=0.002 and p=0.045), and the model is the only one among raw-feature and TabPFN-embedding baselines whose 95% CIs lie above chance on both tasks at once

What carries the argument

Dual-label supervised contrastive loss that treats two subjects as a positive pair if they agree on the peak label or the durability label, fused via missingness-masked attention over frozen TabPFN-v2 per-modality encoders.

If this is right

Peak prediction is carried primarily by cytokine signatures while durability prediction draws from baseline antibody features, recovering task-specific biological signals.
The architecture handles structured missingness without collapse and remains the sole model with confidence intervals above chance on both tasks.
Joint modeling captures the full boost-and-wane trajectory instead of isolated endpoints.
Statistical significance on both tasks holds under joint label permutation testing.
Per-modality contribution analyses align with known immunology of acute activation versus long-term memory.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same fusion approach could be tested on booster responses for other vaccines that exhibit dissociated peak and durability phases.
Predicted durability values might support individualized booster timing schedules if validated on larger cohorts.
Adding genomic or transcriptomic modalities could further separate the biological compartments driving each task.
The interpretable modality contributions suggest candidate biomarkers for early assessment of vaccine response quality.

Load-bearing premise

The small sample of 158 subjects with 44.9 percent missing modalities allows the dual-label contrastive loss to reliably capture the biological dissociation between peak and durability without overfitting or spurious correlations.

What would settle it

An independent replication cohort where the model's 95 percent confidence intervals for both AUROCs include 0.5 or where simpler baselines achieve comparable or better intervals on both tasks simultaneously would falsify the claim of effective joint prediction.

Figures

Figures reproduced from arXiv: 2605.12852 by Divya Sitani.

**Figure 1.** Figure 1: Cohort × modality missingness pattern in CMI-PB. (A) Subject level modality availability for the Task 1 (peak response) cohort, organised by annual cohort (2020-2023). Each row is one subject; each column is one of the four data modalities. Filled cells indicate the modality is present; white cells indicate it is missing. Within each cohort, subjects are sorted by missingness pattern (subjects with more mo… view at source ↗

**Figure 2.** Figure 2: Multi-task contrastive multimodal fusion architecture. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Peak and durability are anti-correlated phases of the humoral response. [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: ROC curves on the held-out test set with 95% bootstrap confidence intervals, and [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: Bootstrap AUROC distributions on the held-out test set, both tasks overlaid. [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Label-permutation null distributions confirm learned signal for both tasks. [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Per-modality contribution to peak and durability prediction. Top: [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Graceful degradation under inference-time modality missingness. [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

read the original abstract

Pertussis booster vaccination produces immune responses that vary widely across individuals in both peak magnitude and long-term durability. These two phases are governed by partly distinct biological compartments:peak reflects acute B-cell activation and antibody secretion, while durability reflects the establishment of long-term humoral memory. Yet most computational models target only one, missing the full boost-and-wane trajectory. Jointly predicting both is non-trivial because the two endpoints are biologically dissociated rather than redundant; samples are small, modalities are heterogeneous with structured missingness, and the two tasks rely on different measurement windows. We propose a multi-task contrastive multimodal fusion architecture combining frozen TabPFN-v2 per-modality encoders, a dual-label supervised contrastive loss that treats two subjects as a positive pair if they agree on the Task 1 label or the Task 2 label, modality dropout calibrated to empirical missingness, and missingness-masked attention fusion. Applied to a curated subset of the CMI-PB pertussis booster dataset (n = 158 subjects, four modalities, 44.9% with at least one modality missing; Spearman r = -0.58 between peak and durability, n = 96), the model achieves test AUROC 0.797 (95% CI [0.621, 0.948]) for peak response and 0.755 (95% CI [0.519, 0.945]) for durability, with both significant under joint label permutation (N = 1000; p = 0.002 and p = 0.045). Across logistic regression, XGBoost, and MLP baselines on raw features and on TabPFN embeddings, the proposed model is the only one whose 95% CIs lie above chance on both tasks simultaneously. Per-modality contribution analyses recover task-specific modality contributions consistent with the underlying immunology: peak prediction is carried by cytokine signatures, while durability is carried by baseline antibody features.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a multitask contrastive fusion model with frozen TabPFN encoders and missingness-masked attention for joint peak/durability prediction in pertussis responses, showing the only baseline-beating AUROCs above chance on both tasks, but the small n and wide CIs leave room for overfitting concerns.

read the letter

The main takeaway is that this work combines frozen TabPFN-v2 encoders per modality with a dual-label supervised contrastive loss and missingness-masked attention to predict both peak antibody response and durability after pertussis booster vaccination. The two endpoints are negatively correlated, so the architecture treats samples as positives if they match on either label. On the CMI-PB subset of 158 subjects with 44.9% missing modalities, the model reaches test AUROCs of 0.797 and 0.755, both significant under joint permutation testing, and it is the only approach whose confidence intervals clear chance on both tasks at once. The per-modality contribution analysis also recovers the expected split, with cytokines driving peak prediction and baseline antibodies driving durability, which lines up with the immunology background they cite. That is the concrete advance: a practical way to fuse heterogeneous modalities while respecting structured missingness and the non-redundant nature of the two targets. The soft spot is the sample size. With n=158 and nearly half the cases incomplete, the wide intervals (especially durability at [0.519, 0.945]) make it hard to rule out that the contrastive loss is picking up noise or missingness patterns rather than stable biological signal. The permutation test is a reasonable check, but a single held-out split on this scale does not fully address generalization. This is for people working on multimodal methods in small clinical datasets or on TabPFN-style foundation models for tabular medical data. A reader focused on vaccine response modeling or contrastive multitask setups could extract useful implementation details. I would send it for peer review. The empirical framing is careful and the architecture choices are well motivated, even if the claims will need tighter validation on larger cohorts.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a multitask multimodal fusion architecture that combines frozen TabPFN-v2 per-modality encoders, a dual-label supervised contrastive loss (positive pairs if subjects agree on peak OR durability label), modality dropout, and missingness-masked attention to jointly predict peak magnitude and durability of pertussis booster responses. On a curated CMI-PB subset (n=158 subjects, four modalities, 44.9% with at least one missing modality, Spearman r=-0.58 between endpoints), it reports test AUROCs of 0.797 (95% CI [0.621, 0.948]) for peak and 0.755 (95% CI [0.519, 0.945]) for durability, both significant under joint label permutation (N=1000; p=0.002 and p=0.045), outperforming logistic regression, XGBoost, and MLP baselines on raw features and embeddings, with per-modality contributions aligning with immunology.

Significance. If the performance claims hold under more rigorous validation, the work would demonstrate a practical way to leverage tabular foundation models and contrastive learning for jointly modeling biologically dissociated endpoints in small-sample, high-missingness multimodal settings, with direct relevance to personalized vaccination and immune-response modeling.

major comments (2)

[Methods] Methods (dual-label contrastive loss definition): the loss treats pairs as positive if subjects agree on either task label, yet the endpoints show negative correlation (r=-0.58 on n=96); with only 158 subjects this formulation risks capturing dataset-specific noise or missingness patterns rather than dissociated biology, and no ablation isolating the loss from the TabPFN encoders is reported.
[Results] Results (evaluation protocol): the reported CIs for durability [0.519, 0.945] include values at or below chance, and the single held-out split plus permutation test (N=1000) does not include nested cross-validation or multiple random splits; given n=158 and 44.9% missing modalities this leaves open the possibility that outperformance on both tasks simultaneously is split-specific rather than robust.

minor comments (1)

[Abstract] Abstract: the number of modalities and the precise baseline configurations (raw vs. embedded) could be stated more explicitly to aid quick assessment of the comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below with clarifications on our design choices and indicate revisions to strengthen the validation and analysis.

read point-by-point responses

Referee: [Methods] Methods (dual-label contrastive loss definition): the loss treats pairs as positive if subjects agree on either task label, yet the endpoints show negative correlation (r=-0.58 on n=96); with only 158 subjects this formulation risks capturing dataset-specific noise or missingness patterns rather than dissociated biology, and no ablation isolating the loss from the TabPFN encoders is reported.

Authors: The dual-label supervised contrastive loss was chosen precisely because the endpoints are biologically dissociated (Spearman r = -0.58), enabling the model to learn representations that capture similarity in at least one task without forcing redundancy. This aligns with the multitask goal of jointly modeling peak and durability. We agree that an ablation isolating the loss contribution from the TabPFN encoders is needed to rule out noise capture. In the revised manuscript we will add such an ablation, comparing the full dual-label model against single-label contrastive variants and a non-contrastive fusion baseline with frozen encoders. revision: yes
Referee: [Results] Results (evaluation protocol): the reported CIs for durability [0.519, 0.945] include values at or below chance, and the single held-out split plus permutation test (N=1000) does not include nested cross-validation or multiple random splits; given n=158 and 44.9% missing modalities this leaves open the possibility that outperformance on both tasks simultaneously is split-specific rather than robust.

Authors: We acknowledge that the wide CI for durability AUROC includes values at or below chance, which reflects uncertainty from the modest sample size and missingness. The single held-out split was used to preserve test integrity under structured missingness. To demonstrate robustness beyond a single split, we will add results from five independent random train-test splits, reporting mean AUROC and standard deviation for both tasks. Full nested cross-validation remains computationally prohibitive with the TabPFN encoders, but the multi-split evaluation will provide clearer evidence that simultaneous outperformance is not split-specific. revision: partial

Circularity Check

0 steps flagged

No circularity: performance claims rest on independent held-out evaluation with external encoders

full rationale

The paper reports test AUROCs (0.797 and 0.755) obtained via standard cross-validation on a held-out split of the n=158 dataset, using frozen external TabPFN-v2 encoders and a dual-label contrastive loss whose parameters are optimized on training data only. No equation or claim reduces a reported metric to a quantity defined by the same fitted parameters (no self-definitional loops, no fitted-input-called-prediction, no load-bearing self-citation). The architecture description and per-modality analyses are post-hoc interpretations of the trained model, not derivations that presuppose the final AUROCs. This is a standard empirical ML evaluation pipeline with no reduction of outputs to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper introduces no new free parameters or invented entities; it relies on standard assumptions in multimodal ML and the pre-trained foundation model.

axioms (2)

domain assumption TabPFN-v2 provides effective frozen encoders for the tabular modalities in this immunology domain
Assumed without re-training or fine-tuning details provided in the abstract.
ad hoc to paper The dual-label supervised contrastive loss appropriately captures agreement on either task label for dissociated biological endpoints
Defined specifically for this multitask setup to handle non-redundant tasks.

pith-pipeline@v0.9.0 · 5662 in / 1579 out tokens · 63782 ms · 2026-05-14T20:33:11.222626+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

dual-label supervised contrastive loss that treats two subjects as a positive pair if they agree on the Task 1 label or the Task 2 label
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

multi-task contrastive multimodal fusion architecture

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references

[1]

Duration of humoral immunity to common viral and vaccine antigens

Amanna IJ, Carlson NE, Slifka MK. Duration of humoral immunity to common viral and vaccine antigens. New England Journal of Medicine. 2007;357(19):1903–1915

2007
[2]

Antiviral responses induced by Tdap-IPV vaccination are associated with persistent humoral immunity to Bordetella pertussis

Gillard J, Suffiotti M, Brazda P, Venkatasubramanian PB, Versteegen P, de Jonge MI, et al. Antiviral responses induced by Tdap-IPV vaccination are associated with persistent humoral immunity to Bordetella pertussis. Nature Communications. 2024;15(1):2133

2024
[3]

A multi-omics systems vaccinology resource to develop and test computational models of immunity

Shinde P, Soldevila F, Reyna J, Aoki M, Rasmussen M, Willemsen L, et al. A multi-omics systems vaccinology resource to develop and test computational models of immunity. Cell Reports Methods. 2024;4(3):100731

2024
[4]

Pertussis resurgence: waning immunity and pathogen adaptation: two sides of the same coin

Mooi FR, Van Der Maas NAT, De Melker HE. Pertussis resurgence: waning immunity and pathogen adaptation: two sides of the same coin. Epidemiology & Infection. 2014;142(4):685–694. May 14, 2026 20/22

2014
[5]

Pertussis vaccines, epidemiology and evolution

Domenech de Cell` es M, Rohani P. Pertussis vaccines, epidemiology and evolution. Nature Reviews Microbiology. 2024;22(11):722–735

2024
[6]

Acellular pertussis vaccines protect against disease but fail to prevent infection and transmission in a nonhuman primate model

Warfel JM, Zimmerman LI, Merkel TJ. Acellular pertussis vaccines protect against disease but fail to prevent infection and transmission in a nonhuman primate model. Proceedings of the National Academy of Sciences. 2014;111(2):787–792

2014
[7]

Waning vaccine immunity in teenagers primed with whole cell and acellular pertussis vaccine: recent epidemiology

Sheridan SL, Frith K, Snelling TL, Grimwood K, McIntyre PB, Lambert SB. Waning vaccine immunity in teenagers primed with whole cell and acellular pertussis vaccine: recent epidemiology. Expert Review of Vaccines. 2014;13(9):1081–1106

2014
[8]

IL-17 and IFN-γ-producing respiratory tissue-resident memory CD4 T cells persist for decades in adults immunized as children with whole-cell pertussis vaccines

McCarthy KN, Hone S, McLoughlin RM, Mills KHG. IL-17 and IFN-γ-producing respiratory tissue-resident memory CD4 T cells persist for decades in adults immunized as children with whole-cell pertussis vaccines. The Journal of Infectious Diseases. 2024;230(3):e518–e523

2024
[9]

Th1/Th17 polarization persists following whole-cell pertussis vaccination despite repeated acellular boosters

da Silva Antunes R, Babor M, Carpenter C, Khalil N, Cortese M, Mentzer AJ, et al. Th1/Th17 polarization persists following whole-cell pertussis vaccination despite repeated acellular boosters. Journal of Clinical Investigation. 2018;128(9):3853–3865

2018
[10]

Immunization with Whole Cell but Not Acellular Pertussis Vaccines Primes CD4 T RM Cells That Sustain Protective Immunity Against Nasal Colonization withBordetella pertussis

Wilk MM, Borkner L, Misiak A, Curham L, Allen AC, Mills KHG. Immunization with Whole Cell but Not Acellular Pertussis Vaccines Primes CD4 T RM Cells That Sustain Protective Immunity Against Nasal Colonization withBordetella pertussis. Emerging Microbes & Infections. 2019;8(1):169–185

2019
[11]

A system-view of Bordetella pertussisbooster vaccine responses in adults primed with whole-cell versus acellular vaccine in infancy

da Silva Antunes R, Soldevila F, Pomaznoy M, Babor M, Bennett J, Tian Y, et al. A system-view of Bordetella pertussisbooster vaccine responses in adults primed with whole-cell versus acellular vaccine in infancy. JCI Insight. 2021;6(7):e141023

2021
[12]

Systems biology approach predicts immunogenicity of the yellow fever vaccine in humans

Querec TD, Akondy RS, Lee EK, Cao W, Nakaya HI, Teuwen D, et al. Systems biology approach predicts immunogenicity of the yellow fever vaccine in humans. Nature Immunology. 2009;10(1):116–125

2009
[13]

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

Hollmann N, M¨ uller S, Eggensperger K, Hutter F. TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second. arXiv preprint arXiv:220701848. 2022

2022
[14]

Accurate predictions on small data with a tabular foundation model

Hollmann N, M¨ uller S, Purucker L, Krishnakumar A, K¨ orfer M, Hoo SB, et al. Accurate predictions on small data with a tabular foundation model. Nature. 2025;637(8045):319–326

2025
[15]

Missing value estimation methods for DNA microarrays

Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, et al. Missing value estimation methods for DNA microarrays. Bioinformatics. 2001;17(6):520–525

2001
[16]

Adjusting batch effects in microarray expression data using empirical Bayes methods

Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8(1):118–127

2007
[17]

Systems Vaccinology: Probing Humanity’s Diverse Immune Systems with Vaccines

Pulendran B. Systems Vaccinology: Probing Humanity’s Diverse Immune Systems with Vaccines. Proceedings of the National Academy of Sciences. 2014;111(34):12300–12306

2014
[18]

Systems Biology of Vaccination for Seasonal Influenza in Humans

Nakaya HI, Wrammert J, Lee EK, Racioppi L, Marie-Kunze S, Haining WN, et al. Systems Biology of Vaccination for Seasonal Influenza in Humans. Nature Immunology. 2011;12(8):786–795

2011
[19]

Molecular Signatures of Antibody Responses Derived from a Systems Biology Study of Five Human Vaccines

Li S, Rouphael N, Duraisingham S, Romero-Steiner S, Presnell S, Davis C, et al. Molecular Signatures of Antibody Responses Derived from a Systems Biology Study of Five Human Vaccines. Nature Immunology. 2014;15(2):195–204

2014
[20]

Pertussis Vaccines

Edwards KM, Decker MD. Pertussis Vaccines. In: Plotkin SA, Orenstein WA, Offit PA, Edwards KM, editors. Plotkin’s Vaccines. 7th ed. Philadelphia: Elsevier; 2018

2018
[21]

What Is Wrong with Pertussis Vaccine Immunity? The Problem of Waning Effectiveness of Pertussis Vaccines

Burdin N, Handy LK, Plotkin SA. What Is Wrong with Pertussis Vaccine Immunity? The Problem of Waning Effectiveness of Pertussis Vaccines. Cold Spring Harbor Perspectives in Biology. 2017;9(12):a029454. May 14, 2026 21/22

2017
[22]

Supervised Contrastive Learning

Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, et al. Supervised Contrastive Learning. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 33; 2020. p. 18661–18673

2020
[23]

Decoupled Weight Decay Regularization

Loshchilov I, Hutter F. Decoupled Weight Decay Regularization. In: International Conference on Learning Representations (ICLR); 2019

2019
[24]

Permutation Tests for Studying Classifier Performance

Ojala M, Garriga GC. Permutation Tests for Studying Classifier Performance. Journal of Machine Learning Research. 2010;11:1833–1863. Available from: https://jmlr.org/papers/v11/ojala10a.html

2010
[25]

Scikit-learn: Machine Learning in Python

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830

2011
[26]

XGBoost: A Scalable Tree Boosting System

Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16. ACM; 2016. p. 785–794. May 14, 2026 22/22

2016