Recognition: unknown
A Data-Centric Framework for Intraoperative Fluorescence Lifetime Imaging for Glioma Surgical Guidance
Pith reviewed 2026-05-07 16:39 UTC · model grok-4.3
The pith
A data-centric framework uses confident learning to merge seven glioma classes into three for 96 percent FLIm classification accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors demonstrate that their data-centric AI framework, by applying confident learning to quantify point-level confidence in FLIm signals from 192 tissue margins across 31 GBM patients, identifies inconsistencies in the initial seven-class labeling and guides iterative merging into a three-class scheme of low, moderate, and high infiltration. The resulting dataset trains a classifier that achieves 96 percent accuracy, with SHAP analysis exposing class-specific feature importance and targeted analysis linking low-confidence predictions to biological and technical variables. Selective re-evaluation of CL-flagged margins reveals labeling variability, showing the framework improves data可靠性
What carries the argument
Confident learning (CL) to quantify FLIm point-level confidence, flag inconsistencies, and direct iterative class merging from seven to three tumor cellularity categories.
If this is right
- The refined three-class model supports real-time intraoperative decisions that balance tumor resection with tissue preservation.
- SHAP-derived optical signatures provide a biological basis for interpreting FLIm contrast across the infiltration spectrum.
- Targeted analysis of low-confidence predictions identifies specific confounders like blood contamination that can be mitigated in future acquisitions.
- Selective relabeling based on CL reduces the need for exhaustive expert review while maintaining high model performance.
Where Pith is reading between the lines
- The same CL-driven refinement process could be tested on other label-free optical modalities used in neurosurgery to check for similar gains in label fidelity.
- If the three-class scheme generalizes to new patients, it might enable deployment of FLIm probes with built-in uncertainty estimates for surgical navigation systems.
- The observed pathologist variability suggests that future studies should collect multi-rater labels from the start to quantify the upper bound on achievable accuracy.
Load-bearing premise
The single expert neuropathologist's initial seven-class labeling is consistent enough for confident learning to detect and fix errors without introducing new systematic biases.
What would settle it
Independent multi-pathologist re-labeling of the CL-flagged margins showing no greater inconsistency rate than unflagged margins would falsify the claim that CL improves label reliability.
read the original abstract
Accurate intraoperative assessment of glioma infiltration is essential for maximizing tumor resection while preserving functional brain tissue. Fluorescence lifetime imaging (FLIm) offers real-time, label-free biochemical contrast, but its clinical utility is challenged by biological heterogeneity, class imbalance, and variability in histopathological labeling. We present a data-centric AI (DC-AI) framework that integrates confident learning (CL), class refinement, and targeted label evaluation to develop a robust multi-class FLIm classifier for glioblastoma (GBM) resection margins. FLIm data were collected from 192 tissue margins across 31 newly diagnosed IDH-wildtype GBM patients and initially labeled into seven tumor cellularity classes by an expert neuropathologist. CL was applied to quantify FLIm point-level confidence, identify label inconsistencies, and guide iterative class merging into a three-class scheme ("low", "moderate", "high"). The resulting high-fidelity dataset enabled training a model that achieved 96% accuracy in the three-class task. SHAP analysis revealed class-specific FLIm feature importance, highlighting distinct optical signatures across the infiltration spectrum. Targeted FLIm analysis further identified biological (e.g., gray matter composition) and acquisition-related (e.g., blood contamination) contributors to low-confidence predictions. Blinded re-evaluation of margins flagged by CL demonstrated intra-pathologist variability, underscoring the value of selective relabeling rather than exhaustive review. Together, these findings demonstrate that a DC-AI framework can systematically improve data reliability, enhance model robustness, and refine biological interpretation of FLIm signals, supporting the development of clinically actionable optical tools for real-time glioma margin assessment.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a data-centric AI framework for intraoperative fluorescence lifetime imaging (FLIm) to guide glioma resection. It collects FLIm data from 192 margins in 31 GBM patients, obtains initial seven-class cellularity labels from a single expert neuropathologist, applies confident learning (CL) to detect inconsistencies and iteratively merge into a three-class scheme (low/moderate/high), trains a classifier that reaches 96% accuracy on the refined labels, uses SHAP to interpret feature importance, and performs targeted analysis plus blinded re-evaluation of low-confidence cases.
Significance. If the 96% accuracy reflects genuine discriminative power on biologically faithful labels rather than artifacts of the CL-driven merging process, the work could meaningfully advance label-free optical tools for real-time margin assessment in neurosurgery. The data-centric emphasis on systematic label refinement, combined with SHAP-based biological interpretation and selective re-evaluation, offers a practical template for handling noisy, imbalanced medical imaging data; this is especially relevant given the heterogeneity of glioma infiltration.
major comments (2)
- [Abstract] Abstract: the headline claim of 96% accuracy on the three-class task supplies no validation details (train/test split, cross-validation scheme, baseline models, or error analysis). Because CL uses model predictions both to flag inconsistencies and to guide class merging, the accuracy figure risks circularity unless the test partition was strictly held out from the entire CL procedure; performance against the original seven-class labels should also be reported to demonstrate that merging preserved rather than erased discriminative signal.
- [Abstract] Abstract (framework description): the initial seven-class labels come from a single expert with no reported inter-rater reliability or independent ground-truth validation. The subsequent CL-guided merging into three classes, followed by training and 96% accuracy, therefore rests on the untested assumption that the refined labels remain faithful to underlying biology; the blinded re-evaluation is limited to CL-flagged margins and does not quantify how much the merging step alters the infiltration spectrum.
minor comments (2)
- [Abstract] Abstract: the phrase 'targeted FLIm analysis further identified biological (e.g., gray matter composition) and acquisition-related (e.g., blood contamination) contributors' is stated without naming the exact statistical or visualization methods used to link these factors to low-confidence predictions.
- [Abstract] Abstract: SHAP analysis is invoked to reveal 'class-specific FLIm feature importance' but the manuscript does not indicate which FLIm parameters (lifetime, intensity, etc.) were input to the model or how many features were retained after any preprocessing.
Simulated Author's Rebuttal
We thank the referee for their careful reading and valuable comments, which have helped us improve the clarity and rigor of our manuscript. Below, we provide point-by-point responses to the major comments.
read point-by-point responses
-
Referee: [Abstract] Abstract: the headline claim of 96% accuracy on the three-class task supplies no validation details (train/test split, cross-validation scheme, baseline models, or error analysis). Because CL uses model predictions both to flag inconsistencies and to guide class merging, the accuracy figure risks circularity unless the test partition was strictly held out from the entire CL procedure; performance against the original seven-class labels should also be reported to demonstrate that merging preserved rather than erased discriminative signal.
Authors: We agree that additional details on the validation procedure should be included in the abstract for completeness. In the revised manuscript, we will expand the abstract to describe the cross-validation scheme, train/test split, and comparison to baseline models. To address the risk of circularity, we clarify that while CL was used to refine labels on the dataset, the performance evaluation was conducted using a held-out test set that was not involved in the label refinement process. We will also report the classification performance on the original seven-class labels to show that the merging step retained important discriminative information rather than erasing it. These changes will be reflected in both the abstract and the detailed methods section. revision: yes
-
Referee: [Abstract] Abstract (framework description): the initial seven-class labels come from a single expert with no reported inter-rater reliability or independent ground-truth validation. The subsequent CL-guided merging into three classes, followed by training and 96% accuracy, therefore rests on the untested assumption that the refined labels remain faithful to underlying biology; the blinded re-evaluation is limited to CL-flagged margins and does not quantify how much the merging step alters the infiltration spectrum.
Authors: We recognize that relying on labels from a single expert neuropathologist without inter-rater reliability or independent validation is a limitation. The CL framework is specifically designed to handle label noise in such scenarios by identifying and correcting inconsistencies. The blinded re-evaluation was intentionally focused on the low-confidence cases identified by CL to efficiently assess variability, and it did demonstrate intra-pathologist variability. However, we agree that this approach does not fully quantify the impact of the class merging on the entire spectrum of infiltration. In the revision, we will include an analysis of the label distributions before and after merging and discuss the implications for biological fidelity. We will also highlight this as a limitation in the discussion section. revision: partial
- We cannot provide inter-rater reliability or independent ground-truth validation data, as the initial labeling was performed by a single expert and no additional raters were involved in the study.
Circularity Check
No circularity: empirical label refinement and supervised training remain independent of inputs
full rationale
The paper presents a standard data-centric workflow: expert neuropathologist provides initial seven-class labels on FLIm data, confident learning is applied to flag inconsistencies and guide merging to three classes, and a classifier is then trained and evaluated on the resulting dataset, yielding 96% accuracy. No equations, derivations, fitted parameters renamed as predictions, or self-citations appear in the load-bearing steps. The accuracy metric is computed on the refined labels via conventional supervised training (with mention of blinded re-evaluation), not by construction from the same model predictions used in cleaning. The framework is self-contained against external benchmarks and does not reduce any claimed result to its own inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Initial expert histopathological labels serve as a sufficiently accurate starting point despite known variability
- domain assumption FLIm signals contain distinguishable biochemical features across infiltration levels
Reference graph
Works this paper leans on
-
[1]
State -of-the-art imaging for glioma surgery
N. Verburg, P. C. De, and W. Hamer, “State -of-the-art imaging for glioma surgery”, doi: 10.1007/s10143-020-01337-9/Published
-
[2]
A. Alfonso-Garcia et al., “Mesoscopic fluorescence lifetime imaging: Fundamental principles, clinical applications and future directions,” J. Biophotonics , vol. 14, no. 6, Jun. 2021, doi: 10.1002/jbio.202000472
-
[3]
J. R. Lakowicz, Principles of fluorescence spectroscopy , Third. Baltimore: Springer New York LLC, 2006. doi: 10.1007/978-0-387- 46312-4
-
[4]
Recent innovations in fluorescence lifetime imaging microscopy for biology and medicine,
R. Datta, A. Gillette, M. Stefely, and M. C. Skala, “Recent innovations in fluorescence lifetime imaging microscopy for biology and medicine,” J. Biomed. Opt. , vol. 26, no. 07, Jul. 2021, doi: 10.1117/1.jbo.26.7.070603
-
[5]
Augmenting Neuronavigation with Label-Free Fluorescence Lifetime Imaging for Precise Detection of Glioma Infiltration,
S. Noble Anbunesan et al. , “Augmenting Neuronavigation with Label-Free Fluorescence Lifetime Imaging for Precise Detection of Glioma Infiltration,” 2025. 10
2025
-
[6]
Fluorescence lifetime measurements and biological imaging,
M. Y. Berezin and S. Achilefu, “Fluorescence lifetime measurements and biological imaging,” Chem. Rev., vol. 110, no. 5, pp. 2641 –84, May 2010, doi: 10.1021/cr900343z
-
[7]
M. Marsden et al. , “FLImBrush: dynamic visualization of intraoperative free-hand fiber-based fluorescence lifetime imaging,” Biomed. Opt. Express , vol. 11, no. 9, p. 5166, Sep. 2020, doi: 10.1364/boe.398357
-
[8]
Interobserver variation of the histopathological diagnosis in clinical trials on glioma: A clinician’s perspective,
M. J. Van Den Bent, “Interobserver variation of the histopathological diagnosis in clinical trials on glioma: A clinician’s perspective,” Sep
-
[9]
doi: 10.1007/s00401-010-0725-7
-
[10]
Tumor heterogeneity: Mechanisms and bases for a reliable application of molecular marker design,
S. J. Diaz-Cano, “Tumor heterogeneity: Mechanisms and bases for a reliable application of molecular marker design,” Feb. 2012. doi: 10.3390/ijms13021951
-
[11]
A Data -Centric Approach to improve performance of deep learning models,
N. Bhatt, N. Bhatt, P. Prajapati, V. Sorathiya, S. Alshathri, and W. El-Shafai, “A Data -Centric Approach to improve performance of deep learning models,” Sci. Rep., vol. 14, no. 1, p. 22329, Dec. 2024, doi: 10.1038/s41598-024-73643-x
-
[12]
Deep learning from multiple experts improves identification of amyloid neuropathologies,
D. R. Wong et al., “Deep learning from multiple experts improves identification of amyloid neuropathologies,” Acta Neuropathol. Commun., vol. 10, no. 1, Dec. 2022, doi: 10.1186/s40478-022-01365- 0
-
[13]
A survey of label-noise deep learning for medical image analysis,
J. Shi, K. Zhang, C. Guo, Y. Yang, Y. Xu, and J. Wu, “A survey of label-noise deep learning for medical image analysis,” Jul. 01, 2024, Elsevier B.V. doi: 10.1016/j.media.2024.103166
-
[14]
Deep learning with noisy labels: Exploring techniques and remedies in medical image analysis,
D. Karimi, H. Dou, S. K. Warfield, and A. Gholipour, “Deep learning with noisy labels: Exploring techniques and remedies in medical image analysis,” Med. Image Anal. , vol. 65, Oct. 2020, doi: 10.1016/j.media.2020.101759
-
[15]
Confident Learning: Estimating Uncertainty in Dataset Labels,
C. G. Northcutt, L. Jiang, and I. L. Chuang, “Confident Learning: Estimating Uncertainty in Dataset Labels,” Oct. 2019, [Online]. Available: http://arxiv.org/abs/1911.00068
-
[16]
Pulse - sampling fluorescence lifetime imaging: evaluation of photon economy,
X. Zhou, J. Bec, K. Ehrlich, A. A. Garcia, and L. Marcu, “Pulse - sampling fluorescence lifetime imaging: evaluation of photon economy,” Opt. Lett. , vol. 48, no. 17, p. 4578, Sep. 2023, doi: 10.1364/ol.490096
-
[17]
Multispectral fluorescence lifetime imaging device with a silicon avalanche photodetector,
X. Zhou, J. Bec, D. Yankelevich, and L. Marcu, “Multispectral fluorescence lifetime imaging device with a silicon avalanche photodetector,” Opt. Express, vol. 29, no. 13, p. 20105, Jun. 2021, doi: 10.1364/oe.425632
-
[18]
A. Alfonso -Garcia et al. , “In vivo characterization of the human glioblastoma infiltrative edge with label -free intraoperative fluorescence lifetime imaging,” Biomed. Opt. Express, vol. 14, no. 5, p. 2196, May 2023, doi: 10.1364/boe.481304
-
[19]
Intraoperative detection of IDH-mutant glioma using fluorescence lifetime imaging,
S. Noble Anbunesan et al., “Intraoperative detection of IDH-mutant glioma using fluorescence lifetime imaging,” J. Biophotonics, vol. 16, no. 4, Dec. 2022, doi: 10.1002/jbio.202200291
-
[20]
Halos and related structures.Physica Scripta2013,2013, 014001
J. Liu, Y. Sun, J. Qi, and L. Marcu, “A novel method for fast and robust estimation of fluorescence decay dynamics using constrained least-squares deconvolution with Laguerre expansion,” Phys. Med. Biol., vol. 57, no. 4, pp. 843 –865, Feb. 2012, doi: 10.1088/0031 - 9155/57/4/843
-
[21]
D., Methods Appl Fluoresc 2016, 4 (4), 042001
F. Fereidouni, D. Gorpas, D. Ma, H. Fatakdawala, and L. Marcu, “Rapid fluorescence lifetime estimation with modified phasor approach and Laguerre deconvolution: a comparative study,” Methods Appl. Fluoresc. , vol. 5, no. 3, 2017, doi: 10.1088/2050 - 6120/aa7b62
-
[22]
Ser: Graduate Texts in Mathematics (GTM), vol
Z. Xu et al. , “Denoising for Relaxing: Unsupervised Domain Adaptive Fundus Image Segmentation Without Source Data,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2022, 2022, pp. 214 –224. doi: 10.1007/978 -3-031-16443- 9_21
work page doi:10.1007/978 2022
-
[23]
Unsupervised domain adaptation for histopathology image segmentation with incomplete labels,
H. Zhou et al., “Unsupervised domain adaptation for histopathology image segmentation with incomplete labels,” Comput. Biol. Med. , vol. 171, Mar. 2024, doi: 10.1016/j.compbiomed.2024.108226
-
[24]
Converting tabular data into images for deep learning with convolutional neural networks,
Y. Zhu et al., “Converting tabular data into images for deep learning with convolutional neural networks,” Sci. Rep., vol. 11, no. 1, Dec. 2021, doi: 10.1038/s41598-021-90923-y
-
[25]
A closer look at deep learning methods on tabular datasets, 2025
H.-J. Ye, S.-Y. Liu, H.-R. Cai, Q.-L. Zhou, and D.-C. Zhan, “A Closer Look at Deep Learning Methods on Tabular Datasets,” Nov. 2025, [Online]. Available: http://arxiv.org/abs/2407.00956
-
[26]
Tabular Data: Deep Learning is Not All You Need,
R. Shwartz-Ziv and A. Armon, “Tabular Data: Deep Learning is Not All You Need,” Nov. 2021, [Online]. Available: http://arxiv.org/abs/2106.03253
-
[27]
M. Marsden et al., “Intraoperative Margin Assessment in Oral and Oropharyngeal Cancer Using Label -Free Fluorescence Lifetime Imaging and Machine Learning,” IEEE Trans. Biomed. Eng., vol. 68, no. 3, pp. 857–868, Mar. 2021, doi: 10.1109/TBME.2020.3010480
-
[28]
The Intraoperative Utility of Raman Spectroscopy for Neurosurgical Oncology,
J.-S. Chen, J. Y. Oh, T. C. Hollon, S. L. Hervey-Jumper, J. S. Young, and M. S. Berger, “The Intraoperative Utility of Raman Spectroscopy for Neurosurgical Oncology,” Cancers (Basel)., vol. 17, no. 24, p. 3920, Dec. 2025, doi: 10.3390/cancers17243920
-
[29]
OCT -Guided Surgery for Gliomas: Current Concept and Future Perspectives,
K. Yashin et al. , “OCT -Guided Surgery for Gliomas: Current Concept and Future Perspectives,” Feb. 01, 2022, Multidisciplinary Digital Publishing Institute (MDPI) . doi: 10.3390/diagnostics12020335
-
[30]
Machine Learning: Algorithms, Real -World Applications and Research Directions,
I. H. Sarker, “Machine Learning: Algorithms, Real -World Applications and Research Directions,” SN Comput. Sci., vol. 2, no. 3, May 2021, doi: 10.1007/s42979-021-00592-x
-
[31]
Label -Free Macroscopic Fluorescence Lifetime Imaging of Brain Tumors,
M. Lukina et al. , “Label -Free Macroscopic Fluorescence Lifetime Imaging of Brain Tumors,” Front. Oncol., vol. 11, May 2021, doi: 10.3389/fonc.2021.666059
-
[32]
R. R. Iyer et al. , “Label -free metabolic and structural profiling of dynamic biological samples using multimodal optical microscopy with sensorless adaptive optics,” Sci. Rep., vol. 12, no. 1, p. 3438, Mar. 2022, doi: 10.1038/s41598-022-06926-w
-
[33]
Fluorescence Lifetime Spectroscopy of Glioblastoma Multiforme,
L. Marcu et al. , “Fluorescence Lifetime Spectroscopy of Glioblastoma Multiforme,” Photochem. Photobiol. , vol. 80, p. 3, 2004
2004
-
[34]
W. H. Yong et al., “Distinction of brain tissue, low grade and high grade glioma with time -resolved fluorescence spectroscopy,” Frontiers in Bioscience, vol. 11, no. 2 P.1199-1590, pp. 1255–1263, 2006, doi: 10.2741/1878
-
[35]
A. Alfonso‐Garcia et al. , “Real‐time augmented reality for delineation of surgical margins during neurosurgery using autofluorescence lifetime contrast,” J. Biophotonics, vol. 13, no. 1, Jan. 2020, doi: 10.1002/jbio.201900108
-
[36]
Fast model -free deconvolution of fluorescence decay for analysis of biological systems,
J. A. Jo, Q. Fang, T. Papaioannou, and L. Marcu, “Fast model -free deconvolution of fluorescence decay for analysis of biological systems,” J. Biomed. Opt. , vol. 9, no. 4, p. 743, 2004, doi: 10.1117/1.1752919
-
[37]
Understanding the Warburg Effect: The Metabolic Requirements of Cell Proliferation,
M. G. Vander Heiden, L. C. Cantley, and C. B. Thompson, “Understanding the Warburg Effect: The Metabolic Requirements of Cell Proliferation,” Science (1979)., vol. 324, no. 5930, pp. 1029 – 1033, May 2009, [Online]. Available: https://www.science.org
1979
-
[38]
Metabolomic differentiation of tumor core versus edge in glioma,
M. E. Baxter, H. A. Miller, J. Chen, B. J. Williams, and H. B. Frieboes, “Metabolomic differentiation of tumor core versus edge in glioma,” Neurosurg. Focus , vol. 54, no. 6, 2023, doi: 10.3171/2023.3.FOCUS2379
-
[39]
R. Datta, T. M. Heaster, J. T. Sharick, A. A. Gillette, and M. C. Skala, “Fluorescence lifetime imaging microscopy: fundamentals and advances in instrumentation, analysis, and applications,” J. Biomed. Opt., vol. 25, no. 07, p. 1, May 2020, doi: 10.1117/1.jbo.25.7.071203
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.