eTFCE: Exact Threshold-Free Cluster Enhancement via Fast Cluster Retrieval

Jelle J. Goeman; Thomas E. Nichols; Wouter D. Weeda; Xu Chen

arxiv: 2603.03004 · v3 · submitted 2026-03-03 · 📊 stat.ME · stat.AP· stat.CO

eTFCE: Exact Threshold-Free Cluster Enhancement via Fast Cluster Retrieval

Xu Chen , Wouter D. Weeda , Thomas E. Nichols , Jelle J. Goeman This is my paper

Pith reviewed 2026-05-15 16:53 UTC · model grok-4.3

classification 📊 stat.ME stat.APstat.CO

keywords threshold-free cluster enhancementTFCEcluster-based inferenceneuroimagingnonparametric statisticsexact computationpermutation testing

0 comments

The pith

eTFCE computes the TFCE integral exactly by retrieving all clusters at every threshold instead of using discrete approximations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents eTFCE as a method that evaluates the threshold-free cluster enhancement integral in full numerical exactness rather than through the finite threshold steps used in standard implementations. This exactness comes from an optimized retrieval process that sums every cluster contribution for each voxel across the continuous range of thresholds. Standard methods produce slightly smaller p-values for many voxels due to the way discretization accumulates evidence, while eTFCE concentrates stronger evidence in fewer voxels near the significance boundary. The two approaches agree on overall inference conclusions, yet eTFCE runs faster on average and supports computing several cluster statistics together inside one permutation test.

Core claim

eTFCE replaces the usual discretized approximation of the TFCE integral with a numerically exact computation performed by an optimized cluster retrieval algorithm that enumerates every contributing cluster at every threshold for every voxel.

What carries the argument

The optimized cluster retrieval algorithm that enumerates and sums cluster extents across all thresholds without discretization steps to evaluate the TFCE integral exactly.

If this is right

Standard TFCE implementations introduce small systematic biases in p-value accumulation that are removed by exact integration.
Runtime drops to roughly 71 percent of the standard method on average while preserving inference decisions.
Multiple cluster-based statistics can be computed together inside a single permutation framework without extra cost.
Discrepancies between exact and approximate TFCE remain confined to voxels near the statistical decision boundary.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The exact formulation could serve as a reference standard for validating future approximations in other integral-based neuroimaging statistics.
The retrieval algorithm might extend directly to related methods such as cluster-mass or thresholded extent statistics.
Adoption would allow cleaner meta-analyses across studies that currently differ only because of discretization choices.

Load-bearing premise

The retrieval algorithm correctly finds and accounts for every cluster contribution at every threshold level for every voxel without omissions or misidentifications.

What would settle it

A voxel-by-voxel comparison on a small test volume between eTFCE output and a brute-force numerical integration of the TFCE formula over thousands of finely spaced thresholds.

Figures

Figures reproduced from arXiv: 2603.03004 by Jelle J. Goeman, Thomas E. Nichols, Wouter D. Weeda, Xu Chen.

**Figure 2.** Figure 2: Comparison of eTFCE and FSL’s default TFCE on the auditory task data (vocal vs non [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: Example visualization of integrated TFCE and cluster mass inference (CMI) on HCP Emotion [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

read the original abstract

Threshold-free cluster enhancement (TFCE) is widely used for cluster-based inference in neuroimaging, but existing implementations typically rely on discretized approximations that may introduce numerical variability. We present eTFCE, an efficient framework that provides a numerically exact evaluation of the TFCE integral using an optimized cluster retrieval algorithm. Across multiple datasets, eTFCE and the standard implementation produce highly consistent inference results. Voxel-wise comparisons reveal a systematic asymmetry: the standard method yields smaller p-values for more voxels, while eTFCE concentrates stronger statistical evidence within a smaller subset. These differences are primarily confined to voxels near the inference boundary and have minimal impact on overall inference. This pattern is consistent with discretization effects in standard implementations, where the TFCE integral is approximated using a finite set of threshold levels, introducing subtle biases in statistical evidence accumulation across thresholds. Furthermore, eTFCE improves computational efficiency (71.3% of runtime on average) and enables unified computation of multiple cluster-based statistics within a single permutation framework. Overall, eTFCE provides an exact, efficient, and extensible approach to nonparametric neuroimaging inference.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

eTFCE replaces discretized TFCE with a fast cluster retrieval algorithm but offers no formal proof that the retrieval is exact.

read the letter

The key point is that this paper gives an exact computation of the TFCE statistic through a fast cluster retrieval algorithm, replacing the discretized approximations that everyone has used so far. What works is the efficiency gain and the observation that the standard method has small systematic differences in p-values, mostly near the decision boundary. They report 71 percent of the runtime on average and show that multiple cluster statistics can be computed together in one permutation run. The results across datasets line up closely enough that inference conclusions stay the same in practice. The main weakness is that exactness is asserted without a proof or a set of synthetic tests where the true TFCE value is known analytically. Matching the discretized version only shows they agree within the old error; it does not rule out new errors in how clusters are retrieved at every height. The abstract mentions the algorithm but does not detail why it enumerates every contribution correctly for arbitrary fields. This paper is for statisticians and neuroimaging analysts who care about the numerical details of cluster enhancement. Someone implementing or extending nonparametric inference tools would find the retrieval approach useful to examine. I would send it to peer review. The efficiency numbers and the unified framework are concrete enough to justify referee time, even though the exactness part needs verification.

Referee Report

1 major / 2 minor

Summary. The paper introduces eTFCE, an efficient framework for numerically exact evaluation of the Threshold-Free Cluster Enhancement (TFCE) integral in neuroimaging via an optimized cluster retrieval algorithm. It reports high consistency with standard discretized TFCE implementations across multiple datasets, a systematic asymmetry in voxel-wise p-values (standard method smaller for more voxels, eTFCE stronger in fewer), efficiency gains averaging 71.3% of runtime, and the ability to unify computation of multiple cluster-based statistics in one permutation framework. Differences are attributed to discretization biases and are said to have minimal impact on overall inference.

Significance. If the exactness claim holds, the work would provide a meaningful advance for nonparametric inference in neuroimaging by removing discretization-induced variability in TFCE values. The reported efficiency improvement and unified statistic computation are practical strengths. The empirical consistency across datasets is a positive indicator, but the absence of formal verification limits the assessed impact.

major comments (1)

[Methods (cluster retrieval algorithm description)] The central claim of numerically exact TFCE evaluation rests on the optimized cluster retrieval algorithm correctly enumerating all supra-threshold clusters at every height h without omissions or misassignments. The manuscript describes the algorithm but provides neither a formal correctness argument (e.g., proof that retrieval order and merging rules preserve the continuous integral) nor exhaustive tests on synthetic fields with analytically known TFCE values. Empirical agreement with the discretized implementation only demonstrates consistency within discretization error and cannot detect systematic boundary or topology errors.

minor comments (2)

[Abstract and Results] Clarify the exact meaning of the reported 71.3% runtime figure (reduction to 71.3% versus savings of 71.3%) and specify the hardware and dataset sizes used for timing.
[Results] The voxel-wise asymmetry in p-values is described as confined to the inference boundary; provide a quantitative breakdown (e.g., fraction of voxels affected and magnitude of p-value shifts) to support the claim of minimal overall impact.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for their constructive feedback on the manuscript. We address the major comment on the cluster retrieval algorithm below.

read point-by-point responses

Referee: The central claim of numerically exact TFCE evaluation rests on the optimized cluster retrieval algorithm correctly enumerating all supra-threshold clusters at every height h without omissions or misassignments. The manuscript describes the algorithm but provides neither a formal correctness argument (e.g., proof that retrieval order and merging rules preserve the continuous integral) nor exhaustive tests on synthetic fields with analytically known TFCE values. Empirical agreement with the discretized implementation only demonstrates consistency within discretization error and cannot detect systematic boundary or topology errors.

Authors: We appreciate the referee highlighting the need for rigorous validation of exactness. The algorithm sorts voxels by decreasing intensity and maintains connectivity via a dynamic union-find structure, updating the TFCE integral exactly at each critical height where clusters merge or split. This directly implements the continuous integral without discretization steps. While the original manuscript lacks a formal proof, the approach follows standard properties of connected-component tracking on voxel grids. We will revise to include pseudocode and add exhaustive validation on synthetic fields with analytically known TFCE values (e.g., constant-intensity blobs and simple Gaussian fields). We agree that empirical consistency with the discretized method alone cannot confirm exactness and will update the discussion to reflect the new tests. The reported boundary asymmetries align with expected discretization bias. revision: partial

standing simulated objections not resolved

Formal mathematical proof of the cluster retrieval algorithm's correctness

Circularity Check

0 steps flagged

No circularity: algorithmic reformulation independent of inputs

full rationale

The paper's central claim is that eTFCE computes the TFCE integral exactly via an optimized cluster retrieval algorithm, presented as a direct reformulation rather than a fit or self-referential prediction. No equations or steps in the abstract or described content reduce by construction to the inputs (e.g., no parameter fitted to data then renamed as prediction, no self-citation chain justifying uniqueness, no ansatz smuggled in). The derivation chain is self-contained as a computational method whose correctness is asserted via description of the algorithm, not by re-expressing the standard TFCE as itself. This matches the default expectation for non-circular papers; the skeptic concern is about missing proof, which is a correctness issue outside circularity analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The method assumes standard permutation testing and cluster connectivity rules from prior neuroimaging literature; the only new element is the exact integration step, which rests on the domain assumption that cluster retrieval can be made exact and complete.

axioms (1)

domain assumption The TFCE integral can be exactly recovered by enumerating clusters across all thresholds using an optimized retrieval procedure.
This assumption underpins the claim of numerical exactness and is invoked in the description of the framework.

pith-pipeline@v0.9.0 · 5502 in / 1170 out tokens · 38286 ms · 2026-05-15T16:53:13.524763+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Hybrid eTFCE-GRF: Exact Cluster-Size Retrieval with Analytical p-Values for Voxel-Based Morphometry
eess.IV 2026-03 conditional novelty 6.0

Hybrid eTFCE-GRF retrieves exact cluster sizes via union-find in one pass and converts them to analytical p-values using GRF theory, delivering permutation-free TFCE inference that is faster and exact.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · cited by 1 Pith paper

[1]

Andreella, A., Hemerik, J., Finos, L., Weeda, W., and Goeman, J. (2023). Permutation-based true dis- covery proportions for functional magnetic resonance imaging cluster analysis.Statistics in Medicine, 42(14):2311–2340

work page 2023
[2]

Johansen-Berg, H., Snyder, A., Van Essen, D., and Consortium, W.-M. H. (2013). Function in the human connectome: task-fmri and individual differences in behavior.NeuroImage, 80:169–189

work page 2013
[3]

J., Krebs, T

Chen, X., Goeman, J. J., Krebs, T. J. P., Meijer, R. J., and Weeda, W. D. (2023). Adaptive cluster thresholding with spatial activation guarantees using all-resolutions inference

work page 2023
[4]

Davenport, S. (2024). Statbrainz: A matlab package for analysing brain imaging data using statistics. https://github.com/sjdavenport/Statbrainz

work page 2024
[5]

E., and Knutsson, H

Eklund, A., Nichols, T. E., and Knutsson, H. (2016). Cluster failure: why fmri inferences for spatial extent have inflated false-positive rates.Proceedings of the National Academy of Sciences of the United States of America, 113(28):7900–7905

work page 2016
[6]

F., Sotiropoulos, S

Glasser, M. F., Sotiropoulos, S. N., Wilson, J. A., Coalson, T. S., Fischl, B., Andersson, J. L., Xu, J., Jbabdi, S., Webster, M., Polimeni, J. R., Van Essen, D. C., Jenkinson, M., and Consortium, W.- M. H. (2013). The minimal preprocessing pipelines for the human connectome project.NeuroImage, 80:105–124

work page 2013
[7]

F., Behrens, T

Jenkinson, M., Beckmann, C. F., Behrens, T. E., Woolrich, M. W., and Smith, S. M. (2012). FSL. NeuroImage, 62(2):782–790

work page 2012
[8]

Nichols, T. E. and Holmes, A. P. (2002). Nonparametric permutation tests for functional neuroimaging: a primer with examples.Human Brain Mapping, 15(1):1–25

work page 2002
[9]

Noble, S., Scheinost, D., and Constable, R. T. (2020). Cluster failure or power failure? evaluating sensitivity in cluster-level inference.NeuroImage, 209:116468

work page 2020
[10]

R., McAleer, P., Latinus, M., Gorgolewski, K

Pernet, C. R., McAleer, P., Latinus, M., Gorgolewski, K. J., Charest, I., Bestelmeyer, P. E. G., Watson, R. H., Fleming, D., Crabbe, F., Valdes-Sosa, M., and Belin, P. (2015). The human voice areas: spatial organization and inter-individual variability in temporal and extra-temporal cortices.Neuroimage, 119:164–174

work page 2015
[11]

D., Finos, L., Weeda, W

Rosenblatt, J. D., Finos, L., Weeda, W. D., Solari, A., and Goeman, J. J. (2018). All-resolutions inference for brain imaging.NeuroImage, 181:786–796

work page 2018
[12]

M., and Nichols, T

Salimi-Khorshidi, G., Smith, S. M., and Nichols, T. E. (2011). Adjusting the effect of nonstationarity in cluster-based and tfce inference.NeuroImage, 54(3):2006–19. 16

work page 2011
[13]

Smith, S. M. and Nichols, T. E. (2009). Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference.Neuroimage, 44(1):83–98. Spis´ak, T., Spis´ak, Z., Zunhammer, M., Bingel, U., Smith, S., Nichols, T., and Kincses, T. (2019). Prob- abilistic TFCE: A generalized combination of cluster siz...

work page 2009
[14]

D., Davenport, S., and Goeman, J

Weeda, W. D., Davenport, S., and Goeman, J. J. (2025). Localized cluster enhancement (LCE): improv- ing threshold free cluster enhancement (TFCE) for better localization of brain activity

work page 2025
[15]

and Nieto-Castanon, A

Whitfield-Gabrieli, S. and Nieto-Castanon, A. (2012). Conn: a functional connectivity toolbox for correlated and anticorrelated brain networks.Brain connectivity, 2(3):125–141

work page 2012
[16]

Winkler, A., Ridgway, G., Webster, M., Smith, S., and Nichols, T. (2014). Permutation inference for the general linear model.NeuroImage, 92(100):381–97

work page 2014
[17]

W., Krishnan, A., and Wager, T

Woo, C. W., Krishnan, A., and Wager, T. D. (2014). Cluster-extent based thresholding in fMRI analyses: pitfalls and recommendations.NeuroImage, 91:412–419. 17

work page 2014

[1] [1]

Andreella, A., Hemerik, J., Finos, L., Weeda, W., and Goeman, J. (2023). Permutation-based true dis- covery proportions for functional magnetic resonance imaging cluster analysis.Statistics in Medicine, 42(14):2311–2340

work page 2023

[2] [2]

Johansen-Berg, H., Snyder, A., Van Essen, D., and Consortium, W.-M. H. (2013). Function in the human connectome: task-fmri and individual differences in behavior.NeuroImage, 80:169–189

work page 2013

[3] [3]

J., Krebs, T

Chen, X., Goeman, J. J., Krebs, T. J. P., Meijer, R. J., and Weeda, W. D. (2023). Adaptive cluster thresholding with spatial activation guarantees using all-resolutions inference

work page 2023

[4] [4]

Davenport, S. (2024). Statbrainz: A matlab package for analysing brain imaging data using statistics. https://github.com/sjdavenport/Statbrainz

work page 2024

[5] [5]

E., and Knutsson, H

Eklund, A., Nichols, T. E., and Knutsson, H. (2016). Cluster failure: why fmri inferences for spatial extent have inflated false-positive rates.Proceedings of the National Academy of Sciences of the United States of America, 113(28):7900–7905

work page 2016

[6] [6]

F., Sotiropoulos, S

Glasser, M. F., Sotiropoulos, S. N., Wilson, J. A., Coalson, T. S., Fischl, B., Andersson, J. L., Xu, J., Jbabdi, S., Webster, M., Polimeni, J. R., Van Essen, D. C., Jenkinson, M., and Consortium, W.- M. H. (2013). The minimal preprocessing pipelines for the human connectome project.NeuroImage, 80:105–124

work page 2013

[7] [7]

F., Behrens, T

Jenkinson, M., Beckmann, C. F., Behrens, T. E., Woolrich, M. W., and Smith, S. M. (2012). FSL. NeuroImage, 62(2):782–790

work page 2012

[8] [8]

Nichols, T. E. and Holmes, A. P. (2002). Nonparametric permutation tests for functional neuroimaging: a primer with examples.Human Brain Mapping, 15(1):1–25

work page 2002

[9] [9]

Noble, S., Scheinost, D., and Constable, R. T. (2020). Cluster failure or power failure? evaluating sensitivity in cluster-level inference.NeuroImage, 209:116468

work page 2020

[10] [10]

R., McAleer, P., Latinus, M., Gorgolewski, K

Pernet, C. R., McAleer, P., Latinus, M., Gorgolewski, K. J., Charest, I., Bestelmeyer, P. E. G., Watson, R. H., Fleming, D., Crabbe, F., Valdes-Sosa, M., and Belin, P. (2015). The human voice areas: spatial organization and inter-individual variability in temporal and extra-temporal cortices.Neuroimage, 119:164–174

work page 2015

[11] [11]

D., Finos, L., Weeda, W

Rosenblatt, J. D., Finos, L., Weeda, W. D., Solari, A., and Goeman, J. J. (2018). All-resolutions inference for brain imaging.NeuroImage, 181:786–796

work page 2018

[12] [12]

M., and Nichols, T

Salimi-Khorshidi, G., Smith, S. M., and Nichols, T. E. (2011). Adjusting the effect of nonstationarity in cluster-based and tfce inference.NeuroImage, 54(3):2006–19. 16

work page 2011

[13] [13]

Smith, S. M. and Nichols, T. E. (2009). Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference.Neuroimage, 44(1):83–98. Spis´ak, T., Spis´ak, Z., Zunhammer, M., Bingel, U., Smith, S., Nichols, T., and Kincses, T. (2019). Prob- abilistic TFCE: A generalized combination of cluster siz...

work page 2009

[14] [14]

D., Davenport, S., and Goeman, J

Weeda, W. D., Davenport, S., and Goeman, J. J. (2025). Localized cluster enhancement (LCE): improv- ing threshold free cluster enhancement (TFCE) for better localization of brain activity

work page 2025

[15] [15]

and Nieto-Castanon, A

Whitfield-Gabrieli, S. and Nieto-Castanon, A. (2012). Conn: a functional connectivity toolbox for correlated and anticorrelated brain networks.Brain connectivity, 2(3):125–141

work page 2012

[16] [16]

Winkler, A., Ridgway, G., Webster, M., Smith, S., and Nichols, T. (2014). Permutation inference for the general linear model.NeuroImage, 92(100):381–97

work page 2014

[17] [17]

W., Krishnan, A., and Wager, T

Woo, C. W., Krishnan, A., and Wager, T. D. (2014). Cluster-extent based thresholding in fMRI analyses: pitfalls and recommendations.NeuroImage, 91:412–419. 17

work page 2014