Recognition: no theorem link
Statistical evaluation of measurement precision in linear dose-response relationships via interlaboratory studies
Pith reviewed 2026-05-13 02:11 UTC · model grok-4.3
The pith
For balanced interlaboratory designs, a linear mixed-effects model yields exact ANOVA estimators of repeatability and between-laboratory variances along with F-tests for trend, intercept homogeneity, and slope homogeneity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For fully balanced designs with common dose levels and equal replication, the total sum of squares decomposes exactly, closed-form ANOVA estimators recover the repeatability and between-laboratory variances, and three F-tests assess the overall dose-response trend, the homogeneity of intercepts, and the homogeneity of slopes across laboratories. This formulation quantifies precision directly at the method level and distinguishes whether between-laboratory discrepancies stem primarily from baseline shifts or from differences in sensitivity.
What carries the argument
The linear mixed-effects model with laboratory-specific intercepts and slopes, together with the exact sum-of-squares decomposition that holds only under fully balanced designs.
If this is right
- Repeatability and between-laboratory variances become directly estimable without iterative numerical methods.
- Analysts can test whether between-laboratory variation is driven more by intercept differences or by slope differences.
- The three F-tests provide separate significance statements for overall dose response, baseline consistency, and sensitivity consistency.
- Precision metrics remain defined even when laboratories differ systematically in level or in dose sensitivity.
Where Pith is reading between the lines
- The same decomposition might guide sample-size planning for future balanced interlaboratory protocols.
- When designs are mildly unbalanced the closed-form estimators could serve as starting values for more general mixed-model fitting.
- The separation of intercept and slope effects could help regulators decide whether harmonization efforts should target calibration offsets or assay sensitivity.
Load-bearing premise
The data-generating process follows a linear mixed-effects model with lab-specific intercepts and slopes, and the experimental design is fully balanced with common dose levels and equal replication.
What would settle it
In a fully balanced interlaboratory study, compute the proposed ANOVA estimators and observe whether they produce negative variance components or whether the three F-statistics fail to follow their expected central F distributions under the null hypotheses of no trend, intercept homogeneity, or slope homogeneity.
read the original abstract
This paper proposes a framework for evaluating the statistical precision of measurement methods from interlaboratory studies where the outcome is a dose-response relationship summarized by a regression line. For such measurement methods, where a linear mixed-effects model is applied that allows laboratories to differ in both baseline level and dose-response slope, we define precision evaluation metrics specified in ISO 5725, repeatability and between-laboratory variances. These are method-level precision metrics, and the latter are constructed as design-averaged dose-specific between-laboratory variances over the dose levels and the participating laboratories. For fully balanced designs with common dose levels and equal replication, we obtain an exact decomposition of the total sum of squares, closed-form analysis of variance (ANOVA) estimators of the precision variances, and three associated $F$-tests targeting (i) the overall dose-response trend, (ii) homogeneity of intercepts, and (iii) homogeneity of slopes across laboratories. This formulation enables precision to be quantified and estimated directly and supports an evaluation of whether between-laboratory discrepancies are caused primarily by baseline shifts or by differences in sensitivity, in contrast to fixed-effect comparisons that only detect the presence of differences. Furthermore, we analyze data obtained from an interlaboratory study on observations in bronchoalveolar lavage fluid from experiments involving the intratracheal administration of nanomaterials to rats, using the proposed method as a case study.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a framework for evaluating measurement precision in interlaboratory studies of linear dose-response relationships. It applies a linear mixed-effects model allowing random laboratory-specific intercepts and slopes, defines repeatability and between-laboratory precision variances following ISO 5725, and constructs the latter as design-averaged dose-specific quantities. For fully balanced designs with common dose levels and equal replication, the paper claims an exact total sum-of-squares decomposition, closed-form ANOVA estimators of the precision variances, and three F-tests for the overall dose-response trend, homogeneity of intercepts, and homogeneity of slopes. The approach is demonstrated via a case study on bronchoalveolar lavage fluid observations from rat nanomaterial intratracheal administration experiments.
Significance. If the central derivations hold, the work provides a useful extension of ISO 5725-style precision metrics to dose-response settings by separating baseline shifts from sensitivity differences across laboratories. This distinction is practically relevant for standardizing measurement methods in toxicology and analytical chemistry, and the closed-form estimators for balanced designs represent a clear computational advantage over general REML fitting.
major comments (1)
- The exact sum-of-squares decomposition and closed-form ANOVA estimators are asserted for fully balanced designs under the random-coefficient model, but the manuscript should explicitly equate the observed mean squares to their expectations (including the design-averaged between-laboratory variance functional) to confirm that the three F-tests are the standard ratios for fixed slope, random-intercept component, and random-slope component.
minor comments (3)
- In the case-study section, report the numerical values of the estimated variance components, the design-averaged between-laboratory variances at each dose, and the p-values of the three F-tests so that readers can assess the practical magnitude of intercept versus slope heterogeneity.
- Clarify the precise definition and weighting scheme used for the design-averaged between-laboratory variance; although it is a well-defined functional of the estimated random-coefficient covariance matrix, the averaging weights over dose levels and laboratories should be stated explicitly.
- Add a brief simulation study (or reference to one) confirming that the closed-form estimators recover the true variance components under the balanced design and the stated linear mixed model.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation and constructive comment. The suggestion to make the expected mean squares explicit improves the clarity of the ANOVA justification, and we have incorporated this into the revised manuscript.
read point-by-point responses
-
Referee: The exact sum-of-squares decomposition and closed-form ANOVA estimators are asserted for fully balanced designs under the random-coefficient model, but the manuscript should explicitly equate the observed mean squares to their expectations (including the design-averaged between-laboratory variance functional) to confirm that the three F-tests are the standard ratios for fixed slope, random-intercept component, and random-slope component.
Authors: We agree that an explicit derivation of the expected mean squares strengthens the presentation. In the revised manuscript we have added a new subsection (Section 3.3) that equates each observed mean square to its model expectation under the balanced random-coefficient design. The derivation shows that the design-averaged between-laboratory variance functional appears precisely in the expectation of the mean square for the slope-homogeneity test, while the intercept-homogeneity test isolates the random-intercept component and the overall-trend test isolates the fixed slope. Consequently the three reported F-ratios are the standard ANOVA ratios for these respective hypotheses. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper's central claims concern standard derivations for balanced linear mixed models: an exact sum-of-squares decomposition follows from the orthogonality of fixed and random effects in fully balanced designs with common dose levels and equal replication; closed-form ANOVA estimators arise by equating observed mean squares to their expectations under the random-coefficient model; and the three F-tests are the usual ratios of mean squares for the fixed slope, random-intercept component, and random-slope component. Precision metrics are defined directly from the model's variance components using ISO 5725 terminology, with the between-laboratory variance expressed as a design-averaged functional; these definitions do not reduce any claimed result to a fitted parameter by construction. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked, and the case-study application simply plugs the derived estimators into observed data without circular loops.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Linear mixed-effects model with lab-specific random intercepts and slopes is appropriate for the interlaboratory dose-response data
- domain assumption ISO 5725 definitions of repeatability and between-laboratory variances extend directly to method-level summaries of regression lines
Reference graph
Works this paper leans on
-
[1]
survey on standardization of intratracheal administration study for nanomaterials and related issues
AIST , Annual report on the project “survey on standardization of intratracheal administration study for nanomaterials and related issues” (2017) , 2018
work page 2017
-
[2]
R. K. Burdick, C. M. Borror, and D. C. Montgomery , Design and Analysis of Gauge R & R Studies : Making Decisions with Confidence Intervals in Random and Mixed ANOVA Models , ASA-SIAM Series on Statistics and Applied Probability, Society for Industrial Applied Mathematics ; American Statistical Association, Philadelphia, Pa. : Alexandria, Va, 2005
work page 2005
-
[3]
K. R. Davidson, D. M. Ha, M. I. Schwarz, and E. D. Chan , Bronchoalveolar lavage as a diagnostic procedure: a review of known cellular and molecular findings in various lung diseases , Journal of Thoracic Disease, 12 (2020), pp. 4991--5019
work page 2020
-
[4]
K. E. Driscoll, D. L. Costa, G. Hatch, R. Henderson, G. Oberdorster, H. Salem, and R. B. Schlesinger , Intratracheal Instillation as an Exposure Technique for the Evaluation of Respiratory Tract Toxicity : Uses and Limitations , Toxicological Sciences, 55 (2000), pp. 24--35
work page 2000
-
[5]
FDA/CDER , Guidnace for industry: Exposure-response relationships --- study design, data analysis, and regulatory applications , tech. rep., 2003
work page 2003
-
[6]
ISO , ISO 5725-1:2025 Accuracy (trueness and precision) of measurement methods and results --- Part 1: General principles and definitions , 2023
work page 2025
-
[7]
height 2pt depth -1.6pt width 23pt, ISO 5725-2:2025 Accuracy (trueness and precision) of measurement methods and results --- Part 2: Basic method for the determination of repeatability and reproducibility of a standard measurement method , 2025
work page 2025
-
[8]
F. Kappenberg, J. C. Duda, L. Schürmeyer, O. Gül, T. Brecklinghaus, J. G. Hengstler, K. Schorning, and J. Rahnenführer , Guidance for statistical design and analysis of toxicological dose–response experiments, based on a comprehensive literature review , Archives of Toxicology, 97 (2023), pp. 2741--2761
work page 2023
- [9]
- [10]
-
[11]
J. H. Proost, D. J. Eleveld, and M. M. R. F. Struys , Population pharmacodynamic modeling using the sigmoid emax model: Influence of inter-individual variability on the steepness of the concentration–effect relationship. a simulation study , The AAPS Journal, 23 (2020)
work page 2020
-
[12]
O. C. Shanks, M. Sivaganesan, L. Peed, C. A. Kelty, A. D. Blackwood, M. R. Greene, R. T. Noble, R. N. Bushon, E. A. Stelzer, J. Kinzelman, T. Anan’eva, C. Sinigalliano, D. Wanless, J. G. andYiping Cao, S. Weisberg, V. J. Harwood, C. Staley, K. H. Oshima, M. Varma, and R. A. Haugland , Interlaboratory comparison of real-time pcr protocols forquantification...
work page 2026
- [13]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.