The Measurement of Statistical Evidence as the Basis for Statistical Reasoning

Michael Evans

arxiv: 1906.09484 · v1 · pith:LPIXXES2new · submitted 2019-06-22 · 🧮 math.ST · stat.TH

The Measurement of Statistical Evidence as the Basis for Statistical Reasoning

Michael Evans This is my paper

Pith reviewed 2026-05-25 17:45 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords statistical evidencestatistical reasoningevidence measurementstatistical methodologiescontradictory conclusionsdata analysisstatistical theory

0 comments

The pith

Precise measurement of statistical evidence resolves contradictions among different statistical methodologies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Different statistical approaches often reach contradictory conclusions from the same data because none explicitly define how to characterize and measure the evidence contained in that data. The paper develops a theory of statistical reasoning that treats the precise measurement of statistical evidence as its foundation. This approach is presented as a way to eliminate inconsistencies that arise when methodologies remain vague about evidence. A sympathetic reader would care because it offers a path toward consistent conclusions in data-driven work across science and policy.

Core claim

There are various approaches to the problem of how one is supposed to conduct a statistical analysis. Different analyses can lead to contradictory conclusions in some problems so this is not a satisfactory state of affairs. It seems that all approaches make reference to the evidence in the data concerning questions of interest as a justification for the methodology employed. It is fair to say, however, that none of the most commonly used methodologies is absolutely explicit about how statistical evidence is to be characterized and measured. Developing a theory based on being precise about statistical evidence leads to the resolution of a number of problems.

What carries the argument

The explicit characterization and measurement of statistical evidence, which forms the basis for a unified theory of statistical reasoning.

If this is right

Statistical methodologies will become consistent when they are all grounded in the same explicit measure of evidence.
Ambiguities that currently allow contradictory conclusions from the same data will be removed.
Statistical reasoning will rest on a single, evidence-based foundation rather than competing implicit references to evidence.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The theory could provide a common language for comparing the strength of evidence produced by frequentist, Bayesian, and other frameworks.
It might guide the choice of statistical procedures by quantifying which ones extract more evidence for a given question.
In applied fields, consistent evidence measures could reduce disputes over the interpretation of the same dataset.

Load-bearing premise

That explicitly characterizing and measuring statistical evidence will resolve contradictions among existing statistical methodologies.

What would settle it

A specific statistical problem in which two different analyses, each using an explicit and precise measure of the evidence in the data, still reach incompatible conclusions about the same question.

Figures

Figures reproduced from arXiv: 1906.09484 by Michael Evans.

**Figure 2.** Figure 2: Bias in favor of µ maximized over µ ± δ based on a N(0, 1) prior with σ0 = 1, n = 20, δ = 0.5. n (µ∗, τ∗) = (0, 1), δ = 1.0 (µ∗, τ∗) = (0, 1), δ = 0.5 5 0.451 0.798 10 0.185 0.690 20 0.025 0.486 50 0.000 0.131 100 0.000 0.009 [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

read the original abstract

There are various approaches to the problem of how one is supposed to conduct a statistical analysis. Different analyses can lead to contradictory conclusions in some problems so this is not a satisfactory state of affairs. It seems that all approaches make reference to the evidence in the data concerning questions of interest as a justification for the methodology employed. It is fair to say, however, that none of the most commonly used methodologies is absolutely explicit about how statistical evidence is to be characterized and measured. We will discuss the general problem of statistical reasoning and the development of a theory for this that is based on being precise about statistical evidence. This will be shown to lead to the resolution of a number of problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Evans pushes for an explicit theory of statistical evidence to fix contradictory conclusions across methods, but the abstract gives no derivations or examples to show it works.

read the letter

The main takeaway is that this paper identifies the real problem of inconsistent statistical conclusions and argues that pinning down exactly what 'evidence' means will clean it up. It does a decent job stating the issue plainly: methods disagree on some problems, and all of them invoke evidence without being precise about how to measure it. That observation is fair and worth repeating in foundations discussions. What the paper does well is keep the focus on measurement rather than jumping straight to procedures or priors. If the body actually develops a workable characterization and applies it to cases where, say, p-values and likelihood ratios clash, that would be the useful part. From the abstract alone, though, nothing new appears—no formal definition, no derivation, and no concrete resolution of a contradiction. The claim that precision 'will be shown' to resolve problems stays at the level of a promise. The soft spot is exactly there: without at least one worked example or a sketch of the measurement rule, the central assertion rests on assertion rather than demonstration. This is aimed at readers who already care about the foundations of statistical reasoning—people thinking about relative belief, evidence functions, or why frequentist and Bayesian answers sometimes diverge. It is not for someone needing a new tool or theorem to apply tomorrow. A serious editor could send it to referees if the full text supplies the missing mechanism and checks it against existing literature; otherwise it stays too programmatic to justify the time.

Referee Report

1 major / 0 minor

Summary. The manuscript discusses inconsistencies among statistical methodologies that can produce contradictory conclusions. It notes that all approaches implicitly reference statistical evidence in the data but none provides an explicit characterization and measurement of that evidence. The paper proposes developing a theory of statistical reasoning grounded in precise measurement of statistical evidence and asserts that this will resolve a number of problems in the field.

Significance. An explicit, operational theory of statistical evidence that demonstrably reconciles conflicting methodologies would be of high significance for statistical theory and practice. The manuscript's conceptual framing identifies a genuine gap, but its significance is limited by the absence of concrete measurement procedures, derivations, or examples showing resolution of specific contradictions.

major comments (1)

[Abstract] Abstract: The central assertion that the proposed theory 'will be shown to lead to the resolution of a number of problems' is not supported by any formal definition of the evidence measure, derivation, worked example, or data analysis demonstrating resolution of a methodological contradiction. This leaves the primary claim unsubstantiated.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comments. The manuscript presents a conceptual program for grounding statistical reasoning in an explicit theory of evidence measurement. We address the single major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: The central assertion that the proposed theory 'will be shown to lead to the resolution of a number of problems' is not supported by any formal definition of the evidence measure, derivation, worked example, or data analysis demonstrating resolution of a methodological contradiction. This leaves the primary claim unsubstantiated.

Authors: The referee is correct that the manuscript does not contain a fully operational evidence measure together with derivations and concrete examples that resolve specific contradictions. The paper is framed as a discussion of the general problem and the rationale for developing such a theory; the phrase 'will be shown' in the abstract is therefore prospective rather than a claim of completed demonstrations within this work. We will revise the abstract to remove the forward-looking claim and instead describe the manuscript as outlining the motivation and conceptual basis for an evidence-centered approach, with detailed resolutions reserved for subsequent papers. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation remains conceptual and self-contained

full rationale

The manuscript is a high-level discussion paper whose central claim is that an explicit theory of statistical evidence will resolve methodological contradictions. The abstract and available text supply no equations, no fitted parameters, no self-citations invoked as uniqueness theorems, and no derivations that reduce a claimed result to its own inputs by construction. No load-bearing steps of the enumerated kinds are present; the argument is framed as a program rather than a closed formal chain. This is the expected non-finding for a foundational conceptual paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities can be identified from the provided text.

pith-pipeline@v0.9.0 · 5627 in / 929 out tokens · 22553 ms · 2026-05-25T17:45:46.902021+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

pure frequentist

withσ 2 0 known andπ is aN (µ ∗,τ 2 ∗ ) prior and the hypothesis is H0 :µ =µ 0. So RB(µ 0 |x) = ( 1 + nτ2 ∗ σ 2 0 ) 1/ 2 exp    − 1 2 ( 1 + σ 2 0 nτ 2∗ ) −1 ( √n(¯x−µ 0) σ 0 + σ 0(µ ∗−µ 0)√nτ 2 ∗ ) 2 + (µ 0−µ ∗)2 2τ 2 0    , which, in this case is the same as the Bayes factor for µ 0 obtained via Jeﬀreys’ mixture approach. From this it is easy to se...

work page 2014
[2]

incoherence

949 so Pl (x) is a 0 . 949 Bayesian conﬁdence interval for µ. To use (7) it is necessary to maximize M (RB(µ |X) ≤ 1 |µ ) as a function of µ and it is seen that, at least when the prior is not overly concentrate d, that this maximum occurs at µ =µ ∗. When using the N (0, 1) prior the maximum occurs at µ = 0 when n = 5 and from the second column of Table 1...

work page doi:10.1139/facets-2017-0121 2019

[1] [1]

pure frequentist

withσ 2 0 known andπ is aN (µ ∗,τ 2 ∗ ) prior and the hypothesis is H0 :µ =µ 0. So RB(µ 0 |x) = ( 1 + nτ2 ∗ σ 2 0 ) 1/ 2 exp    − 1 2 ( 1 + σ 2 0 nτ 2∗ ) −1 ( √n(¯x−µ 0) σ 0 + σ 0(µ ∗−µ 0)√nτ 2 ∗ ) 2 + (µ 0−µ ∗)2 2τ 2 0    , which, in this case is the same as the Bayes factor for µ 0 obtained via Jeﬀreys’ mixture approach. From this it is easy to se...

work page 2014

[2] [2]

incoherence

949 so Pl (x) is a 0 . 949 Bayesian conﬁdence interval for µ. To use (7) it is necessary to maximize M (RB(µ |X) ≤ 1 |µ ) as a function of µ and it is seen that, at least when the prior is not overly concentrate d, that this maximum occurs at µ =µ ∗. When using the N (0, 1) prior the maximum occurs at µ = 0 when n = 5 and from the second column of Table 1...

work page doi:10.1139/facets-2017-0121 2019