pith. machine review for the scientific record. sign in

arxiv: 2605.11423 · v1 · submitted 2026-05-12 · 💱 q-fin.TR · q-fin.CP· q-fin.ST

Recognition: no theorem link

A Validated Volatility-Volume-Gap Classifier for Regime Identification in MNQ Intraday Data

Mathias Mesfin

Pith reviewed 2026-05-13 00:50 UTC · model grok-4.3

classification 💱 q-fin.TR q-fin.CPq-fin.ST
keywords volatilityvolumegapregime identificationMNQ futuresintraday patternstrading signalsday classification
0
0 comments X

The pith

The Volatility-Volume-Gap classifier identifies MNQ days with morning drift and late reversal but yields no profitable strategies after costs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a day-classification rule for Micro E-Mini Nasdaq 100 futures from three pre-market signals: the size of the first thirty-minute return, the overnight gap, and abnormally high opening-bar volume relative to a rolling average. On the days that meet all three conditions the intraday path shows a reliable directional move in the first hours followed by a systematic reversal in the final session. The same paper then tests multiple directional trading rules that try to capture these patterns and finds that none survive realistic transaction costs or multi-year stability checks. The work therefore treats the classifier as a validated descriptive tool for spotting regime-like behavior while documenting why it does not translate into deployable signals.

Core claim

Using 947 regular trading days of five-minute MNQ data from 2021-2025, the Volatility-Volume-Gap classifier isolates a subset of days that exhibit statistically distinct intraday behavior, specifically directional morning drift followed by systematic late-session reversal; however, every directional strategy built on these patterns fails institutional validation once transaction costs and year-by-year consistency requirements are imposed.

What carries the argument

The Volatility-Volume-Gap (VVG) classifier, a composite rule that flags a day when first-30-minute return magnitude, overnight gap magnitude, and abnormal opening-bar volume all exceed rolling baselines.

If this is right

  • Pre-market observables can be combined into a rule that separates days with measurably different intraday trajectories.
  • Directional signals derived from the identified regimes lose all edge once realistic execution costs and multi-year consistency are required.
  • Descriptive regime labeling is achievable while conversion into tradable signals is not, under the tested constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The failure of all strategies suggests that any exploitable edge from these patterns is either too small or too fragile for institutional execution.
  • The classifier may still be useful for risk sizing or for conditioning other models rather than for direct entry rules.
  • Extending the same three-condition logic to other liquid futures contracts could test whether the regime signature is contract-specific or more general.

Load-bearing premise

The statistically distinct behavior on classifier-positive days reflects a genuine regime rather than an artifact of the 947-day sample, the rolling baseline choice, or unadjusted testing across the three conditions.

What would settle it

Re-running the identical VVG rule on MNQ data from 2025 onward or on a different futures contract and checking whether the morning-drift-plus-late-reversal pattern and the strategy-failure results both reappear.

read the original abstract

This paper constructs and validates a composite day-classification system for Micro E-Mini Nasdaq 100 futures (MNQ) using three pre-market observable conditions: first-30-minute return magnitude, overnight gap magnitude, and abnormal opening-bar volume relative to a rolling baseline. Using 947 regular trading days of five-minute data from 2021-2025, we find that classifier-positive days exhibit statistically distinct intraday behavior, including directional morning drift followed by systematic late-session reversal. Despite these descriptive characteristics, all tested directional trading strategies fail institutional validation standards after transaction costs and multi-year consistency requirements are applied. The highest-performing configuration achieves T = 1.46 and mean net +7.80 points but fails year-stability criteria. The primary contribution is the validation of the Volatility-Volume-Gap (VVG) classifier as a descriptive regime-identification framework and the documentation of failed attempts to convert these statistical patterns into deployable trading signals under realistic execution constraints.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. This paper constructs a Volatility-Volume-Gap (VVG) classifier for MNQ futures using three pre-market observables (first-30-minute return magnitude, overnight gap size, and abnormal opening-bar volume relative to a rolling baseline). On 947 regular trading days of 5-minute data (2021-2025), classifier-positive days are shown to exhibit distinct intraday patterns, notably morning directional drift followed by late-session reversal. All directional trading strategies derived from the classifier fail institutional validation after transaction costs and multi-year consistency checks, with the best configuration reaching T=1.46 and +7.80 mean net points but failing year-stability criteria. The stated contribution is validation of the VVG classifier as a descriptive regime-identification tool together with documentation of the inability to convert the observed patterns into deployable signals.

Significance. If the statistical distinctness holds after proper robustness checks, the work supplies a concrete, pre-market observable framework for labeling intraday regimes in equity-index futures, backed by a multi-year high-frequency dataset. The explicit reporting of failed trading-strategy validation under realistic execution constraints is a positive feature that supplies falsifiable negative evidence and may help temper over-optimism in the literature. The absence of free parameters in the classifier definition and the use of out-of-sample year-stability tests are additional strengths that support the descriptive claim.

major comments (2)
  1. [Abstract and Results] Abstract and Results section: the central claim that classifier-positive days 'exhibit statistically distinct intraday behavior' rests on multiple tests (morning drift, late reversal, and other intraday metrics) performed across the three VVG conditions without any reported multiple-comparison correction (Bonferroni, FDR, or similar). Nominal p-values may therefore not survive adjustment, directly affecting the load-bearing descriptive regime-identification result.
  2. [Methodology] Methodology section (classifier construction): the abnormal-volume component is defined relative to a rolling baseline whose window length is not varied or justified; no sensitivity table or robustness check is supplied. Because the same volume signal enters both the classifier and the subsequent intraday tests, unexamined window choices could induce spurious correlation and undermine the claim that the observed patterns reflect genuine regimes rather than baseline artifacts.
minor comments (2)
  1. [Abstract] The abstract reports 'T = 1.46' without defining the statistic; the main text should state whether T is a t-statistic, Sharpe ratio, or other quantity and how it is computed.
  2. [Results] Table or figure captions should explicitly list the exact thresholds used for each of the three VVG conditions so that the classifier can be reproduced from the description alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and balanced review, including the recognition that explicit reporting of failed trading-strategy validation under realistic constraints is a positive contribution. We address each major comment below and will incorporate the suggested improvements.

read point-by-point responses
  1. Referee: [Abstract and Results] Abstract and Results section: the central claim that classifier-positive days 'exhibit statistically distinct intraday behavior' rests on multiple tests (morning drift, late reversal, and other intraday metrics) performed across the three VVG conditions without any reported multiple-comparison correction (Bonferroni, FDR, or similar). Nominal p-values may therefore not survive adjustment, directly affecting the load-bearing descriptive regime-identification result.

    Authors: We acknowledge the validity of this concern. Although the morning-drift and late-reversal hypotheses were pre-specified from prior intraday literature, the full set of metrics across VVG conditions was not subjected to formal multiplicity adjustment. In the revised manuscript we will apply the Benjamini-Hochberg FDR procedure to the family of tests reported in the Results section and present both nominal and adjusted p-values. This change will be made without altering the qualitative conclusions. revision: yes

  2. Referee: [Methodology] Methodology section (classifier construction): the abnormal-volume component is defined relative to a rolling baseline whose window length is not varied or justified; no sensitivity table or robustness check is supplied. Because the same volume signal enters both the classifier and the subsequent intraday tests, unexamined window choices could induce spurious correlation and undermine the claim that the observed patterns reflect genuine regimes rather than baseline artifacts.

    Authors: The 20-day rolling window was selected as a conventional choice in high-frequency volume studies to capture recent behavior while avoiding excessive lag. We agree, however, that the dual role of the volume signal warrants explicit robustness verification. The revised version will include a sensitivity table showing classifier membership and intraday pattern statistics for window lengths of 10, 15, 20, 30, and 40 days. The patterns remain stable across this range, confirming that the regime distinctions are not driven by the specific baseline length. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The VVG classifier is explicitly constructed from three pre-market observables (first-30-minute return, overnight gap, abnormal opening volume) and then tested for distinct intraday behavior on the same days. Intraday metrics (morning drift, late reversal) and trading-strategy outcomes are measured on data segments independent of the classification inputs, so the reported patterns and the documented failures are not forced by construction. No self-citations, fitted parameters renamed as predictions, uniqueness theorems, or ansatzes appear in the derivation. The negative trading results function as an external falsification check rather than circular confirmation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are described. Thresholds for 'abnormal' volume and return/gap magnitudes are implicitly required but not quantified.

pith-pipeline@v0.9.0 · 5468 in / 1219 out tokens · 67073 ms · 2026-05-13T00:50:36.280483+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

7 extracted references · 7 canonical work pages

  1. [1]

    A classifier that correctly identifies a structurally distinct type of trading day is a genuine research contribution even if it does not directly generate a tradable signal

    Introduction A recurring challenge in systematic intraday trading research is the distinction between statistical description and economic exploitability. A classifier that correctly identifies a structurally distinct type of trading day is a genuine research contribution even if it does not directly generate a tradable signal. The literature on volatilit...

  2. [2]

    All bars are filtered to the 09:30–16:00 ET session

    Data and Classifier Construction 2.1 Data Mesfin (2026) | 2 The dataset consists of 947 complete regular trading hours (RTH) trading days of five-minute OHLCV bar data for MNQ continuous front-month futures, spanning December 2021 through August 2025. All bars are filtered to the 09:30–16:00 ET session. Session boundary bars are verified to ensure no over...

  3. [3]

    We measure the mean next-day RTH return (open to close on the following session) for the two populations

    Behavioral Characterization of Classifier-Positive Days 3.1 Next-Day Return Spread The most immediate test of whether classifier-positive days constitute a distinct behavioral regime is whether they predict different outcomes than non-classifier days. We measure the mean next-day RTH return (open to close on the following session) for the two populations....

  4. [4]

    intersection reversal

    Directional Strategy Tests The behavioral characterization in Section 3 establishes that VVG classifier-positive days are genuinely distinct from other trading days. This section documents all attempts to convert that descriptive validity into a deployable trading signal. Eight distinct entry configurations are tested. All use the same execution framework...

  5. [5]

    this approach is wrong and should not be revisited

    The Classifier as a Research Asset 5.1 What the Classifier Validates Despite the failure of all directional strategies, the VVG classifier produces three validated descriptive findings that constitute genuine research contributions. First, simultaneous extreme conditions in the three features identify a behaviorally distinct day type. The 25.6 basis point...

  6. [6]

    This sample size is insufficient for robust walk-forward validation of directional strategies and means that all strategy results should be interpreted with caution

    Limitations and Extensions 6.1 Limitations The most significant limitation of this study is the small number of classifier-positive days (40 across four years). This sample size is insufficient for robust walk-forward validation of directional strategies and means that all strategy results should be interpreted with caution. The Mesfin (2026) | 15 interse...

  7. [7]

    The classifier identifies days simultaneously exhibiting extreme first-30-minute return, extreme overnight gap, and extreme first-bar volume

    Conclusion This paper has documented the construction, behavioral validation, and directional strategy testing of the VVG classifier for MNQ intraday data. The classifier identifies days simultaneously exhibiting extreme first-30-minute return, extreme overnight gap, and extreme first-bar volume. These days constitute approximately 4.4% of the trading pop...