A GPH-Filtered Hannan--Rissanen Information Criterion for ARFIMA Order Selection

Chunhao Cai

arxiv: 2606.04561 · v2 · pith:T3PBNOOSnew · submitted 2026-06-03 · 🧮 math.ST · stat.TH

A GPH-Filtered Hannan--Rissanen Information Criterion for ARFIMA Order Selection

Chunhao Cai This is my paper

Pith reviewed 2026-06-28 04:19 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords ARFIMAorder selectioninformation criterionconsistencylong memoryHannan-RissanenGPH estimator

0 comments

The pith

A GPH-filtered Hannan-Rissanen criterion consistently selects finite ARFIMA orders even as candidate bounds grow.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes and proves consistency for a two-stage order selector for ARFIMA models. It first applies a preliminary log-periodogram estimate to filter the long-memory component, then uses a Hannan-Rissanen method on the filtered data to choose autoregressive and moving-average orders via a generalized information criterion. The penalty is strengthened beyond standard BIC to handle errors from the preliminary step and approximation. This matters because correct order selection underpins valid statistical inference for long-memory processes where orders are unknown but finite.

Core claim

We prove a uniform residual-variance approximation over the growing rectangle and combine it with a population separation argument between the true finite ARMA representation and underfitted alternatives. The resulting generalized information-criterion selector is consistent.

What carries the argument

The uniform residual-variance approximation over the growing candidate rectangle combined with a population separation argument between the true model and underfitted alternatives.

Load-bearing premise

The penalty must exceed the ordinary BIC penalty enough to dominate errors from the preliminary long-memory estimation and the Hannan-Rissanen residual approximation.

What would settle it

Repeated sampling from an ARFIMA process with known fixed orders where the proportion of times the selector recovers the true orders fails to approach one as sample size grows.

read the original abstract

We propose a simple two-stage order selector for finite-order ARFIMA models. First, a preliminary log-periodogram estimate of the memory parameter is used to fractionally filter the data. Second, a Hannan--Rissanen residual construction is applied to the filtered series, and the autoregressive and moving-average orders are selected by a generalized information criterion over a growing candidate rectangle. The search bounds are allowed to satisfy \(P_n,Q_n\to\infty\), whereas the true orders remain fixed and finite. The penalty is allowed to be larger than the ordinary BIC penalty so that it dominates the error introduced by preliminary long-memory estimation and by the Hannan--Rissanen residual approximation. We prove a uniform residual-variance approximation over the growing rectangle and combine it with a population separation argument between the true finite ARMA representation and underfitted alternatives. The resulting generalized information-criterion selector is consistent.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a consistent two-stage GPH-filtered Hannan-Rissanen IC for ARFIMA order selection over growing rectangles, but the uniform residual approximation after filtering is the part that needs the closest check.

read the letter

The paper proposes a two-stage procedure for selecting AR and MA orders in finite-order ARFIMA models. First a GPH log-periodogram estimate of the memory parameter is used to fractionally filter the series. Then Hannan-Rissanen residuals are formed on the filtered data and a generalized information criterion is minimized over a rectangle of candidate orders whose dimensions both diverge. The claim is that this selector is consistent when the true orders are fixed and finite.

What is new is the explicit handling of the preliminary long-memory estimation inside the order-selection step while still allowing the search set to grow. The technical steps are a uniform residual-variance approximation over the rectangle together with a population separation argument between the true ARMA representation and underfitted models. The penalty is permitted to exceed the usual BIC rate so that it can absorb the extra error coming from the GPH step and the residual approximation.

The soft spot is whether that uniform approximation really controls the GPH-induced perturbation uniformly. The filtered series carries an additive error whose size depends on the bandwidth and the rate of the GPH estimator. When this error is fed into the Hannan-Rissanen construction for every pair up to the growing bounds, the resulting variance estimates must stay smaller than the minimal population gap. If the uniform bound is only o(1) without a rate that beats the separation term, inflating the penalty may not be enough to restore consistency. The abstract states that the penalty is chosen large enough to dominate these errors, but the rate alignment between m_n, max(P_n, Q_n), and the separation needs to be verified in the proofs.

This work is aimed at time-series researchers who need order-selection tools for long-memory processes. A reader already working in that subfield would get a usable procedure and a consistency result. It deserves a serious referee to check the derivations and the handling of the preliminary estimator.

Referee Report

2 major / 2 minor

Summary. The paper proposes a two-stage procedure for consistent order selection in finite-order ARFIMA models. A preliminary GPH log-periodogram estimator of the memory parameter d is used to fractionally difference the series; a Hannan-Rissanen residual construction is then applied to the filtered data and orders (p,q) are chosen by a generalized information criterion over a candidate rectangle whose dimensions P_n and Q_n may diverge to infinity. The penalty is permitted to exceed the usual BIC rate so that it dominates both the GPH estimation error and the Hannan-Rissanen approximation error. A uniform residual-variance approximation over the growing rectangle is established and combined with a population separation argument to yield consistency of the selector.

Significance. If the uniform approximation and rate conditions hold, the result supplies a theoretically justified, computationally feasible selector for ARFIMA orders that accommodates growing search bounds while remaining consistent. This addresses a practical gap between existing information criteria (which typically fix the candidate set) and the needs of long-memory modeling. The explicit allowance for a larger penalty to absorb preliminary-estimator error is a constructive feature.

major comments (2)

[§4] §4 (uniform residual-variance approximation): the stated o_p(1) bound is not shown to be uniform in a manner that explicitly controls the additive perturbation induced by |d̂ - d| (whose rate depends on the GPH bandwidth m_n) simultaneously for all (p,q) up to (P_n,Q_n). Without an explicit comparison of this error to the minimal population gap and to the chosen penalty sequence, it is unclear whether any penalty that still satisfies the BIC-type consistency condition can dominate the approximation error for all admissible sequences P_n,Q_n→∞.
[Theorem 5.1] Theorem 5.1 (consistency): the population separation argument is applied after GPH filtering, yet the proof sketch does not verify that the filtered process preserves the strict inequality between the true ARMA(p0,q0) residual variance and all underfitted alternatives at a rate faster than the penalty term. The dependence on the preliminary estimator appears only through an o(1) term whose interaction with the growing rectangle is not quantified.

minor comments (2)

[§3] Notation for the filtered series and the Hannan-Rissanen residuals should be introduced with an explicit equation number in §3 to avoid ambiguity when the uniform bound is invoked later.
[§2] The statement that the penalty 'is allowed to be larger than the ordinary BIC penalty' would benefit from an explicit lower bound on the penalty coefficient in terms of m_n and log(P_n Q_n).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and the constructive major comments. The points raised concern the explicit uniformity of the residual-variance approximation with respect to the GPH estimation error and the quantitative preservation of the population separation after filtering. We address each comment below and will revise the manuscript to strengthen the relevant proofs and comparisons.

read point-by-point responses

Referee: [§4] §4 (uniform residual-variance approximation): the stated o_p(1) bound is not shown to be uniform in a manner that explicitly controls the additive perturbation induced by |d̂ - d| (whose rate depends on the GPH bandwidth m_n) simultaneously for all (p,q) up to (P_n,Q_n). Without an explicit comparison of this error to the minimal population gap and to the chosen penalty sequence, it is unclear whether any penalty that still satisfies the BIC-type consistency condition can dominate the approximation error for all admissible sequences P_n,Q_n→∞.

Authors: We agree that an explicit decomposition separating the GPH filtering error from the Hannan-Rissanen approximation error, together with a direct comparison to the penalty, is needed for full rigor when P_n and Q_n diverge. The current argument bounds the filter perturbation via the GPH rate and the continuity of the fractional difference operator, then takes the supremum over the rectangle; however, the interaction with the minimal gap is only implicit. In the revision we will insert a new lemma that isolates the O_p term arising from |d̂ - d| (uniformly in (p,q) ≤ (P_n,Q_n) under the maintained bandwidth conditions) and verifies that this term is dominated by the allowed penalty sequence, which is permitted to exceed the usual BIC rate precisely to absorb preliminary-estimator error. The revised §4 will contain the explicit comparison. revision: yes
Referee: [Theorem 5.1] Theorem 5.1 (consistency): the population separation argument is applied after GPH filtering, yet the proof sketch does not verify that the filtered process preserves the strict inequality between the true ARMA(p0,q0) residual variance and all underfitted alternatives at a rate faster than the penalty term. The dependence on the preliminary estimator appears only through an o(1) term whose interaction with the growing rectangle is not quantified.

Authors: The separation is first established for the correctly filtered ARMA process; continuity of the residual variance map with respect to the memory parameter then transfers the strict inequality to the GPH-filtered series. To quantify the rate, the proof will be expanded to show that the gap after filtering remains bounded below by a positive constant minus an o_p(1) term that is uniform over underfitted models and is controlled by the same uniform approximation result from §4. Because the penalty is chosen to dominate both the GPH error and the Hannan-Rissanen error uniformly over the rectangle, the o_p(1) term is eventually smaller than the penalty increment. The revised proof of Theorem 5.1 will include these rate calculations (or an appendix reference to them). revision: yes

Circularity Check

0 steps flagged

No circularity; consistency derived from independent uniform approximation and separation arguments

full rationale

The paper proves consistency of the order selector via two explicit steps: (1) establishing a uniform residual-variance approximation over the growing (P_n, Q_n) rectangle after GPH filtering, and (2) combining that bound with a population separation argument between the true ARMA representation and underfitted models. These are standard asymptotic arguments that do not reduce by construction to fitted quantities, self-citations, or ansatzes smuggled from prior work by the same authors. The allowance for a larger penalty is justified directly by the need to dominate the approximation error, without any self-definitional loop or renaming of known results. The derivation is therefore self-contained and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract supplies no explicit free parameters, invented entities, or non-standard axioms; the consistency claim rests on standard time-series regularity conditions and the stated penalty dominance, all of which are domain assumptions rather than paper-specific inventions.

axioms (2)

domain assumption Standard regularity conditions for stationary long-memory processes and consistency of the preliminary GPH estimator
Invoked to justify the filtering step and the uniform approximation over the candidate rectangle.
domain assumption Existence of a population separation between the true finite-order ARMA representation and underfitted models
Central to the information-criterion consistency argument.

pith-pipeline@v0.9.1-grok · 5680 in / 1258 out tokens · 24683 ms · 2026-06-28T04:19:05.259739+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 16 canonical work pages

[1]

and Michailidis, G

Basu, S. and Michailidis, G. (2015). Regularized estimation in sparse high-dimensional time series models.The Annals of Statistics43(4), 1535–1567. doi:10.1214/15-AOS1315

work page doi:10.1214/15-aos1315 2015
[2]

(1994).Statistics for Long-Memory Processes

Beran, J. (1994).Statistics for Long-Memory Processes. Chapman and Hall. doi:10.1201/9780203738481

work page doi:10.1201/9780203738481 1994
[3]

Brockwell, P. J. and Davis, R. A. (1991).Time Series: Theory and Methods. Springer. doi:10.1007/978-1-4419-0320-4

work page doi:10.1007/978-1-4419-0320-4 1991
[5]

Granger, C. W. J. and Joyeux, R. (1980). An introduction to long-memory time series models and fractional differencing.Journal of Time Series Analysis1(1), 15–29. doi:10.1111/j.1467- 9892.1980.tb00297.x

work page doi:10.1111/j.1467- 1980
[6]

Hannan, E. J. and Rissanen, J. (1982). Recursive estimation of mixed autoregressive-moving average order.Biometrika69(1), 81–94. doi:10.1093/biomet/69.1.81

work page doi:10.1093/biomet/69.1.81 1982
[7]

Hannan, E. J. and Quinn, B. G. (1979). The determination of the order of an autoregres- sion.Journal of the Royal Statistical Society, Series B41(2), 190–195. doi:10.1111/j.2517- 6161.1979.tb01072.x

work page doi:10.1111/j.2517- 1979
[8]

Hosking, J. R. M. (1981). Fractional differencing.Biometrika68(1), 165–176. doi:10.1093/biomet/68.1.165

work page doi:10.1093/biomet/68.1.165 1981
[9]

H., Chen, K

Huang, H.-H., Chan, N. H., Chen, K. and Ing, C.-K. (2022). Consistent order selection for ARFIMA processes.The Annals of Statistics50(3), 1297–1319. doi:10.1214/21-AOS2149

work page doi:10.1214/21-aos2149 2022
[10]

M., Deo, R

Hurvich, C. M., Deo, R. S. and Brodsky, J. (1998). The mean squared error of Geweke and Porter-Hudak’s estimator of the memory parameter of a long-memory time series.Journal of Time Series Analysis19(1), 19–46. doi:10.1111/1467-9892.00075

work page doi:10.1111/1467-9892.00075 1998
[11]

Robinson, P. M. (1995). Log-periodogram regression of time series with long range dependence. The Annals of Statistics23(3), 1048–1072. doi:10.1214/aos/1176324636. 19

work page doi:10.1214/aos/1176324636 1995
[12]

and Rio, E

Merlevede, F., Peligrad, M. and Rio, E. (2011). A Bernstein type inequality and moderate deviations for weakly dependent sequences.Probability Theory and Related Fields151, 435–

2011
[13]

doi:10.1007/s00440-010-0304-9

work page doi:10.1007/s00440-010-0304-9
[14]

Schwarz, G. (1978). Estimating the dimension of a model.The Annals of Statistics6(2), 461–464. doi:10.1214/aos/1176344136

work page doi:10.1214/aos/1176344136 1978
[15]

Sowell, F. (1992). Maximum likelihood estimation of stationary univariate fractionally in- tegrated time series models.Journal of Econometrics53(1–3), 165–188. doi:10.1016/0304- 4076(92)90084-5

work page doi:10.1016/0304- 1992
[16]

Velasco, C. (2000). Non-Gaussian log-periodogram regression.Econometric Theory16(1), 44–79. doi:10.1017/S0266466600161031

work page doi:10.1017/s0266466600161031 2000
[17]

Wu, W. B. (2005). Nonlinear system theory: Another look at dependence.Proceedings of the National Academy of Sciences102(40), 14150–14154. doi:10.1073/pnas.0506715102

work page doi:10.1073/pnas.0506715102 2005
[18]

and Wu, W

Zhang, D. and Wu, W. B. (2017). Gaussian approximation for high dimensional time series. The Annals of Statistics45(5), 1895–1919. doi:10.1214/16-AOS1512. 20

work page doi:10.1214/16-aos1512 2017

[1] [1]

and Michailidis, G

Basu, S. and Michailidis, G. (2015). Regularized estimation in sparse high-dimensional time series models.The Annals of Statistics43(4), 1535–1567. doi:10.1214/15-AOS1315

work page doi:10.1214/15-aos1315 2015

[2] [2]

(1994).Statistics for Long-Memory Processes

Beran, J. (1994).Statistics for Long-Memory Processes. Chapman and Hall. doi:10.1201/9780203738481

work page doi:10.1201/9780203738481 1994

[3] [3]

Brockwell, P. J. and Davis, R. A. (1991).Time Series: Theory and Methods. Springer. doi:10.1007/978-1-4419-0320-4

work page doi:10.1007/978-1-4419-0320-4 1991

[4] [5]

Granger, C. W. J. and Joyeux, R. (1980). An introduction to long-memory time series models and fractional differencing.Journal of Time Series Analysis1(1), 15–29. doi:10.1111/j.1467- 9892.1980.tb00297.x

work page doi:10.1111/j.1467- 1980

[5] [6]

Hannan, E. J. and Rissanen, J. (1982). Recursive estimation of mixed autoregressive-moving average order.Biometrika69(1), 81–94. doi:10.1093/biomet/69.1.81

work page doi:10.1093/biomet/69.1.81 1982

[6] [7]

Hannan, E. J. and Quinn, B. G. (1979). The determination of the order of an autoregres- sion.Journal of the Royal Statistical Society, Series B41(2), 190–195. doi:10.1111/j.2517- 6161.1979.tb01072.x

work page doi:10.1111/j.2517- 1979

[7] [8]

Hosking, J. R. M. (1981). Fractional differencing.Biometrika68(1), 165–176. doi:10.1093/biomet/68.1.165

work page doi:10.1093/biomet/68.1.165 1981

[8] [9]

H., Chen, K

Huang, H.-H., Chan, N. H., Chen, K. and Ing, C.-K. (2022). Consistent order selection for ARFIMA processes.The Annals of Statistics50(3), 1297–1319. doi:10.1214/21-AOS2149

work page doi:10.1214/21-aos2149 2022

[9] [10]

M., Deo, R

Hurvich, C. M., Deo, R. S. and Brodsky, J. (1998). The mean squared error of Geweke and Porter-Hudak’s estimator of the memory parameter of a long-memory time series.Journal of Time Series Analysis19(1), 19–46. doi:10.1111/1467-9892.00075

work page doi:10.1111/1467-9892.00075 1998

[10] [11]

Robinson, P. M. (1995). Log-periodogram regression of time series with long range dependence. The Annals of Statistics23(3), 1048–1072. doi:10.1214/aos/1176324636. 19

work page doi:10.1214/aos/1176324636 1995

[11] [12]

and Rio, E

Merlevede, F., Peligrad, M. and Rio, E. (2011). A Bernstein type inequality and moderate deviations for weakly dependent sequences.Probability Theory and Related Fields151, 435–

2011

[12] [13]

doi:10.1007/s00440-010-0304-9

work page doi:10.1007/s00440-010-0304-9

[13] [14]

Schwarz, G. (1978). Estimating the dimension of a model.The Annals of Statistics6(2), 461–464. doi:10.1214/aos/1176344136

work page doi:10.1214/aos/1176344136 1978

[14] [15]

Sowell, F. (1992). Maximum likelihood estimation of stationary univariate fractionally in- tegrated time series models.Journal of Econometrics53(1–3), 165–188. doi:10.1016/0304- 4076(92)90084-5

work page doi:10.1016/0304- 1992

[15] [16]

Velasco, C. (2000). Non-Gaussian log-periodogram regression.Econometric Theory16(1), 44–79. doi:10.1017/S0266466600161031

work page doi:10.1017/s0266466600161031 2000

[16] [17]

Wu, W. B. (2005). Nonlinear system theory: Another look at dependence.Proceedings of the National Academy of Sciences102(40), 14150–14154. doi:10.1073/pnas.0506715102

work page doi:10.1073/pnas.0506715102 2005

[17] [18]

and Wu, W

Zhang, D. and Wu, W. B. (2017). Gaussian approximation for high dimensional time series. The Annals of Statistics45(5), 1895–1919. doi:10.1214/16-AOS1512. 20

work page doi:10.1214/16-aos1512 2017