pith. machine review for the scientific record. sign in

arxiv: 2604.26069 · v1 · submitted 2026-04-28 · 🧮 math.ST · stat.TH

Recognition: unknown

Estimating the tail index of Pareto-type distributions from geometric records

F. Javier L\'opez, Gerardo Sanz, Mart\'in Alcalde, Miguel Lafuente, Ra\'ul Gouet

Pith reviewed 2026-05-07 14:11 UTC · model grok-4.3

classification 🧮 math.ST stat.TH
keywords tail indexPareto distributionsgeometric recordsmaximum likelihood estimatorasymptotic normalityheavy-tailed distributionsdestructive testing
0
0 comments X

The pith

A maximum likelihood estimator built from geometric records for the Pareto tail index is strongly consistent and asymptotically normal with explicit variance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an inferential method that uses geometric records to estimate the tail index of heavy-tailed distributions without needing every observation measured in full. It derives a maximum likelihood estimator specifically for the Pareto model, then proves this estimator is strongly consistent and asymptotically normal while supplying a closed-form expression for the limiting variance. The same consistency and normality results carry over to a larger family of Pareto-type distributions. Monte Carlo experiments show the estimator performs at least as well as classical alternatives such as Hill's estimator, and the method is shown to be especially efficient when data arrive one at a time or when full measurement is expensive.

Core claim

We construct a maximum likelihood estimator for the Pareto model and establish its strong consistency and asymptotic normality, providing also an explicit expression for its asymptotic variance. These results are then extended to a broad class of Pareto-type distributions.

What carries the argument

The likelihood function formed directly from geometric records, which serves as the basis for the maximum likelihood estimator of the tail index.

If this is right

  • The estimator produces smooth trajectories when data arrive sequentially.
  • In destructive testing the method reaches accuracy comparable to or better than Hill's estimator while using substantially fewer fully measured observations.
  • The approach applies directly to heavy-tailed data such as fluctuations of the Dow Jones Industrial Average.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The geometric-record sampling scheme may reduce measurement costs in other online or resource-limited monitoring settings beyond destructive testing.
  • Because the construction relies only on exceedance indicators at geometrically spaced points, the same likelihood idea could be adapted to estimate other extreme-value parameters.
  • The explicit asymptotic variance formula supplies a ready-made way to build approximate confidence intervals once the estimator is computed.

Load-bearing premise

The observations are independent and identically distributed draws from a Pareto or Pareto-type distribution so that the geometric records form a valid likelihood for the tail index.

What would settle it

A Monte Carlo experiment or real data set in which the estimator fails to converge to the true tail index or its finite-sample distribution deviates from the predicted asymptotic normal law as the number of records grows.

read the original abstract

In this paper we develop a novel inferential approach based on geometric records for estimating the tail index of heavy-tailed distributions. We construct a maximum likelihood estimator for the Pareto model and establish its strong consistency and asymptotic normality, providing also an explicit expression for its asymptotic variance. These results are then extended to a broad class of Pareto-type distributions. The performance of the estimator is assessed via Monte Carlo simulation and compared with classical estimators from the literature. The proposed method is particularly well suited for settings where data arrive sequentially, as it yields smooth estimation trajectories. It is also especially advantageous in applications such as destructive testing, where measuring each observation exactly is costly. In this context, the estimator clearly outperforms Hill's estimator, achieving comparable or better accuracy while requiring a substantially smaller number of measured observations. An application to the analysis of the distribution of fluctuations of the Dow Jones Industrial Average (DJI) is also presented.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops a maximum likelihood estimator for the tail index using geometric records from i.i.d. samples of a Pareto distribution, establishes its strong consistency and asymptotic normality with an explicit asymptotic variance formula, and extends these properties to Pareto-type distributions (regularly varying tails). It compares the estimator's performance via Monte Carlo simulations against classical methods such as Hill's estimator, emphasizes advantages for sequential data arrival and destructive testing scenarios, and illustrates the method on Dow Jones Industrial Average fluctuation data.

Significance. If the asymptotic results hold under the stated conditions, the approach offers a practical alternative for tail-index estimation that requires fewer fully measured observations while maintaining competitive accuracy, which is valuable in applications with high measurement costs. The explicit variance expression and the focus on smooth sequential estimation trajectories are strengths, as is the direct comparison to existing estimators in simulations and the real-data example. The extension to the broader Pareto-type class, if rigorously justified, broadens applicability in extreme-value theory.

major comments (2)
  1. [§4] §4 (extension to Pareto-type distributions): the claim that strong consistency and asymptotic normality carry over directly to distributions with regularly varying tails does not address the effect of a non-constant slowly varying function L on the record-value likelihood. When L is non-constant the score and Fisher information generally acquire extra terms that can alter the centering and the rate of convergence unless L satisfies a second-order regular-variation condition; the manuscript provides no such assumption and no verification that the normalized MLE still converges to the claimed normal limit with the same explicit variance.
  2. [Theorem 3.2] Theorem 3.2 (asymptotic normality for exact Pareto): the explicit variance formula is derived under the geometric-record likelihood, but the proof sketch does not show that the information matrix remains non-degenerate uniformly in the record index; a concrete check that the variance expression remains positive and finite for all admissible tail indices would strengthen the result.
minor comments (2)
  1. [Simulation study] The Monte Carlo section would benefit from reporting the exact number of replications, the range of sample sizes, and the precise definition of the geometric-record sampling scheme used in the simulations.
  2. [Notation] Notation for the tail quantile function and the slowly varying component should be introduced once and used consistently; occasional switches between U(t) and the record-based formulation create minor ambiguity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. Below we respond point by point to the major comments and indicate the revisions that will be made to the manuscript.

read point-by-point responses
  1. Referee: [§4] §4 (extension to Pareto-type distributions): the claim that strong consistency and asymptotic normality carry over directly to distributions with regularly varying tails does not address the effect of a non-constant slowly varying function L on the record-value likelihood. When L is non-constant the score and Fisher information generally acquire extra terms that can alter the centering and the rate of convergence unless L satisfies a second-order regular-variation condition; the manuscript provides no such assumption and no verification that the normalized MLE still converges to the claimed normal limit with the same explicit variance.

    Authors: We agree that the extension requires additional justification when L is non-constant. In the revised manuscript we will impose a second-order regular-variation condition on L and show that the resulting perturbation terms in the score and Fisher information are asymptotically negligible, thereby preserving both strong consistency and the stated asymptotic normality with the same explicit variance. revision: yes

  2. Referee: [Theorem 3.2] Theorem 3.2 (asymptotic normality for exact Pareto): the explicit variance formula is derived under the geometric-record likelihood, but the proof sketch does not show that the information matrix remains non-degenerate uniformly in the record index; a concrete check that the variance expression remains positive and finite for all admissible tail indices would strengthen the result.

    Authors: We accept that the current proof sketch is incomplete on this point. The revision will contain an explicit verification that the Fisher information matrix is non-degenerate for every record index and that the asymptotic variance remains positive and finite for all admissible tail indices γ > 0. revision: yes

Circularity Check

0 steps flagged

No circularity: MLE construction and asymptotic results follow from standard likelihood theory without reduction to inputs by definition or self-citation.

full rationale

The paper constructs an MLE directly from the geometric records likelihood under the Pareto model, then invokes standard theorems for strong consistency and asymptotic normality (with explicit variance) to establish its properties. The extension to Pareto-type distributions is framed as a direct generalization of these results. No steps match the enumerated circularity patterns: there is no self-definitional loop (e.g., defining the estimator in terms of its own predicted quantities), no fitted parameter relabeled as a prediction, no load-bearing self-citation chain, no imported uniqueness theorem from the authors' prior work, no smuggled ansatz, and no renaming of known results. The derivation remains self-contained against external statistical benchmarks such as classical MLE asymptotics for i.i.d. samples, with the reader's assessment of score 1.0 aligning with the absence of any quoted reduction to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract was available; detailed model assumptions and any additional technical conditions are not accessible.

axioms (1)
  • domain assumption Data consist of i.i.d. observations from a Pareto or Pareto-type distribution
    Required for the likelihood construction and asymptotic results stated in the abstract.

pith-pipeline@v0.9.0 · 5463 in / 1302 out tokens · 63883 ms · 2026-05-07T14:11:33.376392+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 42 canonical work pages

  1. [1]

    Business cycle aned herding behav- ior in stock returns: theory and evid Hence.Financ

    Ahn, K., Cong, L., Jang, H.et al.(2024). Business cycle aned herding behav- ior in stock returns: theory and evid Hence.Financ. Innov.,10, 6. DOI: https://doi.org/10.1186/s40854-023-00540-z

  2. [2]

    C., Balakrishnan, N., & Nagaraja, H

    Arnold, B. C., Balakrishnan, N., & Nagaraja, H. N. (1998).Records. John Wiley & Sons. DOI: https://doi.org/10.1002/9781118150412

  3. [3]

    G., & Stepanov, A

    Balakrishnan, N., Pakes, A. G., & Stepanov, A. On the number and sum of near-record observations.Adv. Appl. Prob.,37, 765–780. DOI: https://doi.org/10.1239/aap/1127483746

  4. [4]

    E., & Podobnik, B

    Beguˇ si´ c, S., Kostanjˇ car, Z., Stanley, H. E., & Podobnik, B. (2018). Scal- ing porperties of extreme price fluctuations in Bitcoin markets.Phys- ica A: Statistical Mechanics and its Applications,510, 400–406. DOI: https://doi.org/10.1016/j.physa.2018.06.131

  5. [5]

    Beirlant, J., Vynckier, P., & Teugels, J. L. (1996). Tail index estimation, Pareto quantile plots, and regression diagnostics.J. Amer. Statist. Assoc.,91, 1659–

  6. [6]

    DOI: https://doi.org/10.2307/2291593

  7. [7]

    Berred, M. (1992). On record values and the exponent of a distribution with regularly varying upper tail.J. Appl. Prob.,29(3), 575–586. DOI: https://doi.org/10.2307/3214894

  8. [8]

    Bertail, P., Cl´ emen¸ con, S., & Fern´ andez, C. (2025). Tail index estimation for dis- crete heavy-tailed distributions with application to statistical inference for regular 30 markov chains.TEST,34, 691–713. DOI: https://doi.org/10.1007/s11749-025- 00975-9

  9. [9]

    H., Goldie, C

    Bingham, N. H., Goldie, C. M., & Teugels, J. L. (1989).Regular Variation. Encyclopedia of Mathematics and its Applications. Cambrigde University Press

  10. [10]

    M., Chen, X

    Chaudhry, S. M., Chen, X. H., Ahmed, R., & Nasir, M. A. (2025). Risk modelling of ESG (environmental, social, and governance), healthcare, and financial sectors. Risk Analysis,45, 477–495. DOI: https://doi.org/10.1111/risa.14195

  11. [11]

    Chen, Q., & Liu, J. (2017). The conditional Borel-Cantelli lemma and applications.J. Korean Math. Soc.,54(2), 441–460. DOI: https://doi.org/10.4134/JKMS.j160036

  12. [12]

    Cs¨ org˝ o, S., & Viharos, L. (1995). On the asymptotic normality of Hill’s estimator.Math. Proc. Camb. Phil. Soc.,118(2), 375–382. DOI: https://doi.org/10.1017/S0305004100073710

  13. [13]

    Danielsson, J., de Haan, L., Peng, L., & de Vries, C. G. (2001). Using a bootstrap method to choose the sample fraction in tail index estimation.Journal of Multi- variate Analysis,76(2), 226–248. DOI: https://doi.org/10.1006/jmva.2000.1903

  14. [14]

    Davletov, F. (2022). Estimating the tail index of conditional distribution of asset returns.International Journal of Financial Research,13(2), 21615. DOI: https://doi.org/10.5430/ijfr.v13n2p14

  15. [15]

    Deheuvels, P., Haeusler, E., & Mason, D. M. (1988). Almost sure convergence of the Hill estimator.Math. Proc. Camb. Phil. Soc.,104(2), 371–381. DOI: https://doi.org/10.1017/S0305004100065531

  16. [16]

    Drees, H., de Haan, L., & Resnick, S. (2000). How to make a Hill plot.Ann. Stat., 28(1), 254–274

  17. [17]

    Eliazar, I. (2005). On geometric record times.Physica A: Stat. Mech. Appl.,348, 181-198. DOI: https://doi.org/10.1016/j.physa.2004.09.009

  18. [18]

    Fedotenkov, I. (2020). A review of more than one hundred Pareto-tail index estimators.Statistica,80(3), 245–299. DOI: https://doi.org/10.6092/issn.1973- 2201/9533

  19. [19]

    Ferguson, T. S. (1996).A Course in Large Sample Theory. Texts in Statistical Science. Chapman & Hall/CRC

  20. [20]

    Glick, N. (1978). Breaking records and breaking boards.The American Mathe- matical Monthly,85(1), 2–26. DOI: https://doi.org/10.2307/2978044. 31

  21. [21]

    I., Caeiro, F., Figueiredo, F., Henriques-Rodrigues, L., & Pestana, D

    Gomes, M. I., Caeiro, F., Figueiredo, F., Henriques-Rodrigues, L., & Pestana, D. (2020). Corrected-Hill versus partially reduced-bias value-at-risk estima- tion.Communications in Statistics-Simulation and Computation,49(4), 867–885. DOI: https://doi.org/10.1080/03610918.2018.1489053

  22. [22]

    Gopikrishnan, P., Meyer, M., Amaral, L. A. N., & Stanley, H. E. (1998). Inverse cubic law for the distribution of stock price variations.Eur. Phys. J. B,3, 139–

  23. [23]

    DOI: https://doi.org/10.1007/s100510050292

  24. [24]

    Gopikrishnan, P., Plerou, V., Amaral, L. A. N., Meyer, M., & Stan- ley, H. E. (1999). Scaling of the distribution of fluctuations of financial market indices.Physical Review E,60(5), 5305–5316. DOI: https://doi.org/10.1103/PhysRevE.60.5305

  25. [25]

    J., & Sanz, G

    Gouet, R., L´ opez, F. J., & Sanz, G. (2007). Asymptotic normality for the counting process of weak records andδ-records in discrete models.Bernoulli,13(3), 754–

  26. [26]

    DOI: https://doi.org/10.3150/07-BEJ6027

  27. [27]

    J., & Sanz, G

    Gouet, R., L´ opez, F. J., & Sanz, G. (2012). Onδ-record observations: asymptotic rates for the counting process and elements of maximum likelihood estimation. TEST,21(1), 188–214. DOI: https://doi.org/10.1007/s11749-011-0242-6

  28. [28]

    J., Maldonado, L., & Sanz, G

    Gouet, R., L´ opez, F. J., Maldonado, L., & Sanz, G. (2014). Statistical inference for the geometric distribution based onδ-records.Computational Statistics & Data Analysis,78, 21–32. DOI: https://doi.org/10.1016/j.csda.2014.04.002

  29. [29]

    J., Maldonado, L., & Sanz, G

    Gouet, R., L´ opez, F. J., Maldonado, L., & Sanz, G. (2020). Statistical inference for the Weibull distribution based onδ-record data.Symmetry,12, 20. DOI: https://doi.org/10.3390/sym12010020

  30. [30]

    Haeusler, E., & Teugles, J. L. (1985). On asymptotic normality of Hill’s esti- mator for the exponent of regular variation.Ann. Stat.,13(2), 743–756. DOI: https://doi.org/10.1214/aos/1176349551

  31. [31]

    Hall, P. (1982). On some simple estimates of an exponent of regular variation. Journal of the Royal Statistical Society: Series B (Methodological),44(1), 37–42

  32. [32]

    Hill, B. M. (1975). A simple approach to inference about the tail of a distribution. Ann. Stat.,3(5), 1163–1174. DOI: https://doi.org/10.1214/aos/1176343247

  33. [33]

    Keles, D., Hadzi-Mishev, R., & Paraschiv, F. (2016). Extreme value theory for heavy tails in electricity prices.Journal of Energy Markets,9(2), 21–50. DOI: https://doi.org/10.21314/JEM.2016.141

  34. [34]

    L., Bousquet, N., & Remy, E

    Keller, M., Popelin, A. L., Bousquet, N., & Remy, E. (2015). Nonpara- metric estimation of the probability of detection of flaws in an indus- trial component, from destructive and nondestructive testing data, using 32 approximate Bayesian computation.Risk Analysis,35(9), 1595–1610. DOI: https://doi.org/10.1111/risa.12484

  35. [35]

    D., & Passalidis, C

    Konstantinides, G. D., & Passalidis, C. D. (2025). A new approach in two- dimensional heavy-tailed distributions.Annals of Actuarial Science,19(2), 317–349. DOI: https://doi.org/10.1017/S1748499525000041

  36. [36]

    R., & Chandak, N

    Kumavat, H. R., & Chandak, N. R. (2024). Statistical analysis for evaluating concrete strength of existing structure using non-destructive and destructive test.Innovative Infrastructure Solutions,9(5), 173. DOI: https://doi.org/10.1007/s41062-024-01490-w

  37. [37]

    Langousis, A., Mamalakis, A., Puliga, M., & Deidda, R. (2016). Threshold detec- tion for the generalized Pareto distribution: Review of representative methods and application to the NOAA NCDC daily rainfall database.Water Resources Research,52(4), 2659–2681. DOI: https://doi.org/10.1002/2015WR018502

  38. [38]

    Louzaoui, A., & El Arrouchi, M. (2020). On the maximum likelihood estima- tion of extreme value index based onk-record values.Journal of Probability and Statistics,2020(2), 1–9. DOI: https://doi.org/10.1155/2020/5497413

  39. [39]

    Louzaoui, A., & El Arrouchi, M. (2023). Improving the bias of a pseudo-maximum likelihood estimate of the extreme value index byk-records.J. Stat. Theory Appl., 22, 54–69. DOI: https://doi.org/10.1007/s44199-023-00055-7

  40. [40]

    Mason, D. M. (1982). Laws of large numbers for sums of extreme values.Ann. Probab.,10(3), 756–764. DOI: https://doi.org/10.1214/AOP/1176993783

  41. [41]

    Mudelsee, M., & Bermejo, M. A. (2017). Optimal heavy tail estimation - Part 1: Order selection.Nonlin. Processes Geophys.,24(4), 737–744. DOI: https://doi.org/10.5194/npg-24-737-2017

  42. [42]

    D., & Papalexiou, S

    Nerantzaki, S. D., & Papalexiou, S. M. (2022). Assessing extremes in hydrocli- matology: A review on probabilistic methods.Journal of Hydrology,605, 127302. DOI: https://doi.org/10.1016/j.jhydrol.2021.127302

  43. [43]

    Nolan, J. P. (2020).Univariate Stable Distributions. Springer Series in Operations Research and Financial Engineering. Springer Naure. DOI: https://doi.org/10.1007/978-3-030-52915-4

  44. [44]

    O., Semary, H

    Obulezi, J. O., Semary, H. E., Nadir, S., Igbokwe, C. P., Orji, G. O., Al-Moisheer, A. S., & Elgarhy, M. (2025). Type-I heavy-tailed Burr XII distribution with applications to quality control, skewed reliability engineering systems and lifetime data.Computer Modeling in Engineering & Sciences,144(3), 2991–3027. DOI: https://doi.org/10.32604/cmes.2025.069553. 33

  45. [45]

    The Annals of Statistics , author =

    Pickands, J. III (1975). Statistical inference using extreme order statistics.Ann. Statist.,3(1), 119–131. DOI: https://doi.org/10.1214/aos/1176343003

  46. [46]

    Resnick, S. I. (2007).Heavy-tail phenomena. Probabilistic and statistical modeling. Springer Nature

  47. [47]

    I., & Stˇ aricˇ a, C

    Resnick, S. I., & Stˇ aricˇ a, C. (1997). Smoothing the Hill estimator.Advances in Applied Probability,29(1), 271–293. DOI: https://doi.org/10.2307/1427870

  48. [48]

    She, R., Dai, L., & Ling, S. (2025). Testing for change-points in heavy-tailed time series—A Winsorized CUSUM approach.Journal of Business & Economic Statistics, 1–13. DOI: https://doi.org/10.1080/07350015.2025.2561747

  49. [49]

    Shorrock, R. W. (1972). On record values and record times.Journal of Applied Probability,9(2), 316–326. DOI: https://doi.org/10.2307/3212801

  50. [50]

    M., Papalexiou, S

    Vogel, R. M., Papalexiou, S. M., Lamontagne, J. R., & Dolan, F. C. (2024). When heavy tails disrupt statistical inference.The American Statistician,79(2), 221–235. DOI: https://doi.org/10.1080/00031305.2024.2402898. Appendix A Lemma A.1Letp∈(0,1)andn∈N. IfXfollows a truncated geometric distribution to {0,1, . . . , n}, then a) E(X) = 1−p p −(n+ 1) (1−p) n...