arxiv: 2604.26069 · v1 · submitted 2026-04-28 · 🧮 math.ST · stat.TH

Recognition: unknown

Estimating the tail index of Pareto-type distributions from geometric records

F. Javier L\'opez, Gerardo Sanz, Mart\'in Alcalde, Miguel Lafuente, Ra\'ul Gouet

Pith reviewed 2026-05-07 14:11 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords tail indexPareto distributionsgeometric recordsmaximum likelihood estimatorasymptotic normalityheavy-tailed distributionsdestructive testing

0 comments

The pith

A maximum likelihood estimator built from geometric records for the Pareto tail index is strongly consistent and asymptotically normal with explicit variance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an inferential method that uses geometric records to estimate the tail index of heavy-tailed distributions without needing every observation measured in full. It derives a maximum likelihood estimator specifically for the Pareto model, then proves this estimator is strongly consistent and asymptotically normal while supplying a closed-form expression for the limiting variance. The same consistency and normality results carry over to a larger family of Pareto-type distributions. Monte Carlo experiments show the estimator performs at least as well as classical alternatives such as Hill's estimator, and the method is shown to be especially efficient when data arrive one at a time or when full measurement is expensive.

Core claim

We construct a maximum likelihood estimator for the Pareto model and establish its strong consistency and asymptotic normality, providing also an explicit expression for its asymptotic variance. These results are then extended to a broad class of Pareto-type distributions.

What carries the argument

The likelihood function formed directly from geometric records, which serves as the basis for the maximum likelihood estimator of the tail index.

If this is right

The estimator produces smooth trajectories when data arrive sequentially.
In destructive testing the method reaches accuracy comparable to or better than Hill's estimator while using substantially fewer fully measured observations.
The approach applies directly to heavy-tailed data such as fluctuations of the Dow Jones Industrial Average.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The geometric-record sampling scheme may reduce measurement costs in other online or resource-limited monitoring settings beyond destructive testing.
Because the construction relies only on exceedance indicators at geometrically spaced points, the same likelihood idea could be adapted to estimate other extreme-value parameters.
The explicit asymptotic variance formula supplies a ready-made way to build approximate confidence intervals once the estimator is computed.

Load-bearing premise

The observations are independent and identically distributed draws from a Pareto or Pareto-type distribution so that the geometric records form a valid likelihood for the tail index.

What would settle it

A Monte Carlo experiment or real data set in which the estimator fails to converge to the true tail index or its finite-sample distribution deviates from the predicted asymptotic normal law as the number of records grows.

read the original abstract

In this paper we develop a novel inferential approach based on geometric records for estimating the tail index of heavy-tailed distributions. We construct a maximum likelihood estimator for the Pareto model and establish its strong consistency and asymptotic normality, providing also an explicit expression for its asymptotic variance. These results are then extended to a broad class of Pareto-type distributions. The performance of the estimator is assessed via Monte Carlo simulation and compared with classical estimators from the literature. The proposed method is particularly well suited for settings where data arrive sequentially, as it yields smooth estimation trajectories. It is also especially advantageous in applications such as destructive testing, where measuring each observation exactly is costly. In this context, the estimator clearly outperforms Hill's estimator, achieving comparable or better accuracy while requiring a substantially smaller number of measured observations. An application to the analysis of the distribution of fluctuations of the Dow Jones Industrial Average (DJI) is also presented.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The geometric records MLE gives clean consistency and normality for exact Pareto plus some practical edges in simulations, but the extension to Pareto-type distributions looks incomplete without extra conditions on the slowly varying function.

read the letter

The paper builds a maximum likelihood estimator for the tail index directly from geometric records. For the exact Pareto model it proves strong consistency, asymptotic normality, and supplies an explicit asymptotic variance. That core derivation is the main new piece, and it is presented as distinct from the usual order-statistic estimators like Hill's. The sequential nature of the records also lets the estimator update smoothly as data arrive, which is a clear practical plus when observations come in streams or when full measurement is expensive, such as destructive testing. The Monte Carlo comparisons and the Dow Jones application show the estimator holding its own or beating Hill's while using fewer exact measurements, so the efficiency claim has some backing in the numbers they report. The extension of the same asymptotic results to the broader Pareto-type class is the part that needs scrutiny. When the tail is regularly varying but the slowly varying function L is not constant, the likelihood score and information can pick up additional terms that affect the rate and the limiting distribution. The abstract states the extension without mentioning second-order regular variation or similar controls on L, and the stress-test concern on this point is reasonable. If the proofs do not impose or verify those conditions, the general result may not hold at the claimed rate. The paper is aimed at extreme-value statisticians and applied researchers who work with heavy tails in risk or reliability settings, especially when data are sequential or costly to measure fully. A reader who wants a new likelihood-based estimator with some theory and simulation checks will find usable material here. It deserves peer review because the Pareto-case results are self-contained and the practical comparisons are concrete, even though the general extension would benefit from tighter conditions and clearer proof steps.

Referee Report

2 major / 2 minor

Summary. The paper develops a maximum likelihood estimator for the tail index using geometric records from i.i.d. samples of a Pareto distribution, establishes its strong consistency and asymptotic normality with an explicit asymptotic variance formula, and extends these properties to Pareto-type distributions (regularly varying tails). It compares the estimator's performance via Monte Carlo simulations against classical methods such as Hill's estimator, emphasizes advantages for sequential data arrival and destructive testing scenarios, and illustrates the method on Dow Jones Industrial Average fluctuation data.

Significance. If the asymptotic results hold under the stated conditions, the approach offers a practical alternative for tail-index estimation that requires fewer fully measured observations while maintaining competitive accuracy, which is valuable in applications with high measurement costs. The explicit variance expression and the focus on smooth sequential estimation trajectories are strengths, as is the direct comparison to existing estimators in simulations and the real-data example. The extension to the broader Pareto-type class, if rigorously justified, broadens applicability in extreme-value theory.

major comments (2)

[§4] §4 (extension to Pareto-type distributions): the claim that strong consistency and asymptotic normality carry over directly to distributions with regularly varying tails does not address the effect of a non-constant slowly varying function L on the record-value likelihood. When L is non-constant the score and Fisher information generally acquire extra terms that can alter the centering and the rate of convergence unless L satisfies a second-order regular-variation condition; the manuscript provides no such assumption and no verification that the normalized MLE still converges to the claimed normal limit with the same explicit variance.
[Theorem 3.2] Theorem 3.2 (asymptotic normality for exact Pareto): the explicit variance formula is derived under the geometric-record likelihood, but the proof sketch does not show that the information matrix remains non-degenerate uniformly in the record index; a concrete check that the variance expression remains positive and finite for all admissible tail indices would strengthen the result.

minor comments (2)

[Simulation study] The Monte Carlo section would benefit from reporting the exact number of replications, the range of sample sizes, and the precise definition of the geometric-record sampling scheme used in the simulations.
[Notation] Notation for the tail quantile function and the slowly varying component should be introduced once and used consistently; occasional switches between U(t) and the record-based formulation create minor ambiguity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. Below we respond point by point to the major comments and indicate the revisions that will be made to the manuscript.

read point-by-point responses

Referee: [§4] §4 (extension to Pareto-type distributions): the claim that strong consistency and asymptotic normality carry over directly to distributions with regularly varying tails does not address the effect of a non-constant slowly varying function L on the record-value likelihood. When L is non-constant the score and Fisher information generally acquire extra terms that can alter the centering and the rate of convergence unless L satisfies a second-order regular-variation condition; the manuscript provides no such assumption and no verification that the normalized MLE still converges to the claimed normal limit with the same explicit variance.

Authors: We agree that the extension requires additional justification when L is non-constant. In the revised manuscript we will impose a second-order regular-variation condition on L and show that the resulting perturbation terms in the score and Fisher information are asymptotically negligible, thereby preserving both strong consistency and the stated asymptotic normality with the same explicit variance. revision: yes
Referee: [Theorem 3.2] Theorem 3.2 (asymptotic normality for exact Pareto): the explicit variance formula is derived under the geometric-record likelihood, but the proof sketch does not show that the information matrix remains non-degenerate uniformly in the record index; a concrete check that the variance expression remains positive and finite for all admissible tail indices would strengthen the result.

Authors: We accept that the current proof sketch is incomplete on this point. The revision will contain an explicit verification that the Fisher information matrix is non-degenerate for every record index and that the asymptotic variance remains positive and finite for all admissible tail indices γ > 0. revision: yes

Circularity Check

0 steps flagged

No circularity: MLE construction and asymptotic results follow from standard likelihood theory without reduction to inputs by definition or self-citation.

full rationale

The paper constructs an MLE directly from the geometric records likelihood under the Pareto model, then invokes standard theorems for strong consistency and asymptotic normality (with explicit variance) to establish its properties. The extension to Pareto-type distributions is framed as a direct generalization of these results. No steps match the enumerated circularity patterns: there is no self-definitional loop (e.g., defining the estimator in terms of its own predicted quantities), no fitted parameter relabeled as a prediction, no load-bearing self-citation chain, no imported uniqueness theorem from the authors' prior work, no smuggled ansatz, and no renaming of known results. The derivation remains self-contained against external statistical benchmarks such as classical MLE asymptotics for i.i.d. samples, with the reader's assessment of score 1.0 aligning with the absence of any quoted reduction to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract was available; detailed model assumptions and any additional technical conditions are not accessible.

axioms (1)

domain assumption Data consist of i.i.d. observations from a Pareto or Pareto-type distribution
Required for the likelihood construction and asymptotic results stated in the abstract.

pith-pipeline@v0.9.0 · 5463 in / 1302 out tokens · 63883 ms · 2026-05-07T14:11:33.376392+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 42 canonical work pages

[1]

Business cycle aned herding behav- ior in stock returns: theory and evid Hence.Financ

Ahn, K., Cong, L., Jang, H.et al.(2024). Business cycle aned herding behav- ior in stock returns: theory and evid Hence.Financ. Innov.,10, 6. DOI: https://doi.org/10.1186/s40854-023-00540-z

work page doi:10.1186/s40854-023-00540-z 2024
[2]

C., Balakrishnan, N., & Nagaraja, H

Arnold, B. C., Balakrishnan, N., & Nagaraja, H. N. (1998).Records. John Wiley & Sons. DOI: https://doi.org/10.1002/9781118150412

work page doi:10.1002/9781118150412 1998
[3]

G., & Stepanov, A

Balakrishnan, N., Pakes, A. G., & Stepanov, A. On the number and sum of near-record observations.Adv. Appl. Prob.,37, 765–780. DOI: https://doi.org/10.1239/aap/1127483746

work page doi:10.1239/aap/1127483746
[4]

E., & Podobnik, B

Beguˇ si´ c, S., Kostanjˇ car, Z., Stanley, H. E., & Podobnik, B. (2018). Scal- ing porperties of extreme price fluctuations in Bitcoin markets.Phys- ica A: Statistical Mechanics and its Applications,510, 400–406. DOI: https://doi.org/10.1016/j.physa.2018.06.131

work page doi:10.1016/j.physa.2018.06.131 2018
[5]

Beirlant, J., Vynckier, P., & Teugels, J. L. (1996). Tail index estimation, Pareto quantile plots, and regression diagnostics.J. Amer. Statist. Assoc.,91, 1659–

1996
[6]

DOI: https://doi.org/10.2307/2291593

work page doi:10.2307/2291593
[7]

Berred, M. (1992). On record values and the exponent of a distribution with regularly varying upper tail.J. Appl. Prob.,29(3), 575–586. DOI: https://doi.org/10.2307/3214894

work page doi:10.2307/3214894 1992
[8]

Bertail, P., Cl´ emen¸ con, S., & Fern´ andez, C. (2025). Tail index estimation for dis- crete heavy-tailed distributions with application to statistical inference for regular 30 markov chains.TEST,34, 691–713. DOI: https://doi.org/10.1007/s11749-025- 00975-9

work page doi:10.1007/s11749-025- 2025
[9]

H., Goldie, C

Bingham, N. H., Goldie, C. M., & Teugels, J. L. (1989).Regular Variation. Encyclopedia of Mathematics and its Applications. Cambrigde University Press

1989
[10]

M., Chen, X

Chaudhry, S. M., Chen, X. H., Ahmed, R., & Nasir, M. A. (2025). Risk modelling of ESG (environmental, social, and governance), healthcare, and financial sectors. Risk Analysis,45, 477–495. DOI: https://doi.org/10.1111/risa.14195

work page doi:10.1111/risa.14195 2025
[11]

Chen, Q., & Liu, J. (2017). The conditional Borel-Cantelli lemma and applications.J. Korean Math. Soc.,54(2), 441–460. DOI: https://doi.org/10.4134/JKMS.j160036

work page doi:10.4134/jkms.j160036 2017
[12]

Cs¨ org˝ o, S., & Viharos, L. (1995). On the asymptotic normality of Hill’s estimator.Math. Proc. Camb. Phil. Soc.,118(2), 375–382. DOI: https://doi.org/10.1017/S0305004100073710

work page doi:10.1017/s0305004100073710 1995
[13]

Danielsson, J., de Haan, L., Peng, L., & de Vries, C. G. (2001). Using a bootstrap method to choose the sample fraction in tail index estimation.Journal of Multi- variate Analysis,76(2), 226–248. DOI: https://doi.org/10.1006/jmva.2000.1903

work page doi:10.1006/jmva.2000.1903 2001
[14]

Davletov, F. (2022). Estimating the tail index of conditional distribution of asset returns.International Journal of Financial Research,13(2), 21615. DOI: https://doi.org/10.5430/ijfr.v13n2p14

work page doi:10.5430/ijfr.v13n2p14 2022
[15]

Deheuvels, P., Haeusler, E., & Mason, D. M. (1988). Almost sure convergence of the Hill estimator.Math. Proc. Camb. Phil. Soc.,104(2), 371–381. DOI: https://doi.org/10.1017/S0305004100065531

work page doi:10.1017/s0305004100065531 1988
[16]

Drees, H., de Haan, L., & Resnick, S. (2000). How to make a Hill plot.Ann. Stat., 28(1), 254–274

2000
[17]

Eliazar, I. (2005). On geometric record times.Physica A: Stat. Mech. Appl.,348, 181-198. DOI: https://doi.org/10.1016/j.physa.2004.09.009

work page doi:10.1016/j.physa.2004.09.009 2005
[18]

Fedotenkov, I. (2020). A review of more than one hundred Pareto-tail index estimators.Statistica,80(3), 245–299. DOI: https://doi.org/10.6092/issn.1973- 2201/9533

work page doi:10.6092/issn.1973- 2020
[19]

Ferguson, T. S. (1996).A Course in Large Sample Theory. Texts in Statistical Science. Chapman & Hall/CRC

1996
[20]

Glick, N. (1978). Breaking records and breaking boards.The American Mathe- matical Monthly,85(1), 2–26. DOI: https://doi.org/10.2307/2978044. 31

work page doi:10.2307/2978044 1978
[21]

I., Caeiro, F., Figueiredo, F., Henriques-Rodrigues, L., & Pestana, D

Gomes, M. I., Caeiro, F., Figueiredo, F., Henriques-Rodrigues, L., & Pestana, D. (2020). Corrected-Hill versus partially reduced-bias value-at-risk estima- tion.Communications in Statistics-Simulation and Computation,49(4), 867–885. DOI: https://doi.org/10.1080/03610918.2018.1489053

work page doi:10.1080/03610918.2018.1489053 2020
[22]

Gopikrishnan, P., Meyer, M., Amaral, L. A. N., & Stanley, H. E. (1998). Inverse cubic law for the distribution of stock price variations.Eur. Phys. J. B,3, 139–

1998
[23]

DOI: https://doi.org/10.1007/s100510050292

work page doi:10.1007/s100510050292
[24]

Gopikrishnan, P., Plerou, V., Amaral, L. A. N., Meyer, M., & Stan- ley, H. E. (1999). Scaling of the distribution of fluctuations of financial market indices.Physical Review E,60(5), 5305–5316. DOI: https://doi.org/10.1103/PhysRevE.60.5305

work page doi:10.1103/physreve.60.5305 1999
[25]

J., & Sanz, G

Gouet, R., L´ opez, F. J., & Sanz, G. (2007). Asymptotic normality for the counting process of weak records andδ-records in discrete models.Bernoulli,13(3), 754–

2007
[26]

DOI: https://doi.org/10.3150/07-BEJ6027

work page doi:10.3150/07-bej6027
[27]

J., & Sanz, G

Gouet, R., L´ opez, F. J., & Sanz, G. (2012). Onδ-record observations: asymptotic rates for the counting process and elements of maximum likelihood estimation. TEST,21(1), 188–214. DOI: https://doi.org/10.1007/s11749-011-0242-6

work page doi:10.1007/s11749-011-0242-6 2012
[28]

J., Maldonado, L., & Sanz, G

Gouet, R., L´ opez, F. J., Maldonado, L., & Sanz, G. (2014). Statistical inference for the geometric distribution based onδ-records.Computational Statistics & Data Analysis,78, 21–32. DOI: https://doi.org/10.1016/j.csda.2014.04.002

work page doi:10.1016/j.csda.2014.04.002 2014
[29]

J., Maldonado, L., & Sanz, G

Gouet, R., L´ opez, F. J., Maldonado, L., & Sanz, G. (2020). Statistical inference for the Weibull distribution based onδ-record data.Symmetry,12, 20. DOI: https://doi.org/10.3390/sym12010020

work page doi:10.3390/sym12010020 2020
[30]

Haeusler, E., & Teugles, J. L. (1985). On asymptotic normality of Hill’s esti- mator for the exponent of regular variation.Ann. Stat.,13(2), 743–756. DOI: https://doi.org/10.1214/aos/1176349551

work page doi:10.1214/aos/1176349551 1985
[31]

Hall, P. (1982). On some simple estimates of an exponent of regular variation. Journal of the Royal Statistical Society: Series B (Methodological),44(1), 37–42

1982
[32]

Hill, B. M. (1975). A simple approach to inference about the tail of a distribution. Ann. Stat.,3(5), 1163–1174. DOI: https://doi.org/10.1214/aos/1176343247

work page doi:10.1214/aos/1176343247 1975
[33]

Keles, D., Hadzi-Mishev, R., & Paraschiv, F. (2016). Extreme value theory for heavy tails in electricity prices.Journal of Energy Markets,9(2), 21–50. DOI: https://doi.org/10.21314/JEM.2016.141

work page doi:10.21314/jem.2016.141 2016
[34]

L., Bousquet, N., & Remy, E

Keller, M., Popelin, A. L., Bousquet, N., & Remy, E. (2015). Nonpara- metric estimation of the probability of detection of flaws in an indus- trial component, from destructive and nondestructive testing data, using 32 approximate Bayesian computation.Risk Analysis,35(9), 1595–1610. DOI: https://doi.org/10.1111/risa.12484

work page doi:10.1111/risa.12484 2015
[35]

D., & Passalidis, C

Konstantinides, G. D., & Passalidis, C. D. (2025). A new approach in two- dimensional heavy-tailed distributions.Annals of Actuarial Science,19(2), 317–349. DOI: https://doi.org/10.1017/S1748499525000041

work page doi:10.1017/s1748499525000041 2025
[36]

R., & Chandak, N

Kumavat, H. R., & Chandak, N. R. (2024). Statistical analysis for evaluating concrete strength of existing structure using non-destructive and destructive test.Innovative Infrastructure Solutions,9(5), 173. DOI: https://doi.org/10.1007/s41062-024-01490-w

work page doi:10.1007/s41062-024-01490-w 2024
[37]

Langousis, A., Mamalakis, A., Puliga, M., & Deidda, R. (2016). Threshold detec- tion for the generalized Pareto distribution: Review of representative methods and application to the NOAA NCDC daily rainfall database.Water Resources Research,52(4), 2659–2681. DOI: https://doi.org/10.1002/2015WR018502

work page doi:10.1002/2015wr018502 2016
[38]

Louzaoui, A., & El Arrouchi, M. (2020). On the maximum likelihood estima- tion of extreme value index based onk-record values.Journal of Probability and Statistics,2020(2), 1–9. DOI: https://doi.org/10.1155/2020/5497413

work page doi:10.1155/2020/5497413 2020
[39]

Louzaoui, A., & El Arrouchi, M. (2023). Improving the bias of a pseudo-maximum likelihood estimate of the extreme value index byk-records.J. Stat. Theory Appl., 22, 54–69. DOI: https://doi.org/10.1007/s44199-023-00055-7

work page doi:10.1007/s44199-023-00055-7 2023
[40]

Mason, D. M. (1982). Laws of large numbers for sums of extreme values.Ann. Probab.,10(3), 756–764. DOI: https://doi.org/10.1214/AOP/1176993783

work page doi:10.1214/aop/1176993783 1982
[41]

Mudelsee, M., & Bermejo, M. A. (2017). Optimal heavy tail estimation - Part 1: Order selection.Nonlin. Processes Geophys.,24(4), 737–744. DOI: https://doi.org/10.5194/npg-24-737-2017

work page doi:10.5194/npg-24-737-2017 2017
[42]

D., & Papalexiou, S

Nerantzaki, S. D., & Papalexiou, S. M. (2022). Assessing extremes in hydrocli- matology: A review on probabilistic methods.Journal of Hydrology,605, 127302. DOI: https://doi.org/10.1016/j.jhydrol.2021.127302

work page doi:10.1016/j.jhydrol.2021.127302 2022
[43]

Nolan, J. P. (2020).Univariate Stable Distributions. Springer Series in Operations Research and Financial Engineering. Springer Naure. DOI: https://doi.org/10.1007/978-3-030-52915-4

work page doi:10.1007/978-3-030-52915-4 2020
[44]

O., Semary, H

Obulezi, J. O., Semary, H. E., Nadir, S., Igbokwe, C. P., Orji, G. O., Al-Moisheer, A. S., & Elgarhy, M. (2025). Type-I heavy-tailed Burr XII distribution with applications to quality control, skewed reliability engineering systems and lifetime data.Computer Modeling in Engineering & Sciences,144(3), 2991–3027. DOI: https://doi.org/10.32604/cmes.2025.069553. 33

work page doi:10.32604/cmes.2025.069553 2025
[45]

The Annals of Statistics , author =

Pickands, J. III (1975). Statistical inference using extreme order statistics.Ann. Statist.,3(1), 119–131. DOI: https://doi.org/10.1214/aos/1176343003

work page doi:10.1214/aos/1176343003 1975
[46]

Resnick, S. I. (2007).Heavy-tail phenomena. Probabilistic and statistical modeling. Springer Nature

2007
[47]

I., & Stˇ aricˇ a, C

Resnick, S. I., & Stˇ aricˇ a, C. (1997). Smoothing the Hill estimator.Advances in Applied Probability,29(1), 271–293. DOI: https://doi.org/10.2307/1427870

work page doi:10.2307/1427870 1997
[48]

She, R., Dai, L., & Ling, S. (2025). Testing for change-points in heavy-tailed time series—A Winsorized CUSUM approach.Journal of Business & Economic Statistics, 1–13. DOI: https://doi.org/10.1080/07350015.2025.2561747

work page doi:10.1080/07350015.2025.2561747 2025
[49]

Shorrock, R. W. (1972). On record values and record times.Journal of Applied Probability,9(2), 316–326. DOI: https://doi.org/10.2307/3212801

work page doi:10.2307/3212801 1972
[50]

M., Papalexiou, S

Vogel, R. M., Papalexiou, S. M., Lamontagne, J. R., & Dolan, F. C. (2024). When heavy tails disrupt statistical inference.The American Statistician,79(2), 221–235. DOI: https://doi.org/10.1080/00031305.2024.2402898. Appendix A Lemma A.1Letp∈(0,1)andn∈N. IfXfollows a truncated geometric distribution to {0,1, . . . , n}, then a) E(X) = 1−p p −(n+ 1) (1−p) n...

work page doi:10.1080/00031305.2024.2402898 2024