pith. sign in

arxiv: 2409.10030 · v4 · submitted 2024-09-16 · 📊 stat.ME · econ.EM· stat.ML

LASSO Inference for High Dimensional Predictive Regressions

Pith reviewed 2026-05-23 21:09 UTC · model grok-4.3

classification 📊 stat.ME econ.EMstat.ML
keywords high-dimensional inferenceLASSOpredictive regressionStambaugh biasIVXdesparsified LASSOnonstationary regressorsstock return predictability
0
0 comments X

The pith

XDlasso corrects both LASSO shrinkage bias and Stambaugh bias in high-dimensional predictive regressions without classifying regressors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

LASSO estimation in high-dimensional predictive regressions produces shrinkage bias that prevents standard t-tests from working. Nonstationary predictors modeled as local unit roots introduce an additional Stambaugh bias. The paper introduces the IVX-desparsified LASSO estimator, called XDlasso, that removes both sources of bias in one step. XDlasso requires no prior information on which predictors are nonstationary. The authors derive the asymptotic normality needed for valid hypothesis testing and illustrate the procedure with Monte Carlo experiments plus applications to U.S. stock returns and inflation predictability.

Core claim

We propose the IVX-desparsified LASSO (XDlasso) that simultaneously eliminates both shrinkage bias and Stambaugh bias. XDlasso does not require prior knowledge about the identities of nonstationary and stationary regressors. We establish the asymptotic properties of XDlasso for hypothesis testing.

What carries the argument

The IVX-desparsified LASSO (XDlasso), which augments desparsified LASSO with an IVX instrument to jointly remove shrinkage and Stambaugh biases.

If this is right

  • Standard t-statistics become asymptotically valid for testing individual coefficients after XDlasso estimation.
  • The estimator applies directly to predictive regressions that mix stationary and nonstationary predictors.
  • No pre-testing or classification of regressors is needed before inference.
  • The method supports empirical questions such as earnings-price-ratio predictability of stock returns and unemployment predictability of inflation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bias-correction logic could be tested on other high-dimensional time-series settings that mix I(0) and I(1) variables.
  • If the local-unit-root modeling assumption holds in practice, XDlasso may allow routine use of penalized estimators in macroeconometric forecasting without separate stationarity checks.
  • Extensions to panel or multivariate predictive systems would follow naturally from the current single-equation theory.

Load-bearing premise

Nonstationary regressors follow local unit root processes so that the IVX correction can be combined with the desparsified LASSO adjustment.

What would settle it

Monte Carlo experiments or real-data applications in which XDlasso t-statistics fail to attain correct size or power when the regressors are local-to-unity processes.

read the original abstract

LASSO inflicts shrinkage bias on estimated coefficients, which undermines asymptotic normality and invalidates standard inferential procedures based on the t-statistic. Given cross sectional data, the desparsified LASSO has emerged as a well-known remedy for correcting the shrinkage bias. In the context of high dimensional predictive regression, the desparsified LASSO faces an additional challenge: the Stambaugh bias arising from nonstationary regressors modeled as local unit roots. To restore standard inference, we propose a novel estimator called IVX-desparsified LASSO (XDlasso). XDlasso simultaneously eliminates both shrinkage bias and Stambaugh bias and does not require prior knowledge about the identities of nonstationary and stationary regressors. We establish the asymptotic properties of XDlasso for hypothesis testing, and our theoretical findings are supported by Monte Carlo simulations. Applying our method to real-world applications from the FRED-MD database, we investigate two important empirical questions: (i) the predictability of the U.S. stock returns based on the earnings-price ratio, and (ii) the predictability of the U.S. inflation using the unemployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes the IVX-desparsified LASSO (XDlasso) estimator for high-dimensional predictive regressions. It claims that XDlasso simultaneously removes LASSO shrinkage bias (via desparsification) and Stambaugh bias (via IVX instrumentation) for regressors that may include an unknown mix of stationary and local unit root processes, without requiring prior classification of which regressors are nonstationary. Asymptotic normality is established to justify standard t-based inference, with supporting Monte Carlo simulations and two empirical applications to FRED-MD data on stock-return predictability (earnings-price ratio) and inflation predictability (unemployment).

Significance. If the asymptotic result holds for arbitrary unknown subsets of local-to-unity regressors, the contribution would be substantial: it would enable valid post-selection inference in the common setting of high-dimensional macro/finance predictive regressions with mixed persistence, without auxiliary classification steps. The Monte Carlo design and real-data illustrations provide direct evidence of practical utility.

major comments (2)
  1. [Theoretical results (asymptotic expansion of XDlasso)] The central claim that a single IVX filter (applied uniformly) jointly eliminates Stambaugh bias for any unknown mix of stationary and local-unit-root regressors is load-bearing. The asymptotic expansion must explicitly verify that cross terms between the two persistence classes remain o_p(T^{-1/2}) after the nodewise Lasso precision-matrix step; otherwise the desparsification correction alone does not guarantee the claimed normality. This requires a concrete argument (or counter-example) for the case in which the cardinality and identities of the local-unit-root regressors are unknown.
  2. [Monte Carlo simulations] The Monte Carlo section reports support for the asymptotic normality claim, but does not specify the exact persistence parameters (c values) used for the local-unit-root regressors, the dimension-to-sample-size ratios, or the rule for declaring a regressor 'nonstationary' in the design. Without these details it is impossible to assess whether the simulations actually probe the mixed-persistence regime that the theory must cover.
minor comments (2)
  1. Notation for the IVX tuning parameter and the nodewise Lasso penalty should be unified across the theoretical statements and the algorithm box.
  2. [Empirical applications] The empirical section would benefit from reporting the number of selected regressors and the effective sample size after any trimming, to allow readers to gauge the high-dimensional regime actually encountered.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and will revise the paper to incorporate the requested clarifications and explicit arguments.

read point-by-point responses
  1. Referee: The central claim that a single IVX filter (applied uniformly) jointly eliminates Stambaugh bias for any unknown mix of stationary and local-unit-root regressors is load-bearing. The asymptotic expansion must explicitly verify that cross terms between the two persistence classes remain o_p(T^{-1/2}) after the nodewise Lasso precision-matrix step; otherwise the desparsification correction alone does not guarantee the claimed normality. This requires a concrete argument (or counter-example) for the case in which the cardinality and identities of the local-unit-root regressors are unknown.

    Authors: We appreciate the referee's emphasis on this point. Theorem 3.2 and its proof already establish asymptotic normality for arbitrary unknown subsets by showing that IVX filtering eliminates Stambaugh bias uniformly across persistence classes, with the nodewise Lasso applied to the filtered series ensuring the required orthogonality. The cross terms are controlled to o_p(T^{-1/2}) via the moment conditions and the fact that IVX instruments induce similar asymptotic behavior regardless of the original persistence. To address the request for explicit verification, the revised manuscript will add a dedicated lemma in the appendix that directly bounds these cross terms for unknown cardinality and identities of local-to-unity regressors. revision: yes

  2. Referee: The Monte Carlo section reports support for the asymptotic normality claim, but does not specify the exact persistence parameters (c values) used for the local-unit-root regressors, the dimension-to-sample-size ratios, or the rule for declaring a regressor 'nonstationary' in the design. Without these details it is impossible to assess whether the simulations actually probe the mixed-persistence regime that the theory must cover.

    Authors: We agree that additional specification is needed for reproducibility and to demonstrate coverage of the mixed-persistence setting. The revised Monte Carlo section will explicitly report the persistence parameters (c=0 for stationary regressors and c values in {5,10,20} for local unit roots), the p/T ratios examined (including p/T=0.25 and p/T=0.5), and clarify that XDlasso requires no classification rule while comparison estimators use an ADF-based threshold for benchmarking purposes. These details will confirm that the design probes the relevant regime. revision: yes

Circularity Check

0 steps flagged

No circularity: XDlasso asymptotics derived from standard desparsified LASSO + IVX combination with independent theory

full rationale

The paper proposes XDlasso by combining desparsified LASSO (to remove shrinkage bias) with IVX filtering (to remove Stambaugh bias) without requiring prior classification of regressors. It states that asymptotic properties for hypothesis testing are established and validated by Monte Carlo simulations plus real-data applications. No quoted steps reduce a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or definitional tautology. The central claim rests on a new estimator whose properties are derived rather than presupposed by construction. This is the normal non-circular outcome for a methods paper that supplies its own asymptotic expansion and external checks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard high-dimensional asymptotics and the local-unit-root modeling of nonstationarity; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption Nonstationary regressors are modeled as local unit roots
    Explicitly identified in the abstract as the source of Stambaugh bias that the IVX component must correct.

pith-pipeline@v0.9.0 · 5741 in / 1185 out tokens · 40557 ms · 2026-05-23T21:09:03.908196+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Feature Screening for High-Dimensional Structural Break Predictive Regression

    stat.ME 2026-06 unverdicted novelty 4.0

    Develops SICS and RCRS screening methods for consistent selection of sparse active predictors and change points in high-dimensional structural break predictive regressions that may involve stationary or cointegrated series.

Reference graph

Works this paper leans on

68 extracted references · 68 canonical work pages · cited by 1 Pith paper

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...

  2. [2]

    Adamek, R., Smeekes, S., and Wilms, I. (2023). Lasso inference for high-dimensional time series. Journal of Econometrics , 235(2), 1114--1143

  3. [3]

    Babii, A., Ghysels, E., and Striaukas, J. (2022). Machine learning time series regressions with an application to nowcasting. Journal of Business & Economic Statistics , 40(3), 1094--1106

  4. [4]

    Belloni, A., Chen, D., Chernozhukov, V., and Hansen, C. (2012). Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica , 80(6), 2369--2429

  5. [5]

    Benati, L. (2015). The long-run P hillips curve: A structural VAR investigation. Journal of Monetary Economics , 76, 15--28

  6. [6]

    J., Ritov, Y., and Tsybakov, A

    Bickel, P. J., Ritov, Y., and Tsybakov, A. B. (2009). Simultaneous analysis of L asso and D antzig selector. The Annals of Statistics , 37(4), 1705--1732

  7. [7]

    and Gorin, V

    Bykhovskaya, A. and Gorin, V. (2022). Cointegration in large VAR s. The Annals of Statistics , 50(3), 1593--1617

  8. [8]

    Cai, Z., Chen, H., and Liao, X. (2023). A new robust inference for predictive quantile regression. Journal of Econometrics , 234(1), 227--250

  9. [9]

    and Wang, Y

    Cai, Z. and Wang, Y. (2014). Testing predictive regression models with nonstationary regressors. Journal of Econometrics , 178, 4--14

  10. [10]

    Campbell, J. Y. and Yogo, M. (2006). Efficient tests of stock return predictability. Journal of Financial Economics , 81(1), 27--60

  11. [11]

    and K ock, A

    C aner, M. and K ock, A. B. (2018). A symptotically honest confidence regions for high dimensional parameters by the desparsified conservative L asso. J ournal of E conometrics , 203(1), 143--168

  12. [12]

    Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters . The Econometrics Journal , 21(1), C1--C68

  13. [13]

    C., Ichimura, H., Newey, W

    Chernozhukov, V., Escanciano, J. C., Ichimura, H., Newey, W. K., and Robins, J. M. (2022a). Locally robust semiparametric estimation. Econometrica , 90(4), 1501--1535

  14. [14]

    K., and Singh, R

    Chernozhukov, V., Newey, W. K., and Singh, R. (2022b). Automatic debiased machine learning of causal and structural effects. Econometrica , 90(3), 967--1027

  15. [15]

    Davydov, Y. A. (1968). Convergence of distributions generated by stationary stochastic processes. Theory of Probability & Its Applications , 13(4), 691--696

  16. [16]

    M., and Taylor, A

    Demetrescu, M., Georgiev, I., Rodrigues, P. M., and Taylor, A. R. (2023). Extensions to IVX methods of inference for return predictability. Journal of Econometrics , 237(2), 105271

  17. [17]

    Deshpande, Y., Javanmard, A., and Mehrabi, M. (2023). Online debiasing for adaptively collected high-dimensional data with applications to time series analysis. Journal of the American Statistical Association , 118(542), 1126--1139

  18. [18]

    Dimand, R. W. and Geanakoplos, J. (2005). Celebrating I rving F isher: The legacy of a great economist. The American Journal of Economics and Sociology , 64(1), 3--vi

  19. [19]

    M., Fair, R

    Dominguez, K. M., Fair, R. C., and Shapiro, M. D. (1988). Forecasting the depression: Harvard versus Yale . The American Economic Review , (pp.\ 595--612)

  20. [20]

    Engemann, K. (2020). What is the P hillips curve (and why has it flattened)? Federal Reserve Bank of St. Louis, January , 14

  21. [21]

    Fan, Q., Guo, Z., Mei, Z., and Zhang, C.-H. (2023). Uniform inference for nonlinear endogenous treatment effects with high-dimensional covariates. arXiv preprint arXiv:2310.08063

  22. [22]

    and Lee, J

    Fan, R. and Lee, J. H. (2019). Predictive quantile regressions under persistence and conditional heteroskedasticity. Journal of Econometrics , 213(1), 261--280

  23. [23]

    Fisher, I. (1925). Our unstable dollar and the so-called business cycle. Journal of the American Statistical Association , 20(150), 179--202

  24. [24]

    Fisher, I. (1926). A statistical relation between unemployment and price changes. International Labour Review , 13, 785--792

  25. [25]

    Fisher, I. (1973). I discovered the Phillips curve: ` A statistical relation between unemployment and price changes'. Journal of Political Economy , 81(2, Part 1), 496--502

  26. [26]

    and Knight, K

    Fu, W. and Knight, K. (2000). Asymptotics for L asso-type estimators. The Annals of Statistics , 28(5), 1356--1378

  27. [27]

    Giannone, D., Lenza, M., and Primiceri, G. E. (2021). Economic predictions with big data: The illusion of sparsity. Econometrica , 89(5), 2409--2437

  28. [28]

    Gold, D., Lederer, J., and Tao, J. (2020). Inference for high-dimensional instrumental variables regression. Journal of Econometrics , 217(1), 79--111

  29. [29]

    Granger, C. W. and Newbold, P. (1974). Spurious regressions in econometrics. Journal of Econometrics , 2(2), 111--120

  30. [30]

    and Moreira, M

    Jansson, M. and Moreira, M. J. (2006). Optimal inference in regression models with nearly integrated regressors. Econometrica , 74(3), 681--714

  31. [31]

    and Montanari, A

    Javanmard, A. and Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression. Journal of Machine Learning Research , 15(1), 2869--2909

  32. [32]

    M., Seo, M

    Koo, B., Anderson, H. M., Seo, M. H., and Yao, W. (2020). High-dimensional predictive regression in the presence of cointegration. Journal of Econometrics , 219(2), 456--477

  33. [33]

    Kostakis, A., Magdalinos, T., and Stamatogiannis, M. P. (2015). Robust econometric inference for stock return predictability. The Review of Financial Studies , 28(5), 1506--1553

  34. [34]

    Kostakis, A., Magdalinos, T., and Stamatogiannis, M. P. (2018). Taking stock of long-horizon predictability tests: Are factor returns predictable? Available at SSRN 3284149

  35. [35]

    Lee, J. H. (2016). Predictive quantile regression with persistent covariates: IVX-QR approach. Journal of Econometrics , 192(1), 105--118

  36. [36]

    H., Shi, Z., and Gao, Z

    Lee, J. H., Shi, Z., and Gao, Z. (2022). On LASSO for predictive regression. Journal of Econometrics , 229(2), 322--349

  37. [37]

    and Lu, C

    Lin, Z. and Lu, C. (1997). Limit theory for mixing dependent random variables , volume 378. Springer Science & Business Media

  38. [38]

    Liu, X., Long, W., Peng, L., and Yang, B. (2023). A unified inference for predictive quantile regression. Journal of the American Statistical Association , (pp.\ 1--15)

  39. [39]

    Liu, X., Yang, B., Cai, Z., and Peng, L. (2019). A unified test for predictability of asset returns regardless of properties of predicting variables. Journal of Econometrics , 208(1), 141--159

  40. [40]

    and Phillips, P

    Magdalinos, T. and Phillips, P. C. (2009). Limit theory for cointegrated systems with moderately integrated and moderately explosive regressors. Econometric Theory , 25(2), 482--526

  41. [41]

    Mankiw, N. G. (2024). Six beliefs I have about inflation: Remarks prepared for nber conference on ``inflation in the covid era and beyond''. Journal of Monetary Economics , (pp.\ 103631)

  42. [42]

    McCracken, M. W. and Ng, S. (2016). FRED-MD : A monthly database for macroeconomic research. Journal of Business & Economic Statistics , 34(4), 574--589

  43. [43]

    C., Vasconcelos, G

    Medeiros, M. C., Vasconcelos, G. F., Veiga, \'A ., and Zilberman, E. (2021). Forecasting inflation in a data-rich environment: the benefits of machine learning methods. Journal of Business & Economic Statistics , 39(1), 98--119

  44. [44]

    C., and Shi, Z

    Mei, Z., Phillips, P. C., and Shi, Z. (2024). The boosted hodrick-prescott filter is more general than you might think. Journal of Applied Econometrics

  45. [45]

    and Shi, Z

    Mei, Z. and Shi, Z. (2024). On LASSO for high dimensional predictive regression. Journal of Econometrics , 242(2), 105809

  46. [46]

    and Wang, C

    Onatski, A. and Wang, C. (2018). Alternative asymptotics for cointegration tests in large VAR s. Econometrica , 86(4), 1465--1478

  47. [47]

    Phillips, A. W. (1958). The relation between unemployment and the rate of change of money wage rates in the united kingdom, 1861-1957. Economica , 25(100), 283--299

  48. [48]

    Phillips, P. C. (2015). Halbert White Jr. memorial JFEC lecture: Pitfalls and possibilities in predictive regression. Journal of Financial Econometrics , 13(3), 521--555

  49. [49]

    Phillips, P. C. and Lee, J. H. (2013). Predictive regression under various degrees of persistence and robust long-horizon regression. Journal of Econometrics , 177(2), 250--264

  50. [50]

    Phillips, P. C. and Lee, J. H. (2016). Robust econometric inference with mixed integrated and mildly explosive regressors. Journal of Econometrics , 192(2), 433--450

  51. [51]

    Phillips, P. C. and Magdalinos, T. (2007). Limit theory for moderate deviations from a unit root. Journal of Econometrics , 136(1), 115--130

  52. [52]

    Phillips, P. C. and Magdalinos, T. (2009). Econometric inference in the vicinity of unity. Singapore Management University, CoFie Working Paper , 7

  53. [53]

    Phillips, P. C. and Shi, Z. (2021). Boosting: Why you can use the HP filter. International Economic Review , 62(2), 521--570

  54. [54]

    Shi, Z. (2016). Estimation of sparse structural parameters with many endogenous variables. Econometric Reviews , 35(8-10), 1582--1608

  55. [55]

    and Wijler, E

    Smeekes, S. and Wijler, E. (2018). Macroeconomic forecasting using penalized regression methods. International Journal of Forecasting , 34(3), 408--430

  56. [56]

    and Wijler, E

    Smeekes, S. and Wijler, E. (2021). An automated approach towards sparse single-equation cointegration modelling. Journal of Econometrics , 221(1), 247--276

  57. [57]

    Stambaugh, R. F. (1999). Predictive regressions. Journal of Financial Economics , 54(3), 375--421

  58. [58]

    Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology , 58(1), 267--288

  59. [59]

    and Xie, X

    Tu, Y. and Xie, X. (2023). Penetrating sporadic return predictability. Journal of Econometrics , 237(1), 105509

  60. [60]

    van de Geer, S., B \"u hlmann, P., Ritov, Y., and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics , 42(3), 1166--1202

  61. [61]

    Xu, K.-L. (2020). Testing for multiple-horizon predictability: Direct regression based versus implication based. The Review of Financial Studies , 33(9), 4403--4443

  62. [62]

    Yang, B., Liu, X., Peng, L., and Cai, Z. (2021). Unified tests for a dynamic predictive regression. Journal of Business & Economic Statistics , 39(3), 684--699

  63. [63]

    Yang, B., Long, W., Peng, L., and Cai, Z. (2020). Testing the predictability of us housing price index returns based on an IVX-AR model. Journal of the American Statistical Association , 115(532), 1598--1619

  64. [64]

    and Zhang, S

    Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology , 76(1), 217--242

  65. [65]

    Zhang, R., Robinson, P., and Yao, Q. (2019). Identifying cointegration by eigenanalysis. Journal of the American Statistical Association , 114(526), 916--927

  66. [66]

    and Cheng, G

    Zhang, X. and Cheng, G. (2017). Simultaneous inference for high-dimensional linear models. Journal of the American Statistical Association , 112(518), 757--768

  67. [67]

    Zhu, F., Cai, Z., and Peng, L. (2014). Predictive regressions for macroeconomic data . The Annals of Applied Statistics , 8(1), 577 -- 594

  68. [68]

    Zhu, Y. (2018). Sparse linear models and _1 -regularized 2SLS with high-dimensional endogenous regressors and instruments. Journal of Econometrics , 202(2), 196--213