arxiv: 2605.05076 · v1 · submitted 2026-05-06 · 🧮 math.ST · stat.CO· stat.ME· stat.ML· stat.TH

Recognition: unknown

High-Dimensional Statistics: Reflections on Progress and Open Problems

Ali Shojaie, Anru Zhang, Arian Maleki, Chao Gao, Christos Thrampoulidis, Jason M. Klusowski, Po-Ling Loh, Rishabh Dudeja, Sivaraman Balakrishna, Subhabrata Sen, Verena Zuber, Weijie Su

Pith reviewed 2026-05-08 15:30 UTC · model grok-4.3

classification 🧮 math.ST stat.COstat.MEstat.MLstat.TH

keywords high-dimensional statisticsestimation and inferenceopen problemscomplex datasetsrandom matrix theoryoptimizationinterdisciplinary connections

0 comments

The pith

High-dimensional statistics has evolved to tackle sophisticated problems in complex datasets by building connections across multiple mathematical and computational fields.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper synthesizes the progress in high-dimensional statistics over the last twenty years, noting how cheaper data collection has led to more intricate datasets. It describes how the field has developed advanced estimation and inference techniques in response. A reader would care because these developments link statistics to optimization, random matrix theory, and other areas, opening pathways for better understanding in sciences like biology and medicine. The review also flags open problems to guide future work.

Core claim

What carries the argument

The synthesis of representative advances, common themes, and open problems that serve as entry points into high-dimensional statistics.

If this is right

The field's connections to other areas will continue to produce new tools for data analysis.
Open problems identified will direct research toward handling data dependency and heterogeneity.
Entry points provided will help new researchers engage with the literature efficiently.
Practical applications in medicine and astronomy will benefit from refined estimation methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The review implies that ignoring these interdisciplinary links could slow progress in statistical methodology.
Future work might test whether addressing the open problems leads to measurable improvements in prediction accuracy on real datasets.
Connections to theoretical computer science could influence algorithm design for large-scale data processing.

Load-bearing premise

The chosen representative advances and open problems accurately reflect the field's key developments without significant omissions.

What would settle it

A systematic survey revealing a major unmentioned advance or open problem in high-dimensional statistics would falsify the completeness of this reflection.

Figures

Figures reproduced from arXiv: 2605.05076 by Ali Shojaie, Anru Zhang, Arian Maleki, Chao Gao, Christos Thrampoulidis, Jason M. Klusowski, Po-Ling Loh, Rishabh Dudeja, Sivaraman Balakrishna, Subhabrata Sen, Verena Zuber, Weijie Su.

**Figure 1.** Figure 1: Schematic phase diagram illustrating the computational-statistical gap. The solid blue curve marks view at source ↗

**Figure 2.** Figure 2: (a) Schematic illustration of data integration from summary level data. (b) Illustration of analysis view at source ↗

**Figure 3.** Figure 3: A visual comparison of (a) one-shot averaging vs. (b) iterative optimization. Machine view at source ↗

**Figure 4.** Figure 4: Some interesting open directions in distributed learning. (a) An illustration of a sequential setting, view at source ↗

read the original abstract

Over the past two decades, the field of high-dimensional statistics has experienced substantial progress, driven largely by technological advances that have dramatically reduced the cost and effort for data collection and storage across a broad range of domains, including biology, medicine, astronomy, and the social and environmental sciences. Modern datasets are increasingly complex, often exhibiting rich dependency, heterogeneity, and other features that challenge traditional statistical methods. In response, high-dimensional statistics has evolved to address more sophisticated estimation and inference problems. This evolution has, in turn, fostered deep connections with and contributions to a wide range of research areas, including optimization, concentration of measure, random matrix theory, information theory, and theoretical computer science. Given the rapid pace of recent developments in high-dimensional statistics, our goal is to synthesize representative advances, highlight common themes and open problems, and point to important works that offer entry points into the field.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A useful survey that organizes high-dimensional statistics progress and open problems but adds no new technical results.

read the letter

This paper is a survey that reflects on the progress in high-dimensional statistics and lists some open problems. It does not present any new theorems or empirical results of its own. The authors do a good job describing how the field has evolved to handle more complex data with dependencies and heterogeneity. They connect it to areas like optimization, random matrix theory, concentration of measure, information theory, and theoretical computer science. This helps show why the area has grown and what tools have been useful. They also point to key papers that can serve as entry points, which is practical for readers. One thing the paper does well is staying honest about its scope. The abstract notes that the selection is representative rather than exhaustive, and the goal is synthesis rather than completeness. That keeps expectations in check. The soft spots are mostly around the inherent limits of any survey. The choice of which advances to highlight and which open problems to emphasize is subjective. If the full paper has a clear structure and balanced coverage, that helps, but it's always possible that some sub-areas get short shrift. Since the stress test found no internal inconsistencies, the high-level narrative holds up. This work is for people who are new to high-dimensional statistics or who work in adjacent fields and want an overview. It could be useful in a reading group to discuss the open problems or to get a sense of the literature. Specialists already deep in the area might not need it as much. I would recommend sending it to peer review. A solid survey can be a real service to the community by organizing the literature and sparking ideas, even without original contributions.

Referee Report

0 major / 2 minor

Summary. The manuscript is a reflective survey on high-dimensional statistics over the past two decades. It claims that technological advances enabling large-scale data collection have produced complex datasets with dependencies and heterogeneity, driving the field to develop more sophisticated estimation and inference techniques. These developments have created interdisciplinary links with optimization, concentration of measure, random matrix theory, information theory, and theoretical computer science. The paper synthesizes representative advances, identifies common themes and open problems, and provides pointers to key literature as entry points, while explicitly framing the selection as non-exhaustive.

Significance. If the synthesis is balanced, the paper offers a useful high-level overview and set of entry points for a rapidly evolving field. Its explicit acknowledgment of non-exhaustiveness and focus on interdisciplinary connections could help orient new researchers and highlight cross-field opportunities. The survey format itself is a strength when it successfully points readers to primary sources rather than attempting exhaustive coverage.

minor comments (2)

[Abstract] The abstract and introduction would benefit from a brief explicit statement of the manuscript's intended audience (e.g., researchers new to the area versus specialists) to help readers calibrate expectations for depth versus breadth.
[Introduction] Section headings and transitions between thematic blocks could be strengthened with short forward-looking sentences that preview how each advance connects to the open problems listed later.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. The manuscript is framed as a non-exhaustive synthesis of representative advances, common themes, open problems, and interdisciplinary connections in high-dimensional statistics, with pointers to key entry-point works.

Circularity Check

0 steps flagged

No significant circularity in this reflective review

full rationale

This paper is a high-level synthesis and reflection on progress in high-dimensional statistics. It explicitly frames its goal as summarizing representative advances from the literature, highlighting themes and open problems, and directing readers to external entry-point works. No original derivations, theorems, predictions, fitted parameters, or equations are presented that could reduce to the paper's own inputs by construction. Central claims are descriptive and non-exhaustive, with no self-citation chains serving as load-bearing justifications for any technical result. The structure relies on external references rather than internal self-reference, satisfying the criteria for a self-contained review with no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a survey paper. It introduces no new free parameters, axioms, or invented entities; all content is drawn from the cited literature.

pith-pipeline@v0.9.0 · 5507 in / 955 out tokens · 61066 ms · 2026-05-08T15:30:14.838094+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

109 extracted references · 95 canonical work pages · 18 internal anchors

[1]

Understanding intermediate layers using linear classifier probes

[AB16] G. Alain and Y. Bengio. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644,

work page internal anchor Pith review arXiv
[2]

[ADZ23] A. N. Angelopoulos, J. C. Duchi, and T. Zrnic. PPI++: Efficient prediction-powered infer- ence.arXiv preprint arXiv:2311.01453,

work page arXiv
[3]

Concrete Problems in AI Safety

[AOS`16] D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Man´ e. Concrete problems in ai safety.arXiv preprint arXiv:1606.06565,

work page internal anchor Pith review arXiv
[4]

What learning algorithm is in-context learning? investigations with linear models, 2023

44 [ASA`22] E. Aky¨ urek, D. Schuurmans, J. Andreas, T. Ma, and D. Zhou. What learning algorithm is in-context learning? Investigations with linear models.arXiv preprint arXiv:2211.15661,

work page arXiv
[5]

[BC15] R. F. Barber and E. J. Cand` es. Controlling the false discovery rate via knockoffs.The Annals of Statistics, pages 2055–2085,

2055
[6]

Banerjee, I

[BCG21] S. Banerjee, I. Castillo, and S. Ghosal. Bayesian inference in high-dimensional models.arXiv preprint arXiv:2101.04491,

work page arXiv
[7]

Bhattacharyya, C

[BDGW25] A. Bhattacharyya, C. Daskalakis, T. Gouleakis, and Y. Wang. Learning high-dimensional Gaussians from censored data.arXiv preprint arXiv:2504.19446,

work page arXiv
[8]

Buhai, J

[BHJK25] R. Buhai, J. Hsieh, A. Jain, and P. K. Kothari. The quasi-polynomial low-degree conjecture is false.arXiv preprint arXiv:2505.17360,

work page arXiv
[9]

Brown, B

[BMR`20] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners.Advances in Neural Information Processing Systems, 33:1877–1901,

1901
[10]

Castillo

[Cas14] I. Castillo. On Bayesian supremum norm contraction rates.Annals of Statistics, 42(5):2058– 2091,

2058
[11]

Castillo

[Cas24] I. Castillo. Bayesian nonparametric statistics, St-Flour lecture notes.arXiv preprint arXiv:2402.16422,

work page arXiv
[12]

The high-dimensional asymptotics of first order methods with random data

[CCM21] M. Celentano, C. Cheng, and A. Montanari. The high-dimensional asymptotics of first order methods with random data.arXiv preprint arXiv:2112.07572,

work page internal anchor Pith review Pith/arXiv arXiv
[13]

[CCV24] T. T. Cai, A. Chakraborty, and L. Vuursteen. Federated nonparametric hypothesis test- ing with differential privacy constraints: Optimal rates and adaptive tests.arXiv preprint arXiv:2406.06749,

work page arXiv
[14]

[CGR18] M. Chen, C. Gao, and Z. Ren. Robust covariance and scatter matrix estimation under huber’s contamination model.The Annals of Statistics, 46(5):1932–1960,

1932
[15]

Dynamics of transient structure in in-context linear regression transformers.arXiv preprint arXiv:2501.17745, 2025

[CHFRM25] L. Carroll, J. Hoogland, M. Farrugia-Roberts, and D. Murfet. Dynamics of transient structure in in-context linear regression transformers.arXiv preprint arXiv:2501.17745,

work page arXiv
[16]

Chaudhuri, J

[CLC25] S. Chaudhuri, J. Li, and T. A. Courtade. Robust estimation under heterogeneous corruption rates.arXiv preprint arXiv:2508.15051,

work page arXiv
[17]

[CLM24] X. Chen, L. Liu, and R. Mukherjee. Method-of-moments inference for GLMs and doubly robust functionals under proportional asymptotics.arXiv preprint arXiv:2408.06103,

work page arXiv
[18]

[CLS15] E. J. Candes, X. Li, and M. Soltanolkotabi. Phase retrieval via Wirtinger flow: Theory and algorithms.IEEE Transactions on Information Theory, 61(4):1985–2007,

1985
[19]

Celentano and A

[CM21] M. Celentano and A. Montanari. CAD: Debiasing the Lasso with inaccurate covariate model. arXiv preprint arXiv:2107.14172,

work page arXiv
[20]

[CMZZ25] S. Chen, T. Misiakiewicz, I. Zadik, and P. Zhang. An optimized franz-parisi criterion and its equivalence with sq lower bounds.arXiv preprint arXiv:2506.06259,

work page arXiv
[21]

Castillo, J

49 [CSHvdV15] I. Castillo, J. Schmidt-Hieber, and A. W. van der Vaart. Bayesian linear regression with sparse priors.The Annals of Statistics, 43(5):1986–2018,

1986
[22]

Castillo and A

[CvdV12] I. Castillo and A. W. van der Vaart. Needles and straw in a haystack: Posterior concentration for possibly sparse sequences.The Annals of Statistics, 40(4):2069–2101,

2069
[23]

Celentano and M

[CW23] M. Celentano and M. J. Wainwright. Challenges of the inconsistency regime: Novel debiasing methods for missing data models.arXiv preprint arXiv:2309.01362,

work page arXiv
[24]

[CWL`25] X. Cai, W. Wang, F. Liu, T. Liu, G. Niu, and M. Sugiyama. Reinforcement learning with verifiable yet noisy rewards under imperfect verifiers.arXiv preprint arXiv:2510.00915,

work page arXiv
[25]

Decruyenaere, H

[DDR`23] A. Decruyenaere, H. Dehaene, P. Rabaey, C. Polet, J. Decruyenaere, S. Vansteelandt, and T. Demeester. The real deal behind the artificial appeal: Inferential utility of tabular syn- thetic data.arXiv preprint arXiv:2312.07837,

work page arXiv
[26]

[DFH`15] C

arXiv:2411.04216. [DFH`15] C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold, and A. Roth. The reusable holdout: Preserving validity in adaptive data analysis.Science, 349(6248):636–638,

work page arXiv
[27]

Diakonikolas, D

[DKLP25b] I. Diakonikolas, D. M. Kane, S. Liu, and T. Pittas. PTF testing lower bounds for non- Gaussian component analysis.arXiv preprint arXiv:2511.19398,

work page arXiv
[28]

Diakonikolas, D

[DKP26] I. Diakonikolas, D. M. Kane, and T. Pittas. High-dimensional Gaussian mean estimation under realizable contamination.arXiv preprint arXiv:2603.16798,

work page arXiv
[29]

What is the objective of reasoning with reinforcement learning?arXiv preprint arXiv:2510.13651,

[DR25] D. Davis and B. Recht. What is the objective of reasoning with reinforcement learning? arXiv preprint arXiv:2510.13651,

work page arXiv
[30]

Toy Models of Superposition

[EHO`22] N. Elhage, T. Hume, C. Olsson, N. Schiefer, T. Henighan, S. Kravec, Z. Hatfield-Dodds, R. Lasenby, D. Drain, C. Chen, R. Grosse, S. McCandlish, J. Kaplan, D. Amodei, M. Wat- tenberg, and C. Olah. Toy models of superposition.arXiv preprint arXiv:2209.10652,

work page internal anchor Pith review arXiv
[31]

El Karoui

[EK13] N. El Karoui. Asymptotic behavior of unregularized and ridge-regularized high-dimensional robust regression estimators: Rigorous results.arXiv preprint arXiv:1311.2445,

work page arXiv
[32]

[FGLX25] Y. Fan, L. Gao, J. Lv, and X. Xu. Asymptotic FDR control with model-X knockoffs: Is moments matching sufficient?arXiv preprint arXiv:2502.05969,

work page arXiv
[33]

Localizing model behavior with path patching.arXiv preprint arXiv:2304.05969,

53 [GDMSA23] N. Goldowsky-Dill, C. MacLeod, L. Sato, and A. Arora. Localizing model behavior with path patching.arXiv preprint arXiv:2304.05969,

work page arXiv
[34]

[GGHVDM19] C. Guo, T. Goldstein, A. Hannun, and L. Van Der Maaten. Certified data removal from machine learning models.arXiv preprint arXiv:1911.03030,

work page arXiv 1911
[35]

[GGWM24] Y. Guo, S. Ghosh, H. Weng, and A. Maleki. A note on the minimax risk of sparse linear regression.arXiv preprint arXiv:2405.05344,

work page arXiv
[36]

Ghosh, Y

[GGWM25] S. Ghosh, Y. Guo, H. Weng, and A. Maleki. Signal-to-noise ratio aware minimax analysis of sparse linear regression.arXiv preprint arXiv:2501.13323,

work page arXiv
[37]

Universality of first-order methods on random and deterministic matrices

[GJKP26] N. Gorini, C. Jones, D. Kunisky, and L. Pesenti. Universality of first-order methods on random and deterministic matrices.arXiv preprint arXiv:2604.11729,

work page internal anchor Pith review Pith/arXiv arXiv
[38]

Gan and W

[GL23] F. Gan and W. Liang. Prediction de-correlated inference.arXiv preprint arXiv:2312.06478,

work page arXiv
[39]

, author Xia, D

[GX24] Y. Gu and D. Xia. Local prediction-powered inference.arXiv preprint arXiv:2409.18321,

work page arXiv
[40]

[GYZ`25] D. Guo, D. Yang, H. Zhang, J. Song, R. Zhang, R. Xu, Q. Zhu, S. Ma, P. Wang, X. Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948,

work page internal anchor Pith review arXiv
[41]

[Han25b] Q. Han. Long-time dynamics and universality of nonconvex gradient descent.arXiv preprint arXiv:2509.11426,

work page arXiv
[42]

Training Compute-Optimal Large Language Models

[HBM`22] J. Hoffmann, S. Borgeaud, A. Mensch, E. Buchatskaya, T. Cai, E. Rutherford, D. de Las Casas, L. A. Hendricks, J. Welbl, A. Clark, et al. Training compute-optimal large language models.arXiv preprint arXiv:2203.15556,

work page internal anchor Pith review arXiv
[43]

[HGL`24] Z. Han, C. Gao, J. Liu, J. Zhang, and S. Q. Zhang. Parameter-efficient fine-tuning for large models: A comprehensive survey.arXiv preprint arXiv:2403.14608,

work page internal anchor Pith review arXiv
[44]

Hsieh, D

[HKK`26] J. Hsieh, D. M Kane, P. K. Kothari, J. Li, S. Mohanty, and S. Tiegel. Rigorous implications of the low-degree heuristic.arXiv preprint arXiv:2601.05850,

work page arXiv
[45]

Hu and Y

[HL22] H. Hu and Y. M. Lu. Universality laws for high-dimensional learning with random features. IEEE Transactions on Information Theory, 69(3):1932–1964,

1932
[46]

[HLM24] H. Hu, Y. M. Lu, and T. Misiakiewicz. Asymptotics of random feature regression beyond the linear scaling regime.arXiv preprint arXiv:2403.08160,

work page arXiv
[47]

[HSS26] Y. Han, A. Shetty, and J. Shkrob. An empirical Bayes perspective on heteroskedastic mean estimation.arXiv preprint arXiv:2603.13499,

work page arXiv
[48]

Heckel, M

[HST26] R. Heckel, M. Soltanolkotabi, and C. Thramboulidis. Asymmetric prompt weighting for reinforcement learning with verifiable rewards.arXiv preprint arXiv:2602.11128,

work page arXiv
[49]

Jayaraman and D

[JE19] B. Jayaraman and D. Evans. Evaluating differentially private machine learning in practice. In28th USENIX Security Symposium, pages 1895–1912,

1912
[50]

[JSM`23] A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, et al. Mistral 7b. arxiv.arXiv preprint arXiv:2310.06825, 10:3,

work page internal anchor Pith review arXiv
[51]

[KKR25] J. Kim, J. Kim, and E. K. Ryu. LoRA training provably converges to a low-rank global minimum or it fails loudly (but it probably won’t fail). InInternational Conference on Machine Learning, volume 2025,

2025
[52]

Scaling Laws for Neural Language Models

[KMH`20] J. Kaplan, S. McCandlish, T. Henighan, T. B Brown, B. Chess, R. Child, S. Gray, A. Rad- ford, J. Wu, and D. Amodei. Scaling laws for neural language models.arXiv preprint arXiv:2001.08361,

work page internal anchor Pith review arXiv 2001
[53]

Keret and A

[KS25] N. Keret and A. Shojaie. GLM inference with AI-generated synthetic data using misspecified linear regression.arXiv preprint arXiv:2503.21968,

work page arXiv
[54]

[Kun23] D. Kunisky. Generic MANOVA limit theorems for products of projections.arXiv preprint arXiv:2301.09543,

work page arXiv
[55]

Kunisky, A

[KWB19] D. Kunisky, A. S. Wein, and A. S. Bandeira. Notes on computational hardness of hypothesis testing: Predictions using the low-degree likelihood ratio.CoRR, abs/1907.11636,

work page arXiv 1907
[56]

Luong and L

[LC26] H. Luong and L. Chen. Why LoRA fails to forget: Regularized low-rank adaptation against backdoors in language models.arXiv preprint arXiv:2601.06305,

work page arXiv
[57]

[LCL22a] S. Li, T. T. Cai, and H. Li. Estimation and inference with proxy data and its genetic applications.arXiv preprint arXiv:2201.03727,

work page arXiv
[58]

Li and W

[LF21] X. Li and W. Fithian. Whiteout: When do fixed-X knockoffs fail?arXiv preprint arXiv:2107.06388,

work page arXiv
[59]

Tulu 3: Pushing Frontiers in Open Language Model Post-Training

[LMP`24] N. Lambert, J. Morrison, V. Pyatkin, S. Huang, H. Ivison, F. Brahman, L. J. V. Miranda, A. Liu, N. Dziri, S. Lyu, et al. Tulu 3: Pushing frontiers in open language model post-training. arXiv preprint arXiv:2411.15124,

work page internal anchor Pith review arXiv
[60]

Li and P

[LS23] Y. Li and P. Sur. Spectrum-aware debiasing: A modern inference framework with applications to principal components regression.arXiv preprint arXiv:2309.07810,

work page arXiv
[61]

[LWB25] L. Lin, J. Wu, and P. L. Bartlett. Improved scaling laws in linear regression via data reuse. arXiv preprint arXiv:2506.08415,

work page arXiv
[62]

[LWGH13] J. Liu, C. Wang, J. Gao, and J. Han. Multi-view clustering via joint nonnegative matrix factorization. InProceedings of the 2013 SIAM International Conference on Data Mining, pages 252–260. SIAM,

2013
[63]

[LZVLP25] M. I. Letey, J. A. Zavatone-Veth, Y. M. Lu, and C. Pehlevan. Pretrain-test task alignment governs generalization in in-context learning.arXiv preprint arXiv:2509.26551,

work page arXiv
[64]

[Mal11] C. Male. Traffic distributions and independence: Permutation invariant random matrices and the three notions of independence.arXiv preprint arXiv:1111.4662,

work page arXiv
[65]

Mousavi-Hosseini and M

[MHE26] A. Mousavi-Hosseini and M. A. Erdogdu. Post-training with policy gradients: Optimality and the base model barrier.arXiv preprint arXiv:2603.06957,

work page arXiv
[66]

The natural language decathlon: Multitask learning as question answering

[MKXS18] B. McCann, Nitish S. Keskar, C. Xiong, and R. Socher. The natural language decathlon: Multitask learning as question answering.arXiv preprint arXiv:1806.08730,

work page arXiv
[67]

Miao and Q

[ML24] J. Miao and Q. Lu. Task-agnostic machine learning-assisted inference.arXiv preprint arXiv:2405.20039,

work page arXiv
[68]

Mahdavi, R

[MLT24] S. Mahdavi, R. Liao, and C. Thrampoulidis. Revisiting the equivalence of in-context learning and gradient descent: The impact of data distribution. InICASSP 2024-2024 IEEE Inter- national Conference on Acoustics, Speech and Signal Processing, pages 7410–7414. IEEE,

2024
[69]

[MMW`23] J. Miao, X. Miao, Y. Wu, J. Zhao, and Q. Lu. Assumption-lean and data-adaptive post- prediction inference.arXiv preprint arXiv:2311.14220,

work page arXiv
[70]

Ma and L

[MP17] J. Ma and L. Ping. Orthogonal AMP.IEEE Access, 5:2020–2033,

2020
[71]

arXiv preprint arXiv:2210.16859 (2022)

[MRS22] A. Maloney, D. A. Roberts, and J. Sully. A solvable model of neural scaling laws.arXiv preprint arXiv:2210.16859,

work page arXiv
[72]

Montanari, F

[MRSS23] A. Montanari, F. Ruan, B. Saeed, and Y. Sohn. Universality of max-margin classifiers.arXiv preprint arXiv:2310.00176,

work page arXiv
[73]

[MVB`24] T. Ma, K. A. Verchand, T. B. Berrett, T. Wang, and R. J. Samworth. Estimation beyond missing (completely) at random.arXiv preprint arXiv:2410.10704,

work page internal anchor Pith review Pith/arXiv arXiv
[74]

[MZH`25] A. M. Mason, V. Zuber, G. Hemani, E. Raffetti, Y. Xu, A. H. W. Chong, B. Woolf, E. Allara, D. Gill, O. Soremekun, and S. Burgess. Mendelian randomization in a multi-ancestry world: Reflections and practical advice.arXiv preprint arXiv:2510.17554,

work page arXiv
[75]

In-context Learning and Induction Heads

[OEN`22] C. Olsson, N. Elhage, N. Nanda, N. Joseph, N. DasSarma, T. Henighan, B. Mann, A. Askell, Y. Bai, A. Chen, T. Conerly, D. Drain, D. Ganguli, Z. Hatfield-Dodds, D. Hernandez, S. John- ston, A. Jones, J. Kernion, L. Lovitt, K. Ndousse, D. Amodei, T. Brown, J. Clark, J. Ka- plan, S. McCandlish, and C. Olah. In-context learning and induction heads.arX...

work page internal anchor Pith review arXiv
[76]

Oymak and B

[OH10] S. Oymak and B. Hassibi. New null space results and recovery thresholds for matrix rank minimization.arXiv preprint arXiv:1011.6326,

work page arXiv
[77]

[PGI`23] S. M. Park, K. Georgiev, A. Ilyas, G. Leclerc, and A. Madry. Trak: Attributing model behavior at scale.arXiv preprint arXiv:2303.14186,

work page arXiv
[78]

[PLPT24] C. F. Park, E. S. Lubana, I. Pres, and H. Tanaka. Competition dynamics shape algorithmic phases of in-context learning.arXiv preprint arXiv:2412.01003,

work page arXiv
[79]

Michaud, Stephen Casper, Max Tegmark, William Saunders, David Bau, Eric Todd, Atticus Geiger, Mor Geva, Jesse Hoogland, Daniel Murfet, and Tom McGrath

[SCB`25] L. Sharkey, B. Chughtai, J. Batson, J. Lindsey, J. Wu, L. Bushnaq, N. Goldowsky-Dill, S. Heimersheim, A. Ortega, J. Bloom, S. Biderman, A. Garriga-Alonso, A. Conmy, N. Nanda, J. Rumbelow, M. Wattenberg, N. Schoots, J. Miller, E. J. Michaud, S. Casper, M. Tegmark, W. Saunders, D. Bau, E. Todd, A. Geiger, M. Geva, J. Hoogland, D. Murfet, and T. McG...

work page arXiv
[80]

[SL19] W. W. Sun and L. Li. Dynamic tensor clustering.Journal of the American Statistical Association, 114(528):1894–1907,

1907

Showing first 80 references.