arxiv: 2604.13399 · v1 · submitted 2026-04-15 · 💰 econ.EM

Recognition: unknown

Root-n Asymptotically Normal Maximum Score Estimation

Nan Liu, Yanbo Liu, Yuanyuan Wan, Yuya Sasaki

Authors on Pith no claims yet

Pith reviewed 2026-05-10 12:27 UTC · model grok-4.3

classification 💰 econ.EM

keywords maximum score estimationroot-n convergenceasymptotic normalitybinary choice modelssurrogate score functionssemiparametric estimationidentificationsmooth criterion

0 comments

The pith

Under primitive conditions, strictly concave surrogate scores turn maximum score estimation into a root-n normal procedure.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates when strictly concave surrogate score functions can be substituted into the maximum score criterion for binary choice models. This substitution produces a smooth objective that identifies the parameter of interest and supports the standard root-n rate of convergence to a normal limiting distribution. The authors express the required conditions in terms of primitive features of the model rather than high-level assumptions. If the conditions hold, standard t-tests and confidence intervals become valid without resorting to nonstandard asymptotics or specialized inference methods. Simulation evidence in the paper illustrates that the root-n rate and normality appear in finite samples.

Core claim

Strictly concave surrogate score functions can be used to construct a smooth criterion function that identifies the parameters in binary choice models and delivers root-n asymptotic normality, with the necessary conditions characterized in primitive terms on the data generating process.

What carries the argument

Strictly concave surrogate score functions that replace the indicator-based maximum score criterion to create a differentiable objective function.

If this is right

The estimator converges at the parametric root-n rate instead of the slower cube-root rate of the original maximum score estimator.
The limiting distribution is normal, so conventional standard errors and Wald tests are asymptotically valid.
The identification and smoothness properties hold under conditions stated directly in terms of the joint distribution of covariates and errors.
Monte Carlo experiments confirm that the asymptotic results translate to reliable finite-sample behavior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be extended to other models with discontinuous objective functions, such as quantile regression at non-differentiable points, by designing analogous concave surrogates.
Practitioners facing large datasets may prefer this smoothed version for faster computation while retaining the robustness properties of maximum score.
Future work could examine whether data-driven selection of the surrogate function preserves the root-n normality result.

Load-bearing premise

The chosen surrogate functions must be strictly concave and produce a criterion smooth enough for standard asymptotic arguments to yield root-n normality.

What would settle it

A data generating process satisfying the paper's primitive conditions yet producing an estimator whose convergence rate is slower than root-n or whose limiting distribution is non-normal would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.13399 by Nan Liu, Yanbo Liu, Yuanyuan Wan, Yuya Sasaki.

**Figure 2.** Figure 2: Evidence of the Asymptotic Normality by Q-Q Plots. [PITH_FULL_IMAGE:figures/full_fig_p021_2.png] view at source ↗

read the original abstract

The maximum score method (Manski, 1975, 1985) is a powerful approach for binary choice models, yet it is known to face both practical and theoretical challenges. In particular, the estimator converges at a slower-than-root-$n$ rate to a nonstandard limiting distribution. We investigate conditions under which strictly concave surrogate score functions can be employed to achieve identification through a smooth criterion function. This criterion enables root-$n$ convergence to a normal limiting distribution. While the conditions to guarantee these desired properties are nontrivial, we characterize them in terms of primitive conditions. Extensive simulation studies support, the root-$n$ convergence rate, the asymptotic normality, and the validity of the standard inference methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper restores root-n normality to the Manski maximum score estimator by smoothing the criterion with a fixed strictly concave surrogate under explicit primitive conditions.

read the letter

The main point is that you can get standard root-n asymptotic normality for the maximum score estimator by swapping the usual discontinuous score for a strictly concave surrogate function. The paper works out primitive conditions under which the smoothed population criterion is smooth, has a unique interior maximum at the true parameter, and has a nonsingular Hessian, so ordinary M-estimator arguments deliver the root-n normal limit. Simulations then check that the rate, normality, and even standard inference hold in designs that meet those conditions. That is the actual advance over the classic Manski results, which stopped at cube-root convergence to a non-normal limit. The argument is direct and avoids circularity; the surrogate is fixed rather than data-dependent, and the conditions are stated in terms of the data distribution and the surrogate properties rather than high-level smoothness assumptions. The simulations are useful corroboration rather than the main evidence. The nontrivial part is that the primitive conditions still need to be verified in any given application, and the choice of surrogate will affect finite-sample behavior even if the asymptotics go through. The paper does not overstate how automatic this makes the estimator, which keeps the claim proportionate. This is useful for econometricians who want to keep the robustness properties of maximum score but also need reliable standard errors and faster rates. It is also relevant for people working on smoothing techniques for other nonsmooth semiparametric estimators. The work is coherent on its own terms and the central claim holds up, so it deserves a serious referee rather than a desk rejection.

Referee Report

0 major / 3 minor

Summary. The paper claims that strictly concave surrogate score functions can replace the discontinuous indicator in Manski's maximum score estimator for binary choice models. Under primitive conditions ensuring the resulting population criterion is smooth, strictly concave, and uniquely maximized at the true parameter with non-singular Hessian, the estimator attains root-n consistency and asymptotic normality via standard M-estimation arguments. Simulations are used to corroborate the rate, normality, and validity of conventional inference.

Significance. If the primitive conditions hold in practice, the approach addresses a longstanding limitation of the maximum score estimator by delivering root-n rates and standard asymptotics without abandoning its semiparametric robustness. The explicit characterization of the required conditions in primitive terms, rather than high-level assumptions, is a clear strength, as is the simulation evidence that directly checks the claimed rate and normality under designs satisfying those primitives.

minor comments (3)

Introduction: the motivation section would benefit from a brief comparison to existing smoothed maximum-score estimators (e.g., those based on kernel or logistic smoothing) to clarify the distinct role of the strictly concave surrogate.
§4, Assumption 4.2: while the non-singularity of the Hessian is stated, a short remark on how this condition can be checked for common covariate distributions would aid applicability.
Section 5: the simulation tables should report the exact number of Monte Carlo replications and the precise functional forms of the surrogates employed, to facilitate exact reproduction.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript, accurate summary of the contribution, and recommendation for minor revision. We are pleased that the referee highlights the value of the primitive conditions and simulation evidence.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper derives primitive conditions under which a strictly concave surrogate produces a smooth population criterion with a unique interior maximum and nonsingular Hessian at the target parameter. Root-n asymptotic normality is then obtained by applying standard M-estimator theorems to this criterion. No equation or claim reduces by construction to a fitted input, self-definition, or self-citation chain; the limiting distribution is imported from external asymptotic theory once the stated primitives hold. Simulations are used only for corroboration, not as load-bearing evidence for the main result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, axioms, or invented entities are identified. The paper states that conditions are characterized in primitive terms but does not list them here.

pith-pipeline@v0.9.0 · 5416 in / 1044 out tokens · 39175 ms · 2026-05-10T12:27:01.916597+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 3 canonical work pages

[1]

Abrevaya, J. and J. Huang (2005, July). On the bootstrap of the maximum score estimator. Econometrica\/ 73\/ (4), 1175--1204

2005
[2]

Ghysels, and X

Babii, A., E. Ghysels, and X. Chen (2020). Binary choice with asymmetric loss in a data-rich environment: Theory and an application to racial justice

2020
[3]

Bartlett, P. L., M. I. Jordan, and J. D. McAuliffe (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association\/ 101\/ (473), 138--156

2006
[4]

Bickel, P. J., F. G \"o tze, and W. R. van Zwet (2011). Resampling fewer than n observations: gains, losses, and remedies for losses. In Selected works of Willem van Zwet , pp.\ 267--297. Springer

2011
[5]

Cattaneo, M. D., G. F. Cox, M. Jansson, and K. Nagasawa (2026). Continuity of the distribution function of the arg max of a gaussian process. Econometrica\/

2026
[6]

Cattaneo, M. D., M. Jansson, and K. Nagasawa (2020). Bootstrap-based inference for cube root asymptotics. Econometrica\/ 88\/ (5), 2203--2219

2020
[7]

Cattaneo, M. D., M. Jansson, and K. Nagasawa (2024). Bootstrap-assisted inference for generalized grenander-type estimators. The Annals of Statistics\/ 52\/ (4), 1509--1533

2024
[8]

Chen, X., W. Y. Gao, and L. Wen (2025). Relu-based and dnn-based generalized maximum score estimators. arXiv preprint arXiv:2511.19121\/

work page arXiv 2025
[9]

Cheng, Y. and S. Yang (2024). Inference for optimal linear treatment regimes in personalized decision-making. arXiv preprint arXiv:2405.16161\/

work page arXiv 2024
[10]

Delgado, M. A., J. M. Rodr guez-Poo, and M. Wolf (2001). Subsampling inference in cube root asymptotics with an application to manski’s maximum score estimator. Economics Letters\/ 73\/ (2), 241--250

2001
[11]

Hall, P. (1992). The Bootstrap and Edgeworth Expansion . Springer Series in Statistics. New York: Springer

1992
[12]

Horowitz, J. L. (1992). A smoothed maximum score estimator for the binary response model. Econometrica: Journal of the Econometric Society\/ , 505--531

1992
[13]

Horowitz, J. L. (2001). The bootstrap. In J. J. Heckman and E. Leamer (Eds.), Handbook of Econometrics , Volume 5, pp.\ 3159--3228. Elsevier

2001
[14]

Horowitz, J. L. (2019). Bootstrap methods in econometrics. Annual Review of Economics\/ 11\/ (1), 193--224

2019
[15]

Kim, J. and D. Pollard (1990). Cube root asymptotics. The Annals of Statistics\/ , 191--219

1990
[16]

Sakaguchi, and A

Kitagawa, T., S. Sakaguchi, and A. Tetenov (2023). Constrained classification and policy learning. Technical report

2023
[17]

Klein, R. W. and R. H. Spady (1993). An efficient semiparametric estimator for binary response models. Econometrica: Journal of the Econometric Society\/ , 387--421

1993
[18]

Lee, S. M. S. and M. Pun (2006). On m out of n bootstrapping for nonstandard m-estimation with nuisance parameters. Journal of the American Statistical Association\/ 101\/ (475), 1185--1197

2006
[19]

L \'e ger, C. and B. MacGibbon (2006). On the bootstrap in cube root asymptotics. Canadian Journal of Statistics\/ 34\/ (1), 29--44

2006
[20]

Liu, N., Y. Liu, Y. Sasaki, and Y. Wan (2025). Nonparametric uniform inference in binary classification and policy values. arXiv preprint arXiv:2511.14700\/

work page arXiv 2025
[21]

Lugosi, G. and N. Vayatis (2004). On the bayes-risk consistency of regularized boosting methods. The Annals of statistics\/ 32\/ (1), 30--55

2004
[22]

Manski, C. F. (1975). Maximum score estimation of the stochastic utility model of choice. Journal of Econometrics\/ 3\/ (3), 205--228

1975
[23]

Manski, C. F. (1985). Semiparametric analysis of discrete response: Asymptotic properties of the maximum score estimator. Journal of Econometrics\/ 27\/ (3), 313--333

1985
[24]

Newey, W. K. and D. McFadden (1994). Large sample estimation and hypothesis testing. Handbook of econometrics\/ 4 , 2111--2245

1994
[25]

Rosen, A. M. and T. Ura (2025, 01). Finite sample inference for the maximum score estimand. The Review of Economic Studies\/ 92\/ (6), 4117--4151

2025
[26]

Seo, M. H. and T. Otsu (2018). Local m-estimation with discontinuous criterion for dependent and limited observations. The Annals of Statistics\/ 46\/ (1), 344--369

2018
[27]

Steinwart, I. (2005). Consistency of support vector machines and other regularized kernel classifiers. IEEE transactions on information theory\/ 51\/ (1), 128--142

2005
[28]

Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics\/ 32\/ (1), 56--85

2004
[29]

Zhao, Y., D. Zeng, A. J. Rush, and M. R. Kosorok (2012). Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association\/ 107\/ (499), 1106--1118

2012