Recognition: unknown
Optimized questionnaire item selection for tracking the progression of motor symptoms in Parkinson's disease
Pith reviewed 2026-05-10 15:49 UTC · model grok-4.3
The pith
Coordinate descent and adaptive selection of MDS-UPDRS items cut expected standard deviation by 26 and 34 percent for five-item subsets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Three item selection methods were compared for minimizing uncertainty in disease severity estimates from the MDS-UPDRS: ranking items by expected Fisher information, coordinate descent to directly minimize estimate standard deviation, and adaptive selection based on true latent scores. For five-item subsets the reductions in expected standard deviation relative to random selection were 14, 26, and 34 percent respectively. The adaptive method represents a best-case performance limit. Gains from sophisticated methods are largest for small subsets and diminish as more items are included. Reduced sets measure the same latent construct because item parameters are unchanged from the full test.
What carries the argument
Coordinate descent local search that directly minimizes the standard error of the latent trait estimate in the IRT model.
If this is right
- For five-item subsets coordinate descent reduces expected standard deviation by 26 percent compared with random selection.
- Adaptive selection achieves a 34 percent reduction but only as an upper limit under ideal information.
- Advantages of the optimization methods shrink as the number of retained items increases.
- All reduced sets continue to measure the identical latent trait because they reuse the original item parameters.
Where Pith is reading between the lines
- In real use, where only estimated scores are available, the adaptive method's reported advantage is likely to shrink.
- The same selection procedures could be applied to other IRT-based clinical scales to create shorter yet precise versions.
- Routine clinical monitoring might become feasible at higher frequency if shorter forms maintain acceptable precision.
Load-bearing premise
The adaptive selection gains assume the optimal items can be chosen using the patient's true unknown disease severity score rather than scores estimated from the responses.
What would settle it
A simulation that repeatedly selects the five items using only the estimated severity score obtained from those same items and measures whether the reduction in expected standard deviation still reaches 34 percent.
Figures
read the original abstract
Long questionnaires increase the response burden for patients and healthcare workers. In the treatment of Parkinson's disease, the MDS-UPDRS questionnaire to track disease progression may be underutilized due to time requirements. While reduced item sets have been studied using Fisher information from Item Response Theory (IRT) models, optimal selection methods remain unclear. We compared three methods for selecting an optimal subset of items, with the aim of minimizing the uncertainty in the estimates of the disease severity: Ranking by the Fisher information, coordinate descent local search to directly minimize estimate uncertainty, and adaptive selection. Whereas item ranking based on the expected Fisher information outperformed random choice of items, we saw further gains with the coordinate descent algorithm that directly minimizes the uncertainty of the disease severity estimate. An adaptive algorithm gave an additional slight gain compared to the coordinate descent method. However, the performance of the adaptive method is a best-case limit as we assume that we find the optimal set for the true latent trait scores. For a 5-item subset, the ranked Fisher information method reduced the expected standard deviation by 14 percent compared to random item selection. The corresponding reductions for coordinate descent and adaptive selection were 26 percent and 34 percent respectively. More sophisticated selection methods substantially improved estimate accuracy for small item sets, with diminishing returns for larger subsets. Because item parameters are retained from the full test, reduced item sets measure the same latent construct as the original test. The choice of method entails a trade-off between methodological complexity and precision.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript compares three methods for selecting optimal subsets of items from the MDS-UPDRS questionnaire to minimize uncertainty in IRT-based estimates of Parkinson's disease severity: ranking by expected Fisher information, coordinate descent optimization, and adaptive selection. It reports concrete performance gains for a 5-item subset (14%, 26%, and 34% reductions in expected standard deviation versus random selection) and notes that the adaptive method represents a best-case limit assuming knowledge of true latent trait scores.
Significance. If the results hold under practical conditions, the work provides a useful framework for shortening clinical questionnaires while preserving precision on the same latent construct, which could increase routine use of the MDS-UPDRS in Parkinson's monitoring. The direct optimization of estimate uncertainty and the transparent labeling of the adaptive method's oracle limitation are methodological strengths that advance optimal test design in applied psychometrics.
major comments (2)
- Abstract: The 34% reduction for adaptive selection is obtained by selecting the optimal set using the true (unknown) latent trait scores rather than estimates derived from responses. Although labeled a 'best-case limit,' including this figure in the headline comparison with the 26% coordinate-descent result overstates the attainable incremental gain; a practical evaluation using estimated trait scores is needed to support the claim of 'additional slight gain' from the adaptive algorithm.
- Methods section: The concrete percentage improvements (14%, 26%, 34%) and the soundness of the simulation-based comparisons cannot be fully assessed without visible details on the data source, IRT model specification, number of Monte Carlo replications, variance estimation procedure, and any error analysis or confidence intervals around the reported reductions.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important issues of clarity and practical relevance, which we address point by point below.
read point-by-point responses
-
Referee: Abstract: The 34% reduction for adaptive selection is obtained by selecting the optimal set using the true (unknown) latent trait scores rather than estimates derived from responses. Although labeled a 'best-case limit,' including this figure in the headline comparison with the 26% coordinate-descent result overstates the attainable incremental gain; a practical evaluation using estimated trait scores is needed to support the claim of 'additional slight gain' from the adaptive algorithm.
Authors: We agree that the adaptive result is an oracle bound and that its direct juxtaposition with the coordinate-descent figure in the abstract risks overstating the incremental practical gain. In the revised version we will rephrase the abstract to foreground that the 34% figure is a theoretical upper limit, remove the phrase 'additional slight gain' from the headline comparison, and add a short paragraph (or supplementary simulation) that implements adaptive selection using trait estimates obtained from an initial non-adaptive item set. This will provide the requested realistic evaluation while preserving the theoretical comparison. revision: yes
-
Referee: Methods section: The concrete percentage improvements (14%, 26%, 34%) and the soundness of the simulation-based comparisons cannot be fully assessed without visible details on the data source, IRT model specification, number of Monte Carlo replications, variance estimation procedure, and any error analysis or confidence intervals around the reported reductions.
Authors: We acknowledge that the current Methods section does not make these simulation parameters sufficiently explicit for independent assessment. In the revision we will expand the relevant subsection to state: the source of the item-parameter estimates (the specific PD cohort or public dataset used for IRT calibration), the precise IRT model and estimation method, the number of Monte Carlo replications performed, the exact procedure used to obtain the standard deviation of the latent-trait estimates (analytical inverse-information approximation), and Monte-Carlo-based standard errors or 95% confidence intervals for the reported percentage reductions. These additions will render the numerical results fully reproducible and allow readers to judge the precision of the comparisons. revision: yes
Circularity Check
No circularity; standard IRT optimization on external questionnaire data
full rationale
The paper fits an IRT model to the full MDS-UPDRS questionnaire, retains the item parameters, and applies standard Fisher information ranking, coordinate descent minimization of estimate variance, and adaptive selection to choose subsets. All reported reductions (14%, 26%, 34%) are direct empirical comparisons of expected standard deviation against random selection on the same fitted model; the adaptive result is explicitly qualified as an oracle best-case limit rather than a practical prediction. No step equates a fitted quantity to a derived output by construction, no self-citation chain bears the central claim, and no known result is renamed as a new derivation. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The MDS-UPDRS items follow a standard Item Response Theory model relating responses to a latent disease severity trait.
Reference graph
Works this paper leans on
-
[1]
L., Carthy , M
AlMahadin, G., Lotfi, A., Zysk, E., Siena, F . L., Carthy , M. M., & B reedon, P . (2020). Parkinson’s disease: current assessment methods and wearable devices for ev aluation of movement disorder motor symptoms - a patient and healthcare professional perspective. BMC Neurology, 20(1),
2020
-
[2]
URL https://doi.org/10.1186/s12883-020-01996-7 Arrington, L., Ueckert, S., Ahamadi, M., Macha, S., & Karlsson , M. O. (2020). Performance of longitudinal item response theory models in shortened or partial as sessments. Journal of Phar- macokinetics and Pharmacodynamics, 47(5), 461–471. URL https://doi.org/10.1007/s10928-020-09697-x Casella, G. (1985). An...
-
[3]
John Wiley&Sons. 20 Goetz, C. G., Tilley , B. C., Shaftman, S. R., Stebbins, G. T., Fahn, S., Martinez-Martin, P ., Poewe, W ., Sampaio, C., Stern, M. B., Dodel, R., Dubois, B., Holloway , R., Jankovic, J., Kulisevsky , J., Lang, A. E., Lees, A., Leurgans, S., LeWitt, P . A., Nyenhuis, D., Ola now , C. W ., Rascol, O., Schrag, A., Teresi, J. A., van Hilte...
-
[4]
V ehtari, A., Gelman, A., Simpson, D., Carpenter, B., & B ¨urkner, P .-C. (2021). Rank-normalization, folding, and localization: An improved Rˆ for assessing conve rgence of MCMC (with discus- sion). Bayesian Analysis, 16(2), 667–718. URL https://doi.org/10.1214/20-BA1221 Zafar, S., & Y addanapudi, S. S. (2025). Parkinson Disease. In StatPearls. Treasure ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.