Recognition: unknown
Estimating Government Worker Skills
Pith reviewed 2026-05-10 07:43 UTC · model grok-4.3
The pith
A machine learning approach using private-sector wages shows Indonesian government skills declined steadily while paying a 43 percent premium conditional on skills.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors propose estimating government worker skills by predicting private-sector wages from observables using machine learning and then applying those predictions to government workers. This yields evidence that government skills declined continuously from 1988 to 2014 relative to the private sector, primarily because the most skilled workers choose private jobs. It also shows an unconditional wage premium of 43 percent for government employment after controlling for these estimated skills.
What carries the argument
The machine learning model trained on private-sector wages and observables that serves as the benchmark for inferring equivalent skill levels among government workers.
If this is right
- The most skilled workers have increasingly sorted into the private sector, reducing the average skill level retained by government.
- Government labor costs exceed the market rate for the skill levels actually employed.
- Talent allocation between public and private sectors has become less efficient over time.
- The method provides a way to monitor public-sector skill trends without needing direct output data.
Where Pith is reading between the lines
- The same estimation approach could be applied to household surveys in other countries to test whether government skill erosion is a general pattern in developing economies.
- If the skill decline is confirmed, it raises the possibility that reforms to recruitment, promotion, or non-wage job features could help government retain higher-ability workers.
- The wage premium finding suggests that simply raising pay further may not solve retention problems if the premium already exists conditional on skills.
Load-bearing premise
Private sector wages in comparable jobs accurately reflect differences in worker skills and the relationship between observable traits and skills is the same across government and private employment.
What would settle it
Direct productivity or output measures for Indonesian government agencies that show no decline or an improvement over the same 1988-2014 period would challenge the conclusion that skills have fallen.
Figures
read the original abstract
We propose a new approach to estimate government worker skills, a setting where output is hard to observe and wages may be uninformative about skills. The approach uses wages in comparable jobs in the private sector and machine learning tools to link skills to skill-related observables. We apply the approach to rich Indonesian household-level panel data from 1988-2014, showing two main applications. First, government skills have continuously declined relative to the private sector, driven by the most skilled workers ending up in the private sector. Second, the Indonesian government pays a wage premium of 43% conditional on skills.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a new method to estimate government worker skills in contexts where output is hard to observe and wages may not reflect productivity. It uses private-sector wages as a skill proxy combined with machine learning to map observables to skills, then applies this to Indonesian household panel data (1988-2014). The main findings are that government skills have declined relative to the private sector (driven by high-skill workers selecting into private employment) and that the government pays a 43% wage premium conditional on estimated skills.
Significance. If the identification holds, the approach offers a practical way to measure public-sector human capital trends where direct productivity data are unavailable, with potential applications to compensation reform and selection effects. The long panel spanning the 1998 crisis and civil-service changes is a strength, as is the use of ML for flexible prediction rather than parametric assumptions. The results, if robust, would inform debates on public-private wage gaps and talent allocation in developing-country labor markets.
major comments (2)
- [Identification and estimation approach] The core identification (described in the methods section following the abstract) assumes private-sector wages equal marginal product of skill without systematic bias and that the observable-to-skill mapping is identical across sectors. This is load-bearing for both the skill-decline result and the 43% premium claim. The Indonesian data period includes the 1998 financial crisis, decentralization, and civil-service reforms, any of which could introduce sector-specific wage distortions (e.g., monopsony or informal-sector competition) that violate the assumption and render the ML mapping unidentified when applied to government workers.
- [Main results on skill trends] The claim that skill decline is 'driven by the most skilled workers ending up in the private sector' requires explicit evidence on selection. The panel structure should permit individual fixed effects or Heckman-style selection corrections, but it is unclear whether these are implemented or whether the ML predictions are validated out-of-sample separately by sector (e.g., via cross-validation metrics or hold-out tests on private-sector data).
minor comments (3)
- [Data and variables] Clarify the exact set of observables fed into the ML model and report feature-importance or partial-dependence plots to show which variables drive the skill predictions.
- [Wage premium results] The 43% premium figure should be accompanied by standard errors, robustness to alternative ML algorithms (e.g., random forest vs. neural net), and checks for sensitivity to the definition of 'comparable jobs.'
- [Descriptive statistics] Add a table or figure showing balance of observables between government and private workers before and after the ML mapping to help readers assess common support.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which have helped us strengthen the paper. We address the major concerns below by clarifying the identification strategy, adding robustness analyses, and providing more explicit evidence on selection. We have revised the manuscript accordingly.
read point-by-point responses
-
Referee: [Identification and estimation approach] The core identification (described in the methods section following the abstract) assumes private-sector wages equal marginal product of skill without systematic bias and that the observable-to-skill mapping is identical across sectors. This is load-bearing for both the skill-decline result and the 43% premium claim. The Indonesian data period includes the 1998 financial crisis, decentralization, and civil-service reforms, any of which could introduce sector-specific wage distortions (e.g., monopsony or informal-sector competition) that violate the assumption and render the ML mapping unidentified when applied to government workers.
Authors: The identification does rest on private-sector wages reflecting marginal product and a common observable-to-skill mapping, which is a maintained assumption in the public-private wage literature when direct output data are unavailable. We acknowledge that the 1998 crisis and subsequent reforms could introduce distortions. In the revision we add period-specific estimates (pre- and post-1998), include time-varying macroeconomic controls, and discuss potential monopsony effects in the public sector. These checks leave the main results qualitatively unchanged. We do not claim the mapping is literally identical in every sub-period but argue it is stable enough for the long-run trends we emphasize. revision: partial
-
Referee: [Main results on skill trends] The claim that skill decline is 'driven by the most skilled workers ending up in the private sector' requires explicit evidence on selection. The panel structure should permit individual fixed effects or Heckman-style selection corrections, but it is unclear whether these are implemented or whether the ML predictions are validated out-of-sample separately by sector (e.g., via cross-validation metrics or hold-out tests on private-sector data).
Authors: We use the panel to document selection directly: individuals with higher predicted skills are significantly more likely to transition from government to private employment, and we now report these transition probabilities by skill quintile in a new figure. The ML model is trained exclusively on private-sector observations and evaluated via k-fold cross-validation on held-out private-sector data; out-of-sample R-squared and mean-squared-error metrics are reported in the appendix. We have added individual fixed-effects specifications for the wage-premium equation as a robustness check. Heckman-style corrections are not implemented because the selection process is already modeled through the rich set of observables; we now discuss this modeling choice and its limitations explicitly. revision: yes
Circularity Check
No significant circularity; derivation uses external private-sector benchmark
full rationale
The paper proposes training ML models on private-sector wages and observables to infer skill proxies, then transfers the mapping to government workers in the Indonesian panel. This chain does not reduce any claimed prediction (skill decline or 43% conditional premium) to the inputs by construction, as the training data and target sector are distinct. No self-definitional steps, fitted-input-as-prediction within the same sample, or load-bearing self-citations appear in the abstract or described method. The results remain contingent on external assumptions about wage-skill correspondence rather than tautological re-labeling of fitted values.
Axiom & Free-Parameter Ledger
free parameters (1)
- Machine learning model parameters
axioms (1)
- domain assumption Private sector wages reflect true underlying skills for comparable jobs
Reference graph
Works this paper leans on
-
[1]
High wage workers and high wage firms
Abowd, John M, Francis Kramarz, and David N Margolis (1999). “High wage workers and high wage firms”. In:Econometrica67.2, pp. 251–333. Ahn, Seung C., Young H. Lee, and Peter Schmidt (2013). “Panel Data Models with Multiple Time-Varying Individual Effects”. In:Journal of econometrics174.1, pp. 1–14. Alesina, Alberto, Stephan Danninger, and Massimo Rostagn...
1999
-
[2]
Public Action for Public Goods
Banerjee, Abhijit, Lakshmi Iyer, and Rohini Somanathan (2007). “Public Action for Public Goods”. In: Handbook of development economics4, pp. 3117–3154. Becker, Gary S (1975). “Investment in human capital: effects on earnings”. In:Human Capital: A Theo- retical and Empirical Analysis, with Special Reference to Education, Second Edition. NBER, pp. 13–
2007
-
[3]
Gender Quotas and the Crisis of the Mediocre Man: Theory and Evidence from Sweden
Besley, Timothy et al. (2017). “Gender Quotas and the Crisis of the Mediocre Man: Theory and Evidence from Sweden”. In:American economic review107.8, pp. 2204–42. Best, Michael Carlos, Jonas Hjort, and David Szakonyi (2023). “Individuals and organizations as sources of state effectiveness”. In:American Economic Review113.8, pp. 2121–2167. Bhavnani, Rikhil...
2017
-
[4]
Letter grading government efficiency
Chong, Alberto et al. (2014). “Letter grading government efficiency”. In:Journal of the European Eco- nomic Association12.2, pp. 277–298. Colonnelli, Emanuele, Mounu Prem, and Edoardo Teso (2020). “Patronage and Selection in Public Sector Organizations”. In:American Economic Review110.10, pp. 3071–99. Dal B´ o, Ernesto, Frederico Finan, Olle Folke, et al....
2014
-
[5]
Estimating the Value of Political Connections
Elsevier, pp. 467–514. Fisman, Raymond (2001). “Estimating the Value of Political Connections”. In:American economic review 91.4, pp. 1095–1102. Goldberg, Lewis R (1993). “The structure of phenotypic personality traits.” In:American psychologist 48.1, p
2001
-
[6]
The Political Economy of Oligarchy and the Reorganization of Power in Indonesia
Hadiz, Vedi and Richard Robison (2013). “The Political Economy of Oligarchy and the Reorganization of Power in Indonesia”. In:Indonesia96, pp. 35–57. Hamory, Joan et al. (2021). “Reevaluating agricultural productivity gaps with longitudinal microdata”. In:Journal of the European Economic Association19.3, pp. 1522–1555. Hanna, Rema and Shing-Yi Wang (2017)...
2013
-
[7]
Sources of Lifetime Inequality
Huggett, Mark, Gustavo Ventura, and Amir Yaron (2011). “Sources of Lifetime Inequality”. In:American Economic Review101.7, pp. 2923–54. Jia, Ruixue, Masayuki Kudamatsu, and David Seim (Aug. 2015). “Political Selection in China: The complementary roles of connections and performance”. en. In:Journal of the European Economic Association13.4, pp. 631–668.iss...
-
[8]
The Non-Democratic Roots of Elite Capture: Evidence From Soeharto Mayors in Indonesia
Martinez-Bravo, Monica, Priya Mukherjee, and Andreas Stegmann (2017). “The Non-Democratic Roots of Elite Capture: Evidence From Soeharto Mayors in Indonesia”. In:Econometrica85.6, pp. 1991–
2017
-
[9]
Private Sector Lessons for Public Sector Reform in Indonesia
McLeod, Ross H. (2006). “Private Sector Lessons for Public Sector Reform in Indonesia”. In:Agenda: A Journal of Policy Analysis and Reform, pp. 275–288. — (2008). “Inadequate Budgets and Salaries as Instruments for Institutionalizing Public Sector Corrup- tion in Indonesia”. In:South East Asia Research16.2, pp. 199–223. Meghir, Costas and Luigi Pistaferri...
2006
-
[10]
On the pooling of time series and cross section data
Elsevier, pp. 773–854. Mundlak, Yair (1978). “On the pooling of time series and cross section data”. In:Econometrica: journal of the Econometric Society, pp. 69–85. Penrose, Lionel S and John C Raven (1936). “A new series of perceptual tests: preliminary communica- tion.” In:British Journal of Medical Psychology. 37 Pesaran, M. Hashem (2006). “Estimation ...
1978
-
[11]
Heterogeneous Life-Cycle Profiles, Income Risk and Consumption Inequality
Now Publishers Inc. Primiceri, Giorgio E. and Thijs Van Rens (2009). “Heterogeneous Life-Cycle Profiles, Income Risk and Consumption Inequality”. In:Journal of monetary Economics56.1, pp. 20–39. RAND Corporation (Apr. 2018).IFLS Data Notes: How Do I Use IFLS Weights?Accessed 2026-01-30. url:https://www.rand.org/health/surveys/FLS/IFLS/datanotes.html. Rasu...
-
[12]
notes that it is unclear whether reform processes since 2001 have actually led to an improvement of hiring practices beyond just a few reform-minded institutions. In Pierskalla and Sacks (Pierskalla and Sacks 2018), the authors draw on teacher censuses to show that changes in the political system after 1998 actually had negative effects on public hiring. ...
2001
-
[13]
Kristiansen and Ramli (Kristiansen and Ramli
use data on the universe of civil servants to show that civil servants with a postgraduate education are twice as likely to be promoted after 1999 in comparison to before, indicating a combination of composition changes and more performance-related promotion patterns. Kristiansen and Ramli (Kristiansen and Ramli
1999
-
[14]
Recent reforms have tried to reverse this
draw on in-depth qualitative and quantitative evidence from interviews and focus groups with a non-representative sample of 60 civil servants in two areas of Indonesia to document that personal ties and nepotism are often named as primary reasons 14Under the Suharto regime, the Civil Service system was organized as a military-type organisation where new r...
2006
-
[15]
Private-sector workers are propensity-weighted; government workers use survey weights
Group GovernmentComparable private sector Notes:Weighted proportion of value = 1 for binary dummy x variables. Private-sector workers are propensity-weighted; government workers use survey weights. 45 Figure B.2: Common support: continuous skill variables Word recall test score Delayed word recall test score Raven’s IQ test score Word ability Big Five: Ne...
1940
-
[16]
Rel. govt skills
56 D Additional empirical results This section provides additional results for Section 4 that are in part referenced in the main text. Figure D.6 shows absolute and relative government worker skills across (binned) cohorts, now additionally holding the composition of government jobs fixed. The difference between Figure D.6 and Figure 8 thus gives the impo...
2000
-
[17]
Figure D.3: Government selection rule: jobfix skills (group-mean reference) Ratio = 1 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 7 8 9 10 Skills (jobfix: comparing-group mean reference) Density All (incl. govt)Govt Selectionrule s(z) Ratio = 1 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 7 8 9 10 Skills (jobfix: comparing-group mean reference) Density Selection rule: pos...
2000
-
[18]
Median” gives the baseline estimator using median within-individual wage changes in the flat-spot region, while “Mean
60 Figure D.4: Comparison of estimation methods for skill price 0.5 1.0 1.5 2.0 1990 1995 2000 2005 2010 2015 Year Skill price Median Mean Notes:Skill price estimation following Bowlus and C. Robinson (2012). “Median” gives the baseline estimator using median within-individual wage changes in the flat-spot region, while “Mean” reports the same estimator u...
1990
-
[19]
Exclusion test
Figure D.5: Comparison of estimated wage profile and data 7.2 7.5 7.8 8.1 0 10 20 30 40 Experience (years, centered at mean) log wage (relative to fixed effects) Notes:The flat-spot region is highlighted. Centered within-FE R2 uses weighted SSE/SST around the weighted mean of within-FE log wages. Quadratic R2=0.83%. Best, second-best, and third-best binne...
1940
-
[20]
Government worker skills are defined asz i =h i,0,t, assuming thatδ 0 = 0, and the correlation between zi,g i and experience dummies is left unrestricted
66 F Further details on extension ”multi-dimensional skills & more flexible experience profile” Here, we provide further details on extending the estimation to allow for a factor-structure in the skill estimation of the form: Hi,e,t =exp(z i)×exp(g i ×δ e) (10) wherez i are individual time-fixed skills at labor market entry as before,δ e are arbitrary exp...
2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.