Queue & AI: When Faster Tasks Slow Down the Workflow
Pith reviewed 2026-07-01 15:51 UTC · model grok-4.3
The pith
AI assistance stabilizes overloaded workflows only when it exceeds a critical task fraction and review attention costs less than manual completion.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In workflows modeled as queues where tasks compete for scarce human attention, AI assistance produces a variance wedge: mean task completion times decrease because of fast drafts, yet system-level delay increases when a fraction of AI outputs contain errors that escape review and generate rework. Stabilization of an overloaded system occurs only when the AI task fraction exceeds a critical threshold and the attention required for review plus expected rework is lower than the attention for fully manual completion.
What carries the argument
A queueing model in which tasks arrive, receive either manual or AI processing, and compete for limited reviewer attention, with a fraction of AI outputs escaping to create additional rework that re-enters the queue.
If this is right
- Reviewers rationally raise their risk threshold for checking AI outputs under congestion, reducing scrutiny when it matters most.
- The time-efficient processing regime can switch between fully AI-assisted and fully manual depending on load parameters.
- Mean-based productivity metrics can show gains while system congestion and total delay worsen.
- AI deployment must be assessed on effects to congestion, rework volume, and robustness of oversight rather than average task speed alone.
Where Pith is reading between the lines
- If the escaped-error fraction can be driven to zero through better AI or review processes, the variance wedge and its thresholds would disappear.
- The same mechanism could appear in any attention-constrained pipeline, such as code review or customer-service ticket handling, even outside the paper's explicit examples.
- Organizations might need to temporarily increase review resources to cross the critical AI fraction before seeing net stabilization.
- A direct test would involve running parallel workflows with controlled AI adoption rates and measuring both mean task time and end-to-end completion time under varying loads.
Load-bearing premise
A non-zero fraction of AI-generated errors escape review and generate costly downstream rework that competes for the same human attention pool as new tasks.
What would settle it
Measure whether increasing the AI task fraction above the model's critical threshold, while keeping review-plus-rework attention below manual attention, actually reduces total workflow delay in a controlled queueing experiment or real deployment.
Figures
read the original abstract
Quantifying the workplace productivity effects of Generative Artificial Intelligence is now central to economics, management, and public policy. The deployment of AI tools in customer service, writing, software development, and consulting operations has been reported to generate large per-task productivity gains, typically measured as tasks completed per worker-hour or reductions in mean handle time. We argue that such mean-based metrics can misrepresent AI's effects in workflows where tasks accumulate and compete for scarce human attention. AI assistance can generate a deceptive productivity signature: average completion times fall because AI tools typically supply a fast first draft, yet workflow-level performance deteriorates when a subset of AI errors escapes review and returns as costly downstream rework. We call this divergence between mean task speed and system-level delay the variance wedge. Depending on the operational parameters, the most time-efficient way to complete a workflow may undergo a transition between two task-processing regimes, a fully AI-assisted and a fully manual one. We formalize the mechanism as a queueing model and derive two main implications analytically. First, under congestion, reviewers rationally raise the risk threshold for checking AI outputs, reducing scrutiny precisely when it would matter the most. Second, AI assistance can stabilize an overloaded workflow only when (i) the fraction of tasks handled by AI exceeds a critical threshold, and (ii) the human attention required for review and expected rework is lower than the attention for manual completion, a requirement substantially more stringent than faster draft generation. These results suggest that AI deployment should be evaluated not only by average task speed, but by its overall effects on congestion, rework, and the robustness of human oversight under load.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops a queueing model of workflows in which AI assistance supplies fast initial drafts but can introduce rework when errors escape review. It derives analytically a 'variance wedge' between per-task speed and system-level delay, plus two stabilization conditions: AI assistance stabilizes an overloaded system only when the AI task fraction exceeds a critical threshold and when the attention cost of review plus expected rework is strictly less than the attention cost of fully manual completion. A secondary result is that congestion induces rational reduction in scrutiny of AI outputs.
Significance. The analytical derivation of the variance wedge and the rational-scrutiny result under load offers a clean mechanism for why mean-based productivity metrics can mislead in attention-constrained settings. If the model assumptions are satisfied, the two explicit stabilization conditions supply falsifiable, policy-relevant thresholds that go beyond the usual 'AI is faster' claim. The work is entirely analytical and contains no fitted parameters or data, which strengthens its internal logic but also makes the rework-probability premise central.
major comments (2)
- [Queueing model and analytical implications] The derivation of both the variance wedge and the two stabilization conditions (critical AI fraction and review-attention bound) rests on the premise that a strictly positive fraction of AI outputs escape review and re-enter the attention pool as rework. The manuscript supplies no independent bound, empirical anchor, or sensitivity analysis on this rework probability; setting it to zero eliminates the wedge and the thresholds. This modeling choice is load-bearing for the headline claims (see the queueing-model section and the paragraph deriving the two implications).
- [Abstract and model derivation] The abstract states that the results are 'derived analytically' and 'parameter-free' in their qualitative form, yet the stability conditions are expressed in terms of the (unquantified) rework probability and the relative attention costs. Without the explicit equations or a demonstration that the thresholds survive when the rework probability is treated as a free parameter, it is impossible to confirm that the derivations are free of post-hoc restrictions.
minor comments (2)
- [Abstract] The abstract and introduction would benefit from a single sentence stating the minimal modeling assumptions (Poisson arrivals, exponential service, fixed rework probability) before the claims are presented.
- [Model section] Notation for the attention costs (review vs. manual completion) and the AI fraction should be introduced once with a table or equation reference rather than repeated in prose.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for identifying the central role of the rework probability in our queueing model. The comments correctly highlight that this parameter drives the variance wedge and the stabilization conditions. Below we respond point by point, indicating where revisions will be made and where the purely analytical nature of the work limits what can be supplied.
read point-by-point responses
-
Referee: [Queueing model and analytical implications] The derivation of both the variance wedge and the two stabilization conditions (critical AI fraction and review-attention bound) rests on the premise that a strictly positive fraction of AI outputs escape review and re-enter the attention pool as rework. The manuscript supplies no independent bound, empirical anchor, or sensitivity analysis on this rework probability; setting it to zero eliminates the wedge and the thresholds. This modeling choice is load-bearing for the headline claims (see the queueing-model section and the paragraph deriving the two implications).
Authors: We agree that a positive rework probability is essential for the wedge and the two stabilization thresholds; the model is constructed precisely to study the case in which AI outputs are imperfect. The qualitative results (existence of the wedge, the critical AI-fraction threshold, and the stricter attention-cost condition) hold for any rework probability in (0,1) that satisfies the derived inequalities. We will add a sensitivity analysis in the revision that varies the rework probability over its admissible range and shows that the sign of the key comparative-statics results is preserved. Because the manuscript is entirely analytical, we cannot supply an empirical anchor or independent bound on the parameter; we will instead discuss how the parameter could be calibrated from future observational data on AI error rates. revision: partial
-
Referee: [Abstract and model derivation] The abstract states that the results are 'derived analytically' and 'parameter-free' in their qualitative form, yet the stability conditions are expressed in terms of the (unquantified) rework probability and the relative attention costs. Without the explicit equations or a demonstration that the thresholds survive when the rework probability is treated as a free parameter, it is impossible to confirm that the derivations are free of post-hoc restrictions.
Authors: The phrase 'parameter-free in their qualitative form' was intended to convey that the direction of the effects and the existence of the thresholds do not require specific numerical values, only that the rework probability is positive and that the attention-cost inequality holds. We will revise the abstract to remove any ambiguity and will include the explicit stability conditions (as inequalities involving the rework probability p and the attention-cost ratio) in the main text. These conditions are derived directly from the steady-state balance equations without additional restrictions; we will add a short appendix derivation that treats p as a free parameter in (0,1) and shows the thresholds remain well-defined. revision: yes
- Supplying an empirical anchor or independent bound on the rework probability, as the manuscript contains no data and is purely theoretical.
Circularity Check
No significant circularity in analytical queueing derivations
full rationale
The manuscript formalizes a queueing model and derives the variance wedge, critical AI fraction threshold, and review-attention bound analytically from the stated assumptions (including non-zero rework probability). No equations or claims reduce the derived stability conditions to fitted parameters, self-citations, or inputs by construction. The central results are presented as consequences of the model rather than tautological restatements of its premises. This is the expected outcome for a self-contained mathematical derivation without data-fitting steps or load-bearing self-references.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Tasks accumulate and compete for scarce human attention
- domain assumption A subset of AI errors escapes review and returns as costly downstream rework
Reference graph
Works this paper leans on
-
[1]
Daniotti, S., Wachs, J., Feng, X. & Neffke, F. Who is using AI to code? global diffusion and impact of generative AI. Scienceeadz9311, DOI: 10.1126/science.adz9311 (2026). 2.Wu, L. & Vasilescu, B. AI raises the productivity bar.Science391, 763–764, DOI: 10.1126/science.aef5239 (2026)
-
[2]
& O’Connor, S
Burn-Murdoch, J. & O’Connor, S. Are we thinking about AI and productivity all wrong? Financial Times, The AI Shift (2026)
2026
-
[3]
Available: https://doi.org/10.1145/3571730
Ji, Z.et al.Survey of hallucination in natural language generation.ACM Comput. Surv.55, 1–38, DOI: 10.1145/3571730 (2023)
-
[4]
Huang, L.et al.A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Inf. Syst.43, 1–55, DOI: 10.1145/3703155 (2025)
-
[5]
Brynjolfsson, E., Li, D. & Raymond, L. R. Generative AI at work.Q. J. Econ.140, 889–942, DOI: 10.1093/qje/qjae044 (2025)
-
[6]
Noy, S. & Zhang, W. Experimental evidence on the productivity effects of generative artificial intelligence.Science381, 187–192, DOI: 10.1126/science.adh2586 (2023)
-
[7]
Sci.37, 403–423, DOI: 10.1287/orsc.2025.21838 (2026)
Dell’Acqua, F.et al.Navigating the jagged technological frontier: Field experimental evidence of the effects of artificial intelligence on knowledge worker productivity and quality.Organ. Sci.37, 403–423, DOI: 10.1287/orsc.2025.21838 (2026)
-
[8]
& Deming, D
Bick, A., Blandin, A. & Deming, D. The impact of generative AI on work productivity. Federal Reserve Bank of St. Louis, On the Economy (2025)
2025
-
[9]
The Impact of AI on Developer Productivity: Evidence from GitHub Copilot
Peng, S., Kalliamvakou, E., Cihon, P. & Demirer, M. The impact of AI on developer productivity: Evidence from GitHub Copilot.arXiv preprint arXiv:2302.06590(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[10]
M., Ernst, E., Merola, R., Samaan, D
del Rio-Chanona, R. M., Ernst, E., Merola, R., Samaan, D. & Teutloff, O. AI and jobs. a review of theory, estimates, and evidence.arXiv preprint arXiv:2509.15265(2025). 12.Acemoglu, D. The simple macroeconomics of AI.Econ. Policy40, 13–58, DOI: doi.org/10.1093/epolic/eiae042 (2025)
-
[11]
Über eine Aufgabe der Wahrscheinlichkeitstheorie
Pollaczek, F. Über eine Aufgabe der Wahrscheinlichkeitstheorie. I.Math. Zeitschrift32, 64–100, DOI: doi.org/10.1007/ BF01194620 (1930). 14.Khinchine, A. Y . Mathematical theory of stationary queues.Matematicheskii Sbornik39, 73–84 (1932). In Russian
1930
-
[12]
Kingman, J. F. C. The single server queue in heavy traffic.Math. Proc. Camb. Philos. Soc.57, 902–904, DOI: https://doi.org/10.1017/S0305004100036094 (1961)
-
[13]
A single-server queue with feedback.The Bell Syst
Takács, L. A single-server queue with feedback.The Bell Syst. Tech. J.42, 505–519, DOI: 10.1002/j.1538-7305.1963. tb00512.x (1963)
-
[14]
Disney, R. L., McNickle, D. C. & Simon, B. The M/G/1 queue with instantaneous Bernoulli feedback.Nav. Res. Logist. Q. 27, 635–644, DOI: 10.1002/nav.3800270411 (1980)
-
[15]
Disney, R. L. A note on sojourn times in M/G/1 queues with instantaneous, Bernoulli feedback.Nav. Res. Logist. Q.28, 679–684, DOI: 10.1002/nav.3800280415 (1981)
-
[16]
Choi, B. D., Kim, B. & Choi, S. H. On the M/G/1 Bernoulli feedback queue with multi-class customers.Comput. & Oper. Res.27, 269–286, DOI: 10.1016/S0305-0548(99)00036-2 (2000). 20.Jackson, J. R. Networks of waiting lines.Oper. Res.5, 518–521, DOI: 10.1287/opre.5.4.518 (1957). 21.Kleinrock, L.Queueing Systems, Volume 2: Computer Applications(Wiley, New York, 1976)
-
[17]
Yang, Y ., Jiao, L. & Xu, Y . A queueing theoretic perspective on low-latency LLM inference with variable token length. InProceedings of the 22nd International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks, WiOpt, 273–280 (2024)
2024
-
[18]
Ozbas, E. & Bastopcu, M. Queueing-aware optimization of reasoning tokens for accuracy-latency trade-offs in LLM servers.arXiv preprint arXiv:2601.10274(2026). 2601.10274. 19/20
-
[19]
Mitzenmacher, M. & Shahout, R. Queueing, predictions, and large language models: Challenges and open problems. Stoch. Syst.15, 195–219, DOI: 10.1287/stsy.2025.0106 (2025)
-
[20]
Kusumegi, K.et al.Scientific production in the era of large language models.Science390, 1240–1243, DOI: 10.1126/ science.adw3000 (2025)
2025
-
[21]
Lindsay, G. F. & Bishop, A. B. Allocation of screening inspection effort—a dynamic-programming approach.Manag. Sci. 10, 342–352, DOI: 10.1287/mnsc.10.2.342 (1964). 27.Tapiero, C. S.The Management of Quality and its Control(Chapman & Hall, 1996)
-
[22]
Parasuraman, R. & Manzey, D. H. Complacency and bias in human use of automation: An attentional integration.Hum. Factors52, 381–410, DOI: 10.1177/0018720810376055 (2010). 29.Bainbridge, L. Ironies of automation.Automatica19, 775–779, DOI: 10.1016/0005-1098(83)90046-8 (1983)
-
[23]
A sampling inspection problem in arms control agreements: A game-theoretic analysis
Dresher, M. A sampling inspection problem in arms control agreements: A game-theoretic analysis. Tech. Rep. RM-2972- ARPA, RAND Corporation, Santa Monica, CA (1962)
1962
-
[24]
Avenhaus, R., von Stengel, B. & Zamir, S. Inspection games. In Aumann, R. J. & Hart, S. (eds.)Handbook of Game Theory with Economic Applications, vol. 3, 1947–1987, DOI: 10.1016/S1574-0005(02)03014-X (Elsevier, 2002)
-
[25]
Kiekintveld, C.et al.Computing optimal randomized resource allocations for massive security games. InProceedings of the 8th International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS ’09, 689–696 (International Foundation for Autonomous Agents and Multiagent Systems, 2009). 33.Cohen, J. W.The Single Server Queue(North-Holland, 1969)...
2009
-
[26]
Neuts, M. F. Generalizations of the Pollaczek–Khinchin integral equation in the theory of queues.Adv. Appl. Probab.18, 952–990, DOI: 10.2307/1427258 (1986)
-
[27]
Abate, J., Choudhury, G. L. & Whitt, W.An Introduction to Numerical Transform Inversion and Its Application to Probability Models, 257–323 (Springer US, Boston, MA, 2000). 38.Wolff, R. W. Poisson arrivals see time averages.Oper. Res.30, 223–231, DOI: 10.1287/opre.30.2.223 (1982). 39.Little, J. D. C. A proof for the queuing formula:L=λW.Oper. Res.9, 383–38...
-
[28]
Kingman, J. F. C. On queues in heavy traffic.J. Royal Stat. Soc. B24, 383–392, DOI: 10.1111/j.2517-6161.1962.tb00465.x (1962)
-
[29]
Kingman, J. F. C. Inequalities in the theory of queues.J. Royal Stat. Soc. B32, 102–110, DOI: 10.1111/j.2517-6161.1970. tb00819.x (1970)
-
[30]
The queueing network analyzer.Bell Syst
Whitt, W. The queueing network analyzer.Bell Syst. Tech. J.62, 2779–2815, DOI: 10.1002/j.1538-7305.1983.tb03204.x (1983)
-
[31]
Hopp, W. J. & Spearman, M. L.Factory Physics: Foundations of Manufacturing Management(McGraw-Hill/Irwin, Boston, 2001), 2nd edn. 20/20
2001
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.