The Value of Temporary Control for the M/M/1 Queue
Pith reviewed 2026-06-28 21:17 UTC · model grok-4.3
The pith
A one-off temporary control option with two service rates during an exponential period approximates expected cost savings in the M/M/1 queue via value iteration.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The expected total saved cost from exercising the one-off temporary service rate control option in the M/M/1 queue can be approximated by the Value Iteration algorithm with M-uniform geometric recurrence, supplying concrete methods for both the total and the discounted saved-cost criteria; the same analysis establishes structural properties of optimal policies and strong Blackwell optimality.
What carries the argument
Value Iteration algorithm with M-uniform geometric recurrence, which iteratively approximates the value function of the finite-horizon control problem embedded in the infinite-horizon queue.
If this is right
- The approximation supplies numerical estimates of expected saved future cost for any chosen starting state or initial distribution.
- The approximation supplies numerical estimates of expected saved discounted future cost for any chosen discount factor.
- Optimal policies during the temporary control window possess identifiable structural properties.
- The derived policies are strongly Blackwell optimal.
Where Pith is reading between the lines
- The same recurrence technique could be tested on other birth-death processes that admit a temporary action window.
- Numerical values obtained for different parameter regimes indicate thresholds on queue length above which the control option becomes worthwhile.
- The structural policy results may carry over to models with phase-type service or arrival processes once the uniform recurrence condition is verified.
Load-bearing premise
The Value Iteration algorithm with M-uniform geometric recurrence produces sufficiently accurate approximations to the true expected saved costs of the temporary control option.
What would settle it
For a truncated finite-state version of the model with small buffer size, compute the exact optimal saved cost by solving the linear program and compare the numerical difference to the value-iteration output on the same instance.
Figures
read the original abstract
In this article, a one-off option for temporary service rate control for the M/M/1 queue is considered. After taking this option, during a single exponentially distributed period, two service rates are available for use. Once service rate control is lost, the system continues with a fixed service rate $\mu$. The objective is to minimise the sum of holding costs and service costs. We approximate the expected total saved cost by taking the one-off option, depending on the starting state or starting distribution. Using the Value Iteration algorithm with $M$-uniform geometric recurrence, we present methods to approximate the expected total saved future cost, as well as the expected total saved discounted future cost. Furthermore, we obtain theoretical results on the structure of optimal policies and strong Blackwell optimality. The paper is concluded by numerically applying the methods to various instances of the model.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript models a one-off temporary service-rate control option for the M/M/1 queue: after exercising the option, an exponentially distributed interval begins during which the controller may switch between two service rates before the system reverts to a fixed rate μ. The objective is to minimize the sum of holding and service costs. The expected total saved cost (undiscounted and discounted) is approximated via Value Iteration under the M-uniform geometric recurrence condition, both from a fixed initial state and from an initial distribution. Structural results on optimal policies and strong Blackwell optimality are derived, and the methods are illustrated numerically on several parameter instances.
Significance. If the recurrence condition holds and the value-iteration approximations converge to the true value functions, the paper supplies a concrete computational approach for valuing temporary control options in birth-death processes together with monotonicity properties of the optimal policy. Such results would be useful for designing flexible service contracts in queueing systems. The explicit treatment of both average-cost and discounted criteria, plus the numerical examples, strengthens the contribution provided the foundational assumptions are verified.
major comments (2)
- [Methods / Value Iteration section] The central approximation procedure relies on Value Iteration with M-uniform geometric recurrence, yet the manuscript provides no verification that the temporary-control MDP (with its finite-duration exponential control window and the given cost and rate parameters) satisfies the required recurrence condition. Without this verification the claimed convergence of the iterates and the numerical values of saved cost rest on an unestablished foundation.
- [Theoretical results section] The structural results on optimal policies and strong Blackwell optimality are stated to follow from the value-function properties obtained via the same recurrence-based iteration; if the recurrence fails to hold, these structural claims lose their supporting derivation.
minor comments (2)
- [Model formulation] Notation for the two service rates available during the temporary interval and for the exponential duration parameter should be introduced consistently in the model section before being used in the value-function equations.
- [Numerical experiments] The numerical section would benefit from an explicit statement of the truncation level M chosen for each instance and a brief check that increasing M does not materially change the reported saved-cost values.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for highlighting the need to verify the foundational recurrence condition. We address each major comment below.
read point-by-point responses
-
Referee: [Methods / Value Iteration section] The central approximation procedure relies on Value Iteration with M-uniform geometric recurrence, yet the manuscript provides no verification that the temporary-control MDP (with its finite-duration exponential control window and the given cost and rate parameters) satisfies the required recurrence condition. Without this verification the claimed convergence of the iterates and the numerical values of saved cost rest on an unestablished foundation.
Authors: We agree that the manuscript must explicitly confirm the M-uniform geometric recurrence condition for the augmented MDP (queue length together with the binary control-active state). In the revision we will add a short appendix verifying the condition via standard drift arguments for controlled birth-death processes: the service rates are bounded above and below, the holding-cost function is linear, and the exponential control window is memoryless, which together ensure a uniform geometric drift outside a finite set for any fixed cost and rate parameters in the numerical examples. This will rigorously justify both the convergence of value iteration and the reported saved-cost values. revision: yes
-
Referee: [Theoretical results section] The structural results on optimal policies and strong Blackwell optimality are stated to follow from the value-function properties obtained via the same recurrence-based iteration; if the recurrence fails to hold, these structural claims lose their supporting derivation.
Authors: The monotonicity and Blackwell-optimality claims are derived from the value-function properties that the recurrence condition guarantees. Once the recurrence verification is supplied, the derivations remain valid. In the revision we will add an explicit statement that all structural results hold under the verified recurrence condition and will note the (mild) parameter restrictions under which the drift argument applies. revision: yes
Circularity Check
No significant circularity; derivation applies standard VI to new model
full rationale
The paper applies the Value Iteration algorithm under the M-uniform geometric recurrence condition to compute approximations for expected saved costs (undiscounted and discounted) in the temporary-control MDP, then derives structural results on optimal policies and Blackwell optimality. No quoted step reduces a claimed prediction or value function to a fitted parameter or input by the paper's own equations. No self-citation chain is load-bearing for the central claims, and the recurrence condition is invoked as a prerequisite for the algorithm rather than defined circularly. The derivation remains self-contained as an application of known MDP methods to the one-off control model.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
K.M. Adusumilli and J.J. Hasenbein,Dynamic Admission and Service uRate Control of a Queue, Queueing Syst.66(2010), 131–154, available athttps://doi.org/10.1007/s11134-010-9192-z
-
[2]
Altman, A
E. Altman, A. Hordijk, and F.M. Spieksma,Contraction Conditions for Average andα-Discount Optimality in Countable State Markov Games with Unbounded Rewards, Math. Oper. Res.22(1997), 588–618
1997
-
[3]
B. Ata and S. Shneorson,Dynamic Control of an M/M/1 Service System with Adjustable Arrival and Service Rates, MNSC52 (2006), 1778–1791, available athttps://doi.org/10.1287/mnsc.1060.0587
-
[4]
Badian-Pessot, M.E
P. Badian-Pessot, M.E. Lewis, and D.G. Down,Optimal Control Policies for an M/M/1 Queue With a Removable Server and Dynamic Service Rates, Probab. Eng. Inf. Sci.35(2021), 189–209
2021
-
[5]
Blackwell,Discrete Dynamic Programming, The Annals of Mathematical Statistics33(1962), 719–726
D. Blackwell,Discrete Dynamic Programming, The Annals of Mathematical Statistics33(1962), 719–726
1962
-
[6]
Blok and F.M
H. Blok and F.M. Spieksma,Structures of Optimal Policies in MDPs with Unbounded Jumps: The State of Our Art, Markov Decision Processes in Practice, 2017, pp. 131–186
2017
-
[7]
Borkar,Topics in Controlled Markov Chains, Pitman Res
V .S. Borkar,Topics in Controlled Markov Chains, Pitman Res. Notes Math. Ser., vol. 240, Longman Scientific & Technical, New York, 1991
1991
-
[8]
Crabill,Optimal Control of a Service Facility with Variable Exponential Service Times and Constant Arrival Rate, MNSC18(1972), 560–566
T.B. Crabill,Optimal Control of a Service Facility with Variable Exponential Service Times and Constant Arrival Rate, MNSC18(1972), 560–566
1972
-
[9]
D. Ertiningsih, S. Bhulai, and F.M. Spieksma,A novel use of value iteration for deriving bounds for threshold and switching curve optimal policies, NRL65(2018), 638–659, available athttps://onlinelibrary.wiley.com/doi/pdf/ 10.1002/nav.21824
-
[10]
J.M. George and J.M. Harrison,Dynamic Control of a Queue with Adjustable Service Rate, Oper. Res.49(2001), 720–731, available athttps://doi.org/10.1287/opre.49.5.720.10605
-
[11]
Gnedenko and I.A
B. Gnedenko and I.A. Ushakov,Probabilistic Reliability Engineering, John Wiley & Sons, New York, 1995
1995
-
[12]
Groenevelt, L
H. Groenevelt, L. Pintelon, and A. Seidmann,Production Lot Sizing with Machine Breakdown, MNSC38(1992), 104–123
1992
-
[13]
Gryna and J.M
F.M. Gryna and J.M. Juran,Quality and Costs, McGraw-Hill, New York, 1999
1999
-
[14]
Kaliski Jr.,Subexponential Time, Encyclopedia of Cryptography and Security, 2011, pp
B.S. Kaliski Jr.,Subexponential Time, Encyclopedia of Cryptography and Security, 2011, pp. 1267–1267
2011
-
[15]
Teghem Jr.,Control of the service process in a queueing system, Eur
J. Teghem Jr.,Control of the service process in a queueing system, Eur. J. Oper. Res.23(1986), 141–158
1986
-
[16]
Stidham Jr
S. Stidham Jr. and R.R. Weber,Monotonic and Insensitive Optimal Policies for Control of Queues with Undiscounted Costs, Oper. Res.37(1989), 611–625. 29
1989
-
[17]
Kallenberg,Finite state and action MDPs, Handbook of Markov Decision Processes, 2002, pp
L.C.M. Kallenberg,Finite state and action MDPs, Handbook of Markov Decision Processes, 2002, pp. 21–87
2002
-
[18]
Kapur and L.R
K.C. Kapur and L.R. Lamberson,Reliability in Engineering Design, Wiley, New York, 1977
1977
-
[19]
Kitaev and V .V
M.Y . Kitaev and V .V . Rykov,Controlled Queueing Systems, CRC press, Boca Raton, 1995
1995
-
[20]
Koole,Structural results for the control of queueing systems using event-based dynamic programming, Queueing Syst
G.M. Koole,Structural results for the control of queueing systems using event-based dynamic programming, Queueing Syst. 30(1998), 323–339
1998
-
[21]
Koole,Monotonicity in Markov Reward and Decision Chains: Theory and Applications, FnT: Stochastic Systems1 (2006), 1–76
G.M. Koole,Monotonicity in Markov Reward and Decision Chains: Theory and Applications, FnT: Stochastic Systems1 (2006), 1–76
2006
-
[22]
R. Kumar, M.E. Lewis, and H. Topaloglu,Dynamic Service Rate Control for a Single-Server Queue with Markov-Modulated Arrivals, NRL60(2013), 661–677, available athttps://onlinelibrary.wiley.com/doi/pdf/10.1002/nav. 21560
work page doi:10.1002/nav 2013
-
[23]
Lee and V .G
N. Lee and V .G. Kulkarni,Optimal arrival rate and service rate control of multi-server queues, Queueing Syst.76(2014)
2014
-
[24]
Lippman,Applying a New Device in the Optimization of Exponential Queuing Systems, Oper
S.A. Lippman,Applying a New Device in the Optimization of Exponential Queuing Systems, Oper. Res.23(1975), 687–710
1975
-
[25]
Olver,Numerical Solution of Second-Order Linear Difference Equations, J
F.W.J. Olver,Numerical Solution of Second-Order Linear Difference Equations, J. Res. Nat. Bur. Standards Sect. B71(1967), 111–129
1967
-
[26]
Rust, A.J
R.T. Rust, A.J. Zahorik, and T.L. Keiningham,Return on Quality (ROQ): Making Service Quality Financially Accountable, J. Mark.59(1995), 58–70
1995
-
[27]
Sennott,Stochastic Dynamic Programming and the Control of Queueing Systems, Ser
L.I. Sennott,Stochastic Dynamic Programming and the Control of Queueing Systems, Ser. Probab. Math. Stat., Wiley, New York, 1999
1999
-
[28]
Serfozo,Technical Note–An Equivalence Between Continuous and Discrete Time Markov Decision Processes, Oper
R.F. Serfozo,Technical Note–An Equivalence Between Continuous and Discrete Time Markov Decision Processes, Oper. Res. 27(1979), 616–620
1979
-
[29]
Shapley,Stochastic Games, PNAS39(1953), 1095 –1100
L.S. Shapley,Stochastic Games, PNAS39(1953), 1095 –1100
1953
-
[30]
Spieksma,Geometrically ergodic Markov Chains and the optimal Control of Queues, Ph.D
F.M. Spieksma,Geometrically ergodic Markov Chains and the optimal Control of Queues, Ph.D. Thesis, 1990
1990
-
[31]
Spieksma,A recurrence type characterisation of mu-exponential ergodicity for Markov processes, 1992
F.M. Spieksma,A recurrence type characterisation of mu-exponential ergodicity for Markov processes, 1992. Accepted by Applied Probability. Available at:https://pub.math.leidenuniv.nl/~spieksmafm/publicaties.html
1992
-
[32]
Weber and S
R.R. Weber and S. Stidham Jr.,Optimal Control of Service Rates in Networks of Queues, Adv. Appl. Probab.19(1987), 202– 218
1987
-
[33]
Z ∞ 0 c(X(t))−c(Y(t)) dt # = lim α↓0 Eϕ p0×p1
J. Wessels,Markov Programming by Successive Approximations with Respect to Weighted Supremum Norms, J. Math. Anal. Appl.58(1977), 326–335. A Equality of total saved cost for the continuous and discrete time models In this section we show the equivalence of Equations (1.1) and (2.1) for zero discounts. As there is equiva- lence for non-zero discounts, we o...
1977
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.