Recognition: 1 theorem link
· Lean TheoremSure-almost-sure and Sure-limit-sure Window Mean Payoff in Markov Decision Processes
Pith reviewed 2026-05-13 04:31 UTC · model grok-4.3
The pith
Sure-almost-sure and sure-limit-sure problems for window mean-payoff in MDPs are in P for fixed windows and in NP ∩ coNP for bounded windows.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We solve the sure-almost-sure and sure-limit-sure problems for window mean-payoff objectives. We show that the sure-almost-sure problem and the sure-limit-sure problem are both in P for the fixed variant (if ℓ is given in unary) and are both in NP ∩ coNP for the bounded variant, matching the computational complexity of sure satisfaction and almost-sure satisfaction when considered separately for these objectives. We also give bounds for the memory requirement of winning strategies for all considered problems.
What carries the argument
The window mean-payoff objective, which requires the average payoff over every sliding finite window of length ℓ to exceed a threshold α or β, studied in its fixed-length variant (ℓ given) and bounded-length variant (ℓ unknown but finite throughout the run).
Load-bearing premise
The Markov decision processes are finite with rational probabilities and payoffs, and the window length ℓ is given in unary for the fixed variant.
What would settle it
A concrete finite MDP with rational transition probabilities, payoffs, and a unary window length ℓ for which no polynomial-time algorithm correctly decides whether a strategy exists that meets both the sure-α and almost-sure-β window mean-payoff conditions.
read the original abstract
Given rationals $\alpha$ and $\beta$, the sure-almost-sure problem for a quantitative objective $\varphi$ in a Markov decision process (MDP) asks if one can simultaneously ensure that all outcomes of the MDP have $\varphi$-value at least $\alpha$ (i.e. sure $\alpha$ satisfaction) and with probability $1$ the outcome has $\varphi$-value at least $\beta$ (i.e. almost-sure $\beta$ satisfaction). The sure-limit-sure problem asks if for all $\varepsilon > 0$ one can simultaneously ensure that all outcomes have $\varphi$-value at least $\alpha$ and with probability at least $1 - \varepsilon$ the outcome has $\varphi$-value at least $\beta$. Moreover, if simultaneous satisfaction of objectives is possible, then one would also like to construct a strategy (for sure-almost-sure) or a family of strategies (for sure-limit-sure) that achieves this. In this paper, we solve the sure-almost-sure and sure-limit-sure problems for window mean-payoff objectives. The window mean-payoff objective strengthens the standard mean-payoff objective by requiring that the average payoff of a finite window that slides over an infinite run be greater than a given threshold. We study two variants of window mean payoff: in the fixed variant, the window length $\ell$ is given, while in the bounded variant, the length is not given but is required to be bounded throughout the run. We show that the sure-almost-sure problem and the sure-limit-sure problem are both in P for the fixed variant (if $\ell$ is given in unary) and are both in NP $\cap$ coNP for the bounded variant, matching the computational complexity of sure satisfaction and almost-sure satisfaction when considered separately for these objectives. We also give bounds for the memory requirement of winning strategies for all considered problems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper solves the sure-almost-sure and sure-limit-sure problems for window mean-payoff objectives in finite MDPs with rational probabilities and payoffs. For the fixed variant (window length ℓ given in unary), both problems are in P; for the bounded variant, both are in NP ∩ coNP. These match the complexities of the separate sure and almost-sure satisfaction problems. The paper also supplies explicit memory bounds for the winning strategies (or families of strategies).
Significance. If the results hold, the work is significant for providing tight complexity classifications for combined quantitative objectives in MDPs. It demonstrates that the conjunction of sure and almost-sure (or limit-sure) requirements does not raise the complexity beyond the individual problems for window mean-payoff, which strengthens standard mean-payoff. The explicit memory bounds for strategies are a concrete strength that supports synthesis applications.
minor comments (3)
- [Abstract] The abstract and introduction should explicitly reference the theorem numbers that establish the P and NP ∩ coNP memberships, so readers can directly locate the reductions or algorithms that preserve these classes.
- [Preliminaries] A small illustrative example of a window mean-payoff run (showing the sliding window averages) would clarify the objective definition before the complexity results.
- [Main results section] The memory bounds for strategies are stated but would benefit from a compact table or corollary that lists the exact size (e.g., polynomial in |M| and ℓ) for each of the four problem variants.
Simulated Author's Rebuttal
We thank the referee for their positive summary of our work and for recommending minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity in derivation chain
full rationale
The paper establishes new complexity results (P for fixed unary window length; NP ∩ coNP for bounded) for combined sure-almost-sure and sure-limit-sure window mean-payoff objectives by reductions to standard MDP decision procedures for sure and almost-sure satisfaction separately. These reductions preserve the target complexity classes without introducing self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations whose validity depends on the present work. All assumptions (finite MDPs, rational probabilities/payoffs, unary encoding of ℓ) are external and standard; no ansatz is smuggled via prior author work, and no known empirical pattern is merely renamed. The derivation is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Markov decision processes are finite-state, finite-action with rational transition probabilities and payoffs.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclearWe show that the sure-almost-sure problem and the sure-limit-sure problem are both in P for the fixed variant... matching the computational complexity of sure satisfaction and almost-sure satisfaction
Reference graph
Works this paper leans on
-
[1]
K. R. Apt and E. Gr \"a del. Lectures in Game Theory for Computer Scientists . Cambridge University Press, 2011. https://doi.org/10.1017/CBO9780511973468 doi:10.1017/CBO9780511973468
-
[2]
C. Baier and J - P. Katoen. Principles of model checking . MIT Press, 2008
work page 2008
-
[3]
R. Berthon, S. Guha, and J. - F. Raskin. Mixing Probabilistic and non-Probabilistic Objectives in Markov Decision Processes . In LICS , pages 195--208. ACM , 2020. https://doi.org/10.1145/3373718.3394805 doi:10.1145/3373718.3394805
-
[4]
R. Berthon, J - P. Katoen, and T. Winkler. Markov Decision Processes with Sure Parity and Multiple Reachability Objectives . In RP , volume 15050 of Lecture Notes in Computer Science , pages 203--220. Springer, 2024. https://doi.org/10.1007/978-3-031-72621-7_14 doi:10.1007/978-3-031-72621-7_14
-
[5]
R. Berthon, M. Randour, and J. - F. Raskin. Threshold Constraints with Guarantees for Parity Objectives in Markov Decision Processes . In ICALP , volume 80 of LIPIcs , pages 121:1--121:15, 2017. https://doi.org/10.4230/LIPIcs.ICALP.2017.121 doi:10.4230/LIPIcs.ICALP.2017.121
-
[6]
P. Billingsley. Probability and Measure . John Wiley and Sons, second edition, 1986
work page 1986
-
[7]
B. Bordais, S. Guha, and J - F. Raskin. Expected Window Mean-Payoff . In FSTTCS , volume 150 of LIPIcs , pages 32:1--32:15, 2019. https://doi.org/10.4230/LIPIcs.FSTTCS.2019.32 doi:10.4230/LIPIcs.FSTTCS.2019.32
-
[8]
T. Brihaye, F. Delgrange, Y. Oualhadj, and M. Randour. Life is Random, Time is Not: Markov Decision Processes with Window Objectives . Logical Methods in Computer Science , Volume 16, Issue 4 , Dec 2020. https://doi.org/10.23638/LMCS-16(4:13)2020 doi:10.23638/LMCS-16(4:13)2020
-
[9]
V. Bruy \` e re, E. Filiot, M. Randour, and J-F. Raskin. Meet your expectations with guarantees: Beyond worst-case synthesis in quantitative games . Information and Computation , 254:259--295, 2017. https://doi.org/10.1016/j.ic.2016.10.011 doi:10.1016/j.ic.2016.10.011
-
[10]
K. Chatterjee, L. de Alfaro, and T. A. Henzinger. The Complexity of Stochastic Rabin and Streett Games . In ICALP , pages 878--890. Springer, 2005. https://doi.org/10.1007/11523468_71 doi:10.1007/11523468_71
-
[11]
K. Chatterjee, L. Doyen, M. Randour, and J-F. Raskin. Looking at mean-payoff and total-payoff through windows . Information and Computation , 242:25--52, 2015. https://doi.org/10.1016/j.ic.2015.03.010 doi:10.1016/j.ic.2015.03.010
-
[12]
K. Chatterjee and M. Henzinger. Efficient and Dynamic Algorithms for Alternating B \" u chi Games and Maximal End-Component Decomposition . J. ACM , 61(3):15:1--15:40, 2014. https://doi.org/10.1145/2597631 doi:10.1145/2597631
-
[13]
K. Chatterjee, T. A. Henzinger, and F. Horn. Stochastic Games with Finitary Objectives . In MFCS , pages 34--54. Springer, 2009. https://doi.org/10.1007/978-3-642-03816-7_4 doi:10.1007/978-3-642-03816-7_4
-
[14]
K. Chatterjee, Z. K r et\' nsk\' a , and J. K r et\' nsk\' y . Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes . Logical Methods in Computer Science , Volume 13, Issue 2, Jul 2017. https://doi.org/10.23638/LMCS-13(2:15)2017 doi:10.23638/LMCS-13(2:15)2017
-
[15]
K. Chatterjee and N. Piterman. Combinations of Qualitative Winning for Stochastic Parity Games . In CONCUR , volume 140 of LIPIcs , pages 6:1--6:17, 2019. https://doi.org/10.4230/LIPIcs.CONCUR.2019.6 doi:10.4230/LIPIcs.CONCUR.2019.6
-
[16]
L. Clemente and J.-F. Raskin. Multidimensional beyond Worst-Case and Almost-Sure Problems for Mean-Payoff Objectives . In LICS , pages 257--268. IEEE, 2015. https://doi.org/10.1109/LICS.2015.33 doi:10.1109/LICS.2015.33
-
[17]
L. Doyen, P. Gaba, and S. Guha. Expectation in Stochastic Games with Prefix-Independent Objectives . In CONCUR , volume 348 of LIPIcs , pages 16:1--16:19, 2025. https://doi.org/10.4230/LIPIcs.CONCUR.2025.16 doi:10.4230/LIPIcs.CONCUR.2025.16
-
[18]
L. Doyen, P. Gaba, and S. Guha. Stochastic Window Mean-Payoff Games . Logical Methods in Computer Science , Volume 21, Issue 2, Jun 2025. https://doi.org/10.46298/lmcs-21(2:19)2025 doi:10.46298/lmcs-21(2:19)2025
-
[19]
L. Doyen and S. Guha. Algorithm and Strategy Construction for Sure-Almost-Sure Stochastic Parity Games . In STACS , pages 34:1--34:20, 2026. https://doi.org/10.4230/LIPICS.STACS.2026.34 doi:10.4230/LIPICS.STACS.2026.34
-
[20]
A. Ehrenfeucht and J. Mycielski. Positional Strategies for Mean Payoff Games . Int. Journal of Game Theory , 8(2):109--113, 1979. https://doi.org/10.1007/BF01768705 doi:10.1007/BF01768705
-
[21]
P. Gaba and S. Guha. Optimising Expectation with Guarantees for Window Mean Payoff in Markov Decision Processes . In AAMAS , pages 820--828, 2025. https://doi.org/10.5555/3709347.3743600 doi:10.5555/3709347.3743600
-
[22]
E. Gr \" a del, W. Thomas, and T. Wilke, editors. Automata, Logics, and Infinite Games: A Guide to Current Research , volume 2500 of Lecture Notes in Computer Science . Springer, 2002. https://doi.org/10.1007/3-540-36387-4 doi:10.1007/3-540-36387-4
-
[23]
D. A. Martin. The Determinacy of Blackwell Games . The Journal of Symbolic Logic , 63(4):1565--1581, 1998. https://doi.org/10.2307/2586667 doi:10.2307/2586667
-
[24]
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming . John Wiley and Sons, 1994
work page 1994
-
[25]
U. Zwick and M. Paterson. The Complexity of Mean Payoff Games on Graphs . Theoretical Computer Science , 158(1 & 2):343--359, 1996. https://doi.org/10.1016/0304-3975(95)00188-3 doi:10.1016/0304-3975(95)00188-3
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.