Model-free RL learns optimal strategies in stochastic games for LTL specs by constructing a product with DPA and assigning rewards/discounts from acceptance conditions.
Security-aware synthesis of human-uav protocols,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2021 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Learning Optimal Strategies for Temporal Tasks in Stochastic Games
Model-free RL learns optimal strategies in stochastic games for LTL specs by constructing a product with DPA and assigning rewards/discounts from acceptance conditions.