**Uri ZwickTel Aviv University**Simple Stochastic GamesMean Payoff GamesParity Games TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA**Simple Stochastic Games**Mean Payoff Games Parity Games Randomizedsubexponential algorithm for SSG Deterministicsubexponential algorithm for PG**Simple Stochastic Games**Mean Payoff Games Parity Games**R**R R R A simple Simple Stochastic Game**min-sink**MAX-sink Simple Stochastic game (SSGs)Reachability version[Condon (1992)] R min MAX RAND Two Players: MAX and min Objective:MAX/min the probability of getting to the MAX-sink**Simple Stochastic games (SSGs)Strategies**A generalstrategy may be randomized and history dependent A positional strategy is deterministicand history independent Positionalstrategy for MAX: choice of an outgoing edge from each MAX vertex**Simple Stochastic games (SSGs)Values**Every vertex i in the game has a valuevi general positional general positional Both players have positionaloptimal strategies There are strategies that are optimal for every starting position**Simple Stochastic game (SSGs)[Condon (1992)]**Terminating binary games The outdegrees of all non-sinks are 2 All probabilities are ½. The game terminates with prob. 1 Easy reduction from general gamesto terminating binary games**“Solving” terminating binary SSGs**The values vi of the vertices of a game are the unique solution of the following equations: The values are rational numbersrequiring only a linear number of bits Corollary: Decision version in NP co-NP**Value iteration (for binary SSGs)**Iterate the operator: Converges to the unique solution But, may require an exponentialnumber of iterations to get close**Simple Stochastic game (SSGs)Payoff version[Shapley (1953)]**R min MAX RAND Limiting average version Discounted version**Markov Decision Processes (MDPs)**R min MAX RAND Theorem:[Epenoux (1964)] Values and optimal strategies of a MDP can be found by solving an LP**SSG NP co-NP – Another proof**Deciding whether the value of a game isat least (at most) v is in NP co-NP To show that value v,guess an optimal strategy for MAX Find an optimal counter-strategy for min by solving the resulting MDP. Is the problem in P ?**Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)]**R min MAX RAND Non-terminating version Discounted version ReachabilitySSGs PayoffSSGs MPGs Pseudo-polynomial algorithm (PZ’96)**Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)]**Again, both players have optimal positional strategies. Value(σ,) – average of cycle formed**Selecting the second largest element with only four storage**locations [PZ’96]**Parity Games (PGs) A simple example**Priorities 2 3 2 1 4 1 EVEN wins if largest priorityseen infinitely often is even**8**3 ODD EVEN Parity Games (PGs) EVEN wins if largest priorityseen infinitely often is even Equivalent to many interesting problemsin automata and verification: Non-emptyness of -tree automata modal -calculus model checking**8**3 ODD EVEN Parity Games (PGs) Mean Payoff Games (MPGs) [Stirling (1993)] [Puri (1995)] Replace priority k by payoff (n)k Move payoffs to outgoing edges**Switches**…**Strategy/Policy Iteration**Start with some strategy σ (of MAX) While there are improving switches, perform some of them As each step is strictly improving and as there is a finite number of strategies, the algorithm must end with an optimal strategy SSG PLS (Polynomial Local Search)**Strategy/Policy IterationComplexity?**Performing only one switch at a time may lead to exponentially many improvements,even for MDPs [Condon (1992)] What happens if we perform all profitable switches[Hoffman-Karp (1966)] ??? Not known to be polynomialO(2n/n) [Mansour-Singh (1999)] No non-linear examples2n-O(1) [Madani (2002)]**A randomized subexponential algorithm for simple stochastic**games**Arandomizedsubexponentialalgorithm for binary SSGs[Ludwig**(1995)][Kalai (1992)] [Matousek-Sharir-Welzl (1992)] Start with an arbitrary strategy for MAX Choose a random vertex iVMAX Find the optimal strategy ’ for MAX in the gamein which the only outgoing edge of i is (i,(i)) If switching ’ at i is not profitable, then ’ is optimal Otherwise, let (’)i and repeat**Arandomizedsubexponentialalgorithm for binary SSGs[Ludwig**(1995)][Kalai (1992)] [Matousek-Sharir-Welzl (1992)] MAX vertices All correct ! Would never be switched ! There is a hidden order of MAX vertices under which the optimal strategy returned by the first recursive call correctly fixes the strategy of MAX at vertices 1,2,…,i**The hidden order**ui(σ)- the maximum sum of values of a strategy of MAX that agrees with σ on i**The hidden order**Order the vertices such that Positions 1,..,iwere switchedand would neverbe switched again**SSGs are LP-type problems[Halman (2002)]**General (non-binary) SSGs can be solved in time Independently observed by[Björklund-Sandberg-Vorobyov (2005)] AUSO – Acyclic Unique Sink Orientations**SSGs GPLCP[Gärtner-Rüst**(2005)][Björklund-Svensson-Vorobyov (2005)] GPLCPGeneralized Linear ComplementaryProblem with a P-matrix**A deterministic subexponential algorithm for parity games**Mike PatersonMarcin JurdzinskiUri Zwick**Parity Games (PGs) A simple example**Priorities 2 3 2 1 4 1 EVEN wins if largest priorityseen infinitely often is even**8**3 ODD EVEN Parity Games (PGs) Mean Payoff Games (MPGs) [Stirling (1993)] [Puri (1995)] Replace priority k by payoff (n)k Move payoffs to outgoing edges**Exponential algorithm for PGs[McNaughton (1993)] [Zielonka**(1998)] Vertices of highest priority(even) Firstrecursivecall Vertices from whichEVEN can force thegame to enter A Lemma: (i) (ii)**Exponential algorithm for PGs[McNaughton (1993)] [Zielonka**(1998)] Second recursivecall In the worst case, both recursive calls are on games of size n1**Deterministic subexponential alg for PGsJurdzinski,**Paterson, Z (2006) Idea:Look for small dominions! Second recursivecall Dominions of size s can be found in O(ns) time Dominion Dominion: A (small) set from which one of the players can win without the play ever leaving this set**Open problems**• Polynomial algorithms? • Is the Policy Improvement algorithm polynomial? • Faster subexponential algorithmsfor parity games? • Deterministic subexponential algorithmsfor MPGs and SSGs? • Faster pseudo-polynomial algorithmsfor MPGs?