Description

Lecture 6: Adversarial Search & Games. Reading: Ch. 6, AIMA. Adversarial search. So far, single agent search – no opponents or collaborators Multi-agent search: Playing a game with an opponent: adversarial search

Transcripts

Address 6: Adversarial Search & Games Reading: Ch. 6, AIMA Rutgers CS440, Fall 2003

Adversarial seek So far, single specialist hunt – no rivals or teammates Multi-specialist look: Playing an amusement with a rival: antagonistic inquiry Economies: significantly more perplexing, social orders of helpful and non-agreeable specialists Game playing and AI: Games can be mind boggling, require (?) human insight Have to develop "continuously" Well-characterized issues Limited extension Rutgers CS440, Fall 2003

Games and AI Rutgers CS440, Fall 2003

Games and pursuit Traditional pursuit: single operator, scans for its prosperity, unhindered Games: seek against a rival Consider a two player prepackaged game: e.g., chess, checkers, tic-tac-toe board setup: remarkable game plan of "pieces" Representing tabletop games as pursuit issue: states: board designs administrators: legitimate moves introductory state: ebb and flow board arrangement objective state: winning/terminal board setup Rutgers CS440, Fall 2003

X O X O X Wrong representation We need to improve our (agent's) objective, henceforth construct a pursuit tree taking into account conceivable moves/activities Problem: rebates the rival Rutgers CS440, Fall 2003

X O X O X O X O X O Better representation: diversion seek tree Include rival's activities also Agent move Full move Opponent move Agent move 5 10 1 Utilities (relegated to objective hubs) Rutgers CS440, Fall 2003

Game inquiry trees What is the span of the diversion look trees? O(b d ) Tic-tac-toe: 9! leaves (max depth= 9) Chess: 35 legitimate moves, normal "profundity" 100 b d ~ 35 100 ~10 154 states, "just" ~10 40 lawful states Too profound for comprehensive inquiry! Rutgers CS440, Fall 2003

F - 7 G - 5 H 3 I 9 J - 6 K 0 L 2 M 1 N 3 O 2 Utilities in pursuit trees Assign utility to (terminal) states, depicting the amount they are esteemed for the operator High utility – useful for the specialist Low utility – useful for the adversary PC's conceivable moves A 9 rival's conceivable moves B - 5 C 9 D 2 E 3 terminal states load up assessment from specialist's point of view Rutgers CS440, Fall 2003

B - 7 C - 6 D 0 E 1 B C D E F - 7 G - 5 H 3 I 9 J - 6 K 0 L 2 M 1 N 3 O 2 Search system Worst-case situation : accept the rival will dependably make a best move (i.e., most exceedingly bad move for us) Minimax seek: amplify the utility for our operator while expecting that the rival plays his best moves: High utility favors specialist => picked move with maximal utility Low move favors rival => accept rival makes the move with least utility A 1 A PC's conceivable moves rival's conceivable moves terminal states Rutgers CS440, Fall 2003

B - 5 B - 5 B C - 6 C - 6 C D 0 D 0 D E 1 E 1 An A 1 An A maximum min B C D E F - 7 F - 7 F - 7 G - 5 G - 5 G - 5 H 3 H 3 H 3 I 9 I 9 I 9 J - 6 J - 6 J - 6 K 0 K 0 K 0 L 2 L 2 L 2 M 1 M 1 M 1 N 3 N 3 N 3 O 2 O 2 O 2 Minimax calculation Start with utilities of terminal hubs Propagate them back to root hub by picking the minimax technique Rutgers CS440, Fall 2003

Complexity of minimax calculation Utilities engender up in a recursive manner: DFS Space many-sided quality: O(bd) Time intricacy: O(b d ) Problem: time unpredictability – it's a diversion, limited time to make a move Rutgers CS440, Fall 2003

Reducing many-sided quality of minimax (1) Don't hunt to full profundity d, end early Prune awful ways Problem: Don't have utility of non-terminal hubs Estimate utility for non-terminal hubs: static load up assessment capacity (SBE) is a heuristic that doles out utility to non-terminal hubs it mirrors the PC's odds of winning from that hub it must be anything but difficult to compute from load up arrangement For instance, Chess: SBE = α * materialBalance + β * centerControl + γ * … material equalization = Value of white pieces - Value of dark pieces pawn = 1, rook = 5, ruler = 9, and so on. Rutgers CS440, Fall 2003

Minimax with Evaluation Functions Same as general Minimax, aside from just goes to profundity m gauges utilizing SBE work How might this calculation perform at chess? on the off chance that could look ahead ~4 sets of moves (i.e., 8 handle) would be reliably beaten by normal players if could look ahead ~8 sets as done in a commonplace PC, is in the same class as human expert Rutgers CS440, Fall 2003

Reducing multifaceted nature of minimax (2) Some branches of the tree won't be taken if the rival plays shrewdly. Will we identify them early? Prune off ways that don't should be investigated Alpha-beta pruning Keep track of while doing DFS of amusement tree: augmenting level: alpha most elevated quality seen so far lower bound on hub's assessment/score minimizing level: beta least esteem seen so far higher bound on hub's assessment/score Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(A,0,4) Call Stack max A α = An A B C D 0 E F G - 5 H 3 I 8 J K L 2 M N 4 O P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 A W - 3 X - 5 Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(B,1,4) Call Stack max A α = min B β = C D 0 E F G - 5 H 3 I 8 J K L 2 M N 4 O P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 B A W - 3 X - 5 Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(F,2,4) Call Stack max A α = B β = C D 0 E min max F α = F G - 5 H 3 I 8 J K L 2 M N 4 O P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 F B A W - 3 X - 5 Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(N,3,4) max Call Stack A α = min B β = C D 0 E max F α = G - 5 H 3 I 8 J K L 2 M N 4 N 4 O P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 F B A W - 3 X - 5 blue: terminal state Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(F,2,4) is come back to alpha = 4 , greatest seen so far Call Stack A α = max B β = C D 0 E min F α =4 F α = G - 5 H 3 I 8 J K L 2 M max N 4 O P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 F B A W - 3 X - 5 blue: terminal state Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(O,3,4) Call Stack A α = max B β = C D 0 E min F α =4 G - 5 H 3 I 8 J K L 2 M max O N 4 O β = O P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 min F B A W - 3 X - 5 blue: terminal state Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(W,4,4) Call Stack max A α = min B β = C D 0 E F α =4 G - 5 H 3 I 8 J K L 2 M max W O N 4 O β = P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 min F B A W - 3 W - 3 X - 5 blue: terminal state blue: terminal state (profundity limit) Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(O,3,4) is come back to beta = - 3 , least seen so far Call Stack max A α = B β = C D 0 E min F α =4 G - 5 H 3 I 8 J K L 2 M max O N 4 O β =-3 O β = P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 min F B A W - 3 X - 5 blue: terminal state Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(O,3,4) is come back to O's beta F's alpha: quit extending O (alpha cut-off) Call Stack A α = max B β = C D 0 E min F α =4 G - 5 H 3 I 8 J K L 2 M max O N 4 O β =-3 P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 min F B A W - 3 X - 5 X - 5 blue: terminal state Rutgers CS440, Fall 2003

Alpha-Beta Example Why? Smart adversary will pick W or more awful, accordingly O's upper bound is –3 So PC shouldn't pick O:- 3 since N:4 is better Call Stack A α = max B β = C D 0 E min F α =4 G - 5 H 3 I 8 J K L 2 M max O N 4 O β =-3 P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 min F B A W - 3 X - 5 blue: terminal state Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(F,2,4) is come back to alpha not changed (expanding) Call Stack A α = max B β = C D 0 E min F α =4 G - 5 H 3 I 8 J K L 2 M max N 4 O β =-3 P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 min F B A W - 3 X - 5 X - 5 blue: terminal state Rutgers CS440, Fall 2003

Alpha-Beta Example minimax(B,1,4) is come back to beta = 4 , least seen so far Call Stack A α = max B β =4 B β = C D 0 E min F α =4 G - 5 H 3 I 8 J K L 2 M max N 4 O β =-3 P 9 Q - 6 R 0 S 3 T 5 U - 7 V - 9 min B A W - 3 X - 5 X - 5 blue: terminal state Rutgers CS440, Fall 2003

Effectiveness of Alpha-Beta Search Effectiveness relies on upon the request in which successors are inspected. More powerful if best are inspected first Worst Case: requested so that no pruning happens no change over thorough inquiry Best Case: every player's best move is assessed first (left-most) by and by, execution is nearer to best instead of most pessimistic scenario Rutgers CS440, Fall 2003

Effectiveness of Alpha-Beta Search by and by regularly get O(b (d/2) ) as opposed to O(b d ) same as having an expanding component of b since ( b) d = b (d/2) For Example: Chess goes from b ~ 35 to b ~ 6 allows much more profound quest for the same time makes PC chess aggressive with people Rutgers CS440, Fall 2003

Dealing with Limited Time In genuine diversions, there is generally a period limit T on making a move How would we check? can't stop alpha-beta halfway and hope to utilize comes about with any certainty things being what they are, we could set a moderate profundity restrict that assurances we will discover a move in time < T yet then, the inquiry may complete early and the open door is squandered to accomplish more hunt Rutgers CS440, Fall 2003

Dealing with Limited Time practically speaking, iterative extending look (IDS) is utilized run alpha-beta pursuit with an expanding profundity limit when the check runs out, utilize the arrangement found for the last finished alpha-beta inquiry (i.e., the most profound inquiry that was finished) Rutgers CS440, Fall 2003

The Horizon Effect Sometimes debacle prowls just past inquiry profundity PC catches ruler, yet a couple moves later the rival checkmates (i.e., wins) The PC has a constrained skyline ; it can't see that this noteworthy occasion could happen How would you evade cataclysmic misfortunes because of "limitation"? quiet inquiry auxiliary hunt Rutgers CS440, Fall 2003

The Horizon Effect Q