Description

The round of Go. A 4,000 years of age prepackaged game from China. Standard size ... the diversion turns out to be logically more mind boggling, in any event for the initial 100 employ ...

Transcripts

Monte Carlo Go Has a Way to Go Adapted from the slides introduced at AAAI 2006 Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1) (*1)University of Tokyo (*2) Future University Hakodate

Games in AI Ideal proving ground for AI investigate Clear results Clear inspiration Good test Success in hunt based methodology chess (1997, Deep Blue) and others Not effective in the round of Go is to Chess as Poetry is to Double-section bookkeeping It goes to the center of computerized reasoning, which includes the investigation of learning and basic leadership, key considering, information representation, design acknowledgment and, maybe most intriguingly, instinct

The session of Go A 4,000 years of age prepackaged game from China Standard size 19 × 19 Two players, Black and White, put the stones in turns Stones can not be moved, but rather can be caught and taken off Larger region wins

Terminology of Go

Playing Strength $1.2M was set for beating an expert with no impediment (expired!!!) Handtalk in 1997 guaranteed $7,700 for winning a 11-stone impairment match against a 8-9 years of age expert

Difficulties in Computer Go Large pursuit space the diversion turns out to be continuously more intricate, at any rate for the initial 100 handle

Difficulties in Computer Go Lack of good assessment work a material favorable position does not mean a straightforward approach to triumph, and may simply imply that transient addition has been given need legitimate moves around 150 – 250, as a rule <50 worthy (even <10), however PCs experience serious difficulties them. High level of example acknowledgment required in human ability to play well.

Why Monte Carlo Go? Supplant assessment capacity by arbitrary examining Brugmann:1993, Bouzy:2003 Success in different areas Bridge [Ginsberg:1999], Poker [Billings et al.:2002] Reasonable position assessment taking into account inspecting search space from O(b d ) to O(Nbd) Easy to parallelize Can win against hunt based methodology Crazy Stone won the eleventh Computer Olympiad in 9x9 Go MoGo 19 th , 20 th KGS 9x9 champ, evaluated most elevated on CGOS

Basic thought of Monte Carlo Go Generate next moves by 1-handle seek Play various irregular recreations and process the normal score Choose the move with the maximal score The main space subordinate data is eye.

Terminal Position of Go Larger region wins Territory = encompassed range + stones ▲ Black\'s domain is 36 focuses × White\'s region is 45 focuses White wins by 9 focuses

Play numerous specimen diversions Each player plays haphazardly Compute normal focuses for every move Select the move that has the most astounding normal Example Play rest of the amusement arbitrarily 5 focuses win for dark 9 focuses win for dark move A: (5 + 9)/2 = 7 focuses

Monte Carlo Go and Sample Size Monte Carlo with 1000 example recreations Can decrease measurable mistakes with extra examples Relationships between test size and quality are not yet explored Sampling blunder ～ N: # of arbitrary amusements Diminishing returns must show up Monte Carlo with 100 example diversions Stronger than

Our Monte Carlo Go Implementation essential Monte Carlo Go atari-50 upgrade: Utilization of straightforward go learning in move choice dynamic pruning [Bouzy 2003]: factual move pruning in reproductions

Atari-50 Enhancement Basic Monte Carlo: dole out uniform likelihood for every move in test diversion (no eye filling) Atari-50: higher likelihood for catch moves Capture is " for the most part " a decent move half Move A catches dark stones

Progressive Pruning [Bouzy2003] Try examining with littler specimen size Prune factually second rate moves score move Can dole out more example recreations to promising moves

Experimental Design Machine Intel Xeon Dual CPU at 2.40 GHz with 2 GB memory Use 64 PCs (128 processors) associated by 1GB/s system Three adaptations of projects BASIC: Basic Monte Carlo Go ATARI: BASIC + Atari-50 improvement ATARIPP: ATARI + Progressive Pruning Experiments 200 self-play amusements Analysis of choice quality from 58 proficient diversions

Diminishing Returns 4* N tests versus N tests for every move

Additional upgrades and Winning Percentage

Decision Quality of Each Move a b c 1 20 17 10 2b - > 9 times 2c - > 1 times 15 2 25 30 3 12 21 7 Selected move for 100 example amusement Monte Carlo Go Evaluation score of "Prophet" (64 million specimen recreations) Average blunder of one move is ((30 – 30) * 9 + (30 - 15 ) * 1)/10 = 1.5 focuses

Decision Quality of Each Move (Basic)

Decision Quality of Each Move (with Atari50 Enhancement)

Summary of Experimental Results Additional upgrades enhance quality of Monte Carlo Go Diminish returns in the end Additional upgrades get speedier unavoidable losses Need to gather more specimens in the early stage round of 9x9 Go

Conclusions and Future Work Conclusions Additional examples accomplish just little dislike look calculation, e.g. chess Good at technique, not strategies screw up because of absence of area information Easy to assess Easy to parallelize The route for Monte Carlo Go to run Small example diversions with numerous improvements will guarantee Future Work Adjust likelihood with example coordinating Learning Search + Monte Carlo Go MoGo (investigation misuse in the pursuit tree utilizing UCT) Scale to 19 ×19

Questions ? Reference: Go wiki http://en.wikipedia.org/wiki/Go_(board_game) Gnu Go http://www.gnu.org/programming/gnugo/KGS Go Server http://www.gokgs.com CGOS 9x9 Computer Go Server http://cgos.boardspace.net