AlphaGo Zero learns a complicated game without any human help

23 Oct 2017

DeepMind’s AlphaGo Zero can now beat the best Go players without any training or help from human players. This AI program is on top of its game. (Photo from DeepMind)

Google’s AlphaGo turned heads last years when the DeepMind AI beat Go world champion Ke Jie making it the world’s best Go player. It previously beat two of Game’s biggest champions in under a year. Now, the AI is more advanced with its latest iteration, AlphaGo Zero. So, what makes this version different from its predecessors? It can now defeat the best Go players without any help from humans.

Previous versions of AlphaGo learned how to play the game by training on thousands of games played by champions. Once it learned the game, the program played against different versions of itself to help it learn from its mistakes and figure out what needs to win the game. AlphaGo Zero learned how to play the game by playing itself millions of times over. It learned the best method of winning the game through reinforcement learning – if it made a good move, it would be rewarded. If it made a bad move, it got closer to losing.

After playing roughly five million games against itself, the updated AI program could defeat human players and the original AlphaGo. After 40 days, it even reigned supreme over AlphaGo Master. The program depends on a group of software neurons that are connected together to form an artificial neural network. During each turn, the network examines the positions of the pieces on the Go board and determines the moves that might be made next and the chances of them leading to a win. The network updates itself after every game to make itself stronger for the next match.

Along with being more advanced, AlphaGo Zero is a simpler program. It was able to learn the game faster even though it trained on less data and runs on a smaller computer. While the program is only capable at kicking some major butt in Go, its creators see this as a milestone for general-purpose AIs. Most AIs aren’t capable of doing much aside from one specific task, like recognizing faces and translating languages. DeepMind believes with more work AlphaGo Zero could help solve a number of real-world problems. Currently, the program working out how proteins fold, which could greatly improve drug discovery.

But it’ll be a while before we AlphaGo Zero apply itself to other issues. For now, it’ll continue to be the world’s best Go player. The human champions need to step up their game if they want to keep up.

Have a story tip? Message me at: cabe(at)element14(dot)com

http://twitter.com/Cabe_Atwell