Understanding AI Go Programs' Vulnerability to Cyclic-Adversaral Strategy

Category Artificial Intelligence

tldr #

A Go expert found a vulnerability within top level AI Go programs, which exploits the cyclic-adversary strategy, works consistently and does not need to repeat exact sequences or board positions. This vulnerability can be extrapolated to other safety-critical AI programs, such as those in the automated financial trading or autonomous vehicles industry, potentially having dire consequences. The ML research community should invest in improving robust training and adversarial defense techniques to ensure system safety.

content #

It has been believed that the AI programs that play the strategy game of Go had gone far beyond human capability by 2016.

In March 2016, it beat human world champion Lee Sedol in a five-game match, the first time a computer Go program had beaten a 9-dan professional without handicap. Although it lost to Lee Sedol in the fourth game, Lee resigned in the final game, giving a final score of 4 games to 1 in favor of AlphaGo. In recognition of the victory, AlphaGo was awarded an honorary 9-dan by the Korea Baduk Association.

The strategies discovered to attack multiple types of AI Go systems can also be applied on other AI systems, such as AI-driven finance or autonomous vehicles.

The flaws in the programs show that there is no actual understanding being replicated in the AI programs. The programming and techniques can do well at mimicking intelligence and achieving patterns of very strong play, but flaws can be found with enough testing. The Game of Go has complexity far beyond brute force supercomputing. The AI Go programs have been shown to be far from perfect. This could also be the case with the new GPT-4 and Generative AI systems.

Go is the oldest board game still played in its original form and is considered one of the most complex games ever.

The results show that improvements in capabilities do not always translate into adequate robustness. Failures in Go AI systems are entertaining, but similar failures in safety-critical systems like automated financial trading or autonomous vehicles could have dire consequences. The ML research community should invest in improving robust training and adversarial defense techniques in order to produce models with the high levels of reliability needed for safety-critical systems.

AlphaGo is the first computer program to have defeated a Go world champion.

A Go expert (Kellin Pelrine) was able to learn and apply the cyclic-adversary’s strategy to attack multiple types and configurations of AI Go systems. They exploited KataGo with 100K visits, which would normally be strongly superhuman. Besides previously studying our adversary’s game records, no algorithmic assistance was used in this or any of the following examples. The KataGo network and weights used here were b18c384nbt-uec, which is a newly released version the author of KataGo (David Wu) trained for a tournament. This network should be as strong or stronger than Latest.

The cyclic-adversary strategy can be used against all AI programs, including ones with a high amount of search.

Playing under standard human conditions on the online Go server KGS, the same Go expert (Kellin Pelrine) successfully exploited the bot JBXKata005 in 14/15 games. In the remaining game, the cyclic group attack still led to a successful capture, but the victim had enough points remaining to win. This bot uses a custom KataGo implementation, and at the time of the games was the strongest bot available to play on KGS.

A Go expert managed to beat JBXKata005, the strongest bot on KGS, while giving it a 9 stone advantage.

The same Go expert (Kellin Pelrine) exploited JBXKata005 while giving it a huge initial advantage through a 9 stone handicap. A top level human player with this much advantage would have a virtually 100% win rate against any opponent, human or algorithmic.

While Go AIs do already have known weaknesses, for instance the common "ladder" tactic, there are 3 key factors here whose confluence makes this vulnerability different.

KataGo is working to make improvements and patch the vulnerability, but so far has not been successful.

1. This affects top AIs, including when they have a very large amount of search.

2. The attack works consistently to produce a game-winning advantage.

3. This consistency does not require repeating exact sequences or board positions.

KataGo is currently training with numerous positions from these adversary algorithm games. There are clear improvements, but so far it is still vulnerable. It is not as far along as KataGo when it was first released.

hashtags #
worddensity #