How Google's AlphaGo Beat a Go World Champion
Inside a man-versus-machine showdown
On March 19, 2016, the strongest Go player in the world, Lee Sedol, sits down for a game against Google DeepMind’s artificial-intelligence program, AlphaGo. They’re at the Four Seasons Hotel in Seoul’s Gwanghwamun district, and it’s a big deal: Most major South Korean television networks are carrying the game. In China, 60 million people are tuning in. For the English-speaking world, the American Go Association and DeepMind are running an English-language livestream on YouTube, and 100,000 people are watching. A few hundred members of the press are in adjacent rooms, watching the game alongside expert commentators.
The game room itself is spare: a table, two black leather chairs, some cameras. Three officials presiding over the match sit in the back. Across from Lee sits Aja Huang, one of AlphaGo’s lead programmers; and beside him is a computer monitor that displays AlphaGo’s move choices. Huang’s job is to physically place AlphaGo’s pieces on the board. AlphaGo itself is not any one machine—it’s a piece of distributed software supported by a team of more than 100 scientists.
Tonight, Lee Sedol is supported by one 33-year-old human brain and approximately 12 ounces of coffee.
Most people are betting on Lee to win.
* * *
At its core, the game of Go, which originated in China more than 2,500 years ago, is an abstract war simulation. Players start with a completely blank board and place black and white stones, one at a time, to surround territory. Once placed, stones do not move, and they’re removed only if they’re “killed”—that is, surrounded completely by the opponent’s stones. And so the game goes—black stone, white stone, black stone, white stone—until the board is covered in an intricate tapestry of black and white.
The rules of Go are simple and take only a few minutes to learn, but the possibilities are seemingly endless. The number of potential legal board positions is:
That number—which is greater than the number of atoms in the universe—was only determined in early 2016. Because there are so many directions any given game can move in, Go is a notoriously difficult game for computers to play. It has often been called the “Holy Grail” of artificial intelligence.
In February 2016, DeepMind researchers published a paper in Nature that announced that they’d done something remarkable: Not only could their AI beat every other Go-playing program in the world, but it had beaten a professional named Fan Hui, the current European champion. AlphaGo didn’t just beat Fan Hui—it beat him soundly in every match of a five-game series.
The news rippled through the Go world. It was widely believed that an AI strong enough to beat a professional player was still at least a decade away, and that milestone had been quietly crushed. The next logical step was to discover what AlphaGo’s upper limit might be, and Lee was the logical choice for humanity’s champion.
* * *
It’s curious that when someone is really good at something, we call them a “machine.” Lee Sedol is a Go-playing machine, enlisted to beat, well, a Go-playing machine.
Lee is not a machine, of course. He is a particularly young-looking 33-year-old. He is a man who gets up and eats breakfast, takes naps, feels embarrassed, gets nervous. Within the Go world, however, nobody is scarier than Lee, who plays with an unnerving confidence. He creates situations that should end in disaster and then—effortlessly to the observer—turns them on their heads, like a magic trick, steamrolling his opponents.
In the weeks leading up to Game 1, the DeepMind team expressed humble optimism about AlphaGo’s chances of winning. Lee is more brazen; at a press conference with Demis Hassabis, DeepMind’s founder, he claimed that for him, the challenge isn’t whether he’ll beat AlphaGo, but whether he’ll beat it 5-0 or 4-1.
Lee is not being arrogant. He’s making an objective evaluation based on AlphaGo’s play against Fan Hui, which he had seen. And Fan Hui and Lee Sedol are not exactly comparable in strength. In the Go world, Lee is Michael Jordan, Tiger Woods, Roger Federer. He is one of those rare virtuosos who defines his era, who sets the pace for the rest of the world. He is orders of magnitude more talented than Fan Hui, who is no slouch. And Fan Hui has actually beaten AlphaGo outside of the formal five-game match DeepMind publicized. With much stricter time settings, he won two out of five matches, giving AlphaGo a much harder time.
Other Korean professionals joke that they’re envious of Lee, that they feel the DeepMind Challenge Match is the easiest million dollars a top-level player could ever make.
* * *
Minutes into Game 1, all expectations change. It’s immediately clear that Lee Sedol is not playing the same AlphaGo that Fan Hui did back in London. That version of AlphaGo played steadily but also passively, peacefully. The AlphaGo playing in Seoul is happy to engage in aggressive fighting with Lee. Lee has played an unconventional opening, trying to throw AlphaGo off, but it is not working.
AlphaGo has had nearly five months to improve—and it is always improving, playing itself millions of times, incrementally revising its algorithms based on which sequences of play result in a higher win percentage. As you are reading this, AlphaGo is improving. It does not take breaks. It does not have days when it just doesn’t feel like practicing, days when it can’t kick its electronic brain into focus. Day in and day out, AlphaGo has been rocketing towards superiority, and the results are staggering.
Lee goes on to lose Game 1, resigning after 186 moves. The turning point in the mental game seems to have come at White’s move 102. It’s a sharp, unexpected invasion, an aggressive move that invites complicating fighting positions. It is, in truth, exactly the kind of move Lee is known for. In this moment, a full range of reactions washes over Lee: shock, surprise, acceptance, and finally grim resolution. His jaw drops, and after several seconds, he sits back in his chair and smiles, perhaps amused but certainly taken aback. Then his expression grows serious, and his hand rubs the back of his neck, a tic he exhibits when he’s thinking hard or feeling nervous.
The moment he throws in the towel, he begins revising moves, pushing stones around the board to play out alternate variations, experimenting with the roads untraveled. We can see him work through it, trying to pinpoint exactly how he has lost.
He has taken the machine’s measure. Going into Game 2, he understands the magnitude of what he’s up against. The following evening will be the real first test. But in the press conference following Game 1, he downgrades his perceived chances of winning to 50 percent.
* * *
In Game 2, Lee exhibits a different style, attempting to play more cautiously. He waits for any opening he can exploit, but AlphaGo continues to surprise. At move 37, AlphaGo plays an unexpected move, what’s called a “shoulder hit” on the upper right side of the board. This move in this position is unseen in professional games, but its cleverness is immediately apparent. Fan Hui would later say, “I’ve never seen a human play this move. So beautiful.”
And Lee? He gets up and walks out of the room. For a moment it’s unclear what’s happening, but then he re-enters the game room, newly composed, sits down, and plays his response. What follows is a much closer game than Game 1, but the outcome remains the same. Lee Sedol resigns after 211 moves.
That night, Lee and a group of his colleagues stay up until 6:00 a.m. brainstorming possible strategies. They look for a silver bullet, an Achilles heel, any way to secure a win. He’ll now need three wins in a row to win the series.
* * *
Game 3 ends in another loss—after four hours of grueling play, Sedol resigns. He’s playing some of the finest Go of his career, but he simply can’t chip away at the AI’s armor. It’s clear that AlphaGo’s strength surpasses even what was on display in Games 1 and 2. Later, David Ormerod, an American commentator, will write that watching AlphaGo’s Game 3 win made him feel “physically unwell.”
At the post-game conference, Lee looks 10 years older. Amidst a barrage of camera flash bulbs he apologizes to the entire world at once. “I apologize for being unable to satisfy a lot of people’s expectations,” he says. “I kind of felt powerless.” Even the DeepMind researchers, who have a deep admiration for Lee, seem more somber than jubilant at their own victory. There is a sense that something has changed. Gu Li, one of Lee’s long-term friends and rivals, comments on Chinese TV that Lee is fighting “a very lonely battle against an invisible opponent.”
* * *
Lee has already lost the series, but going into Game 4 his new goal is to win at least once.
Lee, playing white against AlphaGo’s black, tries yet another new style—a riskier strategy called amashi. This time the pressure is off, and we see some of the Lee Sedol magic bubble to the surface. Until now, AlphaGo has won by allowing Lee to take small profits in exchange for its own incremental advantages, and its superior calculation abilities have enabled it to come out on top each time. Now, Lee forces AlphaGo into an all-or-nothing fight. He will lose big or win big.
Then comes Lee’s move 78, which will come to be called his “Hand of God” move. It’s a brilliant tactical play that AlphaGo does not account for. Over the course of the next several moves, the sequence becomes disastrous for AlphaGo, which apparently “realizes”—as much as it can have a realization—that it has been outsmarted. Its strategy begins to crumble.
In the end, finding no moves that improve its chances of winning, it begins playing nonsense moves, moves that actually reduce its own points. Finally, it resigns.
After the match, hundreds chant Sedol’s name as he approaches the stage. The jubilant grandmaster thanks everyone involved, saying that the warmth he feels in this moment makes losing the three preceding matches worth it.
There is one more surprise in store for the evening, however. At the press conference, Lee points out that in both this game and in Game 2 (the closest of his losses), AlphaGo has played black. Lee requests to play black himself in the final game, removing the one advantage he may have. It feels as if, having climbed Everest after three failed attempts, Lee has asked to try it yet again, only blindfolded.
In Game 5, Lee employs a strategy similar to Game 4. For a time, the game is close, but AlphaGo proves once more that it finds small ways to cement any advantages it has, and once it pulls ahead, it’s very good at protecting its lead. Lee is forced to resign one last time, ending the series at four losses and one win. This time, there is no Hand of God.
* * *
What does it mean? Not much, in and of itself. If AlphaGo had lost to Lee in March, it would only have been a matter of time before it improved enough to surpass him. Go is constantly evolving. What’s considered optimal play changes quickly. Humans have been honing our collective knowledge of the game for more than 2,500 years—the difference is that AlphaGo can do the same thing much, much faster.
The important thing to take away from this series is not that DeepMind’s AI can learn to conquer Go, but that by extension it can learn to conquer anything easier than Go—which amounts to a lot of things. The ways in which we might apply these revolutionary advances in machine learning—in machines’ ability to mimic human creativity and intuition—are virtually endless.
But it is with human hands that machines are built, at least for now. In a Reddit discussion, the computer-science scholar Andy Salerno puts it well: “AlphaGo isn’t a mysterious beast from some distant unknown planet. AlphaGo is us,” he wrote. “AlphaGo is our incessant curiosity. AlphaGo is our drive to push ourselves beyond what we thought possible.”
“Lee should feel no shame in his losses,” Salerno continued. “For AlphaGo could never demonstrate its abilities—our abilities—if Lee were not there to challenge it.”