SciTech

Two-player limit Texas hold’em poker solved

Michael Bowling, a Carnegie Mellon University School of Computer Science alumnus, helped develop a computer program to solve a version of poker. (credit: Michael Bowling) Michael Bowling, a Carnegie Mellon University School of Computer Science alumnus, helped develop a computer program to solve a version of poker. (credit: Michael Bowling) Michael Bowling, a Carnegie Mellon University School of Computer Science alumnus, helped develop a computer program to solve a version of poker. (credit: Michael Bowling) Michael Bowling, a Carnegie Mellon University School of Computer Science alumnus, helped develop a computer program to solve a version of poker. (credit: Michael Bowling) Tuomas Sandholm, a professor of computer science at Carnegie Mellon, is helped create a computer program to solve a version of poker. (credit: Tuomas Sandholm) Tuomas Sandholm, a professor of computer science at Carnegie Mellon, is helped create a computer program to solve a version of poker. (credit: Tuomas Sandholm)

There’s a new player in the world of poker, and he doesn’t wear a baseball cap or sunglasses. This new player is Cepheus, a computer program developed by Carnegie Mellon University School of Computer Science alumnus Michael Bowling. According to a paper published in the Jan. 9 issue of Science, Cepheus has essentially solved heads-up limit Texas hold’em.

This version of poker is a two-player game in which all bets are a fixed size, creating a game which is made up of approximately 1014 decision points — unique possible positions a player can find himself in where he is forced to make an action — as opposed to the 10161 decision points more complicated versions of the game entail. “It’s the smallest game that humans play in poker,” Bowling explained. “But it’s also the first imperfect information game [competitively played by humans] to be solved.”

Imperfect information games, as the name suggests, are those in which the entire state of the game is not known to a player when he makes a decision. While large games such as checkers have already been solved, the perfect information nature of checkers makes it much easier to solve than games like poker.

In order to solve heads-up limit Texas hold’em, Bowling’s group started Cepheus off by having it play against itself randomly. After each play, it computed how much money it could have made if it had played differently; in other words, it calculated its “regret” at each decision point and then updated its strategy accordingly, getting smarter and smarter with each cycle. This method is called counterfactual regret minimization. “If you do that strategy update in a particular mathematical way, you can prove that its regret has to go to zero over time…and that means its strategy is approaching a perfect strategy,” Bowling said.

Bowling’s group ran Cepheus for two months, during which time they had 48 hundred central processing units (CPUs) working on the strategy, with each CPU processing about 6 billion poker hands every second. By the end of this time, the strategy was statistically indistinguishable from a perfect strategy.

So what can we do with an undefeatable poker genius? Tuomas Sandholm, professor of computer science at Carnegie Mellon University and a member of Bowling’s Ph.D. committee while he was a graduate student, explained that poker is not just a game. “Poker is a benchmark… which we can use to measure progress in the field of solving imperfect information games. And the real applications that we’re targeting are things like security applications, cybersecurity, auctions, negotiations, medical applications, and so forth.” In fact, the Association for the Advancement of Artificial Intelligence (AAAI) has been holding annual poker competitions since 2006, where researchers from around the world compete with each other in various forms of poker and discuss their algorithms.

In the field of cybersecurity, Sandholm explained how imperfect information algorithms can be useful in wireless jamming, when a communicator needs to decide when to communicate and with what power level, and a jammer needs to decide when to jam and with what power level. He and his collaborators have published on this topic.

These algorithms also have various different medical applications. “One is on the population level,” Sandholm said. “So think about a pandemic coming through and you have to decide what segments of the population, defined by geography or some other aspect of the people, you should vaccinate or quarantine, and when to do it.”

This type of strategy can also be used for sequential treatment plans, such as for patients of HIV.

A third medical application is in drug design. “A problem in drug design is that the disease that the drug is intended to battle will also adapt to develop a resistance to the drug. So here the idea is that we can have models of the disease response and then we can actually bundle into the drug something that will already hit the evolved version,” Sandholm said.

There is still a lot of progress to be made in the field of imperfect information games, such as developing artificial intelligence that can efficiently play poker with more than just two players or with no-limit bets.

While it will be very difficult to completely solve poker, computer scientists are getting closer to solving other versions of the game. “I think we’re not far off,” Bowling said. “I think as a community we’re a few years away from being able to exceed the best in the world at heads-up no limit Texas hold’em.”

Sandholm and his students currently have the best program for this harder form of poker and won the Annual Computer Poker Competition in 2014.