Art(ificial Mastery) of the Deal

Researchers at Facebook just showed the Turing Test isn’t all it’s cracked up to be.

Since it was first proposed by Alan Turing in 1950, the Turing Test has remained the gold standard for grading artificial intelligence. Over the last half-century, countless programmers have laid controversial claim to the creation of a program that passes the test—which assesses whether a machine or bot can pass as human to another person—inevitably to a great deal of skepticism. However, a recent study published by Facebook’s Artificial Intelligence Research lab and the Georgia Institute of Technology indicates that this kind of trickery might be a misguided way to evaluate the intelligence of AI.

The study looked at negotiation between Facebook-created bots—which were haggling for books, basketballs, and hats—as well as between bots and human laborers crowdsourced from Amazon’s Mechanical Turk. Each bot was assigned varying valuations of the items and asked to come to an agreement with another bot about how to divide a collection of the items between the two of them. Among the bots that emerged as the most successful negotiators, two notable and unprogrammed behaviors became apparent: Bots developed their own language for negotiating—what the researchers call “divergence from human language”—and learned to deceive their interlocutors, both human and bot. However, the bots’ deceit was not Turing Test-style trickery, since there was no question about their botness. Instead, the bots learned to deceive on their values and intentions.

The researchers gave the bot “agents” three main strategies through which to learn negotiating tactics in order to achieve the best deals: mimicry, conversational simulation, and practice through repetition. The mimicry strategy, termed “supervised learning” by the researchers, involves bots observing conversations between negotiating humans and attempting to replicate them. According to Georgia Tech’s Dhruv Batra, one of the researchers who worked on the study, supervised learning only “works a little bit. It generates grammatical sentences, but those agents turn out to be poor negotiators.” Bots with no learning tools besides supervised learning are able to become decent communicators, but fail to develop the strategic acuity required to succeed in deal-making. Without much in the way of reasoning skills, they fold quickly and are often happy to accept mediocre deals.

Mimicry of this sort is the foundation of most contemporary AI, and is part of why most AI so far has proved underwhelming at human tasks, such as negotiation. Mimicry “can build language models that can produce fairly fluid sentences,” says Mike Lee, a researcher at Facebook who also worked on the study, “but it’s hard to make it to use the language to actually achieve some kind of goal.”

The planning strategy, or “dialogue rollouts,” gave the bots the ability to consider not only what to say at present, in order to achieve a favorable outcome in the negotiation, but also the likely responses to a statement, and how it would want to answer that response, and in turn the best follow-up retort, ad infinitum, until the predicted end to the negotiation. The bots “consider not just what to say, but ‘how will you reply?'” Lee explains, much like a chess player thinking over the long-term ramifications of her next move.

“There’s no particular reason why the bots should have to use human language for this.”

The third strategy, “reinforcement learning,” involved bots playing practice games against each other thousands of times. After every practice game, each bot looked at how the deal worked out for it, and updated its strategies accordingly. The addition of reinforcement learning and dialogue rollouts to the mimicry learning capabilities resulted in bots with a much better understanding of the art of the deal. Bots that thought through and planned for the consequences of the statements they made became much better negotiators. This seems to be because, when experienced bots think a few moves ahead, they become much more aggressive negotiators, and push back against bad offers.

But dialogue rollouts and reinforcement learning also made the bots better negotiators, because the strategies allowed them to develop their own negotiating language, and to attempt to deceive their trade partners by feigning interest in an object of no value to them, in order to appear to be conceding it in a compromise in a later deal. The researchers were surprised, excited even, by the bots’ development of this bluffing strategy. “No human programmed that behavior!” Batra says gleefully. “We never even had annotations for this kind of negotiation strategy. These strategies just emerged from the bots trying different things, and playing these negotiation games with each other.”

The researchers were less happy with the divergence from human language that emerged through the practice games, because, as Lee says, their “goal is to make bots that can negotiate effectively with people.” The bots were primed to start out speaking English, mimicking their most basic language skills from observing conversations between English speakers. As the bots practiced negotiating with each other again and again, however, they began to develop their own language. Bots speaking in tongues unknown to any humans aren’t poised to effectively communicate with people. The new language was developed through the numerous repeated practice games that only involved bots, so it was understood by bots—but humans were not privy to the bot-speak.

To prevent the bots from diverting from English, the researchers modified the practice sessions, asking the bots to to predict how a human might respond to a certain ask, between practice games. Batara noted that, if bots were given thousands of practice sessions with humans, they would probably be no more (or less) likely to diverge from English than humans playing together. “Like with board games, and with other adversarial games, practice against a model adversary is good only if your model of the adversary is a good model of who you’re going to be playing against,” he says.

Still, according to the researchers, it’s not too surprising that the bots, through thousands of practice games, developed their own shorthand, vocabulary, and grammar for communicating. “There’s no particular reason why the bots should have to use human language for this,” Lee says. “They start out with human language because we kind of pre-trained them to try to imitate people. But after that, there’s kind of no particular reason that human language should be the best way that two bots should be communicating.”

In fact, this is exactly what we should expect from individuals who spend an extensive amount of time together. The researchers compare the linguistic development to cryptophasia, the unique language created by some sets of twins, but the bots’ linguistic phenomenon appears to be a normal behavioral development for anyone (or anything) that spends a lot of time together. “This happens in human communities all the time,” Batra says. “One way that humans identify in-group vs. out-group behavior is whether or not you understand the jargon that the community has developed. And it’s often hard for outsiders to break into their community because of the jargon, and you can think of this similarly.”*

So while the bots’ development of their own language is a notable manifestation of the learning capabilities created by the Facebook programmers, given the endlessly repeated tasks in which they were engaged, it would have been surprising if they had not developed their own way of speaking. Similar linguistic developments have been observed in several recent studies, including one at the University of California–Berkeley, where researchers actually attempted to translate the bot language (they call it “neuralese,” the language of a neural net) that emerged.

With the results of this study in mind, the Turing test seems a silly way to evaluate intelligence, at least as normally administered. Developing a new language that no humans speak is not a great way for a bot to fool people into thinking it is human, but it is a complicated development, and a signpost of intelligence. Similarly, the bots’ versions of the bluffing technique probably didn’t fool anyone into thinking they were humans, but it made them better negotiators. Both of these strategies resemble intelligent human behavior in form, if not content. As artificial intelligence improves, it seems, we shouldn’t expect faster humans made of metal. We’ll get bots. But better.

*Update —July 3rd, 2017: This post has been updated to reflect the source of information on cryptophasia.

Related Posts