May 20, 2016 7:00 AM

Google's New Chatbot Won't Shut Up—And That's a Good Thing

Google's new chatbot, Google search assistant, helps you discover how you might talk to the virtual assistant as it rolls out to the rest of Google.

Google

Once, not too long ago, humans couldn’t talk with their computers. You could talk at them---or, really, type at them---and they’d respond like computers. Like machines. They didn't exactly converse. And that was fine. You didn't expect them to.

Today things are different. Wednesday at Google I/O, the company's blockbuster annual conference, the company unveiled two new artificially intelligent products---a messaging app called Allo and an Amazon Echo-like device called Google Home---that rely on a “conversational user interface." You talk, they talk back... and they do what you tell them, and maybe more.

Conversational user interfaces aren't a new idea; computer scientists have been experimenting with the technology for decades, but they've found new life in virtual assistants like Apple's Siri and chatbot-inhabited messenger apps like Facebook Messenger. When they work well, they can answer questions, schedule meetings, check your bank balance, and even pay your rent. The potential is enormous. But if Siri has ever given you directions to someplace you didn't ask for, or Echo has told you it couldn't answer your question, you're familiar with the challenges. Google's engineers and designers are on a mission to confront those challenges, and fulfill the potential of the conversational interface.

One of the biggest problems users have with these increasingly popular interfaces is figuring out what they do---what designers call "discoverability." “If you want people to gradually adopt a new interface, it has to be really easy for it to work the first time," says Alan Black, a computer scientist at Carnegie Mellon University's Language Technologies Institute. "And that’s hard, because if people don’t know what the system can do, they don’t know how to speak to it."

Google's chatbot, called the Google search assistant, will show up in Allo when it comes out later this year, and eventually live across a range of products. It handles the discoverability problem by being super-chatty. With any incoming message it suggests replies, links, or actions based on the context of your conversation. Google calls them "suggestion chips." If, from inside Allo, you ask the Google search assistant how to roast a chicken, it not only answers your question, it might also serve up suggestion chips that you can tap to browse recipes, or complementary dishes.

According to Rebecca Michael, head of marketing for Google's communication products, suggestion chips are designed to be personal, conversational, and "keep the conversation flowing." But there's a lot more going on here. For starters, suggestion chips can dramatically reduce the number of steps you need to take to get things done on your phone, whether it’s typing out a response or booking a reservation. They also help you discover what the Google search assistant is capable of. Most importantly, they give you a sense for how you might talk to the virtual assistant as it rolls out to the rest of Google. “It will be in the context of a user’s daily life," Google CEO Sundar Pichai said at I/O. "It will be on their phones, devices they wear, in their cars, and even in their living rooms.”

That's the big goal, really: to have a bot that can be everywhere, while adapting to specific devices and contexts. “The idea is that assistant should really be bound to you and not to a device and it should really transcend the hardware and follow you around,” says Vlad Sejnoha, chief technology officer of Nuance, a leading developer of voice interface technology.

Without a graphical component like a suggestion chip, the challenges of conversational interactions are magnified. Consider the voice-only interface of Google Home. Ask the Google search assistant, in Allo, for a list of nearby restaurants, and it can show them to you. Google Home can't. It could list them aloud, but most people find it difficult to parse that kind of information in their heads. This is what makes graphical user interfaces are so useful; they reduce the load on your working memory by putting information in front of you, instead of in your brain. The absence of a GUI, says Mark Rolston, co-founder and chief creative of Argodesign, a design studio exploring similar conversational UI problems, is why "we've restricted voice systems to some really simple, handy things."

For now, that's fine. It's enough to ask Alexa to order more paper towels, and leave it at that. Eventually, though, users will expect more. Sentences will get longer, requests more complex. Rolston says that as capabilities of voice-controlled products increase, so too will the complexity of those interactions. That'll require bots to have not only a deeper grasp on natural language, but the limits of human cognition. Answering even a simple question like, "what are the closest coffee shops to me" becomes a challenging interface problem, when that answer is delivered aloud. One solution might be to pair the answers with with a nearby screen. Indeed, Google envisions Google Home integrating with televisions throughout your home, and borrowing their screens on an ad-hoc basis. It's not hard to imagine how virtual assistants might soon piggy-back on the displays of any number of devices that surround us, enlisting the help of our phones, tablets, computers, or wearable devices, as the situation dictates.

In the longer term, the conversational interface will have to disappear entirely. Or rather, it'll have to be so sophisticated that you don't notice it. It should be so intuitive, so human-like, that people don’t need to learn how to interact with it. Like Hector Ouilhet, a senior staff designer at Google once told me: “The thing that excites me the most is building a future that my daughter can just use---without learning how to use it.” That’s a lofty goal, and one that Google probably won’t achieve any time soon. But that’s the beauty of being able to talk to our machines---someday soon, learning will feel a lot less like reading a manual and a lot more like talking to a really smart friend.

nproxy.org