All products featured on Wired are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links.
Voice is the primordial human medium. Newborns recognize their mother’s voice the moment they’re born, having heard a muffled version of it in utero. In extremis, we scream or cry for help or joy. Even our most abstractly textual or computerized communications are framed as “conversations,” mimicking the kind of face-to-face dialogue—rich with body language, subtext, emotional warmth, and innuendo—whose increasing absence has spawned a hundred virtual substitutes. And now that our digital platforms are finally sophisticated enough to turn vocal interactions—listening and/or speaking—into yet another Internet-scale, monetizable platform, voice could soon emerge as one of the most important content and commerce platforms in the world.
Three separate epiphanies got me thinking about this pivot to voice, and while they are highly personal, it turns out there are real numbers behind the anecdata.
Epiphany one: When I wrote a book in 2016 about my early work at Facebook, I was contractually obliged to be trotted out at launch as a salesman. First stop was the glitzy CBS studios in Midtown Manhattan and a highly stressful five-minute interview in front of millions of TV viewers. With the naiveté of the first-time author, I rushed to Twitter the moment I left the studio to check my mentions, where all of a couple of tweets, by people with two-digit follower counts, appeared. TV was the firework that didn’t pop.
Months later, I accepted an invitation to be interviewed by a tech-focused podcast I’d never heard of: Note to Self, produced by WNYC Studios and hosted by Manoush Zomorodi. The online uptick following that show was considerable and long-lasting, and it triggered downstream media coverage from several journalists who evidently don’t watch morning TV. Granted, the subject matter of my book probably had a stronger appeal to this particular podcast’s audience (which likely skewed young, techie, and early adopter) than a CBS morning show’s. But it was also a much better interview. TV anchors seem genetically incapable of replicating the intimacy and engagement that draws more and more people (and, consequently, advertisers) to podcasts every year.
The industry numbers bear out the medium’s rise. In the US, more people now listen to podcasts every month (nearly 70 million1 and counting) than use Twitter regularly , and the numbers are only rising. Moneywise, total advertising revenue from podcasts ($220 million in 2017) is doubling every year. The podcast marketing space is crowding up with ad networks, tracking and targeting software, advertiser-facing buying interfaces, tools for crafting ad creative. Most importantly to potential advertisers, users are engaged: The ad networks claim episode completion rates are around 90 percent, meaning most ads are being heard. Also, and here’s the real test, the market is paying an astonishing $30 CPMs for some of these podcast slots, which is something like five times Facebook’s average CPMs. (CPM is cost-per-mille—that is, cost per thousand appearances of an ad, or what advertisers are willing to pay to reach the audience.) This is a very elevated starting point for a budding medium. As someone who’s played a small role in building that same armature in the digital and mobile spaces, the whole thing is redolent with a certain heady déjà vu.
Eventually, podcasting is going to do to radio what cable TV did to network TV (and what Netflix is now doing to cable TV): It’ll become the showcase for the premier storytelling in that medium. Even if podcasting only manages to take radio’s ad budgets, that’s a good $20 billion a year and a hundredfold increase over the current status quo.
That’s for relatively short-form storytelling, whose audio (and textual) competition is journalism. Which brings me to …
Epiphany two: Before I wrote a book, books on tape seemed to me like something only long-haul truck drivers, or maybe literary-minded marathon runners, would buy. Then I noticed I had five times the number of reviews on Audible as I had on Amazon, and about half the people I’d meet who’d read the book (yes, including some strangers on the street) had "listened" to the book.
Again, industry stats support the anecdata. Publishers are reporting declining ebook sales but growing audiobook revenues, with audio filling the digital revenue gap that ebooks left.
What’s really happening here?
Strip away the technological marvels that make the on-demand nature of streaming audio possible, and just focus on the human experience. We have 21-century consumers flocking to hear a human voice, often that of the very author, tell a long and complex story, just like the ancient Greeks that gathered around a fire to hear their local bards recite what we now call The Odyssey (and whose authorship we’ve amalgamated into a legendary Homer).
But what about the audience—do listeners ever get to speak in this voice-driven world? Yes, into a soon-to-be omnipresent smart speaker, which brings me to …
Epiphany three: I spent an afternoon in a quasi Her-style romance with my Amazon Echo, shopping for items, organizing my calendar, messaging friends, and rather less usefully, trying to get Alexa to say something obscene or witty (and only partially succeeding). Fast forward four hours later. I’m in my car, when one of those things I forgot to either buy or search for on Amazon pops into my head.
“Alexa!” I imperiously shouted into the empty interior of my car, ready to have the global brain do my bidding. The wave of felt stupidity and embarrassment that hit me after was almost as strong as the realization that something had just snapped in my relationship with computing.
Using a keyboard and mouse to manipulate a computer after successfully using voice feels about the same as using a command-line interface on an old UNIX machine after using a graphical interface. In a word, it’s starting to feel a little barbaric, and furthermore, has a certain never-going-back-to-that-crap quality to it. Amazon’s Echo sales have shattered all analyst estimates, Apple is rushing to catch up via its new HomePod, and Facebook(!) just announced its own smart speakers, slated to appear this summer. Everyone will soon be having the WTF, I-want-to-talk-to-the-Internet-now tantrum I had inside my car.
Prediction: Between touchscreens and voice, most people in the future won’t even know how to touch-type, and typing will go back to being a specialist practitioner’s skill, limited to long-form authors, programmers, and (perhaps) antiquarian hipsters who also own fixies and roast their own coffee. My 2-year-old daughter will likely never learn how to drive (and every pedal-to-the-metal, "flooring it" driving analogy will be lost on her), instead issuing voice commands to her self-driving car. And she’ll also not know what QWERTY is, or have her left pinkie wired to the mental notion of the letter "Q," as I do so subconsciously I reach for it without even thinking. Instead, she’ll speak into an empty room and expect the global hive-mind, along with its AI handmaidens, to answer.
The data-for-money alchemy that pays for the Internet will no longer only be turning Google queries and Facebook actions into fortunes. Rather, the new data inputs of value will be her spoken requests to the ambient and ubiquitous smart speakers, which will follow her seamlessly like a disembodied servant from home to transit to work. Dynamically-generated targeted ads, based on those spoken queries, will fill the gaps in her ever-present stream of music, podcasts, and books. Perhaps they’ll even be synthesized to sound like Ira Glass or Joe Rogan or some other favorite host (since so-called ‘host-read’ ads outperform random human voices).
Computer keyboards will then join typewriters in the history museum displays, and that complicated larynx, unique among primates, that first set us down the road to sophisticated social intelligence will once again be central to how we navigate the world.
1 CORRECTION, FEB. 28, 10PM: This article previously stated an incorrect number of monthly podcast listeners.
- Here's exactly what devices like Amazon Echo and Google Home do with your voice data.
- Voice assistants are getting chatty—and branded. We take a look at the future of voice-driven advertising.
- New data puts to rest fears that podcast listeners might not be as engaged as everyone hoped. In fact, they're an advertiser's dream.
Photograph by WIRED/Getty Images