top of page
  • Writer's pictureRon Jaworski

Smart audio experiences: the next big thing in content strategy

Updated: Feb 10, 2022

Audio experiences are getting an upgrade.

Like the technology that drives much of the change, content is evolving.

The ongoing audio revolution has significantly transformed how we engage with and consume content. For millions of listeners across the globe, having content screen-free and ready for consumption on the go has become a norm.

Following this still-growing interest, there’s more audio innovation in the market right now in an effort to delight listeners.

Here is where smart audio experiences come into play.

Audio has found its footing in the content world, effectively raising the bar of what can be offered to audiences. The change among users’ media habits means that in a matter of months, not years, it’ll be virtually inconceivable for publishers, brands, and content creators to offer an experience of reading alone.

I reckon that in a few more years, it won’t be enough to simply be part of this audio renaissance. As a publisher, content creator, or brand, you’ll have to be active in your role to meet expanding user expectations.

The age of audio content is here, and smart audio experiences can be divided into three layers:

  1. Content creation

  2. Content aggregation and recommendation

  3. Content distribution

Quick side note before I delve into the crux of the matter:

You should perceive adding an audio experience as a step forward to increasing the accessibility of your website.

At a mainstream level, people are multitasking. At a social level, people with visual impairment and disabilities, as well as the illiterate, comprise a huge global demographic.

With audio content, you are basically removing accessibility barriers as much as possible and directly improving their user experience. I don’t mean that in the legal sense, I mean enabling those who can’t read your content, for whatever reason, the ability to listen to it – in the sheerest form.

Now, let’s get down to business.

1. Content creation

Offering a listening experience is the future of content, period.

Any content you can audiofy and make accessible and portable means you’re getting maximum value from it.

One really cool thing about audio content is that it can be easily scalable if you opt for a tech-based approach via audio AI.

These days, the tech below the hood is affordable and takes only a few minutes to get the job done. A text-to-speech audio player can be easily adapted to natively align with the site’s overall look without hurting the website user experience whatsoever. Its loading time is optimized for both latency and resource consumption so that the footprint is minimal.

The smart part of the experience is that there is an abundance of customization options.

A listener can change the gender of the voice, switch to a different language (in some cases even the accent), set the playback speed, and more. The playback continues in the background while the user is away from the website, allowing them to go through other content while they continue listening.

Baby Driver GIF

AI-generated content solves a particular challenge for news organizations and anyone who primarily focuses on written content. Typically, they don’t have readily available audio content that can be further repurposed, such as recording news briefings multiple times a day to keep content fresh. This way, it can be regularly updated without additional costs and delays and with the multi-language option, catered to a broader demographic.

Fine-tuning content to perfection

Text-to-speech isn’t perfect as there are some words, abbreviations, and symbols it tends to struggle with. However, integrations with the site’s CMS allow the option to work directly on the resulting audio content so every word sounds as it should.

It’s possible to tweak almost every aspect of the listening experience and customize it to the tiniest of details.

Custom pronunciation for a specific word or phrase, interpretation of the numerical text as a cardinal number, ordinal number, fraction, or measurement, a different voice for different paragraphs, different speech patterns for each voice, specific voice style in line with the type of your content…

Possibilities are endless to personalize content to a specific listener group, making it feel like it is just for them.

Not every audio strategy revolves around synthesized speech as there is also the option of human narration by employing professional voice talent.

Recording live voice, just like in podcasts, offers a genuine feel and nuance to the content as it’s an actual person talking, particularly if it’s someone familiar to you. For a brand, it also provides a certain level of uniqueness, particularly if paired with sonic branding.

However, the problem with human narration is that it is difficult to scale and expensive. After all, it’s an actual person talking so it’s hard to automate the process. Generally speaking, this is an option worth exploring for large brands or publishers who are looking to create a brand signature voice, but no one says you can’t have a specific segment of news or stories curated by a human.

2. Content aggregation and recommendation

Having your content in audio is just the beginning.

Next, it’s a matter of maximizing its value by providing more of it based on the listener’s behavior, contextual analysis, popularity, and more.

With AI-generated audio content, you’ll quickly accumulate a library of content you can leverage to target listener segments. By aggregating various forms of audio content such as audio articles, sound bites, podcast episodes, radio shows, and similar across your individual ecosystem and the Internet, you can create targeted playlists.

For example, let’s say you have trending content in technology.

You can compile all the relevant content you published and create a playlist for the ‘Tech’ section. The same can be done for any other category or subcategory. It’s a story-like way to engage with your audience across different desktop and mobile devices while they are outside, working, driving, commuting, or simply not able to look at a screen at the moment.

In addition, you can mix and match between various types of audio content to provide a broader understanding of related topics, or simply provide your listeners with updates.

Through content aggregation and recommendation, audio content is utilized in more ways. Thanks to cost-effective smart tech solutions, you can effortlessly expand the engagement of audiences in a familiar and natural way. They stick around longer and explore more content, which you can then monetize any other way you see fit – more on that further below.

3. Content distribution

In what can be labeled as phase three of providing smart audio experiences, there are two key elements of content distribution that increase the reach of your content:

  1. Mainstream audio platforms

  2. Unique content channels via smart assistants

The point with mainstream audio platforms as another link in the smart audio experience is to try and leverage the scope of engagement they offer.

Some use it to listen to music, some tune in to digital radio, while others go for on-demand audio content such as full podcast episodes, audiobooks, and standalone stories and clips. Adding your content will position it closer to the ears of wider audiences while also helping you navigate around the problem of discovery.

Another thing I’ve noticed is that more publishers, brands, and content creators are treating these platforms like social media, basically.

In looking for places to reach their audience and connect, they clearly see the likes of Spotify, iHeartRadio, Pandora, and others as important places to be present due to their potential for organic distribution and awareness. This can be especially beneficial when trying to engage listeners and increase organic reach with user-generated or branded playlists that act as audible representations of a certain company, its product or service.

As for smart assistants, their omnipresence is a major opportunity to also be omnipresent through your content.

The idea is to release your audio content to be consumed within a shouting distance of a voice-enabled device through a customized set of voice skills or actions via Amazon’s Alexa, Google Assistant, Siri, Bixby, and other smart assistants. Depending on your goals, some will make more sense than others.

For example, our advanced hybrid content creation and distribution solution automatically distributes content in a flash briefing format. Users with a smartphone, smart speaker, and any other voice-enabled device that features Alexa, Google Assistant, and Siri can invoke flash briefings and listen. They can also go to their favorite streaming platform and play the content from there.

Here’s an example from the Miami Herald of what we call a Splash page – easily exported and implemented web pages that promote multi-channel audio content.

McClatchy Newsroom - Splash

Listeners can subscribe to or simply choose between multiple channels to momentarily consume audio content.

For example, authors and journalists can create their own news feeds featuring topical information. If someone is writing about local developments, they can create a feed and update it regularly with relevant news. A blogger can have a daily briefing on current trends or a weekly summary of their best content. Brands can go for the story-like vibe and have updates on their latest events, products, sales, and so on.

With smart displays such as Amazon Echo Show and Google Nest Hub, there is also the option of visual response in addition to a spoken response.

In short, there are all kinds of possibilities to create portable, personalized streams of information and further build smart experiences designed for eardrums.

A new monetization option

There is plenty of room for experimenting with your own content and delivery for monetization purposes. You can experiment with different content formats and their distribution until you find the winning formula. Audio exclusives and full articles can feature ads or be behind a paywall, audio snippets derived straight from the content for immediate use can be unlocked via micropayments, you can do sponsorships for short stories, and so on.

Thanks to technology, it’s now possible to integrate text-to-speech software with the site’s CMS and tap into ad servers that dynamically insert ads. It can add or stitch custom targeted audio ads into the audio stream, based on the user’s listening behavior and a multitude of data and insights.

If you don’t fancy relying on ad revenue, you can always promote audio content as a subscription feature. In a nutshell, it’s a huge opportunity to increase ROI by offering users a more engaging, tailored, and intuitive experience. Arguably, one they’ve come to expect so far.

The benefit of a subscription model is that there are various kinds of paywalls you can leverage to gain sustainable financing. Hard paywalls require a subscription to access any content, while a metered paywall offers a finite number of articles to be accessed for free before the user needs to register or subscribe.

At the moment, the premium model is among the most widespread forms of paywall, where a large amount of free content draws users in, but also offers them exclusive content available only to paying subscribers.

Having learned from the data created by paywalls, publishers can then grow in scale and shift to a propensity model, a hi-tech hybrid model that analyses the audience behavior and creates personalized offers based on their actions on the site.

As you can see, there is no shortage of options. The key is to invest more in multi-channel and multi-format content to add to the value of a subscription.

Final thoughts

When it comes to your target audience, having an audio option is both convenient and necessary.

It goes in line with the usually busy nature of modern life with different formats and types of content suitable for almost any occasion. For people used to doing things on the go, this particularly rings true.

The beauty of smart audio experiences is that there is content available in almost every format, topic, and length, making for a portable and more accessible solution than text or even video.

For a short burst of information, there’s a flash briefing skill. A long story with developed narratives is what podcasts excel at. If you want news updates and daily posts, audio articles have got you covered.

There are no restrictions as to how someone can consume audio content. Your audience can listen passively and exert no active involvement. They can stream it, download it for later use, consume it all at once or in multiple takes. These are all significant reasons why audio content has skyrocketed in popularity.

Smart audio experiences are not perfect and there are still challenges to overcome. However, there is so much flexibility, both in terms of how you want to offer it and how your audience can consume it.

You know your audience best so my advice is to test it out. If the feedback is positive, slowly build on that foundation by gradually increasing the pace and investment.

Digital audio is where audiences are these days, most notably readers, and they’ll stay there for a long time.

Let's connect via LinkedIn!


Image credits:


Baby Driver GIF - Find & Share on GIPHY
bottom of page