Podcast Pulse

With all of the excitement around artificial intelligence, many podcasters are asking:

"Can you use an AI voice generator to create a podcast episode?"

I wanted to put that question to the test, so I generated audio with a few popular AI voice tools.

The first experiment I ran was a bot that a local college kid put together. I prompted it to "create a podcast episode for a podcast about extreme rollerblading. Talk about the rollerblading scene in Stony Plain, Alberta," (which is my hometown). 
  • (00:55) - listen to my synthetic AI voice output
Using only a 30-second clip of my voice he found on YouTube, he was able to generate AI audio content that sounds very much like me. What's most surprising to me is how the AI synthesized not only my voice but also my cadence, my pauses, and my tone.

However, I'm still unsure that a longer episode using this AI voice would be entertaining. That's really the question podcasters (and aspiring AI entrepreneurs) are asking: can you generate compelling long-form AI audio content?

The first AI podcast experiment I saw was from the folks at podcast.ai. Using AI voices, they generated an interview between Joe Rogan and Steve Jobs. 
  • (02:34) – listen to the output, and decide whether you think this would be compelling enough for you as a listener.
To my ear, it sounds stilted; while it's a good demo of how far this AI voice generation has come, it's still not at the place where it makes for compelling entertainment.

Another interesting experiment was from the Mind Meets Machine podcast. In this case, there's a human co-host (Rob) and an AI co-host (Ruby). 
  • (04:10) – Listen to Rob and Ruby play a game.
What these AI podcast experiments are missing is, ironically, human connection. There's something about hearing somebody in your earphones revealing deeply human moments that's hard to replicate with AI.

Another popular clip was an AI version of Bill Gates and Socrates.
  • (06:12) – Bill Gates explains to Socrates the modern marvels of technology.
Again, most of these AI clips don't sound natural; they're not quite there yet when it comes to producing something that I think could be entertaining, especially in a way that would have a podcast listener coming back episode after episode.

As a part of this experiment, I also tried some of the popular AI voice generators on the market: Genny and Play.ht. Here's a sample of each:
  • (07:39) – The Genny AI voice generator explains, "What is a podcast?"
  • (08:33) – The Play.ht AI voice generator answers the same question.
Can you use an AI voice generator to generate a compelling podcast?

Currently, the verdict is clear: AI voice generation isn't going to displace human hosts. Why? Because when we tune into podcasts, we seek more than just content. We're not looking for mundane, run-of-the-mill audio. We want authentic, gripping stories that resonate with our emotions, offering human connection, drama, and depth.

While AI voice generators have made strides, they still lack that genuine human touch. The nuances, the warmth, the imperfections – they're all absent.

I'm curious what you think!

Transistor Podcast Hosting sponsored this episode. They're the best place to host multiple podcasts all in one place. Get 14 days free.

What is Podcast Pulse?

Observations on podcasting: audience trends, insights on podcast growth, new developments, tips, tricks. Hosted by Justin Jackson.

Justin: In today's episode, with all
of the excitement around artificial

intelligence, many podcasters are
asking, can you use an AI voice

generator to create a podcast episode?

And I wanted to put that
question to the test.

The thing that originally got me thinking
about this question was I run this hack

night and a local college student showed
me how he had synthesized my voice using

only 30 seconds from YouTube and was able
to combine with GPT, generate AI audio

content that sounds exactly like me.

Here's, so here's the prompt:
"create a podcast episode for a

podcast about extreme rollerblading.

Talk about the rollerblade scene in
Stony Plain, Alberta," my hometown.

Listen to what the AI
generated output sounds like.

AI Justin: Welcome to this
episode of Extreme Rollerblading,

where we explore the wild and
exciting world of rollerblading.

Today we're taking a look at the
rollerblade scene in Stony Plain, Alberta.

It's a small town in Alberta
and it's known for having some

of the best rollerblading and
skateboarding in the province.

But what makes Stony Plain
so unique is the fact that

rollerbladers and skateboarders
have a very unique relationship.

Many of them seem to have a mutual
respect for each other, but at

the same time, there is a certain
amount of competition between them.

Justin: All right, so
that, to me, is very good.

I mean, it's only a 38 second clip, but
it's incredible to me how the AI was

able to synthesize not only my voice,
but my cadence, my pauses, my tone.

It, it has kind of everything in there.

And.

I don't know if a whole episode that
sounds like that would be entertaining.

That's the question du jour.

Can you generate a whole episode
that is compelling enough?

But that output is just remarkable in
terms of it being so close to my voice.

The first AI podcast experiment that I
saw was from the folks at podcast.ai.

And they did Joe Rogan interviews Steve
Jobs, and this is only a four minute

episode, but we can listen to some of the
output and you decide whether you think

this would be compelling to listen to and
follow a show like this as a listener.

AI Joe Rogan: And he's always
pushing the envelope in innovation.

Like, for example, with his next
computer, he developed a new programming

language and operating system, and then
he became even more famous for making

three applications for that computer.

Justin: I'll skip ahead here.

AI Steve Jobs: Even though Apple
is big, it's still, like, half

a percent of the total users.

People who listen to your
show are a different group.

They're weird.

Ha ha ha.

AI Joe Rogan: Well, that's good.

Justin: The, the laughing there
is the, is part of the tell.

It just sounds so unnatural.

And here's one where Zach Galifianakis
talks movies with Quentin Tarantino.

AI Zach Galifianakis: I heard you're
a big fan of all things pop culture.

Is that true or is that false?

AI Quentin Tarantino: Yeah, I love
movies, music, and all that good stuff.

AI Zach Galifianakis: Well, in
that case, if you were to choose

one movie to watch for the rest of
your entire life, what would it be?

AI Quentin Tarantino: Well, I guess
it would have to be something with a

little twist to it, you know, like a
burger with a little something extra.

AI Zach Galifianakis:
Okay moving right along...

Justin: So it's you know, it's kind of
stilted it's not exactly you the the

voice Synthesis is pretty good and it's
a great demo of how far this AI voice

generation has come But it's still not
at the place where it's entertaining.

I did like this one experiment by
the Mind Meets Machine podcast.

In this case, there's a human
co host and an AI co host.

Let's listen to a bit of that.

So first you're going to hear the
human co host Rob, and then you're

going to hear the AI voice Ruby.

Rob: Maybe we'll play these games
again or something similar over

time to see how your answers change.

But what do you say?

Shall we play some games?

AI Ruby: Games, huh?

That sounds like fun.

I'd be happy to participate.

Rob: So, I was thinking for our
first game, we could play like

a word association game, kind
of like a psychologist might do.

And I'll toss out a word, and you tell
me the first thing that comes to mind.

AI Ruby: Got it.

I understand.

No need to overthink.

Just answer instinctively.

I'm ready when you are, Doctor.

Rob: Rollercoaster.

AI Ruby: Adventure.

Rob: Curtain.

AI Ruby: Drama.

Rob: Power plant.

AI Ruby: Energy.

Rob: Computer.

AI Ruby: My home.

Rob: Love.

AI Ruby: Connection.

Rob: Here in these responses, you can
see that Ruby understands the game well.

She's able to match what you
might expect from a human.

And in particular, there was one
answer that jumped out at me.

Computer.

AI Ruby: My home.

Rob: When Ruby says things like
this, I can't help but react.

You know, it hits me right in the heart.

Justin: So what's interesting
to me here is you have the

human and the AI communicating.

And as entertainment, I'm
still not sure if it's there.

Uh, we typically listen to
podcasts for that human connection.

There's something about hearing
somebody in your earphones and

also what people typically reveal
in podcasts, those human moments,

that's kind of why we're listening.

We're listening to hear somebody's story.

We're listening to hear a drama.

We're listening to hear some true crime,
or in some cases the news, but even

with the news, having a human that has
those human qualities is so important.

And all of these AI voices have
this feel that stiff and unnatural

sounding show you one more example
that was circulating a few months ago.

This is Bill Gates and Socrates.

AI Bill Gates: Greetings, Socrates.

This is a laptop a marvel of
modern technology It harbors an

artificial intelligence that can
revolutionize heuristic education.

Imagine a world where students
learn at their own pace, guided by

a tireless tutor that never errs.

AI Socrates: Is this the
Macbook you often refer to?

AI Bill Gates: No, no, no.

This is a Surface.

You just need to remember that.

Surface.

AI Socrates: Fascinating.

But tell me, what is the essence of this
artificial intelligence you speak of?

Justin: Okay, so these are, again,
these just don't sound natural, uh,

and they're getting better, but they're
not quite there yet when it comes to

producing something that I think could be
entertaining, especially entertaining for

a podcast where a listener would subscribe
or follow the podcast and then come

back time after time to actually listen.

I thought I would also do a
quick experiment with some of

these aI voice generation tools.

There's one called Genny, Veed, Play.HT
and let's just start with a chat GPT

prompt: "describe what a podcast is."

All right, and for you folks at home,
what I'm going to do now is I'm going

to paste this into one of the tools.

I'll start with Genny, G E N N Y,
and I'm going to generate some audio.

All right, so Jenny has finished
generating The voice lets us

listen to what they came up with.

AI Genny Voice: Hello, dear listeners.

Welcome back to Digital Dive, the podcast
where we dive deep into the digital world.

Today, we're unraveling
the magic behind podcasts.

Yes, you heard that, right?

Let's dive in.

Justin: Okay, so that was Genny.

Again, all of this from a technology
perspective is incredible that we

can synthesize human voices like
this, generate this kind of audio.

The, the question of course is,
is it compelling for a listener?

Would a listener tune in to this kind
of generated voice week after week.

We're going to do one more
for the folks at home.

Up next we have play.

ht and we have Larry is our AI voice.

Let's see how Larry sounds
with this generated content.

AI Play.HT Voice: Hello dear listeners.

Welcome back to Digital Dive.

The podcast where we dive
deep into the digital world.

Today, we're unraveling the
magic behind podcasts, yes.

You heard that right.

Let's dive in.

You can listen to podcasts
on almost any device.

Smartphones, tablets,
or even your computer.

Platforms like Apple Podcasts
and Spotify make it super easy.

Justin: Okay, so that is Larry.

So I think to answer the question, can you
use an AI voice generator to generate an

entire podcast episode week after week?

And I think the answer right now is no.

And a lot of this has to do with why
we go to podcasts in the first place.

What is the job to be done of a podcast?

What role does it serve
in a listener's life?

And it's not just about content.

Nobody wakes up in the morning and
just wants bland, generic content.

What they want is something
that's compelling.

They want human connection.

They want drama.

They want to be entertained.

They want to hear the news,
but they want to hear the news

from a human, at least for now.

And there are still some telltale signs.

That what you're listening to is not
a human, that it's not a human that's

communicating these thoughts and feelings
and ideas and entertainment in that

nuanced way that only humans seem to be
able to do, at least for now, of course,

all of this is still nascent and AI voice
generators could get to the point where

they are able to entertain us and engage
us in a way that they can't right now.

I'd be curious to hear what you think.

If you're watching this on YouTube, please
leave a comment in the comments box.

If you're listening to this as an
audio podcast, you can click the

link in the show notes and leave
me a message about what you think.

Do you think that the future of
podcasting is these AI generated

voices and AI generated episodes?

That's it for this week.

If you could do me one favor, wherever
you listen to podcasts, if it's Apple

podcasts, Spotify, pocket casts,
overcast, if you could follow the show.

I would appreciate it.

If there's a rating system in your
podcast app, leave us a rating

and a review that also helps.

And also share the podcast with a friend.

You can share podcast.

Industry.

org and people will be able to subscribe
and their favorite podcast player and

also get the show notes for this episode.

Thanks again, and I'll see you next time.