MIT research reveals 95% of AI pilots fail to deliver revenue acceleration. Tom breaks down why this isn't a technology problem but a scaling failure, and provides three critical questions to identify which pilots deserve investment.

Show Notes

Key Statistics

95% of generative AI pilots fail to achieve rapid revenue acceleration (MIT, 2025)
8 in 10 companies have deployed Gen AI but report no material earnings impact
Only 25% of AI initiatives deliver expected ROI
Just 16% scale enterprise-wide
Only 6% achieve payback in under a year
30% of GenAI projects predicted to be abandoned by end of 2025

Core Problem: Horizontal vs. Vertical Deployments

Horizontal: Enterprise-wide copilots, chatbots, general productivity tools
- Scale quickly but deliver diffuse, hard-to-measure gains
Vertical: Function-specific applications that transform actual work
- 90% remain stuck in pilot mode

Three Critical Evaluation Questions

Does this pilot solve a problem we pay to fix?
Can we measure impact in terms the CFO cares about?
Does it require process redesign or just tool adoption?

Success Factors

Empower line managers, not just central AI labs
Select tools that integrate deeply and adapt over time
Consider purchasing solutions over custom builds
Be willing to retire failing pilots

This Week's Action Items

Inventory current AI pilots
Categorize as: scaling successfully, stalled but salvageable, or stalled and unlikely to recover
Apply the three evaluation questions
Identify specific barriers for salvageable pilots

Chapters

0:00 - The 95% Problem: Why AI Pilots Aren't Becoming Products
0:24 - The Research: MIT, McKinsey, and IBM Findings on AI Failure Rates
1:49 - Why Pilots Stall: Horizontal vs. Vertical Deployments
3:07 - What Successful Scaling Actually Looks Like
4:11 - Three Critical Questions to Evaluate Your AI Pilots
5:40 - The Permission to Stop: When to Retire Failing Pilots
6:45 - Action Steps: What to Do This Week

What is The AI Briefing?

The AI Briefing is your 5-minute daily intelligence report on AI in the workplace. Designed for busy corporate leaders, we distill the latest news, emerging agentic tools, and strategic insights into a quick, actionable briefing. No fluff, no jargon overload—just the AI knowledge you need to lead confidently in an automated world.

Hi folks, welcome to today's AI briefing.

My name is Tom and in the AI briefing podcast, we talk about tips and support for
executives and people trying to cut through the noise and better understand what's going

on in the AI landscape.

Today, we're going to talk about the 95 % problem and why your AI pilot isn't becoming a
product.

In August 2025, MIT

published research, analysing 150 different executive interviews, 350 survey employees,
and 300 public AI departments.

They found that approximately 95 % of generative AI pilots fail to achieve rapid revenue
acceleration.

The vast majority stall, delivering little to no profit or loss impact.

Now, the thing is, this isn't a technology failure.

Of course, sometimes it may be.

But largely, it is not a technology failure.

It's a scaling failure.

McKinsey's 2025 workplace research found nearly eight in 10 companies have deployed Gen.ai
in some form, be it, know, just like we talking yesterday about stuff inside Outlook or

larger help desk tools, those types of things.

Yet the same percentage report no material impact on earnings.

IBM study in May 2025 of CEOs say only 25 % of AI initiatives have delivered expected ROI
and just 16 % of scaled enterprise wide.

And tonight's 2025 survey, most organizations report achieving satisfactory ROI on a
typical AI use case within two to four years, far longer than the 7 to 12 month payback

typically expected for technology investments.

Only 6 % reported payback in under a year.

Now, why do these pilots stall?

So McKinsey identified the core issue as an imbalance between horizontal and vertical use
cases.

What does this mean, course?

Horizontal deployments are enterprise-wide co-pilots, chatbots, general productivity tools
that scale quickly because they're easy to deploy.

but they deliver diffuse hard to measure gains.

How do you know what your improvement in performance is when you're using a chat box to
ask you questions versus if you just want to figure it out.

Vertical deployments though are function specific applications that transfer how actual
work gets done and about 90 % remains stuck in pilot mode.

The pattern is that organizations bolt on

AI to existing workflows rather than redesigning workflows around AI capabilities.

Sound familiar?

Gartner predicts at least 30 % of GEN.AI projects will be abandoned after proof of concept
by 2025's end, according to now 236, so we'll find out.

We should move from Gartner, however, not that came true due to poor data quality,
inadequate risk controls and escalating costs or unclear business value because everybody

needs a GEN.AI solution.

may not actually fit.

So what does successful scaling look like?

MIT's research found purchase solutions delivered more reliable results than custom built
tools, yet almost everywhere researchers went enterprises are trying to build their own

because, you know, it's their IP, they can do the thing.

Obviously they know how to deploy it better than everyone else.

Sometimes it's easier just to buy the thing off the shelf.

Key success factors identified empowering line managers, not just central AI labs to drive
adoption, and selecting tools integrate deeply and adapt over time.

Like we were saying yesterday, there is not one science fits all in all this if you can
find the thing that works for the thing that you're trying to do.

That makes sense.

Harvard Business Review reports only 26 % of companies have developed working AI products
and only 4 % have achieved significant returns.

The difference between 4 % and the rest isn't about technology.

It's clear a connection between the AI initiative and the business problem some would pay
to solve.

So here are three questions that hopefully help guide you when it's identifying which
pilots deserve investment.

Question number one, does this pilot solve a problem that we pay to fix?

Many pilots start with what can AI do rather than what problem costs us money?

If the pilot disappeared tomorrow, would anyone outside the AI team actually notice?

the pilots worth scaling address workflow pain that existed before AI was an option, not
just as magic from AI, you know, if it does things better for us.

Question two is can we measure impact in terms the CFO cares about?

Productivity improvements and time saved are notoriously hard to convert into financial
returns.

IBM research distinguishes between hard ROI

direct profitability impact and soft ROI, employee satisfaction, bet decisions, those
types of things.

Both matter, but you need to know which one you're actually measuring.

And if your success metrics require a paragraph of explanation, the pilot probably isn't
ready to scale.

Question number three is, it require process redesign or just tool adoption?

Because tool adoption is faster to deliver, but delivers incremental gains.

Process redesign is harder to deliver, but delivers transformative gains.

Most store pilots sit in an uncomfortable middle where they need to process redesign to
deliver value, but they were sold as tool adoption.

To be honest about which category a pilot falls into, and results...

Of course, what you need out of this, the reason that this supposedly five minute podcast
is now into its sixth minute, is the permission to stop.

So cost thinking keeps failing pilots alive for too long.

Deloitte found that despite unclear ROI, most organizations are not holding back.

Investment continues to grow, driven by a fear of falling behind.

And we see it all over the place, in the AI companies themselves, but also out in the real
world.

Well, exec quoted.

Everyone is asking their organization to adopt AI even if they don't know what their
output is.

There's so much hype that I think companies are expecting it to just magically solve
everything.

Which we see time and time again.

And of course, like some places it does magically solve things, but that has to be well
defined until these know what they're actually wanting the AI to do, not just magic fairy

dust over everything.

Retiring a pilot that isn't working for resources, attention, and credibility for
initiatives that actually might work.

And you can also apply the failings from that pilot into the new one, and that might also
be AI driven as well.

You don't know yet.

The MIT research noted that companies are often hesitant to share failure results, which
suggests that most know they have pilots that should have pinwound.

So what to do this week?

Inventory or current AI pilots and categorize them?

Scaling successfully?

Stalled but salvageable?

Or stalled and unlikely to recover?

For stalled pilots, I'll apply the three questions.

Be honest about whether the answers suggest continued investment or graceful retirement.

For pilots worth saving, identify the specific barrier.

Is it data quality?

Integration complexity?

Own clear ownership?

Lack of process redesign?

And consider whether your organization is trying to build what it should buy.

MIT is finding that purchase solutions outperform custom builds is worth examining.

What can you just get off the shelf?

What can you go and pay a license before?

And yet just do a better job because those guys focus on doing that one thing really well.

So closing thought, the 95 % failure rate in Destiny is a reflection of how most
organizations have approached AI so far.

companies in the successful 5 % aren't smarter or better funded, they're more disciplined
about connecting AI investments to business problems and more willing to stop what isn't

working.

The question isn't whether you're experimenting with AI, it's whether your experiments are
designed to become products or just designed to demonstrate activity.

So there we go.

This has been the AI briefing.

I hope this has been useful.

If it has, feel free to share it with any other.

engineering leaders and executives that are struggling with where to use AI in their
ecosystem.

ask questions in the comments below, whatever podcast environment you consume this in, so
that I can help tailor these podcast episodes to be better suited to the audience out

there.

Thank you very much for tuning in.

I'll be back tomorrow with another AI briefing.

Bye for now.