The Experimentation Edge | Inside The Home Depot's experimentation at a $25B scale

Summary
What does experimentation look like inside a $150 billion retailer? In this episode of The Experimentation Edge, host Ashley Stirrup talks with Kim Ting Li, Senior Manager of Experimentation at The Home Depot, where one centralized team tests every major change to a $25 billion online business. Kim explains how 40 people serve 40–50 business teams, why executives join test readouts and ping analysts directly, how every result since 2020 lives in a searchable library, and why scaling beyond hundreds of experiments per year depends on server-side testing capabilities more than AI. For product, data, and engineering leaders building or scaling experimentation programs.

Chapters
00:00 Intro
00:45 From neuroscience research to Home Depot
01:45 A $150B enterprise, a $25B online business
02:45 The centralized experimentation model
03:45 Inside the 40-person team
04:30 Readouts, blast emails, and the experiment library
05:40 Executive visibility and the golden rule
06:15 "If you won't act on a bad result, don't run the test"
11:15 Learning from losing tests
12:30 Scaling up: AI, server-side testing, and what's next

Takeaways

One centralized team of about 40 people tests every major change to Home Depot's $25B online business, serving 40–50 business teams with consistent hypothesis and analysis standards.
Executive engagement is real at Home Depot: leaders join 30-minute readouts, search the experiment library, and ping analysts directly because they treat A/B testing as the golden rule for measuring incrementality.
Institutional memory is infrastructure — every test result since 2020 lives in a centralized, searchable archive so no one re-runs a question the company already answered.
Kim's stakeholder filter: if you wouldn't do anything differently after a bad result, don't run the test.
Scaling past low hundreds of experiments per year is a capabilities problem before it's an AI problem — Home Depot is moving from client-side to server-side testing so winners release quickly, end to end.

Connect with the Guest
LinkedIn: https://www.linkedin.com/in/kimtingli
Website: https://www.homedepot.com

Sponsor
Growthbook helps you ship features with confidence by bringing experimentation and feature flagging into one open-source platform. No more guessing whether that new checkout flow actually moved the needle, waiting weeks for data team bandwidth, or flying blind on rollouts.

Growthbook gives you a single place to run A/B tests, manage feature flags, and analyze results against your existing data warehouse.

With powerful stats built in, it takes the complexity out of experimentation, helps you catch regressions before they hit every user, and makes it easy to test ideas that keep your product improving and your metrics moving in the right direction.

See a demo at https://www.growthbook.io/

What is The Experimentation Edge?

How do product teams decide what to build and what not to? The Experimentation Edge is the podcast where product, growth, and engineering leaders share how A/B testing, feature flags, and experimentation drive real business outcomes — backed by named companies and real numbers. From DoorDash's 12,000 A/B tests a year to Atlassian's experimentation-led product win to UPS's $500M experimentation team, each episode goes deep with operators running experimentation programs at scale.

Hosted by Ashley Stirrup, CMO at GrowthBook and a 25-year executive in data and experimentation. For product managers, engineers, data scientists, and growth leaders at B2B tech companies who care about experimentation culture, statistical rigor, and shipping with confidence. No marketing speak. Just operators explaining what they shipped, what moved the needle, and how experimentation reshaped their teams.

Topics: A/B testing, experimentation, growth experimentation, product experimentation, tech experimentation, feature flags, experimentation culture, statistical significance, marketplace experimentation, conversion rate optimization, experimentation at scale.

Kim Li
===

Kim Li: [00:00:00] we have an annual revenue of $150 billion as the enterprise. And then for the online org that I'm working with we just surpassed $25 billion.

$25 Billion

Ashley Stirrup: Dollars

Welcome to the Experimentation Edge, where product managers, data scientists, and engineers talk about how they make smarter decisions. I'm Ashley Sturrup, the chief marketing officer for GrowthBook, and in each episode, I'll sit down with an executive to unpack how they use experimentation and A/B testing to make better decisions.

This show is sponsored by GrowthBook, the open source experimentation platform leader. Now let's jump in and get started with our next guest.

Ashley Stirrup: Welcome to today's episode. I'm excited to have Kim Li, senior manager of online experimentation at Home Depot with us. Welcome, Kim

Kim Li: Hey, Ashley. Thank you for having me

Ashley Stirrup: So Kim, you bring a very interesting background. You started your career as a neuroscience researcher. Could you tell us a little bit [00:01:00] about your career and kind of journey to Home Depot?

Kim Li: Yeah. So I, my background was in neuroscience and statistics. So I started off as a neuroscience researcher running clinical trials and doing experiments on human subjects. And then I decided to pursue my career as a data professional in the industry. So switched gear into the industry, did a bunch of years of consulting work until I landed at The Home Depot.

I've been at Home Depot for over four years at this point. Now, I'm now leading the online experimentation team for online. Yeah.

Ashley Stirrup: And that's a pretty big business, right?

Kim Li: It is. Many people didn't know about this, but Home Depot is a, we have an annual revenue of $150 billion as the enterprise. And then for the online org that I'm working with we just surpassed $25 billion. Billion

Ashley Stirrup: Those are both pretty massive numbers. Of course, Home Depot's a big business, but you don't realize just how big it is. So that's pretty interesting. I'm sure that makes experimentation [00:02:00] that much more important there.

Kim Li: Very fertile ground for experimentation because we're serving millions of customers across the United States and online as well. Yeah, it's just very good, ground for doing large scale experimentation.

Ashley Stirrup: I've been a-- personally, I've been a very heavy shopper both online and in store and used your app a lot. Yeah, I think you've done a lot of great things there.

Kim Li: Glad to hear that

Ashley Stirrup: Could you tell us a little bit about ~the centraliza-- the, sorry, the~ experimentation model at Home Depot? Like how much of it is centralized versus decentralized?

Kim Li: Yeah. So for us it's very centralized model. Basically any major changes you want to go to the website homedepot.com, whether it being a content swap or a feature update or a model change in, search or recommendation algorithm, it will be tested by my team

Ashley Stirrup: Got it. And how many different product teams are there at Home Depot? I'd imagine there's a lot.

Kim Li: Yeah, that is a pretty broad... We serve [00:03:00] probably 40, 50 different business teams. Not only product, but we also have customers from data science, UX. If you have an idea, we will support you to test it

Ashley Stirrup: Got it. Got it. That's great. And, how many experiments are you running per year?

Kim Li: We're in the low hundreds right now.

Ashley Stirrup: That's great. And how do you see the program evolving over time?

Kim Li: Yeah, we're, we're really pushing for this experimentation practice to be drastically scaled up this year, next year. I think, the leadership's really looking forward to us really expand

Ashley Stirrup: Got it. And you've got a pretty big team working on all this, right?

Kim Li: Yeah, so I have 10 associates that are the, doing the analytics setting up experimentation, making sure it's, got the proper hypothesis doing proper analysis on it. And then we also have a sizable developer team and also automation because we're churning out all these reports [00:04:00] for different experiments and a lot of it has to be automated so we're not draining on everybody's hour to do this one by one.

So yeah overall probably about 40 people on the team.

Ashley Stirrup: Yeah. And then with all those different product teams, I would imagine that it's quite a challenge just identifying good learnings, figuring out how to share those across teams getting, visibility with leadership. How does that all work there?

Kim Li: Yeah. So we, each of my analysts has a very specialized area. They get to really learn the business, whether it being top of the funnel search function, data science function, or bottom of the funnel differences. So they will focus deeply into that. And for every test, they work closely with the product managers to understand what they need to get out of the test.

We help them to, set proper hypothesis make sure we're evaluating it properly. And then they would all have this... at the end of the test, we'll have a readout session with all the stakeholders, [00:05:00] 'cause sometimes you also have other parties involved. Potentially marketing has interest in this feature or UXers have a particular say.

So they will all be invited to this public readout for about 30 minute. And then at the end of it, we all rely on our understanding. We'll send out a blast email with the final results to share with across the stakeholders. But also those results are all, stored in a centralized place, so anybody can go back into tests that has, ran maybe in 2020 to get the learnings and see if in the past something has done already.

Ashley Stirrup: And - do you get much visibility with the executive team there?

Kim Li: Yes. And that's, simply because all the executives, leaders love to learn about the test. They really truly understand, AB testing is the golden rule to understand incrementality, the real impact. So a lot of support, but also a lot of visibility and [00:06:00] comes with it a responsibility to do good analytics.

But also yeah, like these leaders will jump into these review sessions with you and ask questions directly. They would check the library ping-- sometimes ping our associates directly to ask about a specific test

Ashley Stirrup: Oh, that's great. It sounds like it's a relatively humble culture where people are open to, being wrong and wanting to learn and things like that. Is that, that safe to say?

Kim Li: We try definitely try very hard to do that. We really cultivate this culture of test and learn. We test because we don't know for sure. If you know for sure, and I sometimes ask my stakeholder this question: Are you gonna do something differently if the test result come out bad?

If you're not, then let's not waste time, right? So I think it's really just a collaborative effort to try to get to that what's the best for our customer and customer experience.[00:07:00]

Ashley Stirrup: Got it. And one thing I think that's interesting about Home Depot, just thinking about it as a customer, is you must have very different personas that you're trying to create a great experience for. For example, the contractor versus the DIY person maybe somebody who's focusing on bathroom versus building a new fence or whatever it might be, right?

Like very different kinds of projects and skill sets and all that. Like how do you think about that when it comes to creating great online experiences?

Kim Li: That is a great point because we do serve, both the DIYers and the pros, and they shop very differently. If you're a DIYer, you might be deciding on what lights I wanna use, so on and so forth. But for pros, they're very goal-oriented, project-oriented. So we do have a pro website as well. So depend on the feature, they could be tested separately as different audiences.

Ashley Stirrup: Yeah. Yeah. It is [00:08:00] interesting though, 'cause Home Depot has such an advantage because you've got so many stores and the inventory at those stores is really different. I know from my own personal shopping oh, if I'm willing to drive another 20 minutes, they've got three other kinds of wood than the type of wood I was looking for, something like that, so in general how do you help your teams learn from losing experiments? Particularly if you've already got some conviction that, okay, this is an area that should be a winner, it was a loser, but like how do we iterate on it? How do we learn from it?

Kim Li: Yeah, that's a good question. So I think for learning, really important thing is why this happened. And of course, the test could be, set up wrong or something like that. This is the first thing to check, right? Is everything... Do we have tagging in place to measure what we want to measure all the way to when the result actually came out?

So if everything else checks out, now it's really deep dive into the customer journey. Where did people fall off? Where did this gap [00:09:00] start to widen? And when we narrow it down to the why, then we really get to how do we improve it. So that's always the most important part when a solid test went wrong, how do we get to why and what's the next step?

Ashley Stirrup: Got it. That makes a lot of sense. How does Home Depot think about kinda North Star metrics and guardrails and all that? Is it pretty consistent from experiment to experiment, or does it vary a lot?

Kim Li: It vary quite a bit especially as a centralized team, we service so many different verticals. A test who's trying to drive maybe add to cart, that will be their metric. If it's something like a banner that's supposed to attract more visitors, that could be the impression or click-through.

So it really depends on which team is doing it and what is the goal of that test.

Ashley Stirrup: And do you tr- do you track like total ROI from experimentation and like losses avoided and things like that?

Kim Li: [00:10:00] Yes. We do consider that as part of our team's KPI as well to say how much learning did we help, how much dollar we have helped save

Ashley Stirrup: And I know you mentioned earlier that y- you definitely wanna ramp up the number of experiments you're running, but how do you think about that? Is AI a big factor? Is enabling more self-service a big factor?

Kim Li: I think AI is very helpful in terms of the development speed, right? The IT org's definitely going through a lot of evolution in how they code and how fast they can get the product features ready. So that would ultimately speed up this team as well. But also in insights generation, for example, or learnings, AI can play a role there as well.

But I think for us it's a lot about unlocking capabilities. So right now we're still very much client-side testing, and we are developing server-side testing capabilities so everything moves faster, the end to end. Once you do have a learning that [00:11:00] say this is a winner, you should be able to release it very quickly, things like that.

All of this would ultimately contribute to the better testing experience, but also faster and more numbers.

Ashley Stirrup: Terrific. You covered a lot of ground very quickly. This was a great episode. I feel like we, we got a great window into experimentation at Home Depot. I really appreciate you coming on the show.

Kim Li: Thank you

Ashley Stirrup: Thank you so much