How do product teams decide what to build and what not to? The Experimentation Edge is the podcast where product, growth, and engineering leaders share how A/B testing, feature flags, and experimentation drive real business outcomes — backed by named companies and real numbers. From DoorDash's 12,000 A/B tests a year to Atlassian's experimentation-led product win to UPS's $500M experimentation team, each episode goes deep with operators running experimentation programs at scale.
Hosted by Ashley Stirrup, CMO at GrowthBook and a 25-year executive in data and experimentation. For product managers, engineers, data scientists, and growth leaders at B2B tech companies who care about experimentation culture, statistical rigor, and shipping with confidence. No marketing speak. Just operators explaining what they shipped, what moved the needle, and how experimentation reshaped their teams.
Topics: A/B testing, experimentation, growth experimentation, product experimentation, tech experimentation, feature flags, experimentation culture, statistical significance, marketplace experimentation, conversion rate optimization, experimentation at scale.
craig-c09dc42f-1c54-4e33-9f91-1-CFR
===
[00:00:00] Welcome to the Experimentation Edge, where product managers, data scientists, and engineers talk about how they make smarter decisions. I'm Ashley Stirrup, the chief marketing officer for GrowthBook, and in each episode, I'll sit down with an executive to unpack how they use experimentation and A/B testing to make better decisions.
This show is sponsored by GrowthBook, the open source experimentation platform leader. Now let's jump in and get started with our next guest.
Ashley Stirrup - Host: , welcome Craig Kistler, Vice President of Experience Design Personalization Experimentation at Signet Jewelers. So happy to have you on the show
Craig Kistler - Guest: Yeah. It's awesome. Thanks for having me
Ashley Stirrup - Host: You've been at Cygnet a long time, 14 years. Maybe we can start with you just telling us a little bit about Cygnet as a business
Craig Kistler - Guest: Yeah. So Signet Jewelers is the parent company for the mall retailers in the United States. So Kay Jewelers, [00:01:00] Jared Jewelers, Zales Peoples out of Canada, Banter, a couple others. So it's the parent company of all these brands that you see all over your TV commercials.
Ashley Stirrup - Host: Got it. I had no idea they were all under the same umbrella. That's really interesting.
Craig Kistler - Guest: Yeah
Ashley Stirrup - Host: And so that makes experimentation interesting 'cause you have a hybrid online and in-person model.
Craig Kistler - Guest: Correct. Yep
Ashley Stirrup - Host: Yeah. Before we get into all that maybe we talk a little bit about you and your background.
I love the variety of backgrounds that our guests have, and y- you started on the design side, so that's pretty interesting.
Craig Kistler - Guest: Yeah. Yeah, I went to art school and ended up in experimentation
Ashley Stirrup - Host: Yeah, and then you've been at Signet for 14 years now
Craig Kistler - Guest: Yep. Yeah. So I started at Signet 14 years ago as the first and only UX designer there. When they called, I hadn't heard of Signet. I thought it was just a mom-and-pop jeweler, and I'm like, "Nah, I'm not into it." And then they said who they were, and [00:02:00] it was like, oh, this is interesting. But even 14 years ago, user experience design, that was a thing, and I was the first one coming in there. So I built that team up. And then from there we brought in AB testing, and I got my hands on that, and that felt interesting. It felt like an addition to what we were doing with creating a better experience. And then I just kinda took that over, and we've been running that standpoint.
It went from experimentation to personalization through some changes in the org. User experience and XD left my purview. It has recently come back, so now it's back where I started. So now it's user experimentation and personalization.
Ashley Stirrup - Host: Got it. And it's was there a particular trigger that caused you to start doing A/B testing, or was it just a natural evolution of how do you keep upping your game?
Craig Kistler - Guest: Yeah, a little bit of both. So we had a tool come in, we brought in a tool and there was a team using it, and I kinda started dabbling with [00:03:00] it. And I guess maybe I'm just more anal or hands-on. I did, I wasn't really a fan of what we were testing or what we were doing. It just felt frivolous. So I started playing around with it and taking it on, and then it just became a natural evol- evolution because it fit so well together. Like we were solving experience problems through usability testing and understanding what customers needed and designing a solution for it. And we'd get feedback, but we got feedback, you know, the typical eight to 10 people. But I had millions of people coming to my site, so layering that into AB testing, we could really validate sure it's usable, but do people care about it? And does it make a difference? So it just became a natural evolution, and then after we saw the power that it was able to bring to the business, that's kinda where my path...
I just focused on that
Ashley Stirrup - Host: And do you feel like your background in design helped you think differently about what should you be testing?
Craig Kistler - Guest: I do. I've said it a lot. I think user [00:04:00] experience and personalization and experimentation dovetail really well together because ultimately both of those things should be trying to solve the customer's challenges and problems. For us, we've got tens of thousands of jewelry products.
How do we make it easy and relevant for them to find something that they're gonna fall in love with? And the design perspective was, all right how do you make it easy to find the perfect necklace out of 10,000? And that's taxonomy, that's findability, that's all your typical user experience tools. Bringing in A/B testing allowed us to actually test one against the other and see what made the difference. So it was actually tying data to those experiences
Ashley Stirrup - Host: Yeah. Yeah, it makes a ton of sense. Yeah, I've always thought that there's an entire value chain, and it really all should all be connected, everything from the product management to design to personalization, all of it.
Craig Kistler - Guest: For
Ashley Stirrup - Host: it makes a lot of sense. So can you tell us a little bit about how experimentation is organized at your company?
Is it all [00:05:00] centralized? Do you have, kind of-- Is it more of a distributed model?
Craig Kistler - Guest: So we are centralized. Across the board as I mentioned, Kay, Jared, Zales, Peoples, et cetera, they all have their kind of business teams, their brand and marketing teams. They're focused on their kind of PNL for the site. Experimentation and personalization sits underneath the product org, but we're central.
So we're looking at experimentation and personalization across all those different sites, which allows us the scalability of understanding customers at broader scale. So I can take a look and see behaviors that are very similar, somebody shopping on Kay versus somebody shopping on Jared. But then where it becomes fun and interesting is the marketing and the things that each of those shoppers' needs become different. So now you can start playing around not only with a better experience, but tying in marketing and psychology and intent on those different sites. So while it might be the same [00:06:00] experience, you're tweaking it here and there based off of the needs of that those individual groups
Ashley Stirrup - Host: Makes total sense. And roughly how many experiments are you running a quarter?
Craig Kistler - Guest: It varies quite a bit. When we did when I was starting this seven, eight years ago, we were trading in the volume game, and we would run sometimes 40, sometimes 50 experiments in a quarter. Which was fine. They were smaller. They were the should we, change the button color?
What about this headline? What about this placement on the homepage? And what we started to see was they didn't make really big impacts. They were useful, they were helpful, but once you understood that, the velocity of those kind of slowed down. Now I would say we're somewhere in the realm of 15 to 25, depending. Not because we intentionally hit the brakes, but it was we're doing more complex and more, what I would say, value-driven experiments as opposed to just the best practice of this button [00:07:00] color versus that button color
Ashley Stirrup - Host: Can you give an example of something that's more value-driven versus something that's a little more tactical?
Craig Kistler - Guest: Yeah. Yeah, so we've done, we'll do a big scale would be a PDP redesign. So Instead of taking like the add to cart button and moving it around or adding social proof or changing one or two things, which we've done that, sometimes you need to take a look at the entire journey the entire setup.
And we're like, "You know what? This Let's just start over. If we had no rules, no restrictions, what could this page be?"
We'll design that out, and then we'll test that new version against the old version.
Ashley Stirrup - Host: Got it. That's a great example. And y- got something like 10 million visitors to your properties, is that right?
Craig Kistler - Guest: Yeah. K alone typically has about 10-ish million every month. Obviously that ebbs and flows.
Holidays are crazy.
Even some of our smaller sites have a few million a month. So we've definitely got volume so we can test a lot of [00:08:00] things with pretty decent certainty
Ashley Stirrup - Host: Yeah, that's great. Can you tell us about an experiment where you had a lot of learnings?
Craig Kistler - Guest: Yeah. I guess that's kind of where a lot of our experiments are trying to drive now. So I mentioned the smaller experiments. Those are great. You don't get a lot of learnings for those. The PDPs will get a lot of learnings just because there's a lot of things on the page.
But the one that-
Ashley Stirrup - Host: Can you explain that acronym for the audience?
Craig Kistler - Guest: , Product
detail page.
Ashley Stirrup - Host: product detail page. Got it.
Craig Kistler - Guest: Yeah, so it's basically your product page.
So there's a lot of elements on that page that you could look at with a large experiment. But one that's relatively simple and straightforward, and it k- I enjoy it because it debunks some of these best practices. We have a PLP or a product listing page, category page, and some of the top viewed pages are the view alls.
So somebody's shopping, they're like, "I wanna see view all rings. Show me all the rings or show me all the necklaces." Which in theory sounds great. It's "Yeah, just give me everything." [00:09:00] And our site's large, we've got thousands of products. Nobody wants to shop thousands of products, but they're on that page and they're scrolling through there, especially on mobile, not a lot of facet usage, so people aren't narrowing down these choices. So the hypothesis we had was, all right, we know somebody's coming in and they're looking for some type of ring. They don't need all rings, but they're on the view all rings page, what can we do to help them narrow that down? And we actually added friction. So when somebody came in and said, "View all rings," I didn't show them a single product. I blocked off that page and said, forced them to choose what style of ring are you looking for? Engagement, a band, something more fashion-focused. So all the facets, I basically surfaced them, highly visible, increasing the n- the number of clicks before they got to product. So somebody would click in there and going from thousands, they would go to hundreds very quickly. And that was [00:10:00] interesting because it was the myth of three clicks and all this stuff goes out the window as long as that click is adding value.
Ashley Stirrup - Host: Yeah
Craig Kistler - Guest: So that was kinda something where we were able to kinda talk about that through the organization, and then based on that, we were able to keep iterating on that and in- introducing these ideas of how do we ask questions?
How do we understand what they're looking for and help them based on what they need? Much like you would walk into a store. And we saw a lot of value in, in doing that.
Ashley Stirrup - Host: Yeah. Did it, did you, were you able to measure the impact of doing all that?
Craig Kistler - Guest: Absolutely. Yeah, everything we do is always measuring value. Oddly enough, we didn't see exit rates increase because of that extra click. You know, We saw people going deeper, looking at more product, ultimately increasing conversion rates and revenue.
Because they weren't sorting through thousands of products where that first page of product main...
if somebody was looking for, a very specific type of [00:11:00] anniversary band to give as a gift,
and they go to all rings and we're surfacing, typically we're surfacing popular or new, and we've got very popular engagement rings and very new fashion rings, those are all gonna bubble to the top. So on that view all page, they might feel there's nothing relevant to them.
So scroll through, 20, 30 products and be like, "This... I guess I'm out. I'm gonna go someplace else."
Ashley Stirrup - Host: Yeah. Yeah I love that story 'cause, I think with so many of the types of experimentation people can do, it's really understanding what's the buyer's journey and how do we create a great buyer's journey? And you'd never have in the store just somebody go, "Oh, here they all are," right? They'd ask a few questions and they'd really tailor it down.
And the user might naturally be like just show me everything." But that's actually not, the best answer. It's what's your budget and, what's the style? And all those types of things are super important to then giving them relevant options. So that, that, that's really interesting that you [00:12:00] added friction and it ended up leading to a huge increase in revenue for the business.
Craig Kistler - Guest: Yeah.
Yeah.
absolutely. And
like I said, it was something that we could roll into other things. It became... We like to talk about insights a lot. It became an insight, foundational level insight that we can then start tacking onto other things to figure out, like, where can we go with this?
Ashley Stirrup - Host: Yeah. Yeah, that's really powerful when you can kinda take a concept and keep reusing it in
Craig Kistler - Guest: Yeah, 100%.
Ashley Stirrup - Host: Yeah. And when you you talked about doing bigger designs, how do you do things differently when you're doing a major redesign versus just tweaking something?
Craig Kistler - Guest: So there's a lot of similarities, right? You still have your hypothesis, you still have the questions that you're looking to answer, your metrics are all set up. But on redesign of a page or even a navigation structure, there's so many little elements that you first have to look at, did this help me overall?
So did this major change of everything work at least as well and hopefully better than what [00:13:00] was there previously? So that's foundational for us, is hey, if we're gonna go and dev this out and do all this extra work, are we at least gonna cover our bases? But then on a page like the PDP where it's trying to do a lot of things, not just for the visitor, but a lot of things for the business, there's a bunch of little smaller measurements that we have to take into consideration. So for us, we've got a couple different groups that have elements on that page. So obviously you've got the product, you're trying to sell the product. But we've got a financing group, and we've got our own internal credit cards. There's a business tied to that. So immediately if we're changing something, their concern is, are our credit apps going to slip? Because if that, that, that impacts the business. We've got another group that is responsible for our service plans and our warranties, obviously they've got business stakes in that. And then other teams like our virtual JCs, our chat group, all that are designed to [00:14:00] help the customer find the right product, they all have impacts as well. So they're immediately-- their red flags are like, "Hey you're changing my thing, so how do I measure it?" So we have to then come back and say, "Okay, let's take a look at this. - Let's take financing. How did financing s- signups or startups change in the default versus the variant?" Again, the goal is to get all the KPIs moving forward, but sometimes they don't.
A- and then you have to have those conversations. And then like for example we did reduce our chat usage by a little bit. Then you have to have the conversation to say, "Okay, overall business was higher. These KPIs were higher. Let's say you, your change was less than 5%," and then you put that out into revenue. This is what it means in terms of revenue." 'Cause you're always trying to balance that, that revenue. Yes, you want stakeholders to be happy, but ultimately we're trying to sell product and make customers happy.
Ashley Stirrup - Host: Yeah. Yeah, it's interesting. You wouldn't naturally think of a [00:15:00] business like Signet's as having all those different dimensions, but, once you bring it up, it's really obvious that you run an experiment, you've, you're gonna have your key KPI, which I assume is, did they buy a ring?
But that you got so many additional metrics that you need to be looking at with each experiment.
And I'm betting you even start to look at slices of those depending on how you segment your customers.
Craig Kistler - Guest: 100%.
100%.
Ashley Stirrup - Host: When experiments lose, how do you try to extract the most amount of learnings possible from them?
Craig Kistler - Guest: Yeah, so I, I don't wanna say I enjoy when experiments lose. That just sounds like who, who wants their thing to lose? But
it does give you an opportunity to kinda dissect it and say, "Okay, where in this process, where did it go wrong?" Was the hypothesis just off the mark? Was the visitor issue that we were trying to solve not really a visitor issue?
We'll see that a lot of times where stakeholders or product teams will come in and say, "X thing needs to change because of Y reason." [00:16:00] And sometimes we won't do our due diligence and tease that out and truly figure out with data is this a problem? And we'll just say, "Yeah, okay, cool.
You You know your stuff. We'll do it." Those typically fail because product had something in their mind that they assumed was the problem, and it actually wasn't the problem. So it becomes an education opportunity to say, "Okay here's what you said was happening. Here's the data of actually what is happening, and here's why this experiment didn't work." It allows us to readjust and come back and say, "Okay, knowing the problem, here's a couple different ways we can design a solution to fix that." So we always try to tease that out. Sometimes we'll run experiments knowing or having an idea that they will lose, but depending on the level of person asking for it, you end up doing it, and again, it becomes an educational session to say, "Okay, we did this.
We assumed it wasn't going to work. Here's why,"
and then validate that
Ashley Stirrup - Host: Yeah. Hopefully as many of the experiences possible [00:17:00] are helping you to learn about that customer journey and what really matters to them and what doesn't, and hopefully you're extracting insights along the way that can then inform how you think about the next feature or the next problem you're trying to address.
Craig Kistler - Guest: Yeah, 100%. That's what, we have a backlog, a library, and I think anybody in experimentation understands that trying to keep this stuff straight is a challenge in and of itself. But the teams work really hard at putting together a library that kind of pulls out like here's what we were trying to test, here's what we were hoping to learn, and here are the, were the results, here are the next steps. And then it becomes like a flywheel where we can always go back and look at what happened, the iterations that we've run after that, what are we gonna do next. And it's not just, walled off to us, to, to our experimentation team. We open that up to everyone so that there is this insight, there is this knowledge. And the team does a really good job of kind of pulling those things out as opposed to just saying, "Here's the data," and then letting [00:18:00] somebody else interpret it. It's "Oh, here's the data and this is what it actually means." So yeah we try really hard to educate what we're doing and why we're doing it
Ashley Stirrup - Host: Yeah, that makes a ton of sense. And so obviously you're looking at a ton of different metrics as when you're, whenever you're running an experiment, and I'm sure just the, did they buy or not is probably your uh, north star. But how do you think about all the other metrics you're looking at and which ones rank more highly for you?
Craig Kistler - Guest: I would say our north star metric isn't conversion,
it's revenue. So I'm looking at revenue per visitor. Conversion's there, right? So we're always gonna look at conversion. Did they order? Other, like the add to carts and everything. Like we all, we look at those, but conversion and revenue to me are the two most critical, with revenue being most important. Mainly because I can change conversion rate relatively easily. If I come in, yeah I use this as an example all the time,. like i can guarantee you 100% conversion rate, [00:19:00] and I guarantee you'll be out of business in a week because I'm gonna make everything free.
There's no reason for somebody not to convert, Right
Ashley Stirrup - Host: Right conversion's, um, able to be manipulated or just stop looking at bad traffic. Social typically bounces a lot, so
Craig Kistler - Guest: if I just take that all out, my conversion is gonna double. That doesn't mean anything if my revenue number's going up or not going up correct, right? So if I can get revenue per visit going up and I have one metric to look at, that's what I'm gonna look at
Ashley Stirrup - Host: Yeah. And are there kind of key ways in which you're segmenting the business? Repeat customers versus new customers, things like that?
Craig Kistler - Guest: Short answer, yes. Repeat versus new, eh. The-- It's fine. I don't necessarily look at it too much. For us the... When we started this whole personalization idea and testing into personalization, looking at the site, there were two distinct groups, those who shop for engagement rings [00:20:00] and those who aren't.
And we started with that at a very high macro level and started to run these experiments to say if I know somebody's shopping for engagement ring, let's make the experience all about engagement. And if I know they're already married and they don't need an engagement ring, don't show them the engagement messages." And I don't think surprising to anybody, those two messages geared towards the right audience made a ton of difference.
Ashley Stirrup - Host: Yeah
Craig Kistler - Guest: But as a business, we're like it's engagement season, so we're gonna push engagement." Which makes sense, like that's the business initiative, but it doesn't make sense who's been married for 25 years.
They're like,
I don't need this." So
just making those two things
Ashley Stirrup - Host: Yeah, is there an engagement season?
Craig Kistler - Guest: It depends. Marketing thinks there is. So there's two times, it's usually like spring and then around the holidays where you're more focused on talking about engagement, 'cause that's when either the weddings are happening
Or you're getting ready to propose. those are big marketing initiatives. [00:21:00] But
to me, engagement season is much like birthdays. It's happening every single day
Ashley Stirrup - Host: Yeah, right. Uh, how do you see experimentation evolving at your company?
Craig Kistler - Guest: For us we're definitely leaning more into the idea of intent-based personalization and then testing that against non-intent based. So I just gave the example of engagement versus fashion. Two big macro buckets. But if we think about those, there's so much nuance in each one of them that it gives you a lot of opportunity to explore different intents.
For example, say somebody's coming and they're just starting their engagement journey. From when they start to when they purchase could be six months. They're educating themselves "This is a major purchase. Do I wanna get engaged? I don't know. What type of ring?" So forth and so on. So pushing a sale message or something right at the beginning doesn't make sense. Compare that to somebody who's buying something for themselves. They know what they like. It's gonna be a faster shop. So leaning into some of these things [00:22:00] where they're showing signs, different signs of intent, like what level of intent are they? And then even when they have high intent yeah, they they're gonna buy an engagement ring, they're gonna buy a gift or something for themselves, understanding how they're traveling through the site to change the experience.
Again, going back to the overall journey, the experience that customer's having. If they're showing signs of struggling, like maybe they fell on that view all page and they can't find the right style of engagement ring. How do we then test ways to intercept them when they're struggling and give them a prompt to say, "Hey, we can help you.
What do you... what type of metal or what style?" Or maybe they get a quiz or one of those things. The journey might be really long and very different between fashion and engagement, but if we understand the moments when they're starting to struggle or moments when they're like, "Hey, I know what I want, just get everything outta my way." Testing those types of ideas and those types of levers we've seen to become really powerful, and there's just so much [00:23:00] room and white space in there to, to play around with
Ashley Stirrup - Host: Yeah, that sounds really interesting. How do you identify the people that are like, "Get out of my way, I know what I want"?
Craig Kistler - Guest: Some of it comes from channel. Some of it comes from, the behaviors, like their paths are very clean, very straight. We actually use a tool that helps us with that made with intent, allows us in session to, to pull in additional data and make some predictions. So that allows... it kinda takes our intent levels on steroids. But there's plenty of ways to do that where you're just understanding what customers are doing or what they're not doing. If they're pogo-sticking, right? If they're going back and forth and back and forth. Some of that's gonna be needed and necessary, especially if you're building your consideration set. But then sometimes if they're like, if they're on a single PDP and they're scrolling up and down, they're looking for something, these are signals that they're struggling to find some sort of answer. And then the challenge is figuring out what they're asking themselves and what they need answers to, and then, [00:24:00] and pulling that in
to some sort of experience
Ashley Stirrup - Host: Yeah. And uh, to kind of wrap up the episode one of the things that I, you can imagine with a business like yours is just like, how do you bring everybody along? You've got all these different stakeholders and different business units caring about different things and, the exper- it's, it can be hard to unlock all the insight that you're gathering in experimentation and share that with the company.
So how do you try to get that across?
Craig Kistler - Guest: So we do it a few different ways because we have different levels. So we're gonna-- we'll start talking from CMO level down, and then we'll start talking from the person making the changes on the website up. Hopefully, we all meet in the middle. But yeah it's a constant education. What we do, anytime we run an experiment, we send out an experiment plan that outlines exactly what we're doing, what the hypothesis is, what we're measuring, what questions we're looking to answer with the visuals. So this goes out to the entire digital group, [00:25:00] and then we close that loop after the experiment completes and we've run our analysis. Same group gets the results. Win, lose, or draw, they get the results with exactly what we, what the metrics said, answers to the questions we asked, and then next steps. So that's kinda that ground level.
So we'll send that out, and then that goes into our experiment library. And then once a month at the executive level, we'll have kind of touch bases. So obviously at that level you can't get down into the nitty-gritty. So I'll bubble all these up into kinda these three or four key insights for the month, outline them, and then all the insights have dollars and cents tied to them. Knowing that at that level they care about are we making money? Yeah the insights are great. I trust the teams to do those, but for that to have meaning at that level, I tie dollars and cents to everything that we're doing so that they can say, "Oh, I get it. [00:26:00] This is now an incremental bump by doing this key tactic. Let's keep pushing on that tactic." So they're gonna push it down through their teams
Ashley Stirrup - Host: Yeah. That, I think that's such a great summary of a strategy for making sure you're bringing everybody along. And I love the fact that you're tying revenue to the different ideas, 'cause so often, you might get a win somewhere, but is that really gonna move the business or not?
And so if you're able to connect it to revenue, that's really powerful.
Craig Kistler - Guest: Yeah. Yeah, it's the only way that we've seen. 'Cause like you said, everything else, while it has data, their question is like, "Okay, so what? So does these
other 12 priorities.
Why do I care about yours?"
Ashley Stirrup - Host: Yeah. how broad is that particular insight? Is it gonna affect all our customers or just a small percentage?
Craig Kistler - Guest: Yep. Yeah, totally
Ashley Stirrup - Host: Thank you so much for joining today's episode. I feel like we learned a ton and such an interesting business really appreciate you coming on the show.
Craig Kistler - Guest: Yeah, totally appreciate it. This was fun
Ashley Stirrup - Host: Thank you
[00:27:00]