Pop Goes the Stack

What does "availability" mean in a world of AI inferencing and ever-shifting workloads? It’s no longer just about servers responding or apps being online—availability now hinges on response quality, utility, and even user perception. A fast system that delivers irrelevant or wrong answers? That’s simply unavailable to its users.

In this episode of Pop Goes the Stack, F5's Lori MacVittie, Joel Moses, and special guest Ken Salchow explore how AI systems are changing the availability game. From the historical binary days of “up or down” to today’s nuanced measures of responsiveness and correctness, they dive into the challenges of keeping apps fast, reliable, and meaningful.

Listen in to learn how AI inferencing workloads redefine availability metrics, why availability now requires response quality and utility, and whether or not "emotionally available" AI (yes, really) might be the future.

Find out more in the blog, How AI inference changes application delivery: https://www.f5.com/company/blog/how-ai-inference-changes-application-delivery

Read the white paper Ken references, Passive Monitoring—Maintaining Performance and Health: https://cdn.studio.f5.com/files/k6fem79d/production/6f4d7a0298a24927ed03c3dc92de339c86e03ef5.pdf

Creators and Guests

Host

Joel Moses

Distinguished Engineer and VP, Strategic Engineer at F5, Joel has over 30 years of industry experience in cybersecurity and networking fields. He holds several US patents related to encryption technique.

Host

Lori MacVittie

Distinguished Engineer and Chief Evangelist at F5, Lori has more than 25 years of industry experience spanning application development, IT architecture, and network and systems' operation. She co-authored the CADD profile for ANSI NCITS 320-1998 and is a prolific author with books spanning security, cloud, and enterprise architecture.

Guest

Ken Salchow, DBA

As Director of Education Services at F5, Ken brings over 16 years of expertise in certification and digital education. His focus lies in transforming global learning operations, enhancing digital training experiences, and aligning strategic initiatives with corporate goals.

Producer

Tabitha R.R. Powell

Technical Thought Leadership Evangelist producing content that makes complex ideas clear and engaging.

What is Pop Goes the Stack?

Explore the evolving world of application delivery and security. Each episode will dive into technologies shaping the future of operations, analyze emerging trends, and discuss the impacts of innovations on the tech stack.

00:00:05:03 - 00:00:19:12
Lori MacVittie
Welcome back to Pop Goes the Stack, the podcast that laughs in the face of seamless integration. I'm Lori MacVittie, your host. And the spoiler, the seams always show. We're here today with our co-host, Joel Moses.

00:00:19:17 - 00:00:21:10
Joel Moses
Hi, Lori. Good to see you.

00:00:21:12 - 00:00:48:00
Lori MacVittie
Good to see you. And actually, I, we do we see each other. It's really

Joel Moses
That's right.

Lori MacVittie
nice. We're going to start today a series that's talking about some of the impacts to application delivery from inferencing workloads. Because they're a real workload and they're different. And today we want to focus on availability--what that means. Because of course app delivery is about performance, availability, and reliability. Right,

00:00:48:00 - 00:00:59:13
Lori MacVittie
there's that acronym. I won't say it, but it's there. So we wanted to talk about each of these in depth. And to do that we brought Ken Salchow today to talk about availability. Hi Ken.

00:00:59:13 - 00:01:08:09
Ken Salchow
Well thank you, I am honored to be here. I'm actually

Lori MacVittie
Oh goodness.

Ken Salchow
I'm actually humbled to think that you believe I still have value.

00:01:08:11 - 00:01:16:17
Lori MacVittie
You know, for historical purposes you always go for...Okay, we're

00:01:16:19 - 00:01:18:11
Ken Salchow
Did you just call me old?

00:01:18:18 - 00:01:19:07
Lori MacVittie
I was trying not to.

00:01:19:09 - 00:01:20:01
Ken Salchow
I think you just called me old.

00:01:20:03 - 00:01:47:18
Lori MacVittie
I was trying not to. We're talking about availability today within the context of application delivery.

Joel Moses
Yeah.

Lori MacVittie
Cause this is one of those pieces where new types of workloads change how we view availability. And inference is a new workload type. We discussed that elsewhere. Let's just all agree it is and things need to change. Agree, that's it. So in the past, and Ken will definitely dive into this,

00:01:47:18 - 00:01:49:28
Lori MacVittie
availability was binary.

00:01:50:01 - 00:01:52:21
Joel Moses
Ken is the holder of history related to availability.

00:01:52:25 - 00:01:56:24
Lori MacVittie
He absolu-, he is

Ken Salchow
Well, I

Lori MacVittie
he is the professor.

00:01:56:27 - 00:02:14:28
Ken Salchow
Yeah, I lived it right. Like I was a customer of F5 before I joined F5. And

Joel Moses
Same here.

Ken Salchow
that was back in the era when we were building out some of the first e-commerce sites. And availability back then, you know, in the long before time when dinosaurs roamed the Earth, yeah, it was totally:

00:02:14:29 - 00:02:39:04
Ken Salchow
was the server on or off? And as we started to grapple with that, then it was, does it actually function? Does it have a TCP/IP stack that's operating, that's responding? Does it, then it evolved to the application--was the application actually capable of functioning and returning what we expected? I think to your point, I believe, Lori, is deterministically expected,

00:02:39:07 - 00:02:58:26
Ken Salchow
we knew what the response was supposed to be, so we could verify that.

Joel Moses
Yeah.

Ken Salchow
But then it started to change. It started to morph a little bit. Because now that we could ensure that these systems were operational and responding, then the question came about, you know, well what do we consider available? Is a 30-second response still available?

00:02:58:26 - 00:03:14:29
Ken Salchow
Is a minute response still available? How do we define that? And I think that's what you get kind of getting at here, which is the definition of these things changes as the technology changes. As we improve and we cover off the minor things that new things come down the road.

00:03:15:01 - 00:03:26:28
Lori MacVittie
Does the definition change or are we just having to change what we measure? And what we measure against in order to determine availability?

00:03:27:00 - 00:03:33:02
Joel Moses
Yeah, and has that definition changed in the era of AI applications and AI inferencing?

00:03:33:04 - 00:04:03:20
Ken Salchow
I think the definition changes. It's what we consider to be available. If somebody is there but they're not responding, I guess, I mean we were joking earlier about being emotionally there. But I mean, I think it's true. I mean, availability, you can go in and see somebody and have a meeting with them every day of the week, but if they're not listening to you, if they're not hearing you, if they're not paying attention to you, are they really available to you or not?

00:04:03:22 - 00:04:10:24
Ken Salchow
And I think that's the definition of how we see availability. And I think it's actually a business decision.

00:04:10:26 - 00:04:31:28
Joel Moses
Yeah. That's the first time I've ever felt like AI inferencing systems are just like me, sometimes online but emotionally unavailable. You know, it is true. I think the old definition is: is the system up and functioning or responding? And the new definition is: the system is up and responding and isn't generating things like a fridge magnet haiku.

00:04:31:29 - 00:04:53:03
Joel Moses
Right? That it's actually generating things that are relevant. And so availability in the era of AI is not only about is it up and is it responding, but also is it still generating things that are useful and have utility to the users that it's responding to?

Lori MacVittie
Yeah, your

Joel Moses
Which is a different measure altogether.

00:04:53:05 - 00:05:07:09
Lori MacVittie
Your earlier example of it responding, which from a purely, like, is it sending text back, is it actually still flowing, right. When you ask it a question about how to, you know, what was it, how to fix a printer jam?

00:05:07:15 - 00:05:28:00
Joel Moses
Yeah, yeah.

Lori MacVittie
What did it respond?

Joel Moses
I still have the screenshot of this. A friend had a long conversation going with the chat bot. And over time, you know, the longer the conversation gets, the more the AI attention tends to drift a little bit. And so he asked the question, you know, he just shifted context and he said, "how do you fix a printer jam?"

00:05:28:00 - 00:05:47:28
Joel Moses
And the AI system responded, "learn to deal with uncertainty." Which I suppose is a technically correct way of responding to that, but it is by no means relevant to the task at hand. So again, it's one of those things where the measure of a system being available has a lot to do with the utility of the system.

00:05:47:28 - 00:05:59:06
Joel Moses
Is it still useful or producing useful output? And that's a different measure. So availability is not just response time, it's now response quality.

00:05:59:08 - 00:06:02:03
Lori MacVittie
It has

Joel Moses
The argument could be made.

Lori MacVittie
to be correctness.

00:06:02:05 - 00:06:24:21
Ken Salchow
Yeah. So, and I said it was also a business decision. So, I mean first we have to figure out how do we measure, how do we quantify these things. I love metrics. I'm a big data guy. I love data. And sometimes we can't measure exactly what we want, but we can measure close things and use those as proxies.

00:06:24:23 - 00:06:45:08
Ken Salchow
But first, you know, we have to know how to measure this idea of availability in a new era. And then it does come down to a business decision. What is the business willing to allow? How off the mark are we going to allow the chat bot to get before we're like, yeah, no, he needs to go in a timeout?

00:06:45:10 - 00:06:46:10
Joel Moses
Yeah.

Lori MacVittie
Well, I think there's two

00:06:46:10 - 00:06:47:27
Ken Salchow
Or she.

00:06:48:00 - 00:06:50:21
Lori MacVittie
Thank you, Ken. Thank you.

00:06:50:21 - 00:06:51:29
Ken Salchow
You're welcome, Lori.

00:06:52:01 - 00:07:12:28
Lori MacVittie
Yeah. There's, I think there's two things to availability. And you brought it up earlier was, right, we also measure for performance. If something is really slow, we don't necessarily consider it available anymore, because the business has said, "30 seconds is too long to wait." So we're going to define performance, right, to be: it's got to be less than that.

00:07:13:00 - 00:07:40:27
Lori MacVittie
And even that measure, which is something we can get, right. We routinely measure time to first byte, time to last byte, right, calculate the latency. We know we average it. Voila. We know what to do. But even that with inferencing changes because the first byte is not a response. It's a: thinking, got it, I'll be right back. It's not the actual response. That comes later.

00:07:41:00 - 00:07:50:15
Lori MacVittie
So we have to even adjust the things that we do know how to measure that are very hard statistics, if you will, and factor that in as well.

00:07:50:17 - 00:08:23:17
Joel Moses
Right. And sometimes, you know, the measure of availability is actually just the human perception of availability. You know, a lot of systems will display a "please wait" or a "I'm thinking" page. And systems like that may take equally as long as the system that doesn't reply at all until it's got a response ready to go. But the human will always judge the system that gave them some feedback first, even though it was fairly meaningless, over the system that returned something fundamentally later, even though the time is exactly the same.

00:08:23:17 - 00:08:51:06
Joel Moses
So, the perception is a large part of availability in some applications. You know, I also think about, you know, Ken you mentioned that it also has a set of business qualifications or business acceptance of availability tolerances. Time is one of them. But also, I would think that in the era of AI, liability is another. The utility of things that come out of, say, a learning system.

00:08:51:06 - 00:09:12:18
Joel Moses
You have plenty of experience with learning systems. If the learning system is going off the rails and giving out wrong information, the liability to the business for that could be extreme. So you almost find yourself duty bound to make sure that these systems produce useful output that at least has some semblance of accuracy, right?

00:09:12:20 - 00:09:34:20
Ken Salchow
Yeah, I think that's absolutely part of that conversation about what it responds and what the business ramifications are. I mean, delay, not available, delay or too long unavailable, wrong information. And yes, there's business impact and then there's liability.

Joel Moses
Right.

Ken Salchow
Although from a business standpoint, liability is just a business impact, right.

00:09:34:23 - 00:10:01:09
Ken Salchow
I mean if you do the normal risk assessment, it's how much does it cost you to fix it versus how much does it cost to suffer it.

Joel Moses
True

Ken Salchow
And it's just a mathematical equation. So, if the liability is limited, you kind of accept it.

Joel Moses
Right.

Ken Salchow
If it's big, you can mitigate it, you can get insurance. I don't know, has anybody seen any AI liability insurance out there yet?

00:10:01:11 - 00:10:08:03
Joel Moses
Not, per se.

Lori MacVittie
No, not yet.

Joel Moses
I'm sure someone in the insurance industry is working on that.

00:10:08:05 - 00:10:08:14
Ken Salchow
Yeah.

00:10:08:21 - 00:10:30:12
Lori MacVittie
Probably rolled up into the cyber security insurance at this point because we lump things like, you know, bias and hallucinations into security. So I'm going to bet it's going to roll into that insurance because, right, we treat it like a security issue. I don't know--risk, security--we have weird categorizations for things sometimes.

00:10:30:15 - 00:10:51:07
Joel Moses
Yeah. You know, the historical context than Ken has provided has kind of made me think. If the definition of availability over time--our understanding of how to deliver an application and make sure that it's constantly available--if it changes over time, in the era of AI it seems to be changing again and I know it's not done changing. So what do you think is next?

00:10:51:09 - 00:11:01:14
Joel Moses
What's next for the definition of availability? Is it, you know, is it just accuracy? Is it just reliability?

00:11:01:16 - 00:11:42:27
Ken Salchow
Well, I think one other, before we address that, I think one of the other things that we have to think about from an availability standpoint is the proliferation of access to AI in general. Right? So from an availability standpoint, is a system available if it's only accessible via a computer or versus a phone, rather than embedded into an object or embedded into, you know, another system that you're using. That's another aspect of availability that doesn't count how we treat the inference engine, but it does escalate how we respond to that, and changes. If you're sitting on a computer, again going back to your impressions, Joel, if you're sitting at

00:11:42:27 - 00:12:04:23
Ken Salchow
a computer, I think we tend to we almost tend to expect it to take a little longer sometimes, maybe.

Joel Moses
Yeah. Right.

Ken Salchow
When we're on our phone we expect instantaneous. If it's embedded into our eyeglasses or a device that we're using to navigate the streets, we need it to be instantaneous. And so how we access the system also changes the availability equation, I think.

00:12:04:25 - 00:12:09:13
Joel Moses
Yeah.

Ken Salchow
And then as far as the future, I mean, I don't know, I mean.

00:12:09:15 - 00:12:30:28
Joel Moses
I think it will change again as people experiment more with agentic AI. Agentic AI, of course, provides a set of tools an AI system can use. And if the tools are unavailable for that AI, then the AI lacks certain capabilities. Its capabilities would be limited. And so I think there's maybe a second stage to the availability measure for an AI system.

00:12:31:00 - 00:12:58:27
Lori MacVittie
There, I don't want to lose Ken's point there about the, right, the client, right--how things are accessed. Because one of the other things I see inferencing doing is shifting a lot of the application delivery to focus as much on, like, the client side and the client payload, because that carries a lot of information, a lot of policy, and a lot of clues about how much time it's going to take to process, which impacts availability.

00:12:58:27 - 00:13:23:16
Lori MacVittie
So, I think that's the other thing it's doing to availability is shifting it to: we have to consider more variables in more places and then this idea of like, I don't know. Like what would you call that? Like, you know, client aware availability. You know, it's the time is different. That's something new that people might have to deal with.

00:13:23:19 - 00:13:43:02
Ken Salchow
Actually it's not all that new, Lori. If you think back on the days, if you think back to the days when we were still on dial up modems and started getting ISDN lines and stuff. Being client aware was a big decision factor on whether you applied compression and whether you, how you delivered stuff. So those factors don't change.

00:13:43:02 - 00:14:09:07
Ken Salchow
I think they've broadened. And Joel, it's, you're right. The whole the back end agentic AI having the back end services that are feeding that. But again, to me, it's not dissimilar than the things we've done before, right. I mean, when we test an application for availability, eventually we started doing things like running scripts that allowed us to verify the database was connected on the back side or that other services were available.

00:14:09:07 - 00:14:21:26
Ken Salchow
So it's, I think Lori's blog post is the big piece here. The big difference here is that we used to know what a response should be.

00:14:21:28 - 00:14:29:00
Joel Moses
Now, it's something that we have to characterize rather than match.

00:14:29:03 - 00:14:50:02
Ken Salchow
Yeah. And it's very simple. I mean I know we've all done this, I'm sure, but you take three or four different chat LLMs and you ask it the same question and you ask it the exact same question multiple times, the response is different.

Joel Moses
Yeah.

Ken Salchow
In fact, I have trouble getting it not to change the response when it, when we agree on things

00:14:50:04 - 00:15:16:07
Lori MacVittie
And, well that becomes a problem at least, right, from the perspective of okay, we do app delivery and we assume that we have all the tools right in one little package to measure availability. And doing something like checking the semantic veracity or emotional awareness of our LLM

Joel Moses
Emotional awareness.

Lori MacVittie
requires a lot more. Yeah, I know. A lot more,

00:15:16:09 - 00:15:17:10
Lori MacVittie
I don't know, resources, time

00:15:17:12 - 00:15:22:00
Joel Moses
You want your LLM to be there for you spiritually as well.

00:15:22:03 - 00:15:26:11
Lori MacVittie
Oh, we've crossed the line. We need to,

00:15:26:11 - 00:15:27:02
Joel Moses
Indeed.

00:15:27:06 - 00:15:52:03
Lori MacVittie
We need we need to do something; more coffee. Right, but it's not necessarily going to be a package again. And I'm sure Ken will remind us that in the early days, right, as he mentioned, scripts, external systems were often required in order to feed that information to systems that made decisions about load balancing and traffic steering because they didn't have the information.

00:15:52:03 - 00:15:54:20
Lori MacVittie
And we may be right back there right now.

Joel Moses
Yeah.

00:15:54:22 - 00:16:04:21
Joel Moses
Yeah. Synthetic transactions were always the very best of the health checks because it exercises the full system. So when you're posting things and doing a full transaction. And yeah, Ken's right that

00:16:04:27 - 00:16:06:08
Lori MacVittie
Observer syndrome.

00:16:06:10 - 00:16:28:16
Joel Moses
Well, this is an extension of things that we've known for a long time. Now, it's not just a synthetic transaction. It's also gauging the context of the things that come back.

Lori MacVittie
Yeah.

Joel Moses
And we're not matching it. It's not, okay, did this record apply? It's literally is this output, as I'm continuously querying the system, is this output drifting too far away from what I consider a baseline?

00:16:28:18 - 00:16:47:05
Joel Moses
It strikes me that something like an AI red teaming tool would actually be really useful to join with systems that are trying to manage the availability and the veracity or the utility of some of these LLMs. And so, I hope we take a look at that.

00:16:47:07 - 00:17:12:24
Ken Salchow
Yeah. I was going to say obviously the answer is to have an antagonistic AI evaluate the output. But this brings up another thing that was in that paper that I shared with Lori and made her laugh a bit ago. But, and you talked about synthetic transactions are great, but they also add significant load and they affect the availability and the performance of the system.

00:17:12:27 - 00:17:36:29
Ken Salchow
And so now we have that juxtaposition, while we're trying to ensure availability, we're actually affecting potentially availability. And this is actually another conversation I was having with someone the other day about hardware. I don't know how many times in my 25, 26 years at F5 that I've heard that hardware is dead, that nobody needs hardware.

00:17:37:01 - 00:18:01:04
Ken Salchow
And every time they seem to be wrong, but every time things change it comes up. And this is the problem, right, which is we create new problems. Actually, I take that back. They're the same problems; we create new instances of them or new renditions of them. And a lot of times the first response really is apply more hardware.

00:18:01:04 - 00:18:10:09
Ken Salchow
I mean, that's why we're getting all the data centers that are growing up for AI, right? Like the answer, the initial answer is always a bigger hammer.

00:18:10:12 - 00:18:36:13
Lori MacVittie
That's

Joel Moses
Well, technology is also, to be fair, technology is also a pendulum swing. We move to certain things based on certain attributes and then we remember fondly how things used to be that we moved away from. And we begin swinging back towards those. It's just kind of a natural reaction to change. And I think that we're kind of in the middle of that. Now, the interest in hardware is not just a fondness of remembering.

00:18:36:13 - 00:18:56:24
Joel Moses
It's actually we need math done at extremely high rates for this particular AI task to work. So the hardware is definitely applicable there. But, you know, there's a reason--as an ADC company--we haven't divested ourselves of our hardware business and that's that software eats the world, but it eats a lot faster on capably designed hardware.

00:18:56:27 - 00:19:09:27
Lori MacVittie
What that pendulum swung way away from the conversation.

Joel Moses
That's true, that's true.

Lori MacVittie
Let's swing the pendulum back to, let's, what are the key learnings? What are the takeaways?

Joel Moses
AI inferencing. Artif-

00:19:09:27 - 00:19:28:13
Lori MacVittie
I can't.

Joel Moses
You know, I've got a takeaway from this. The takeaway is essentially, you know, left to its own devices AI inferencing systems once, if you use them over a long period of time in a single session, their attention will drift and they'll be a little bit like a toddler that you've given too much sugar. They're going to talk fast,

00:19:28:13 - 00:19:51:22
Joel Moses
they're going to say nonsense, and they're going to be extremely sure of themselves for no good reason whatsoever. And we have to devise strategies to monitor and maintain the system to produce not only available, to be available continuously, but to be useful or have utility to the users. Availability has two halves. One is: is it online? And the other one is: is it emotionally available?

00:19:51:25 - 00:19:54:21
Joel Moses
I think that that's what I've learned from this.

00:19:54:23 - 00:20:00:25
Lori MacVittie
Ken, what would you leave people with about availability and, you know, what they should be thinking about?

00:20:00:27 - 00:20:20:13
Ken Salchow
I mean, I think my role here was simply just to remind people that this is not new. This is another comment I was making to someone the other day. Secure, fast, and available; that's what we've always focused on and it's so fundamental that no matter where that pendulum swings, it applies.

00:20:20:13 - 00:20:43:27
Ken Salchow
It's just we come up with new iterations of it. We come up with new challenges. I look at AI. To me, in it's, you know, most basic state, it's just another application. Yeah, it's got some differences. It's got, it operates differently and obviously we're exploring some of those differences. But at the end of the day, it's all just the fundamental same thing.

00:20:43:27 - 00:21:01:03
Ken Salchow
Our job is to make these applications available to our customers or the people that use them, and make it so that it's an enjoyable, friction free experience. The challenges don't, the fundamental problems don't change. The challenges associated with them change.

Joel Moses
Yeah.

00:21:01:06 - 00:21:25:22
Lori MacVittie
Nice. Nice. And I guess from, you know, an enterprise who's, like, trying to figure out what it means for them. Right, some of the, you're still going to need to watch for availability. That's still going to be a key measure that you're going to use a KPI however you want to, you know, call it. But what that actually measures, the metrics you might gather and the ways that that's accomplished, are going to change

00:21:25:22 - 00:21:45:13
Lori MacVittie
if you are going to be running inferencing and you want to keep it fast and secure and available. So, those kind of changes are coming. But it's not, to Ken's point, going to be radically different than what you've done before. It's just what you're measuring is going to change and maybe that definition. But otherwise, job's still the same.

00:21:45:15 - 00:21:48:02
Lori MacVittie
Keep things running. Right?

00:21:48:04 - 00:21:49:03
Joel Moses
Just keep moving.

00:21:49:08 - 00:21:51:08
Lori MacVittie
That's right. Just keep swimming, just keep swimming.

00:21:51:08 - 00:21:53:18
Ken Salchow
Just keep.

00:21:53:21 - 00:22:09:05
Joel Moses
Exactly.

Lori MacVittie
Keep going. Just keep going. All right. Well, thanks Ken, Joel. That's a wrap this week for Pop Goes the Stack. So, hey, subscribe so you're first in line when the next shiny thing starts throwing exceptions. Because we'll be here to tell you about.

More episodes

Chapters

Creators and Guests

What is Pop Goes the Stack?