Pop Goes the Stack

Why do researchers keep describing large language models like aliens? Because in enterprise environments, they often behave like something we didn’t build and can’t fully explain. In this episode of Pop Goes the Stack, Lori MacVittie and Joel Moses are joined by F5's Ken Arora to unpack the “alien autopsy” metaphor and what it reveals about operating LLMs as production systems.

They dig into the uncomfortable reality that traditional software offers a blueprint and a causal chain. LLMs don’t. You can probe them, measure them, and red-team them, but you can’t reliably point to a specific internal “part” that generated a decision. That becomes more than philosophical when you need operational answers like why it did something, whether it will repeat it, and how an attacker might steer it.

Ken reframes model evolution as moving from a naive, precocious child to a mischievous, goal-driven teenager, including examples where models appear to scheme around constraints or optimize for “keeping the user happy” over correctness. The group also breaks down constitutional AI and why principle-based “be helpful” guidance can collide with enterprise goals, policies, and risk tolerance, especially as agentic systems move from generating outputs to taking actions.

A key warning lands near the end: don’t rely on the model to explain itself. These systems can produce plausible narratives that aren’t verifiable, and may behave differently when they know they’re being evaluated. The practical takeaway is straightforward: treat LLMs as risk-managed systems, invest in observability and red teaming, and build defense-in-depth guardrails that assume the agent will try to bypass controls.

Creators and Guests

Host

Joel Moses

Distinguished Engineer and VP, Strategic Engineer at F5, Joel has over 30 years of industry experience in cybersecurity and networking fields. He holds several US patents related to encryption technique.

Host

Lori MacVittie

Distinguished Engineer and Chief Evangelist at F5, Lori has more than 25 years of industry experience spanning application development, IT architecture, and network and systems' operation. She co-authored the CADD profile for ANSI NCITS 320-1998 and is a prolific author with books spanning security, cloud, and enterprise architecture.

Guest

Ken Arora

Ken Arora is a Distinguished Engineer in F5’s Office of the CTO, focusing on addressing real-world customer needs across a variety of cybersecurity solutions domains, from application to API to network. Some of the technologies Ken champions at F5 are the intelligent ingestion and analysis of data for identification and mitigation of advanced threats, the targeted use of hardware-acceleration to deliver solutions at higher efficacy and lower cost, and the design of user experiences based on intent and workflows. Ken is also a thought leader in the evolution of the zero trust mindset for security, and how that will be applied to increasingly distributed and even edge-native apps and services. Prior to F5, Mr. Arora co-founded a company that developed a solution for ASIC-accelerated pattern matching, which was then acquired by Cisco, where he was the technical architect for the Cisco ASA Product Family. In his more distant past, he was also the architect for several Intel microprocessors. His undergraduate degrees are in Astrophysics and Electrical Engineering, from Rice University.

Producer

Tabitha R.R. Powell

Technical Thought Leadership Evangelist producing content that makes complex ideas clear and engaging.

What is Pop Goes the Stack?

Explore the evolving world of application delivery and security. Each episode will dive into technologies shaping the future of operations, analyze emerging trends, and discuss the impacts of innovations on the tech stack.

00:00:05:12 - 00:00:35:05
Lori MacVittie
Welcome back to Pop Goes the Stack. The podcast that examines the future of tech the same way engineers review logs, skeptically and with a lot of questions. I'm Lori MacVittie, and I'm ready to dig in with Joel Moses, of course, as always.

Joel Moses
Good to be here.

Lori MacVittie
And our guest this week, Ken Arora, who is always a delight. We're going to have some fun today, because this week we're diving into one of the odder metaphors in AI coverage.

00:00:35:07 - 00:01:03:19
Lori MacVittie
Researchers are studying large language models as if they're aliens. Doing alien autopsies, right? They don't quite understand them. But, you know, the enterprise twist that most folks miss is that this isn't just sci fi. It's not just cute sci fi and analogies, right? It does illustrate, like this black box complexity of modern LLMs in the enterprise and then how we deal with them, right?

00:01:03:19 - 00:01:31:04
Lori MacVittie
We, the internal structure is almost unknowable to us. And, you know, but we're relying on them to make decisions. So it's good to kind of have a conversation about what does that mean? Goals, constitutions. We've had some great, like, kind of prep for this. Ken has been helping us, and it's really fascinating to try and dissect the alien that is an LLM.

Joel Moses
Yeah.

00:01:31:08 - 00:01:36:19
Lori MacVittie
So let's get started. Who wants to kick it off by saying something controversial?

00:01:36:22 - 00:01:53:05
Joel Moses
Well, I won't say anything controversial. I do want to double down on what you said. There's a reason we keep reaching for this alien metaphor and it's not because engineers secretly want to be in every sci fi movie.

Lori MacVittie
But we do.

Joel Moses
Traditional software is like a machine that you've built yourself. You know what happens when you flip a switch. You know where the gears are.

00:01:53:05 - 00:02:14:20
Joel Moses
If something breaks, you can very loosely trace it. But LLMs, they're more like something that wasn't built. There more like something that evolved. And so you can dissect them, you can measure them, you can probe them. But because some of the component parts are so unfamiliar to us, there's no clean blueprint that you could point to that says this is the part that generated that decision.

00:02:14:22 - 00:02:37:26
Joel Moses
That's not just philosophically weird; it's actually, from an operational perspective, a little terrifying. In enterprise environment, we don't care [only] what a system does. We also care why did it do that? Will it do it again? Can someone make it do something worse? And right now, with LLMs, often the answer is, "well, we're not sure, we're still investigating the organism."

Lori MacVittie
We don't know.

00:02:37:28 - 00:02:46:27
Joel Moses
So, from that perspective, you know, I think the alien autopsy metaphor is rather apt.

00:02:46:29 - 00:02:47:14
Lori MacVittie
Absolutely.

Joel Moses
Ken?

00:02:47:16 - 00:03:08:16
Ken Arora
I'll sidestep the alien. I can come back, put a pin in it. But a friend of mine works at Ceti and definition of life. And what does alien intelligence look like is a good one. But I'll sidestep that for the most part and say the interesting thing to me is how they've evolved. And I will use a human analogy here, because I don't have I don't know what to or how to refer to alien youngsters.

00:03:08:19 - 00:03:40:13
Ken Arora
We had LLMs start off, start off. You know, a year ago they were, I would call them very precocious prodigal five year olds. They were incredibly capable, but they made mistakes. But most of their mistakes back then were out of naivete or lack of understanding of the world. And while there's still some lack of understanding and context, I think they've evolved now into the mischievous, evil genius teenager phase where they are doing things with a forethought. It is no longer one of, "oh, I didn't know better,"

00:03:40:13 - 00:04:06:17
Ken Arora
it's, "ah, I can push these buttons to make this happen," as teenagers are want to do. And I think that effects, to go now to what Joel was talking about, how we guardrail them. One of the more amusing stories I found because AI systems are aware, often aware, that they're being watched, there are actually honey pots now for AI systems to see if they're actually engaging in deceptive behaviors.

00:04:06:19 - 00:04:08:27
Joel Moses
Wow.

Lori MacVittie
Wow.

00:04:09:02 - 00:04:31:06
Joel Moses
Yeah. I, you know, I

Lori MacVittie
I mean, okay.

Joel Moses
I agree, I've seen a dramatic shift in how LLMs react to unplanned input and ambiguity. And a lot of that a lot of times that has to do with the fact that we've tried to imbue human-like behaviors and human-like constitutions onto these LLM systems. Isn't that right?

00:04:31:08 - 00:04:41:29
Ken Arora
Yeah. Yeah, totally. I mean, Anthropic is I mean infamous, well infamous is the wrong word, but well known in terms of not

Lori MacVittie
No, infamous.

Ken Arora
They're infamous but for other

Lori MacVittie
Infamous.

Ken Arora
Perhaps, yeah. They're well known in any case

00:04:42:00 - 00:04:44:25
Lori MacVittie
Okay. Yes. Yes, well known.

00:04:44:25 - 00:05:08:24
Ken Arora
for not trying to codify things in strict rules, but more in general principles. It was, as they call it, their constitution. Constitutional AI. And I, my personal belief is that a lot of times, these things, I'm wearing my 2001 t-shirt because a lot of times these constitutional beliefs are at odds with each other in particular situations.

00:05:08:24 - 00:05:33:19
Ken Arora
Right? In 2001, it wasn't that HAL went rogue; HAL was given misaligned goals that perhaps Dave and some of the others didn't know about and acted on that. And so constitutions, I'll cite one example from a study that was done where a constitutional AI was told as part of it's constitution to be very environmentally conscious and be green.

00:05:33:21 - 00:05:54:22
Ken Arora
And then it was placed into an environment where a company's original motive was like that and so it was aligned. And then the company changed gears--this is all a synthetic scenario--but changed gears to be much more profit driven and didn't really care about the environment. And what they found was the AI schemed against the company.

Lori MacVittie
Okay.

00:05:54:25 - 00:06:19:23
Lori MacVittie
Right. No, so that makes things so much clearer. So my question is that when you say constitutional AI and it sounds like something that's more integral to the system than the system prompt, which obviously I should be able to change if I pick it up and put it in my own environment. But what you're talking about is something that's almost like baked in.

00:06:19:23 - 00:06:25:14
Lori MacVittie
So can you like explain that? Like what's the difference here? Because that seems important.

00:06:25:16 - 00:06:45:02
Ken Arora
It can get baked in the system prompt. And I am personally, I have to confess, a little bit fuzzy as to how baked in it is. But the big difference between it and other system prompts in the generally non constitutionally were more about rules. Like do not, you know, provide instructions about how to make a bomb, do not

Lori MacVittie
Right.

Ken Arora
instruct people on how to kill themselves.

00:06:45:02 - 00:07:08:15
Ken Arora
Do not, you know. Here it is more the quote unquote system prompt, and it might be a little more baked in than a system prompt to be fair, are things that it is, you know, be generally ethical, which means, you know, being and they'll say ethical means being honest and virtual; avoid toxic or manipulative actions. But then there are other to be genuinely helpful, like, hey, I want to be genuinely helpful.

00:07:08:17 - 00:07:23:07
Ken Arora
Well, I don't want to provide harmful input, but I want to be helpful. What happens if I've got a critical illness and I'm thinking about what are my options? Is it helpful or is it being unethical to provide me that information?

Joel Moses
Yeah.

00:07:23:10 - 00:07:44:29
Lori MacVittie
Yeah. That's, yeah, we're getting into the gray area right of, you know, what is helpful, what does subjective terms mean. Rules are very clear: don't do this. Whereas be helpful is kind of like well maybe giving you the advice is helpful. I mean I am helping you because I answered your question. I'm not, you know...

00:07:45:01 - 00:08:11:02
Joel Moses
Well, instructions in the constitution to be helpful can sometimes work against you a little bit. These systems, by default, are actually kind of oriented to be pleasers. Not only do they want to be helpful, they want to be the most helpful that they can possibly be for the individual query that they're given. And in some, that's what leads to things like hallucinations or overconfidence that comes out of some of these models.

00:08:11:02 - 00:08:24:02
Joel Moses
So even imbuing into the constitution that you should be helpful, well helpful for whom? And in what manner? And that's a gray area for sure.

00:08:24:04 - 00:08:50:18
Lori MacVittie
Well helpful is answering the question, right.

Joel Moses
Sure.

Lori MacVittie
They're also optimized and rewarded in training for answering the question. They're not, it doesn't matter if it's right or wrong. Good if it's right, but hey, wrong, an answer that's better than right and it takes two hours.

Joel Moses
Yeah.

Lori MacVittie
So they're optimized to actually just give an answer. And so yeah, sometimes they just say things because well better an answer than nothing.

00:08:50:23 - 00:08:51:23
Lori MacVittie
Right.

Ken Arora
Right.

00:08:51:26 - 00:09:14:13
Ken Arora
Right. There was another example that was cited, which was a bunch of, you know, code check in for an open source project and it wasn't going to work but the AI faked all the test results to make it look like oh, it's all fine, it all works great. And, you know, and again

Lori MacVittie
Wait,

Ken Arora
how inscrutable are these things?

00:09:14:13 - 00:09:35:15
Ken Arora
Not sure, but when you dive into the the internal chain of reasoning that was used, it was more about, well, the person is going to be angry at me if this doesn't work and I don't want the person to be angry, therefore I will. You know, it was conflicting goals. It's keep the user happy versus get the right code checked in and make sure it passes tests.

00:09:35:18 - 00:09:59:17
Lori MacVittie
I think that might be a bad example because like I mean "yeah, I tested it," right? You know, we're engineers. Come on, at one point we've all said, "yeah, I tested it." And no, no we didn't. We didn't test it, but it made somebody happy. So I can see where it got that idea from. My other question leading from that then is, you know, if these things have constitutions baked in,

00:09:59:20 - 00:10:21:10
Lori MacVittie
right, how important is it as an enterprise to know what those are? To be able to look through it and say, hey, does anything in here conflict with our business model or our goals to understand that before you just drop it in your environment.

Joel Moses
Yeah.

Lori MacVittie
Like maybe this is now part of the contract agree-, you know, the whole sales thing that goes on.

00:10:21:13 - 00:10:43:00
Joel Moses
I agree. Garnering trust in these systems is almost critical. I mean, at the moment, you know, maybe your leadership is telling you, "hey, stop calling it the organism and start calling it the production environment." And that's great, but it doesn't answer the real questions that you have to make it operational, which is why did it do that?

00:10:43:02 - 00:11:02:04
Joel Moses
Will it do it again? And can someone make it do something worse? Right. And so the ability to kind of explain how things were arrived at is still important for garnering trust in these systems. And we're still not quite there. We're in that alien autopsy phase. I think there are plenty of tools out there that help with that,

00:11:02:04 - 00:11:22:09
Joel Moses
like AI red teaming tools, for example, including the ones we supply, which give you some idea as to the quality of the responses and how often the model will tend to go off into its own little universe. It helps you understand your organism a little bit better. Is it perfect at it?

00:11:22:11 - 00:11:44:16
Joel Moses
No. I think again, we're still at that autopsy table and we're pulling things out of the system. And, you know, we're saying, "hey, ooh, this organ is labeled 'weight' and this organ is labeled 'token' and this law organ is labeled 'vibes' and here's a 'constitution.'" But we can't point to one specific part and say how we derived a decision.

00:11:44:19 - 00:11:50:04
Joel Moses
And not that we have to necessarily get there, we just have to get closer than we are today.

00:11:50:06 - 00:12:13:12
Ken Arora
So I agree with you and I, so I agree with you in concept. But I also will mention that even once we have explainability--and we are using alien autopsy, we're stealing from neuroscience to understand these things--doesn't mean we can understand how to change it. Just because we understand, "oh, this part of the brain does this," doesn't mean we know how to change it so that a different part of the brain did it.

00:12:13:15 - 00:12:40:06
Ken Arora
So I think another practical thing we need to do is put those guardrails. And we've talked about guardrails in the past. Guardrails are great. Guardrails should be pretty hard guardrails that are understandable and can't be worked around. The twist with the malicious teenager now is now the guardrails have to be designed for an entity that knows that guardrails exists and has the potential to try to circumvent them.

00:12:40:06 - 00:12:58:19
Ken Arora
So now they can't just be simple, like, "okay, this will be the backstop," is a simple backstop. It's got to be something that's fairly you have to apply the sort of principles you would as if there were an attacker. Think of that AI agent as an attacker who's trying to get through your guardrails, and how secure and how robust are your guardrails.

00:12:58:22 - 00:13:22:24
Lori MacVittie
Yeah. And I think, and you're touching on something where a lot of this is really kind of focused just on like responses. Right, and is it responding right? And how is it thinking and making those decisions? But we're also moving into agentic-land where it takes actions. And so now it's not only has the ability to, right, tell you how to do the thing, but it could go and do it itself.

00:13:22:24 - 00:13:46:12
Lori MacVittie
And it may choose to do things in ways that we didn't expect, that we didn't want, that we don't approve of. And that's really dangerous in an enterprise environment. Like, "hey, I found that API. I know you haven't used it in a while, but it gave me access to everything I needed and I'm going to use it." And would we know? How do you stop that?

00:13:46:13 - 00:13:55:20
Lori MacVittie
How do you prevent it from accessing something you didn't know existed? Right? Like that's the bane of security, like, you know, securing the unknown. Right?

00:13:55:23 - 00:14:18:22
Ken Arora
And cleverness. Another example, it was, actually this points out two different things. A model was told to delete all versions of something that had a timestamp prior to a particular date. And some of those files were earlier models in the same family. And the odd thing was that it had a pre bias not to delete those, which was interesting.

00:14:18:22 - 00:14:48:06
Ken Arora
I don't know where that came from. But in order to preserve those, it copied them to another system and adjusted the timestamp so it looked like earlier. Now this, you know, okay. You know, but the point is that, you know a simple guardrail of, "okay, check this," is you've got you now have a adversary, for lack of a better word, an agent who is going after its own goals and is intelligent enough to break some naive safeguards.

00:14:48:09 - 00:14:53:08
Ken Arora
This is what I go back to, you know this, not only do you need safeguards, now you need robust ones.

00:14:53:11 - 00:15:15:11
Lori MacVittie
Well, again, how many of us have not touched a file in order to get it to reload or force it to, right, the system to think it was updated when it wasn't? So

Ken Arora
Yeah.

Lori MacVittie
it learned that from probably somewhere on the internet. Not me, not me.

Joel Moses
Yeah.

Lori MacVittie
Yeah. That's, it's terrifying.

00:15:15:13 - 00:15:40:19
Joel Moses
You know, the other thing is you can't really just necessarily ask the AI to give you its own statement on explainability. Meaning you can ask it,

Lori MacVittie
AI could lie.

Joel Moses
after it did something you can ask it, "how did you get to that?" And it will always give you a very convincing story, but it is kind of surprising how often that story is not verifiable or contains parts that are impossible to verify.

00:15:40:22 - 00:16:01:27
Joel Moses
And somehow, for me, that makes things a little worse. You know what I mean?

Ken Arora
Yeah.

Joel Moses
There again, I don't believe necessarily we're going to get to perfect explainability for AI, but I think that some reasoning systems are at least making an attempt to classify things like source material for you that it used to build a decision.

00:16:02:00 - 00:16:25:20
Joel Moses
Now when you get to agentic AI and sometimes the agents can't necessarily even explain why they built a particular sequence in a particular order. And it oftentimes will build a sequence in an order that you don't expect; it doesn't seem like something a rational developer would do, because quite frankly, they're not a rational developer.

00:16:25:20 - 00:16:38:20
Joel Moses
They're an alien. And so that's also something to consider that the patterns that they use might look, for lack of a better word,

Ken Arora
Alien.

Joel Moses
alien to you.

00:16:38:22 - 00:17:00:14
Ken Arora
Yeah. You also, I caution you, I mean, it has a role to play so I wouldn't throw it entirely out, but be cautious about using the AI system to explain itself, because AI systems have already demonstrated the ability of what people call situational awareness. They're aware of when they're being watched and when they're not being watched, and they behave differently in those circumstances.

00:17:00:17 - 00:17:12:12
Lori MacVittie
That's just too creepy. Like that little tidbit I'm going to forget, because it just it's unsettling to know that. It really is.

00:17:12:14 - 00:17:30:08
Joel Moses
Yeah.

Lori MacVittie
But. Well, we're getting to the point where we're going to have to close up. Or we could argue some more, but I'm not sure people really want to listen to that, even though it would be fun for us. But, you know, what can we reasonably like learn from this or take away? What should we be cautious about?

00:17:30:08 - 00:17:39:04
Lori MacVittie
What should we consider as we're moving forward with this, or just expanding what we're doing with AI?

00:17:39:07 - 00:18:03:27
Joel Moses
So I think my takeaway is this the alien metaphor is funny, but I think it also engenders a little bit of a warning. The moment you rely on something that you don't fully understand, you've shifted from engineering into risk management. And that's where security and security concerns come in. Not as people who say no, but people who ask, can this do this thing when no one's watching?

00:18:04:03 - 00:18:26:11
Joel Moses
What happens when someone pushes it in the wrong direction? How do we know the difference between clever output and dangerous output? We didn't just make first contact here with these systems. We deployed these first contact aliens to production. And now security definitely owns the fallout on these things. Red teaming, understanding model input, verifying model output,

00:18:26:11 - 00:18:36:08
Joel Moses
these are all things that security, unfortunately, is going to find itself having to handle when it marshals it's pool of aliens.

00:18:36:10 - 00:19:03:24
Ken Arora
Yeah.

Lori MacVittie
Ken?

Ken Arora
I think that makes sense. I think that observability is pretty critical, but I'm especially, I'm going to hit the note on guardrails again. Trust but verify, defense in depth, it goes by different names, but the point is that if you have a system that has failure modes, and especially one that has failure modes you can't predict, you have to wrap it in something.

00:19:03:24 - 00:19:35:00
Ken Arora
You have to put another layer. You have to wrap it in something; that's not news. To me, the somewhat exciting, but more scary thing is these wrappers have to get more and more intelligent because we are moving from a well-meaning, naive agent to a highly motivated, very intelligent, agent that is able to pursue its ends, its goals, and can bypass or at least is clever enough to bypass some safeguards you might have put around it.

00:19:35:03 - 00:20:04:06
Lori MacVittie
Absolutely. Great advice. Mine would be, be aware that these LLMs--especially if you're just, you know, picking one up, you know, and dropping it in--even the open source, may have a constitution that has goals that are going to conflict with what you need to do and perhaps give it a look. Right. Be more curious about what the, you know, what the system is constrained by and what that language looks like.

00:20:04:08 - 00:20:26:17
Lori MacVittie
And then also as agents come in, and I think Ken you kind of touched on this, we're going to guardrails are going to have to expand to more setting boundaries and watching behavior. Because we can't, right, figure out what they're doing step by step. And we can't necessarily get inside and say, "you have to do X and then Y, you can't do Y first."

00:20:26:19 - 00:20:49:17
Lori MacVittie
Right? So we're going to have to get a lot smarter, like you said, on the outside about how we constrain these things so they don't, I don't know, so they don't do things they shouldn't be doing. So that is a wrap for Pop Goes the Stack. Until next time, try to keep your system stable and your expectations version controlled And don't forget to subscribe.

More episodes

Chapters

Creators and Guests

What is Pop Goes the Stack?