Cyber Sentries: AI Insight to Cloud Security

Cyber Sentries: AI Insight to Cloud Security Trailer Bonus Episode 15 Season 1

On-Prem AI Uprising: Navigating the Future of Cloud Security

On-Prem AI Uprising: Navigating the Future of Cloud SecurityOn-Prem AI Uprising: Navigating the Future of Cloud Security

00:00
Diving into the Rise of On-Prem AI and Cloud Security
In this episode of Cyber Sentries, host John Richards is joined by Doron Caspin, a Senior Manager of Product Management at Red Hat, and Christopher Nuland, a Technical Marketing Manager at Red Hat. They explore the growing trend of on-premise open source models for running AI and the unique benefits and challenges that come with it. The conversation also touches on how DeepSeek has challenged the big players and validated the value of smaller agentic models.
John, Doron, and Christopher dive into the shifting landscape of AI and cloud security. They discuss the trends Red Hat is seeing in the industry, such as the move towards smaller, domain-specific language models and the importance of securing AI workloads in hybrid cloud environments. The guests share insights on the key considerations organizations face when deciding to run AI models on-premises, including compliance requirements and the need to treat AI models with the same level of security as databases.
Questions we answer in this episode:
  • What are the benefits and challenges of running AI on-premises?
  • How can organizations secure their AI workloads in hybrid cloud environments?
  • What impact has DeepSeek had on the AI industry?
Key Takeaways:
  • On-prem AI offers unique advantages for industries with strict compliance requirements
  • Treating AI models like databases is crucial for ensuring robust security
  • The future of AI is likely to be open source, with smaller, domain-specific models gaining traction
This episode is a must-listen for anyone interested in the intersection of AI and cloud security. John, Doron, and Christopher provide valuable insights and practical advice for organizations navigating this rapidly evolving landscape. Whether you're a security professional, data engineer, or business leader, you'll come away with a deeper understanding of the trends shaping the future of AI and the steps you can take to secure your AI workloads.
Links & Notes
  • (00:00) - Welcome to Cyber Sentries
  • (00:31) - Red Hat
  • (01:04) - Meet Christopher and Doron
  • (05:26) - Past to Present
  • (07:54) - Trends in the Approach
  • (12:24) - The Security Side
  • (16:15) - Key Considerations
  • (19:26) - Training and Models
  • (22:33) - Iterations and Shifts
  • (25:36) - Importance of Security Foundations
  • (28:35) - Security in Agent Space
  • (30:00) - Wrap Up

Creators & Guests

Host
John Richards II
Head of Developer Relations @ Paladin Cloud The avatar of non sequiturs. Passions: WordPress 🧑‍💻, cats 🐈‍⬛, food 🍱, boardgames ♟, a Jewish rabbi ✝️.

What is Cyber Sentries: AI Insight to Cloud Security?

Dive deep into AI's accelerating role in securing cloud environments to protect applications and data. In each episode, we showcase its potential to transform our approach to security in the face of an increasingly complex threat landscape. Tune in as we illuminate the complexities at the intersection of AI and security, a space where innovation meets continuous vigilance.

John Richards:
Welcome to Cyber Sentries, from Paladin Cloud on TruStory FM. I'm your host, John Richards. Here, we explore the transformative potential of AI for cloud security. Our sponsor is Paladin Cloud, an AI-powered prioritization engine for cloud security. Check them out at paladincloud.io.
Now, in this episode, I'm joined by two guests, Doron Caspin, a Senior Manager of Product Management at Red Hat, and Christopher Nuland, a Technical Marketing Manager at Red Hat. And I'll be asking them about the growing rise of on-prem open source models, and how running AI on-premises offers unique benefits and challenges. We'll also discuss how DeepSeek has challenged the big players and validated the value of smaller agentic models. Let's dive in.
Today we are joined by the Senior Manager, Product Management at Red Hat, Doron Caspin, and the technical or a Technical Marketing Manager at Red Hat, Christopher Nuland. Thank you both for coming on here, it's really good to have you.

Doron Caspin:
Thank you for having us.

Christopher Nuland:
Yes, thank you.

John Richards:
Well, before we dive into talking about AI and some security related to that, I'd love to hear a little bit about how you ended up in your roles working with Red Hat, working in kind of this AI space. So, Doron, do you want to kick us off first and then we'll have Christopher kind of share his path to this role?

Doron Caspin:
Yeah, I've been with Red Hat for the last four years, mostly around security. Right now managing the product management for Advanced Cluster Security, it's a main security product for Kubernetes. In the past I was in different positions, I was engineering for a long time, an architect, and then last almost 10 years in the product management business. Yeah, John, thank you for having us and I'm looking forward for this conversation.

John Richards:
Me too. All right, Christopher, what about you? How did you end up where you're at?

Christopher Nuland:
Mine is a little more untraditional. My degree was actually focused in AI, at the time machine learning, all the way back 15 years ago, which-

John Richards:
Yeah, back before the big wave that we hit right now.

Christopher Nuland:
That was a lifetime ago, when we talk about AI. Actually the funny part is, I remember my senior year, an academic advisor telling me that AI wasn't going to go anywhere and that unless I wanted to get a PhD, then I wasn't going to be able to do anything. I would basically be working at a Starbucks.
So at the time I was doing research into Hadoop, so big data Hadoop, a lot of the mapping that went around that. That got me really interested in distributed systems. And so I took kind of a shift there at the end towards more of the systems side of data. I worked at a couple of finance companies that were doing a lot of things with big data, that ultimately led me and my interest towards things like Kubernetes, and then ultimately here at Red Hat about seven years ago.
So at the time I first came to Red Hat, I was doing a lot of data migration work into Kubernetes. So a lot of organizations were trying to solve the on-prem data challenge of, how do we have this distributed platform but also have it have stateful data and how do we work with that stateful data? And so I was brought in for a lot of different corporations to help them alleviate some of those data migration challenges and how do we look at data in also a hybrid cloud configuration? How do we have some data on-prem, how do we have data in the cloud, how do we deal with subjects like data gravity?
And then about two years ago, AI started ramping up again and I'm like, I really love this stuff. This is kind of my original goal was actually in this area. And so I had started helping also with some AI projects here with our customers. And really, how do we take this concept of big data, especially data that's also partitioned in a lot of different areas, the different things like on-prem and in the cloud and at the edge, and how do we have that data in a appropriate configuration that would be beneficial to AI, ultimately? And then obviously after that, things just blew up and the industry really took a shift towards AI.
Red Hat started actually focusing more on AI workloads in Kubernetes based distributed platforms. And since then this last year, I've been focused mainly on those on-prem distributed systems, AI workloads. Also, the area of security just keeps coming up too, and this is an area of interest of mine too. How do we secure this data? And one of the first conversations I typically have with people now is not about the AI workloads, but more of, how do I actually secure these AI workloads in a lot of the type of configurations I was just talking about?

John Richards:
Yeah, we've seen some big profile things where not securing that that's an issue. But before I jump into that, I didn't realize you had this background, so there has to be a little bit validating to see that come back around. But I'm curious with that background, do you see a lot of the past AI work and research as core underpinnings to what we're seeing today? Or does it feel almost like a new field? From outside looking in, we're like, I'm like, oh, this is all new. That old stuff didn't matter. But do you see like, oh no, that really was core to get to where we are today, or do you think it's kind of new wave, all new categories?

Christopher Nuland:
It's a bit of a mix. AI from an academic standpoint is a very, very broad subject matter. If we look at really in the 1940s, in some of these early computer science papers, we soon had papers in AI. So the field has been tied to artificial intelligence really since its inception. And a lot of it was very theory-based up until probably the last 20 years where we saw some real applicable things come out of it. Computer vision is probably the one that I could think of the most. We saw a lot of advances in the early 2000s and 2010s in the area of computer vision. But to that point, we have seen a revolution here in AI, and just a real explosion especially around what's possible.
When I was in school, there was a lot of conversations around how less data but more quality data was better. And that took a drastic shift around probably around 2017. There was a famous research paper called Attention is All You Need. There's a few other papers that kind of built on top of that, but that one's probably the one that's referenced the most. And ultimately what we found with this deep learning approach was that actually more data was important. The more terabytes of data that you could throw out the problem, you actually ended up with better models in the end, where we ended up now with large language models and some of these newer vision models that we have that are able to produce pictures and music and videos. And that really was groundbreaking for the time. It was a major shift in the way that we were looking at AI both academically and from an enterprise standpoint, to get to where we're at today here in 2025.

John Richards:
Wow, thank you for walking me through that. That makes a lot of sense. Now with all of this happening, we're seeing a lot how people, how organizations relate to AI is changing rapidly. What are the trends you all are seeing over there at Red Hat and inside of ACS, for instance, with how people are approaching this right now?

Christopher Nuland:
Yeah, I'll take a stab of that from a broader sense, what we're seeing the trends in industry, and then I'll let Doron talk a little bit more about some of the specifics from more of a security standpoint.
From an industry standpoint, everything started with ChatGPT about two years ago, and a lot from a commercial standpoint that really exploded but from an enterprise standpoint, there's still been a lot of conversations of, how do organizations see return of investment from AI? And I think that's still a question that we have going even into 2025. Chatbots were very popular in '23, '24, but we still haven't seen the benefit from all those yet. We've seen a lot of challenges with the quality, from a security standpoint of being able to leak data or bypass certain safeguards based off chatbots.
But the trend overall we are seeing is into what we would call smaller language models. And this was the position that Red Hat had going into last year, our positioning was really based off of the idea that models were going to start getting smaller, and this was really vindicated here in the last couple of weeks with DeepSeek and everything that's been going on with DeepSeek. And how in a lot of ways the approach that that model has taken has been a much smaller form of model that can do more. Do more with less is the approach.
And we've had a very similar approach of the Granite models that Red Hat had been supporting in the open source community, and then our InstructLab approach for fine-tuning models. And we're seeing enterprise shift this way as well, where we think of it very much in terms of how microservice architecture was maybe 10 years ago, where we're going from these monolithic models down to more purposeful domain specific models. And the nice thing about this too, is that this gives us an advantage from a security standpoint because we can build more guardrails if we're designing models to do very specific tasks. Some of the big terms that are coming out in the marketplace around this are agentic, which are models that then have the ability to reach out to APIs and actually be able to interact with your current systems to pull in data. And that is another way that we're looking at that big picture of, how do we piece together these smaller models? And that's where my team is really focused on that return of investment for our enterprise customers and how we can see growth actually in this area.
And from a security standpoint and security is always the first conversation that comes up. When I'm talking to customers, they're less worried about even the quality of the AI because they just assume over time it will improve. For them it's more about, how do we actually secure this? And to the conversation I was having earlier about some of my prior experience, it really gets into, how do they get to their data? Their data is partitioned now in potentially multiple clouds, it's on-prem. How do we get this data securely into a place where these AI models can actually be of use? Because without data, it's you just have a model that, it doesn't have real purpose. What's really actually going to bring that return of investment is actually interfacing that with your data.
And that's where data gravity, we're seeing data gravity just continue to grow. We saw a lot of data gravity issues in the 2010s when the cloud was first starting to spin up, but now we're seeing it I think even tenfold when we start having the conversation about AI. And that brings a lot of security challenges as we're dealing with different applications that interact with this data, different AI models, and really in this hybrid cloud architecture that we've seen grow over the last five, six years.

John Richards:
And then Doron, over on the ACS security side, what are you all seeing?

Doron Caspin:
Yeah, so let's start first with before ACS, let's talk about the OpenShift as a security place to run your AI models. Right? So we basically lead this area of run your model locally instead of running in the cloud, right? And you don't want your models to run in the cloud to access your on-prem data. And again, it's something that our customer, we hear from customers, is they want the model to run locally, some of them want to run even in disconnected mode, and this is why Red Hat build this OpenShift AI. And when you bring the data, when you bring your data and your model to OpenShift AI, you have the best practice, that every OpenShift has with [inaudible 00:13:18] Linux and all the security that come with the platform with OpenShift platform.
Of course, we are going to add and we're working on capabilities for example, if you want to scan the models, you will need a way to scan the model because it's not just the OCI, it's not just a regular deployment. Even it's some of the model has code, that some of the model has the Python libraries, and things like that. You can scan it, but it's not really the full model. So you need to expand your scanning capabilities.
Again, if you download the model from Hugging Faces, you need to be sure that this model is really something that you want to run on your system. It's not poisoned or is not including data that you don't want to be in your... So the industry is working on an [inaudible 00:14:08], it's like SBOM for software, it'll be [inaudible 00:14:11]. So the producer of the model will include what including the model, so you'll be able to trust. And we are also working on capability for signing a model. So if you are building a model or you have a pipeline, your data team is building a model, they can sign it and it will be trust and validated on Runtime. So no one is adding data to the model or change it in Runtime. And today we already have tools for validating the signature in runtime.

John Richards:
That's a big progress, right? Because there's so many, I mean, Hugging Face has just exploded with the open source models and sometimes you want to know, hey, I want to know what I'm getting is exactly what it says and can give you a lot of extra trust in that kind of open source space.

Doron Caspin:
Yeah, we try to, again, it's all about taking the best practice that exists today from an open source. It existed there from the software industry and bring it down to the AI and to the AI workload. Again, it's not just in the model itself, it's also for the inference. So, we don't want things that input, they manipulate the data or manipulate the input. So you can basically reverse engineer the model by posing it or sending a request so that you can find data there. So, it's also things that you want to protect on the networking side. We also, additional tools for example, we introduced capabilities for a confidential computing that you can run even in the cloud environment that you can sign your basically the image and also not allow to have a process to access your data. And we have some tools around it for an OpenShift environment in the cloud or on-prem customer can use today.

John Richards:
Now, a lot of the cloud providers are pushing this idea of, run your model in the cloud, we've got the infrastructure for it. So this trend of on-prem is very interesting. What are the key considerations you see from people making the decision to say I want to run this on-prem? What are their top concerns that make this such a good choice for them?

Christopher Nuland:
The industries that we're seeing that are having some of the biggest challenges are a lot of times things like healthcare, finance. A lot of times there are either national or international compliance that they're trying to meet, but that data still is very key to being ingested into these type of AI models, which becomes a challenge of, well, how do we get that data to the model if we're running in their hybrid configuration or an on-prem configuration? So that's option one that we're seeing the most and very industry specific, and that's where we feel like OpenShift AI and things like ACS are positioned well, because Red Hat obviously has a history of being in the data center going all the way back to RHEL and our initial OpenShift container platform. That lends itself well to a lot of the technologies that we've already built out for scaling at a distributed sense for AI within the data center.
The second thing that we're seeing trend-wise is, we're seeing more traditional cloud-based companies actually start moving down into on-prem because of a lot of these security concerns. A lot of it has to do with, especially in places like Europe and South America where there's a lot of AI compliance laws that are already getting passed, which are actually mandating that these actually run in an on-prem setting. We don't have anything like that here yet in the United States. I do think we'll see some of that start to follow here in the next four to eight years, but we are seeing it in different geographical regions in the world. And this is where we're seeing some of the biggest support of OpenShift AI and also just in general open source AI. Red Hat, being an open source company, we really do see the future of AI being open source and it lends itself well.
This is one of the first technologies that we've seen with its foot out first in the open source community, and I think a lot of it has to do with it coming mainly from academia. Academia is much more supportive of the concepts of open source, and because of that, we've seen significant growth even here in the last couple of weeks with the announcement of DeepSeek. But obviously things like Mistral and Llama and the Granite models that Red Hat supports, we're continuing to see growth in this space. That is really tied very closely to some of these compliances that we're seeing coming out of Europe and South America, and I see that growing in places like Asia and North America here pretty soon.

John Richards:
It sounds like it's not just the model run, like the training to meet these. The training is happening as well in the local on-prem environment and not just like, oh, I've got a train model now I'm going to run the small version of it here locally. Is that true?

Christopher Nuland:
Yeah, we're seeing a mix there. So, primarily we're seeing a focus on on-prem training, especially when it has some form of compliance around it. From an inference standpoint, it's definitely a mix. Some people will be required to run these models as well locally, based off of certain compliance and requirements.
John, a good way to put it, and this is an analogy I think is really good. Ultimately, you want to treat your AI models like you would treat your databases. You wouldn't want to put your database front and center and have access directly to it. You have some form of gateways and applications and services that sit in front of it that are ultimately securing it and making sure the requests are correct and certain permissions. It's the exact same thing that we're seeing with AI. You won't want to put your AI model just front and center on the cloud or front and center even on-prem, making it available to the open internet. You would have the same types of gateways and controls and guardrails in there as you would for a database.
And that is a lot of the conversations that I'm seeing organizations have right now, because those patterns don't exist right now. We can pull from things that we've learned in the past like databases and applications and services, but we're still trying to figure out what those patterns look like in this new age of these smaller models that work in what we've considered an agentic type of configuration. And I think the jury is still out on what's best. A lot of different companies are positioning themselves as the experts in this area, but I think we'll quickly see this year some trends start forming, just like what we saw with container technology around Kubernetes and some of the different technologies 10 years ago. I think you'll see the same kinds of trends, where people are going to start gravitating towards specific stacks.
If I had to put money on one right now, I would say probably Meta's Llama stack would be up there. What I like about it is obviously it's focused around the open source community, but it specifically calls out things like security within its own stack. It's not a strictly defined security stack, but more of loosely defined of, these are things that we would expect within certain controls and guardrails for a model, for a model to be completely available within a full AI stack, an enterprise stack. So, I would definitely encourage your listeners to go look up more at the Llama stack. It's growing quite rapidly and it's something that I see being one of the potential main architectures to come out of 2024 and 2025.

John Richards:
How do you see, you mentioned DeepSeek kind of in this space with the open source. Do you see people just iterating off of that, bringing that learning into the other open source models? Or do you think we'll see some more fundamental shifts inside of those?

Christopher Nuland:
That's a really good question. I personally wasn't too surprised about DeepSeek, because the research papers that have been coming out over the last six months have been pointing towards the technologies heading this direction.
A core part of DeepSeek is a technology called reinforcement learning. It's an area that we haven't done much with over the last four or five years, and it was actually an area of research interest of mine going all the way back to college. So when people were starting to talk about this, I was like, oh, hey, this is kind of old technology that we're now applying to new process, which is really cool to me. But the concept here is that instead of just throwing a bunch of data at the problem, we throw a bunch of data, but then we iterate over it and ask human-like questions over and over again. So like for a math problem, you ask, what's four plus four? And then it replies back, well, 10. Well, we know that's not correct. And then you just keep iterating until the model starts learning off of its mistakes. Similar in a way to how we learned from our mistakes, the model is doing the very same thing.

John Richards:
Interesting.

Christopher Nuland:
I have a personal thing here, just, I have a video on YouTube actually using OpenShift AI. I actually trained a model to beat the '90s arcade game, Double Dragon. So if you just go on YouTube and search Christopher Nuland, double Dragon OpenShift AI.

John Richards:
We'll make sure we add that in the show notes as well.

Christopher Nuland:
Absolutely, but the cool part is, is I didn't mean this to happen, but it's the same algorithm that DeepSeek uses. So when DeepSeek announced that they had built this new model, I was like, oh man, that's the exact same algorithm that I was using in my demos on beating this particular video game, just using an AI model. But to the point that you made earlier, this is one of those cases where a lot of the old processes that we had in AI do have some applications into what's going on now, and that's actually where we're seeing some of the biggest growth.
And I do, to your original question, I don't think it's going to take very long for the Llamas, the Mistrals, the Granites of the world to apply reinforcement learning into the approach. Especially since DeepSeek is open source, the open source, their white paper, there are other academic papers that have kind of built up to this moment over the last six months. I don't think this is just a single instance that we'll see. I think over the even in the next month or two from us talking right now, I think we'll see a lot of these other AI models start incorporating a lot of this.

John Richards:
All right. Well, I'm going to pivot a little on that old technology, old concepts still being important, with a question for you, Doron, around security, which is, how much are the foundations of security important when you're running these AI workloads? Or you're like, with AI all that matters is securing my data, it's a black box, nobody's going to be able to do anything with it, or do you still need the whole suite of security for all the underpinnings when you're running this on-prem?

Doron Caspin:
Yeah, I think you need the basic and you need more, right? So for example, if you want your model to access the GPU, so you need to run in privilege as a privilege container or something on that. You need to have capability to limit this container to run your system. You need more and more capabilities to block these capabilities, to block this model or block these capabilities from accessing privileged information. You need to have more hardware, preventing unauthorized access to models, and inference also to least privilege for AI models and the teams. And you're adding more people to the development pipeline, right? You have now have now data engineers and people that are not very technical but not software developers, that need to understand what is security and what libraries to use and what libraries not to use, and what is secure Linux machine, what is Linux? Things that not all of the teams, they're really-

John Richards:
Is this like an extension of shift left, is this shift left, left, where we now are even our data engineers really need to be aware of security, best practices and things like that?

Doron Caspin:
Exactly, things like that. And again, we need to put all these guardrails into the platform to make it easy for this kind of team to manage it, right? Again, you cannot just download an image from the internet and use it in your environment. Again, same for the models, same for libraries. You can use some Python libraries that someone developed in your annual production environment before you scan it, before you see what the code there. We're using the same best practices of shift left and scanning and capabilities to provide this tool to the whole area and pipeline of building AI models.

John Richards:
So it's like those practices still are needed, but they're almost more important because the risks are higher and more people involved.
Before we wrap up, I'm kind of curious, you talked about this agentic model and this agent to agent discussion, were they involved? What does it look like to bring security to the agent space if they're having to interact with one another? Do you treat them like the way we've done human identity management, or is this kind of a different way that you look at that when you're trying to link these kind of smaller models together?

Doron Caspin:
Yeah. First you need to use, again, the regular best practices, short-lived tokens everywhere, not to store certificate in main configuration files of course, and also, machine identity. We're going to, [inaudible 00:29:22] OpenShift will deliver this year with SPIFF capabilities, SPIFF inspired capability for workload identity. There are tools that provide it already today on OpenShift and providing capability for AI workload that you don't need. You're not really, you allow, you know the process identity, you only allow specific process to access the other processes as usually processes and the identity management system.

John Richards:
Well, awesome. That's exciting news. Yeah, yeah. So we'll make sure folks are keeping an eye out for that. Well, I want to thank you both for coming on here today. It's been very interesting, so thank you for your time and sharing your knowledge. Before we wrap up, I'd love for you to share a little bit about how folks can find you online and maybe a project you're working on or something you want to shout out.

Christopher Nuland:
Yeah, I think the best place to find me is just on LinkedIn, just searching my name or using my handle there, for C. Nuland. But yeah, I would definitely check out some of my videos. I think I mentioned already the Double Dragon video that went pretty viral last year, and I'll make sure to have that included in the information.

John Richards:
Awesome, thank you. And what about you, Doron?

Doron Caspin:
Yeah, same on LinkedIn. This is where mostly I'm in, and also in Red Hat conferences. I hope to see you all in the summit in May in Boston. And we have other events that we're joining the CubeCon. I'm not being CubeCon in London, but we have our other team member there and other events that Red Hat is [inaudible 00:31:04]. So, looking forward to meet you face-to-face, so.

John Richards:
Awesome. Yes, well check that out. Check out ACS, OpenShift. I really thank you guys coming on here and thanks for the great work Red Hat's doing in the open source and AI space here. I love that you're targeting this area that's really important for some of our critical industries. So, thank you both for coming on here.

Doron Caspin:
Thank you.

Christopher Nuland:
Thank you.

John Richards:
This podcast is made possible by Paladin Cloud, an AI powered prioritization engine for cloud security. DevOps and security teams often struggle under the massive amount of notifications they receive. Reduce alert fatigue with Paladin Cloud. Using generative AI, the model risk scores and correlates findings across your existing tools, empowering teams to identify, prioritize, and remediate the most important security risks. If you'd like to know more, visit Paladincloud.io.
Thank you for tuning into Cyber Sentries. I'm your host, John Richards. This has been a production of TruStory FM. Audio Engineering by Amy Nelson. Music by Amit Sege. You can find all the links in the show notes. We appreciate you downloading and listening to this show. Take a moment and leave a like and review, it helps us get the word out. We'll be back March 12th, right here on Cyber Sentries.