Threat Talks is your cybersecurity knowledge hub. Unpack the latest threats and explore industry trends with top experts as they break down the complexities of cyber threats.
We make complex cybersecurity topics accessible and engaging for everyone, from IT professionals to every day internet users by providing in-depth and first-hand experiences from leading cybersecurity professionals.
Join us for monthly deep dives into the dynamic world of cybersecurity, so you can stay informed, and stay secure!
Zero Trust step five:
monitor and maintain.
Welcome to Threat Talks.
My name is Lieuwe Jan Koning.
And here from headquarters at ON2IT
we bring you the next episode of
Threat Talks and the subject of today
is the monitor part of
step five of Zero Trust.
Let's get on to it.
Welcome to Threat Talks.
Let's delve deep into the dynamic world
of cybersecurity.
In this series about Zero Trust
'm joined again by my
dear colleague, Rob Maas.
He is Field CTO of ON2IT, and his job
is to make sure that Zero Trust
is everywhere in a nutshell.
Right, Rob? Welcome.
Thank you.
We've been talking about
the five steps of Zero Trust.
Let's briefly recap those
five steps, the five steps
that you need to iteratively go
through if you want to go towards
a Zero Trust environment.
First step was...
Define your protect surface.
What does that mean?
We look at our assets
that we have running,
and we define what's really
important to protect.
And then we define them as a protect
surface, because they need protection.
And we also gather some
metadata with them.
So who's the owner of
this protect surface?
How important is it to the business?
And we will use that, all that
metadata we will use in step five.
So today we're going to discuss
how is this helpful. Yeah, step one
was an important step for step five.
Yeah. Okay.
So we now, we know what
to protect after step one.
And then we go to step two,
that's map the transaction flows.
What do we do? Correct.
So once we have a few protect
surfaces, we are going to define
which protect surfaces need to
communicate with each other.
So we make kind of a table or a matrix
where we show, hey, this protect
surface needs to communicate,
with the other protect surface.
And if we are allowing it or not. Okay.
And once we know those two things,
then we go to step three,
which is architect the Zero Trust environment.
Yeah. Correct.
That's also a really important step
because then we bring
Zero Trust really into the operation.
And we're going to define what
security controls should be implemented
Then we will of course implement
them to make it effective.
And that's what we do in step three.
The nice thing here is that it's
also the allocation of the budget.
So if you have a security budget,
once we have protect surfaces and
we know how important they are,
we also know where we can put the most,
controls in place and spend our budget.
Okay.
Spend your budget wisely by applying it
to those protect surfaces that you ...
Yeah.
So if you have a really important protect
surface, you will probably spend more
than, for example, on your guest
network, which is also can be seen
as a protect surface, which only
needs in most times, isolation.
Yeah. So then we have the diagrams,
the architecture,
we have the firewalls,
we have the endpoint protection
everything and then ... step four.
Which we talked about the last two episodes,
because we had to split that as well,
because it's a big subject.
What's that about?
Yeah. Then we create the Zero Trust policy.
So we have all these controls.
But now we need to
reate a policy around them.
How should they behave and who or what needs,
or gets access to those protect surfaces?
So that’s what we do in step four.
And we mainly do that by the Kipling
method, which we discussed last time.
So to make very granular policies
on who or what is allowed,
but also the configuration
on, for example, your endpoint
solution. The step where we
really operationalize Zero Trust.
Exactly.
This is where the actual deny rules being
created or the allowed rule is not created.
Yeah. Okay.
We're done because everything
is set up. Then why is there still
step five, monitor and maintain?
Yeah.
So that's unfortunately,
what a lot of people think.
So we have implemented the
controls and then we're done.
But step five, monitor and maintain,
it starts with the monitoring part,
is really important because the
more controls we put in place
and the better our policy will
become, the more logs we’ll get,
and the logs will tell us
what's going on in the environment
and where we need,to give some attention
to our protect surface to improve it.
Or maybe, that there is already someone,
inside the network, and we can detect it.
Yeah. So validate what's going on.
So we're talking about the
monitoring part right now.
Let's elaborate a little bit
more on that, next time
we're going to talk about the
maintain, how we keep everything
as good as it already is. Right?
Yeah, correct. Now you say
many people forget step five.
We also sometimes see the opposite.
That organizations start with step five,
MDR services, for example.
So I think, at least, that's my assumption,
step five, if you really look at it from
a distance, it sounds very easy.
Just collect all the logs
and then see if you need
to take action somewhere.
And that's also what we see
a lot with MDR solutions.
So we throw everything in the SIEM,
for example, or in a data lake.
And then we hope to find
something in that, that might be
malicious and can take action up on,
but it's very cumbersome.
You get a lot of logs.
It not necessarily tells you anything
that's going on in the environment,
and it has nothing to do
with protection, at all.
While with Zero Trust we want to have protection.
We want to prevent bad
things from happening.
So there's the difference.
We already put protection in place,
and now we're going to, with monitor,
we’re going to check if everything is
still healthy, if we need to take action.
Yeah.
So step five, a large portion
of it is MDR, or
well can be implemented by MDR today,
at least in today's technology,
we'll see how that evolves later,
because the beauty of Zero Trust
is it's been around for over a
decade and still it’s the same.
But the way we implement every step is different.
Yeah, the strategy is still the same,
but you get new security controls, new
security measures, new ideas around it.
So that's evolving.
But I think the strategy is really strong
because it's just telling you,
hey, this is Zero Trust.
With the five steps
we have a very pragmatic
approach to implement it.
And I think that will stay
around for much longer.
But starting with MDR,
does that ... you just
said you gain no protection,
I mean, right?
And I’ve talked to CISOs and they said,
yeah, well, we at some point we suffered
from a breach and then we were attacked.
We had no clue what happened.
No, it's always reactive.
And then they implemented MDR
and then it got breached again
and they said, yeah, yeah;
now we know exactly where it happened,
but we still got breached.
What is the effect of MDR?
There's no P in it for
protection or prevention.
It's really detection and response.
So you're always late to the game. Yeah.
So MDR, everything we talk about
on step five right now,
really should be in collaboration
with all kinds of preventative measures.
Yeah.
You should really do the first
four steps and then go to step five.
But it might be a nice starting point
if you already have like firewalls
and you need to make sense out of log
files, you can... Yeah of course. Yes.
So MDR is not completely useless,
you still have detection and response,
but it gets way more valuable
if you really go with Zero Trust.
Okay.
One of the mantras, of Zero Trust,
John Kindervag,
I sometimes wake up in the
middle of the night and then I,
and it's like, “inspect
and look all traffic”. Yeah.
Tell us about it. Yeah.
So that's one of the four principles of Zero Trust.
So that's also indicating that
monitoring is really, really key.
Because you need to know
what's going on in your network.
And once we create these
smaller protect surfaces we can
tell, or we get a lot of information,
a lot of logs, about every protect surface:
Hey, I see this going on, that going on.
So your security controls
will really inform you.
Hey, this is going on.
And that's because we inspect
everything that's going into
a protect surface or what's
running on a machine.
So we get a lot of value.
And talking about logging,
what kind of logs
are we talking about?
It can be traffic logs.
If you implement your
security controls correctly,
then you also get a lot of events,
because that's what our security controls
are good at. Events like IPS or a malware
detection sequel injection [ ]
Yeah, maybe blocked a brute force attack.
Downloaded malware, which can indicate
there's a patient zero in your network.
You've suddenly logged in
[from] a different country.
Yeah, or the user executed an executable
that's normally not being executed
by that user or he allows the script or visits
certain website, or did a large file upload or download
all kinds of indicators that there
might be something going on.
Does it make sense to simply log
everything as much as possible?
And I understand that SIEM vendors will want
you to do so, because they will sell more hard drives.
Yes. But is it really a good idea?
Well, you need to log everything, depends
also on how long do you want to keep it?
I think that there is a big difference.
You don't have to keep all your logging for
seven years, which some people might argue with,
but your events you definitely
want to take a look at. Events
like the security connotated events
yeah mean, then.
Yeah. Yeah.
So your logging is more for,
if there is an event
and you want to do a root cause,
then that logging is helpful.
But if you have that lo-
You should have that event immediately
when it happened or near real time.
So then you're logging-
You're talking about logs and events,
can clarify that a little bit?
So a log is just ‘this is everything
I've seen’ and an event is ‘I detected some-’
Could be a process was started
or a button was pressed on the website,
those are logs, right?
Yeah.
Or IP address X accessed IP address B,
because a firewall only knows IP addresses.
And then when does it become an event?
If in that communication, in
that session, for example,
a brute force was detected,
then the firewall says,
hey, I saw a brute force
attack, that's an event.
But the logging is just traffic
from A to B. Okay.
So what you're saying is logging
is there, it’s great for
investigations and all that,
but it's impossible to,
traffic logs are impossible
to store for forever.
But events, where we actually see
actual security, a security control
has made that information
into something bigger,
more rich, that you can keep for longer.
Yeah.
There’s one side path here, that's
what we're going to discuss later.
Logging can help once we do the maintenance,
when we go to the maintain part,
but on the monitoring part,
we should really focus on events.
So you make sure that your security
controls are implemented correctly,
and then you get a lot of events,
and you should judge every single event.
Now, almost everybody will
tell you, that is so much data,
we're never going to respond to every.
Yeah it is.
But the benefit here is-
Alert fatigue.
Yeah. They call it alert fatigue, so
there are so many events that the SOC
doesn't know what to do anymore.
So that's basically, they are
getting a fatigue and don't know
how to handle these events.
But in Zero Trust, we have the benefit that in
step one, we defined a lot of metadata, which
like the owner, how important it is,
what kind of protect
surface it is, what data is involved,
all those information.
And that's now really helpful.
So if you get an event, for example,
a firewall event will only
tell you the IP addresses.
But once you translate those IP addresses
to what protect surface they belong to,
you really know, hey, now I have
an event coming from
Active Directory going
to the CRM system.
So you get all that context that we created.
So the event itself is content.
It just hey, I saw, for example,
a brute force attack going from IP address
one to IP address two.
That in itself could be really bad.
We could be [ ], right,
and there's not much
you can do against it. Exactly.
So we don't know yet.
Even if it is blocked, you want to find out,
do I need to pay attention to it or not?
And then adding context that
we defined in step one by
adding the correct protect surfaces to
the event, we get a lot of information.
So we know now how important it is,
if it was the guest network,
maybe we can simply ignore it.
But if it was your CRM system
or any other important system
you probably want to investigate:
Hey, what's going on here?
But your example of the
brute force attack, can you
name an example of when it
is actually something that you shouldn't
really pay attention to, and another example
where you really want to.
So the same event happens in different
context, but you response is different,
that's what you're saying. Yeah.
So do you have an example for that? Yeah.
So the firewall says I saw a brute force
attack from IP address one to IP address
two, if you map the protect surface to it
and IP address one was the internet,
for example, going to, let's say
your web mail,
because there’s a username, password
before you can access your email.
It happens all the time.
Everybody's trying to get in.
If you put something on the internet,
people will scan for it,
will do a brute force attack.
So in this case, you could say
this is an event.
It was blocked, so the brute
force attack was blocked.
I can safely ignore it,
because the firewall did its job.
We can safely ignore this.
However, if we map a different
protect surface to it.
So the same event, but it turns out
this was the Active Directory
going to the web mail,
that we might want to investigate.
That’s really weird. Yeah.
This is weird, even if it is blocked,
we want to investigate, because
this should not happen. Yeah.
And then I think you have two outcomes.
One is there is already
someone in your network,
trying to move laterally around and
see what other systems can I access.
That means you want to
take action immediately.
It still can also be a false positive.
That, for example, a service
account that's needed to let
that mail environment run
properly, has an expired password.
And by logging in constantly
because it’s a service
account used by a process,
it looks like a brute force attack.
And that's a false positive,
which you also want to fix,
but there's no harm there.
Okay. Yeah. Clear.
I think this actually shows
how lateral movement can be
stopped, by segmenting
and putting multiple barriers,
because in your example, it's
not necessarily the web portal that's
being brute forced, that’s at risk
there, well that's the target,
but the real problem is in
AD in your example, right.
So that's, when we missed it
everywhere, then we can see,
see it as a source of the next step.
And putting the step one context
in place, helps you do so.
Yeah.
Now we've been doing this
at ON2IT, of course, for 20 years.
And we build a lot of automation
to do so, I mean, because
we have a concept like IoG
for example, Indicator of Good,
as opposed to Indicator of Compromise,
I think that plays a role here.
Can you explain a little bit?
Yeah.
So, what I told earlier about the attack
from the internet to your web mail,
we call that an Indicator of Good
if it is blocked, because the firewall,
the security measure in this case,
did a good job.
It blocked the attack.
Then the Indicator of Compromise would be,
hey, we know definitely there’s something going on
and we are absolutely sure that it is wrong.
Maybe a malware download on a system
that was detected by an EDR system.
If that was on an important protect
surface like CRM or HR, ERP, etc.
then we know we need
to take immediate action.
That's really easy an IoC,
because we know this is bad.
We need to take action.
Then in between is everything
that we don't know for sure.
So like the example with the brute force
attack coming from
Active Directory in our previous example,
to the web mail environment,
we cannot know for sure based upon
the content and the context we have,
if it is an actual attacker or a wrongly
configured or an expired password
on a service accoun. [ ] you want to know.
We want to investigate
and that's what we call,
typically an unknown event.
And that really needs to be
addressed by a SOC engineer.
There's not many technology
that use this philosophy.
Actually, the reason we have this,
I mean, we're now very happy with it,
so we can put it everywhere, is because
there was nothing in the market.
So we ended up building it ourselves.
But it's still not common practice to-
No, that’s because most
people will still focus on the content
they get from the security solution.
And then it's really hard to
make good, informed decisions.
So what what you typically see
if there is already any automation,
there are very basic rules saying,
okay, everything that's blocked,
we will just throw away and ignore.
But still, blocked
events can tell you a lot on
what's going on in the environment.
Yeah, yeah.
And the latest development
of course, is to put AI in the mix.
And then the job of AI is to filter out
the right events, because what you're
essentially saying is that if we have
brute force attacks in your example,
most of the, like 99.9% or so
of those events are useless
or expected Indicators of Good
and expected to happen. Right.
But searching for it is one thing.
And by looking at each and every one
of them, every time I guess then
you filter out the stuff that
you look, the Iindicators of Good,
and then what remains is the stuff
you want to look at.
AIs do the opposite.
They try to somehow find an algorithm
to find the bad stuff out of it.
Yeah. Yeah, that's really hard,
I think, if there's no context.
So the context is really key here,
if you have no context, then this,
I think it's impossible because systems
work on basis of IP addresses,
maybe - AI can help, but it’s certainly
not the silver bullet without that data.
Not at this moment, no.
Yeah.
Who knows when we have
a general superintelligence.
If we find a way to feed the AI
with all the protect surface
information, maybe.. Us humans
could figure it out, eventually
AIs also will. Exactly.
We'll see.
The future is very interesting.
Anything else on automation here?
That we should do?
Well, on automation.
Yeah.
So you should have automation
because the sheer amount of events
you get- Should you inspect every
event with automation? Yes.
Yeah. You should.
You’re saying this, but no, almost
nobody is doing this. No, a lot of,
I would say traditional solutions will say,
okay, we’ll have all the events
or all the logs in one big box,
and then I'm going to filter out
what looks interesting.
Yeah. I think you should flip that around.
You should inspect every event, because
the security controls are there for a reason.
They are security controls. So you might,
you might expect from them
that they will throw up good events
that you can work with.
And if not, then maybe you have the
wrong tool or wrongly configured.
But you should inspect every event
that's being created
because it tells you something
that's going on in your environment.
This is a big task for many people
because the SIEMs don't do this.
Even AI powered.
Typically they don't do it.
So you need to have automation here,
and you need to have the context
that helps the automation.
Without the context,
you're doomed to fail here.
Yeah.
Okay.
There's also a thing
called Rules of Engagement.
It’s part of automation.
Can you explain a little bit about that?
Yeah.
So if you get these kinds of information
and you have this context,
then you can also decide where you want
to have an automated response.
So we can, the first part of
automation is hey, this ticket
needs to be investigated by a SOC engineer
or not, it's an IoG, IoC or an unknown.
But the other part is, hey, I know this
is an IoC, what should I do as a response?
And you can then also make rules on
protect surfaces, how important they are,
for example, and then let's say
you find a process that's
known as ransomware that’s being launched
in a protect surface, maybe your
ERP system. Then you might decide, okay,
if something is launched,
and it's that critical,
then I want to isolate the whole
protect surface from the rest of the network
so that I can reduce the blast radius.
A kill switch. It’s kind of a kill switch
yeah, and then automated
by Rules of Engagement, you can
also find a middle ground there,
you say, okay, if that happens, then
the automation will prepare everything
and will send the SOC engineer
kind of button or a link:
‘Hey, I found this, I want to do this.
Can you please agree?’ But then,
you need to act fast, because
ransomware spreads.
Yeah, yeah, because we can typically.
I mean detection can be subsecond
even today, not every technology, but,
then and then, yeah, it would be
a pity tend to wait for 15 minutes.
It depends a bit on the maturity
of your organization,
if you're ready to do that. [ ] building
a lot of skill switches for customers.
I know that for some reason.
What's the hesitation of an organization?
Because it's not, not everybody does this.
Why not?
I think the big... So the function in itself,
I think is a very good marketing tool,
to show why you should do Zero Trust.
Because you will find out
in a lot of organizations,
they don't have segmented
of the network correctly.
And if you want to block something,
completely, you need to block
everything that it's able to reach.
And that can be a very big blast radius.
So you make a kill switch that blocks off
a segment, a whole protect surface, then
turns out that the protect surface
isn't really segmented
and therefore, by blocking it, you block half
the network, because it shares the network.
Exactly.
The impact can then be very, very big.
So you need to kind of have a maturity,
absolutely sure that it is segmented properly,
that your context that you get to
the event is right,
before you can do the full automation.
Yeah, yeah.
So it makes it very obvious to
organizations why they should do better in
step one, actually, or in step three.
Step three.
Where you segment....
Yeah, step one and three combined. Yeah.
And that's what you mean by marketing,
help the CISO or ...
It really shows them, if you
show hey, if you would
what always is nice to do is pick one
of the most important things they have.
So their core process, the core
business process and then say,
okay, on what IP does it run, or multiple-
[ ] or the webshop...
Exactly.
And then you type in the APM
and you do the killswitch,
the blast radius calculation,
and you show okay, but
it would block this part of the network
and then they are going, okay,
oh we really should do better.
Yeah. Exactly. Okay.
Is it easy to deploy, are organizations
comfortable automatically
blocking attacks like this?
Once you’re done.
It's easy to deploy, and I think,
organizations need to grow
into it to have the confidence
that the automation works
and does the right thing.
And also need to accept
that sometimes it's better
to have kind of a loss of productivity,
but prevented a whole ransomware attack
instead of, waiting a bit longer
and then being too late.
Yeah, because security equipment
will sometimes break.
I mean, let's face it, I mean,
if you don't have a firewall,
you will never suffer
from a firewall outage, right?
Yeah, nothing is perfect
I would say in this case.
No, but not having a firewall, obviously,
well, at least to us obviously is
much worse... It is normally
not a risk you want to take.
And it's the same for example,
with automation.
Do I want to take the risk by doing it
manually and maybe being too late,
but prevent automated downtime?
Or do I want to automate it
and then maybe have a slight chance
that it is being automated at the
wrong time or the wrong IP address?
Clear. Rules of Engagement,
you mentioned there's one system
that detects something,
for example, your log system.
And then there's an action in maybe
another control, might be that one
control sees the data
and another control ...
then you need Rules of Engagement.
Could you argue that Rules of Engagement are also
inherent in, so, many people are already using it,
for example, an AV solution or MDR solution,
when it detects malware,
it will immediately block that process.
So in a sense that is a Rule of
Engagement within the control itself,
maybe like a shortcut of a rule...
Yeah. Correct.
And also firewalls nowadays can do this.
And for endpoints I think we're
at the stage that people will trust it.
So they say okay, it's fine that
the endpoint solution will stop
the process, I'm perfectly fine with it, because
the impact, of course, is really, predictable.
It’s solely that endpoint.
But on firewalls, for example, where
we can do the same, I see a malware
download in a particular subnet
or even to a particular endpoint.
We don't often automate it, yet at least.
But the capabilities are there.
But we should. We should.
Yeah. We should.
Why are people hesitant in doing it?
Or is it just not knowing?
I think for firewalls
it's often not knowing.
I think a lot of people still see a firewall
as just network access control.
The last, I would say even say
ten years already, we gained
a lot of Layer 7 capabilities
and a lot of automation in the firewalls,
but is not often you, still
I see a lot of customers
working with solely Layer 4,
as we call them.
So source IP address and the destination
and the ports, and not or barely
looking at the application
or what's actually in the data.
While most modern
firewalls are perfectly capable
of finding a lot of- And blocking.
And blocking.
The discussion should we enable or put the IPS
in blocking mode or in detection mode.
And to me, it's a no brainer
that it's in protection mode.
I mean, it's really bad stuff.
It should all be prevention first.
Block.
It’s the best Rule of Engagement. Yeah.
And the worst Rule of Engagement
is, oh, we got hacked.
Let's figure it out. Right. Yeah.
It should be prevention first.
And if it is then prevented and blocked,
you should be informed by an event
that you are going to inspect.
Very important step, this step number 5A,
the monitoring part of it.
Thank you very much
for all those insights, Rob.
And to you, thank you for tuning in
to Threat Talks this time.
This time was the first part of step five,
and we have a final part of step five.
It's about maintain and we'll do that
next time when we are on the subject.
Thank you very much for tuning in.
Like us, we would appreciate that.
And if you push the subscribe button
you will not miss that next episode.
Once again, thank you.
And from headquarters at ON2IT.
Bye bye.
Thank you for listening to Threat Talks,
a podcast by on ON2IT cybersecurity and AMS-IX.
Did you like what you heard?
Do you want to learn more?
Follow Threat Talks to stay up to date
on the topic of cybersecurity.