1
00:00:04,368 --> 00:00:06,018
Welcome to DevOps and Docker talk.

2
00:00:06,078 --> 00:00:10,668
And in this episode, I'm taking
a clip from the live stream I

3
00:00:10,668 --> 00:00:12,398
did with my co-host Nirmal Mehta.

4
00:00:12,628 --> 00:00:16,558
Of AWS and I bring him
up to speed on something.

5
00:00:16,558 --> 00:00:20,228
I've spent weeks on working with
Eric Smalling of Chainguard,

6
00:00:20,248 --> 00:00:24,328
that's the Zero CVE image security
company for Docker images.

7
00:00:24,328 --> 00:00:28,048
And Eric and I have been working
for a couple of weeks on trying

8
00:00:28,048 --> 00:00:30,568
to create a piece of training.

9
00:00:30,618 --> 00:00:34,518
I wouldn't call it new 'cause this isn't
a new issue, but something I'm calling

10
00:00:34,518 --> 00:00:40,878
silent rebuilds or more accurately
silent upstream base image rebuilds.

11
00:00:41,378 --> 00:00:42,608
That's a long time.

12
00:00:42,608 --> 00:00:43,628
I like silent rebuilds better.

13
00:00:43,628 --> 00:00:44,678
That has a better ring to it.

14
00:00:45,048 --> 00:00:50,118
But this video is gonna be about a problem
that I really want to call attention

15
00:00:50,118 --> 00:00:56,778
to on your path to basically having the
least amount of CVEs in production as

16
00:00:56,778 --> 00:01:02,508
possible without having to have to do
crazy amounts of work or custom home

17
00:01:02,508 --> 00:01:07,173
built images that have little parts
taken out of it or something like that.

18
00:01:07,413 --> 00:01:12,093
Like we all are probably using, a majority
of us at least, are using upstream based

19
00:01:12,093 --> 00:01:17,853
images from Docker Hub or from GitHub
or AWS's public or GitHub's public image

20
00:01:17,853 --> 00:01:22,803
catalog or Chainguard or any one of the
companies that are now providing not just

21
00:01:22,803 --> 00:01:24,873
base images, but hardened base images.

22
00:01:25,323 --> 00:01:31,173
These all are upstream controlled, meaning
that we don't build them ourselves.

23
00:01:31,173 --> 00:01:34,683
We typically rely on the vendor
to provide us that base image.

24
00:01:34,803 --> 00:01:40,834
It's usually a base os of some sort,
whether that's Debian or alpine or maybe

25
00:01:40,834 --> 00:01:44,884
even Ubuntu or Wolfie if you're from
Chainguard, which create, they created

26
00:01:44,884 --> 00:01:48,634
their own custom base image, and there's
all these base images, and then there's

27
00:01:48,634 --> 00:01:51,664
usually something on top, like if you're
not just doing a generic base image,

28
00:01:51,664 --> 00:01:55,414
then you're probably doing something
like Python or Node.JS or something else.

29
00:01:55,414 --> 00:01:57,874
That's a programming language, which
has its own libraries in dependencies.

30
00:01:58,099 --> 00:02:00,484
And then maybe if you're doing
something on top of that, like you

31
00:02:00,484 --> 00:02:05,134
might be doing a WordPress site or a
Ghost blog or Drupal, and so there's

32
00:02:05,134 --> 00:02:06,544
another application layer on top.

33
00:02:06,544 --> 00:02:09,809
Or maybe you're building your own custom
images regardless of what you're doing.

34
00:02:10,019 --> 00:02:14,879
That base image is the cause for
years and years of us talking

35
00:02:14,879 --> 00:02:16,979
about how to reduce image size.

36
00:02:16,979 --> 00:02:19,859
But more importantly, CVE Count.

37
00:02:20,009 --> 00:02:25,344
And CvEs in production are really all
that matters at the end of the day for

38
00:02:25,344 --> 00:02:30,114
us DevOps people because something in
test or staging is obviously important

39
00:02:30,114 --> 00:02:34,734
too, but really what really matters is
making sure that production has the least

40
00:02:34,734 --> 00:02:37,224
number of vulnerabilities as possible.

41
00:02:37,524 --> 00:02:41,514
And we typically only have
vulnerabilities in production even when.

42
00:02:41,949 --> 00:02:44,229
we're thinking about them and
trying to reduce the number.

43
00:02:44,439 --> 00:02:49,379
We're only able to get our production
total, like all the servers or

44
00:02:49,379 --> 00:02:53,399
the entire Kubernetes cluster or
however you consider production.

45
00:02:53,699 --> 00:02:56,579
We're only able to get that
number down to so little.

46
00:02:56,579 --> 00:03:01,229
Like it's really, really hard and
challenging and difficult to get

47
00:03:01,229 --> 00:03:03,209
zero CVEs everywhere all the time.

48
00:03:03,209 --> 00:03:04,859
Like I actually have never seen it.

49
00:03:04,889 --> 00:03:08,669
I've never, at all the government agencies
I've worked with, all the big companies,

50
00:03:08,669 --> 00:03:11,489
the small startups, everyone I've
worked with, there's always something.

51
00:03:11,989 --> 00:03:15,649
So you have to do a lot of analysis
and there's all this work involved.

52
00:03:15,649 --> 00:03:19,879
Well, regardless of all that, the
base upstream images that you're using

53
00:03:19,879 --> 00:03:25,189
today, probably if they didn't come
with CVEs on day one when you first

54
00:03:25,189 --> 00:03:31,064
started using them, the longer they sit
somewhere, on any server running, the

55
00:03:31,064 --> 00:03:33,044
more likely they're gonna have more CVEs.

56
00:03:33,544 --> 00:03:36,784
And that is the topic
of this conversation.

57
00:03:36,784 --> 00:03:42,554
And I wanted to call attention to a
particular style of CVEs, sneaking in

58
00:03:43,344 --> 00:03:48,594
to your images after you've deployed
them, because CVEs are discovered

59
00:03:48,594 --> 00:03:50,184
all the time for existing code.

60
00:03:50,184 --> 00:03:51,684
That's how CVEs happen.

61
00:03:51,684 --> 00:03:54,534
That's how we find the vulnerabilities,
and then eventually patch them

62
00:03:54,534 --> 00:03:55,974
and eventually roll out the patch.

63
00:03:55,974 --> 00:03:56,334
Right?

64
00:03:56,694 --> 00:03:58,734
So there's a strategy to all this.

65
00:03:58,734 --> 00:04:02,754
There's multiple ways to
update your images, which are

66
00:04:02,754 --> 00:04:04,464
really the production artifact.

67
00:04:05,124 --> 00:04:10,254
And in this conversation, I'm essentially
explaining the work we've been doing on

68
00:04:10,254 --> 00:04:18,164
how to provide you a open source guide
to the tooling and basically like a

69
00:04:18,164 --> 00:04:20,144
prescription for using GitHub Actions.

70
00:04:20,144 --> 00:04:21,014
'cause that's my favorite.

71
00:04:21,164 --> 00:04:22,964
Hey, did you know I'm making
a course on GitHub Actions?

72
00:04:23,144 --> 00:04:24,134
I'm making a course.

73
00:04:24,224 --> 00:04:25,904
Uh, you can sign up below in the notes.

74
00:04:25,994 --> 00:04:29,654
Uh, there's a link somewhere in this
video and you can go check that out.

75
00:04:29,654 --> 00:04:30,734
But as a part of that.

76
00:04:30,944 --> 00:04:34,364
We were making this piece of
content around Chainguard images

77
00:04:34,364 --> 00:04:38,954
specifically, and I started to imagine
a bigger solution to this problem.

78
00:04:39,454 --> 00:04:41,734
And we are creating that for you.

79
00:04:41,744 --> 00:04:45,404
I don't really have a name for the
solution other than a series of GitHub

80
00:04:45,404 --> 00:04:51,404
workflows and automations that are going
to ensure from multiple vectors that

81
00:04:51,404 --> 00:04:57,374
your container images actually have the
least CVEs that you can currently get.

82
00:04:57,854 --> 00:05:03,059
And that might be in the form of
a Dependabot or a renovate tool,

83
00:05:03,209 --> 00:05:08,189
and that might be in a newer tool
from Chainguard called Digestabot.

84
00:05:08,489 --> 00:05:12,359
And it's not a very popular or
well-known project, so I want to call

85
00:05:12,359 --> 00:05:16,019
attention to it and we talk about what
it is and the, the problem and the

86
00:05:16,019 --> 00:05:18,449
solution, So let's reduce that down.

87
00:05:18,539 --> 00:05:19,979
Let's create a strategy.

88
00:05:20,194 --> 00:05:23,244
That people can implement, so I hope
you enjoy this episode with Nirmal

89
00:05:23,244 --> 00:05:26,104
and I talking about silent rebuilds.

90
00:05:26,104 --> 00:05:26,134
I.

91
00:05:28,163 --> 00:05:28,313
hi.

92
00:05:28,628 --> 00:05:29,138
Hey man.

93
00:05:29,228 --> 00:05:29,648
Hey.

94
00:05:30,019 --> 00:05:30,859
I am excited.

95
00:05:30,972 --> 00:05:34,362
You and I have been talking about
image Docker builds and Docker

96
00:05:34,362 --> 00:05:40,992
images and security and slims versus
Alpine and CVE scanning and all that.

97
00:05:40,992 --> 00:05:43,542
We have been talking
about that for a decade,

98
00:05:43,587 --> 00:05:44,077
Correct.

99
00:05:44,472 --> 00:05:48,012
and it still feels like I find new
things and it still feels like I

100
00:05:48,012 --> 00:05:54,772
learned things that I realize are
way more important than I thought

101
00:05:54,772 --> 00:05:58,912
and that I've not been considering
and that I should be solving for my

102
00:05:58,912 --> 00:06:00,742
clients, my students and my courses.

103
00:06:00,792 --> 00:06:06,312
we all know that Docker Hub has official
images, over 200 images at this point,

104
00:06:06,312 --> 00:06:08,642
I believe, that are all very popular.

105
00:06:08,642 --> 00:06:13,762
Open source projects, languages,
frameworks, and these things, even

106
00:06:13,762 --> 00:06:17,572
in the Alpine or the slim images
come with an OS underneath them,

107
00:06:17,812 --> 00:06:20,342
it's either Debian or Alpine.

108
00:06:20,732 --> 00:06:26,722
And then, we've had other attempts
at like base os container standards.

109
00:06:27,062 --> 00:06:30,942
over the years we've tried to create
like container oss, Chainguard created

110
00:06:30,942 --> 00:06:34,382
Wolfie, which is essentially their
container os and we've had them on

111
00:06:35,342 --> 00:06:38,552
the show multiple times to talk about
like what Wolfie is and how that

112
00:06:38,552 --> 00:06:41,012
helps them create zero CVE images.

113
00:06:41,512 --> 00:06:46,222
we don't yet have that for
Debian, for Ubuntu, for Alpine.

114
00:06:46,532 --> 00:06:49,712
I'm pretty sure that if I pulled
down any image, running Debian

115
00:06:49,712 --> 00:06:52,382
underneath that, there's gonna
be some vulnerability probably.

116
00:06:52,382 --> 00:06:54,589
Like, it's gonna be pretty
rare when I'm actually zero.

117
00:06:55,089 --> 00:06:58,839
And that's just kinda like
the nature of Linux right now.

118
00:06:58,839 --> 00:07:03,299
we can't update and guarantee and
fix and then update and then roll out

119
00:07:03,299 --> 00:07:04,769
updates of everything all the time.

120
00:07:04,769 --> 00:07:09,499
So there's always, everyone's always
managing like somewhere between one

121
00:07:09,799 --> 00:07:12,259
and infinity of CVEs in production,

122
00:07:12,614 --> 00:07:13,514
At any given moment

123
00:07:13,879 --> 00:07:14,719
any given moment.

124
00:07:14,719 --> 00:07:18,109
And the goal is to get that
thing down as tight as possible.

125
00:07:18,469 --> 00:07:19,609
on a rolling basis,

126
00:07:19,709 --> 00:07:22,199
and the reality is, is
it's all a moment in time.

127
00:07:22,699 --> 00:07:27,739
Like the minute we scan something,
it's only whatever, you're, you

128
00:07:27,739 --> 00:07:29,899
have to trust a scanner because
they all, they're all different.

129
00:07:30,199 --> 00:07:32,275
Some scanners are better than
others at finding things.

130
00:07:32,275 --> 00:07:36,535
At that moment in time, you have that
security stance of how many CVEs you have.

131
00:07:37,035 --> 00:07:38,835
A minute later, you're now outta date.

132
00:07:39,195 --> 00:07:41,955
So you, you, you do your best.

133
00:07:41,955 --> 00:07:44,925
You take snapshots, your security
team maybe scans once a day or once

134
00:07:44,925 --> 00:07:47,985
a week or once a month, or not at
all, but maybe someday they'll scan.

135
00:07:48,315 --> 00:07:50,865
Or you scan only in CI and
you kind of ignore it once in

136
00:07:50,865 --> 00:07:52,155
the go goes into production.

137
00:07:52,395 --> 00:07:54,890
So there's all these variations
and everybody's trying all

138
00:07:54,890 --> 00:07:55,640
these different things.

139
00:07:56,140 --> 00:08:00,490
But I think when we all talk about
containers, we all tend to agree that

140
00:08:00,880 --> 00:08:04,030
you build a container, you want the
smallest image you can get away with.

141
00:08:04,060 --> 00:08:05,710
If you can go distroless, great.

142
00:08:05,710 --> 00:08:09,930
If you can, do something like Chainguard
or a, a paid hardened image where you

143
00:08:09,930 --> 00:08:11,850
have support and guarantees, great.

144
00:08:12,350 --> 00:08:16,250
But that image is only great on
the day that you last looked at it.

145
00:08:16,550 --> 00:08:21,460
Like when you last certified it or last
scanned it and approved it every day

146
00:08:21,460 --> 00:08:23,200
since then, it's gotten worse, right?

147
00:08:23,200 --> 00:08:25,930
And the only way you're gonna
know how worse is to re-scan it.

148
00:08:26,430 --> 00:08:26,880
So.

149
00:08:27,380 --> 00:08:31,030
We, we now have these production
tools, like you can use Trivy to

150
00:08:31,120 --> 00:08:34,660
an open source tool for scanning
for vulnerabilities on in images

151
00:08:34,750 --> 00:08:36,340
amongst other things, but in images.

152
00:08:36,400 --> 00:08:39,460
And then it can now scan
a Kubernetes cluster.

153
00:08:39,910 --> 00:08:45,544
So you can get a posture of what's my
images throughout the cluster and what are

154
00:08:45,904 --> 00:08:48,304
the CVE counts in production in real time.

155
00:08:48,334 --> 00:08:51,364
'cause that's probably different than
what your CI is telling you or what

156
00:08:51,364 --> 00:08:53,074
your, you know, your local machine.

157
00:08:53,344 --> 00:08:55,834
So, so again, I think
these are all known things.

158
00:08:55,834 --> 00:08:58,594
I think most people that are
running Kubernetes today understand

159
00:08:58,594 --> 00:09:00,214
that, at least to some degree,

160
00:09:00,709 --> 00:09:01,099
Yeah.

161
00:09:01,564 --> 00:09:05,094
and they know that if I pin,
which everybody tells you to pin,

162
00:09:05,094 --> 00:09:12,444
don't pin the latest pin, pin your
images to Python three point 13.3

163
00:09:12,654 --> 00:09:16,154
so there's some determinism, like
you're trying to achieve some

164
00:09:16,154 --> 00:09:17,984
determinism with the versioning.

165
00:09:18,014 --> 00:09:22,064
Like what you can kind of control
what's inside your container in that

166
00:09:22,274 --> 00:09:22,604
Yeah.

167
00:09:22,634 --> 00:09:27,824
if you came to me as a boss, might come to
their dev engineer, their DevOps engineer

168
00:09:27,824 --> 00:09:32,704
and say, I want you to give me strategies
so that we can implement projects

169
00:09:32,704 --> 00:09:38,889
that will reduce the CVE count across
our production infrastructure by 90%.

170
00:09:39,339 --> 00:09:41,789
Like my target, my stretch goal is 90%.

171
00:09:41,789 --> 00:09:43,259
My goal is 50%.

172
00:09:43,759 --> 00:09:46,339
Please implement, give me a strategy.

173
00:09:46,519 --> 00:09:50,289
And so you're gonna end up with
these strategy plans around,

174
00:09:50,739 --> 00:09:53,499
obviously there's the host os
and that's a whole other thing.

175
00:09:53,499 --> 00:09:57,529
I'm really just gonna talk about container
images, but that's not enough because

176
00:09:57,529 --> 00:10:03,399
minification reduces the amount of blast
radius, but you still have to have things

177
00:10:03,399 --> 00:10:05,769
like the application layer dependencies.

178
00:10:05,829 --> 00:10:08,319
Like if your developers aren't
updating their dependencies in

179
00:10:08,319 --> 00:10:11,899
their app, their gems, their jars,
like they're not updating all these

180
00:10:11,899 --> 00:10:16,159
things, then I, as the infrastructure
manager can only do so much.

181
00:10:16,159 --> 00:10:20,959
So Dependabot and renovate largely are
the automation tools we all use to solve

182
00:10:20,959 --> 00:10:22,339
that problem for application developers.

183
00:10:22,699 --> 00:10:26,299
For US infrastructure people, it's
all about minimal base images.

184
00:10:26,299 --> 00:10:30,569
Like that's the best we can add to
that workload and say like, yeah.

185
00:10:30,569 --> 00:10:33,369
So like where a lot of people move
to Alpine or they buy Chainguard

186
00:10:33,989 --> 00:10:37,349
yeah, so, or they go to, at least
they go to Slim and they learn about

187
00:10:37,349 --> 00:10:42,029
slim versus like no normal Python
or normal ruby images or whatever.

188
00:10:42,539 --> 00:10:48,149
that again, only gets you so far because
the minute you move to that slim or that

189
00:10:48,149 --> 00:10:52,829
minimal, it's really only about how fast
you update that when a new one comes out.

190
00:10:53,329 --> 00:10:54,949
that becomes the next big problem.

191
00:10:54,949 --> 00:10:59,179
If you update today and you're on
this minimal image, and you're super

192
00:10:59,179 --> 00:11:01,909
lean on your slim images, great.

193
00:11:02,059 --> 00:11:06,349
But tomorrow your CVE count probably
went up in production or the day after,

194
00:11:06,349 --> 00:11:08,089
it'll go up one, one of the days here.

195
00:11:08,089 --> 00:11:09,469
Pretty soon you will have more.

196
00:11:09,829 --> 00:11:14,299
And so your job then is,
okay, how do I update faster?

197
00:11:14,449 --> 00:11:18,919
Like if that lower image, if that alpine
base image three point 12 or whatever

198
00:11:19,249 --> 00:11:24,999
is in all my images as what we call
the base image, then maybe my job is

199
00:11:24,999 --> 00:11:26,859
now to make sure that's always fresh.

200
00:11:26,859 --> 00:11:28,629
That's always the latest version.

201
00:11:28,734 --> 00:11:28,954
Yep.

202
00:11:29,394 --> 00:11:32,379
Dependabot and Renovate
can also help with that.

203
00:11:32,599 --> 00:11:35,509
They now can manage helm
chart updates automatically.

204
00:11:35,539 --> 00:11:41,009
Kubernetes compose Docker files,
customize, you know, like basically

205
00:11:41,009 --> 00:11:43,829
any way you wanna deliver a
container and infrastructure code.

206
00:11:44,329 --> 00:11:47,509
Those two tools will tell
you about a new version.

207
00:11:48,009 --> 00:11:48,309
Right

208
00:11:48,459 --> 00:11:52,989
So when you go from Python,
three point 13.0 to Python,

209
00:11:53,319 --> 00:11:59,079
three point 13.1, presumably that
0.1 was a CVE fix or a bug fix.

210
00:11:59,109 --> 00:12:01,689
Maybe not a security bug, but a bug fix.

211
00:12:01,789 --> 00:12:02,239
something.

212
00:12:03,049 --> 00:12:03,469
Something is fixed.

213
00:12:03,649 --> 00:12:05,929
Hopefully nothing will break,
but something is fixed.

214
00:12:06,229 --> 00:12:09,619
And so generally, a lot of teams
either have the stance of, we're

215
00:12:09,619 --> 00:12:13,969
gonna pin to Python three 13 and
we're gonna have Dependabot and

216
00:12:13,969 --> 00:12:15,769
Renovate just kind of run every day.

217
00:12:15,769 --> 00:12:17,209
they tend to run only once a day.

218
00:12:17,629 --> 00:12:23,659
So the best I can get is the day that
Alpine releases a BA new base image,

219
00:12:23,659 --> 00:12:28,159
or Ubuntu, or Debian or the new, the
latest Postgres, like whatever that base

220
00:12:28,159 --> 00:12:29,569
image gets updated to a new version.

221
00:12:30,259 --> 00:12:33,559
Those two tools, one of those two tools
will tell me they kinda do the same thing.

222
00:12:33,689 --> 00:12:36,839
that will gimme a PR and then it's
all about how fast can I click the

223
00:12:36,839 --> 00:12:38,279
button and then how fast can I deploy.

224
00:12:38,779 --> 00:12:44,839
So you're shortening the window between
CVE detected and CVE fixed in production?

225
00:12:45,339 --> 00:12:45,689
Right.

226
00:12:46,189 --> 00:12:49,879
that is like half of my talk is
to say, this is where we were.

227
00:12:50,379 --> 00:12:51,579
Now the line has moved.

228
00:12:51,789 --> 00:12:53,049
It was always moved.

229
00:12:53,439 --> 00:12:55,089
I just didn't always know it.

230
00:12:55,589 --> 00:12:56,549
It wasn't until about

231
00:12:56,804 --> 00:12:57,349
do you, wait a minute.

232
00:12:57,719 --> 00:12:58,199
Okay.

233
00:12:58,559 --> 00:13:03,349
Because what you just described, even
for organizations today would be,

234
00:13:03,849 --> 00:13:03,939
a

235
00:13:04,149 --> 00:13:06,409
would be great, would be sophisticated.

236
00:13:06,589 --> 00:13:08,029
it, but it's not the pinnacle.

237
00:13:08,029 --> 00:13:08,989
It's not enough.

238
00:13:09,469 --> 00:13:13,069
There are things happening and
I'm calling them silent rebuilds.

239
00:13:13,669 --> 00:13:14,779
This has always happened.

240
00:13:15,279 --> 00:13:19,779
at some point in every container
engineer's career, they realize this

241
00:13:19,779 --> 00:13:23,679
is happening because it's not shouted
from the rooftops, I don't think Claude

242
00:13:23,679 --> 00:13:27,459
and ChatGPT, neither one of them could
find significant discussions about

243
00:13:27,459 --> 00:13:32,509
this problem anywhere on the internet,
except for one tool from one company.

244
00:13:32,759 --> 00:13:34,629
Chainguard made Digestabot.

245
00:13:35,199 --> 00:13:39,939
Digestabot like Dependabot and
Renovate, but it only does one thing

246
00:13:40,439 --> 00:13:40,829
okay.

247
00:13:41,369 --> 00:13:44,399
Python, three point 13.1

248
00:13:44,899 --> 00:13:45,169
Okay.

249
00:13:45,469 --> 00:13:46,129
rebuilt

250
00:13:46,629 --> 00:13:48,969
By the person that owns that.

251
00:13:49,329 --> 00:13:49,509
Yeah.

252
00:13:49,579 --> 00:13:53,739
by Docker Hub, that image
doesn't stay the same.

253
00:13:54,239 --> 00:13:58,889
If you were to go and download
Python three point 13.1 today.

254
00:13:59,129 --> 00:13:59,549
Mm-hmm.

255
00:14:00,049 --> 00:14:04,729
And you then downloaded that image in a
month, would it be the identical image

256
00:14:05,229 --> 00:14:07,719
With just that tag, not like the sha

257
00:14:07,779 --> 00:14:10,059
the most specific tag you can get?

258
00:14:10,559 --> 00:14:10,919
okay.

259
00:14:11,419 --> 00:14:11,539
I,

260
00:14:12,039 --> 00:14:13,449
I'm not trying to put you on pressure.

261
00:14:13,569 --> 00:14:14,049
I'm not trying.

262
00:14:14,049 --> 00:14:14,529
No, no, no.

263
00:14:14,529 --> 00:14:15,454
I know, I understand.

264
00:14:15,454 --> 00:14:17,879
Let me, I want to make sure that
we're explaining this properly.

265
00:14:17,909 --> 00:14:22,339
'cause there's a gigantic amount of nuance
about what you're about to talk about.

266
00:14:22,369 --> 00:14:26,529
So this is what happens when
we don't pre preuss our show.

267
00:14:26,649 --> 00:14:26,949
Okay.

268
00:14:27,449 --> 00:14:27,959
So.

269
00:14:28,459 --> 00:14:31,629
I think, I know, I think I'm
not gonna spoil the ending here.

270
00:14:31,929 --> 00:14:34,929
I think I understand where this
headed, but just to reiterate to

271
00:14:34,929 --> 00:14:40,179
our audience you're talking about,
like, if you go to Docker Hub, right.

272
00:14:40,179 --> 00:14:43,149
and Python's a perfect example
because there's so much

273
00:14:43,149 --> 00:14:45,509
machine learning stuff going on

274
00:14:45,614 --> 00:14:46,514
I am picking on it.

275
00:14:46,564 --> 00:14:51,894
tho those workloads are insanely
sensitive to the versions of like

276
00:14:51,894 --> 00:14:54,474
minor versions of a bunch of libraries.

277
00:14:54,474 --> 00:14:54,864
Right.

278
00:14:55,364 --> 00:15:01,884
So I've definitely been in this
position where I need a very specific

279
00:15:01,884 --> 00:15:04,224
version of Python in a container.

280
00:15:04,584 --> 00:15:08,604
So I go and I say from, you
know, Python, what did you say?

281
00:15:08,994 --> 00:15:10,254
3

282
00:15:10,449 --> 00:15:11,619
13 point something.

283
00:15:11,619 --> 00:15:11,859
Yeah.

284
00:15:12,039 --> 00:15:12,880
Three 13.3.

285
00:15:13,380 --> 00:15:17,210
And so, you know, that's
what I put as my base image.

286
00:15:17,210 --> 00:15:21,410
And then I, do QEMU, CUDA and all this
other stuff that I need to put into like

287
00:15:21,410 --> 00:15:23,470
the notebook or whatever, container image.

288
00:15:23,970 --> 00:15:27,780
So what you're saying is I go
ahead and build my image with

289
00:15:27,780 --> 00:15:30,110
that from, that base image.

290
00:15:30,610 --> 00:15:36,050
I am assuming if I change something
in my application code, but I'm not

291
00:15:36,050 --> 00:15:41,590
changing that from base image, but I
don't have it in my local registry.

292
00:15:41,590 --> 00:15:42,530
I don't have it cached.

293
00:15:42,550 --> 00:15:43,650
I do a no cached.

294
00:15:44,160 --> 00:15:46,260
A week later I rebuild.

295
00:15:46,260 --> 00:15:49,440
In theory, it should be the
exact same thing, except for

296
00:15:49,440 --> 00:15:50,880
the application code changes.

297
00:15:51,210 --> 00:15:53,940
Let's let, actually, let's just
assume I don't even touch anything

298
00:15:53,940 --> 00:15:55,560
in the application code, right?

299
00:15:55,560 --> 00:15:56,880
Like all that's the same.

300
00:15:57,380 --> 00:16:01,050
You are saying that if I, the
next day or the two days later

301
00:16:01,050 --> 00:16:03,690
or a week later and I redo that.

302
00:16:04,095 --> 00:16:05,325
It's non-deterministic.

303
00:16:05,325 --> 00:16:09,735
That build is not the same, even
though I've done all the best

304
00:16:09,735 --> 00:16:15,005
practices to ensure that I'm thinking,
making this as deterministic as

305
00:16:15,005 --> 00:16:16,985
possible, that's what you're saying.

306
00:16:17,375 --> 00:16:17,735
right.

307
00:16:17,945 --> 00:16:20,465
I will find out that there's
differences for some reason.

308
00:16:20,825 --> 00:16:24,755
Yeah, if you were to manually hash the
image, it would be a different hash.

309
00:16:25,255 --> 00:16:26,185
There's a reason,

310
00:16:26,685 --> 00:16:26,975
Okay.

311
00:16:27,050 --> 00:16:30,680
and that is because a little,
a very, I think a very

312
00:16:30,680 --> 00:16:32,270
underrated, little known fact.

313
00:16:32,640 --> 00:16:35,490
we know that tags are, mutable.

314
00:16:35,830 --> 00:16:37,150
all official images.

315
00:16:37,150 --> 00:16:40,210
As far as I know, all official
Docker Hub images are mutable.

316
00:16:40,710 --> 00:16:46,440
Docker Hub, as well as, harbor, both have
the option for you as an individual in

317
00:16:46,440 --> 00:16:48,960
your images to make the tags immutable.

318
00:16:48,990 --> 00:16:52,110
Which means the minute I make that
tag for any image, I can never reuse

319
00:16:52,110 --> 00:16:53,970
that tag for any other image built.

320
00:16:54,460 --> 00:16:56,010
that's a new option in Docker Hub.

321
00:16:56,010 --> 00:16:58,330
You can go in and say for my
organization and make them immutable,

322
00:16:58,570 --> 00:17:00,430
but official images are mutable.

323
00:17:00,930 --> 00:17:02,940
I think that's also an ECR.

324
00:17:03,340 --> 00:17:04,360
It's also in ECR.

325
00:17:04,390 --> 00:17:05,200
So anyways, keep going

326
00:17:05,485 --> 00:17:05,695
Yeah.

327
00:17:05,695 --> 00:17:11,615
It's probably gotta be, I mean,
because Python runs on something in

328
00:17:11,615 --> 00:17:16,375
that image, whether it's Debian or
Wolfie or Alpine, Python changes at a

329
00:17:16,375 --> 00:17:19,425
different speed than the underlying.

330
00:17:19,925 --> 00:17:20,275
right?

331
00:17:20,375 --> 00:17:25,535
don't tag, and in most cases, there are
some that we do, some of the official

332
00:17:25,535 --> 00:17:30,665
images in Docker Hub are tagged where you
can see the version of the, the app that's

333
00:17:30,665 --> 00:17:35,935
running on it, like Postgres, and then the
version of Alpine underneath it, at least

334
00:17:35,965 --> 00:17:37,375
two minor version, not patch version.

335
00:17:37,615 --> 00:17:37,945
Okay.

336
00:17:38,215 --> 00:17:42,625
So you could then pin to, and then
when we have Debian, a lot of times

337
00:17:42,625 --> 00:17:44,875
what you'll see is like we're, we
got bookmark and then we're new.

338
00:17:45,055 --> 00:17:46,825
That's the version of Debian underneath.

339
00:17:46,915 --> 00:17:48,985
And then there's a new version of Debian.

340
00:17:49,285 --> 00:17:52,435
And for some reason we don't use
the SemVer tac versions of Zambian.

341
00:17:52,435 --> 00:17:53,245
We always use it.

342
00:17:53,245 --> 00:17:55,505
It seems like at Docker Hub, we
always use the, what do we call those?

343
00:17:55,505 --> 00:17:58,085
the friendly names, the
fun names, the code names.

344
00:17:58,085 --> 00:17:58,805
I don't know what you wanna call 'em.

345
00:17:58,805 --> 00:17:59,345
The handle

346
00:17:59,650 --> 00:18:00,980
product names, I dunno,

347
00:18:01,055 --> 00:18:03,895
So we've had bookworm, yeah,
we've had a lot of other ones.

348
00:18:03,995 --> 00:18:05,955
and those two move at different speeds.

349
00:18:05,955 --> 00:18:11,525
So you could then try to pin to a tag
that is both the version of Postgres that

350
00:18:11,525 --> 00:18:14,015
you might need to use and the version
of Alpine that you wanna stick to.

351
00:18:14,515 --> 00:18:18,745
Because you want to be as deterministic
as possible, but even that's not

352
00:18:18,745 --> 00:18:24,225
enough, because underneath each one
of those tags, even though they're

353
00:18:24,225 --> 00:18:31,255
pinned to both, it's still mutable,
meaning that Docker rebuilds the image

354
00:18:31,525 --> 00:18:36,265
when the underlying os part changes
and they don't tell you about it.

355
00:18:36,475 --> 00:18:37,735
This isn't a conspiracy.

356
00:18:37,735 --> 00:18:41,155
I'm not, I'm not saying they
just don't have a good way

357
00:18:41,620 --> 00:18:43,030
There's no UX for it.

358
00:18:43,510 --> 00:18:46,720
Right there, the UX is
staring at digest lists.

359
00:18:46,750 --> 00:18:52,330
That's the UX is looking at sha1 hashes
and timestamps, and knowing when you last

360
00:18:52,330 --> 00:18:55,030
pulled it versus what the one is today,

361
00:18:55,150 --> 00:18:55,370
Mm.

362
00:18:55,400 --> 00:18:55,760
again.

363
00:18:56,210 --> 00:18:56,630
Okay.

364
00:18:57,130 --> 00:19:00,800
And just before the show, I'm gonna
shout out to Eric Smalling at Chainguard.

365
00:19:00,820 --> 00:19:04,410
he let me know that their particular
tool chain control keeps, they

366
00:19:04,410 --> 00:19:10,070
have this archive of tracking
every digest for every tag.

367
00:19:10,570 --> 00:19:14,110
There's a limit to the re, the, the, to
the what they do, but they track this.

368
00:19:14,440 --> 00:19:14,800
Okay.

369
00:19:15,190 --> 00:19:17,200
So I will just prove this theory.

370
00:19:17,450 --> 00:19:23,180
their chain control tool that's going to
pull their history of them essentially

371
00:19:23,180 --> 00:19:26,990
pulling, I don't know exactly how they
get this data, but my guess is if I

372
00:19:26,990 --> 00:19:31,520
actually spent the week building a tool,
I was using Claude Code to build a tool.

373
00:19:31,520 --> 00:19:34,750
I'm calling Tag Tracker,
which will do this for me.

374
00:19:35,250 --> 00:19:38,460
So it's going to run every day and
it's gonna download all my favorite

375
00:19:38,460 --> 00:19:42,830
images and all different variations
of their tag, whether it's Python

376
00:19:42,830 --> 00:19:46,050
three, Python three 18, Python 3
18, like it's gonna get 'em all.

377
00:19:46,290 --> 00:19:48,000
And then it's going to track the digest.

378
00:19:48,000 --> 00:19:50,040
And if the digest changes,
it makes a log entry.

379
00:19:50,070 --> 00:19:50,100
Okay?

380
00:19:50,600 --> 00:19:52,820
So they have this tool
and they already do this.

381
00:19:52,820 --> 00:19:54,110
we've been doing this all along.

382
00:19:54,610 --> 00:20:00,390
And so the Python three 13.8 image,
the most specific pen image you

383
00:20:00,390 --> 00:20:04,890
can have, has had four builts
in the time that it's existed.

384
00:20:05,070 --> 00:20:07,400
So it was built, ten seven.

385
00:20:07,865 --> 00:20:10,175
Ten eight, ten nine and 10 13.

386
00:20:10,675 --> 00:20:12,025
So which one are you running

387
00:20:12,525 --> 00:20:12,765
No

388
00:20:12,780 --> 00:20:14,130
you decided to pull Python?

389
00:20:14,130 --> 00:20:15,720
3 13 8. In production?

390
00:20:15,855 --> 00:20:17,440
have to look at the digest and compare.

391
00:20:17,760 --> 00:20:21,430
Yeah, now good news is
Kubernetes and Swarm.

392
00:20:21,670 --> 00:20:23,580
The two orchestrators that I care about.

393
00:20:24,080 --> 00:20:28,455
they are smart enough at least to
resolve that tag to the digest.

394
00:20:28,455 --> 00:20:29,685
And the digest is the you.

395
00:20:29,955 --> 00:20:35,025
For those not aware, the digest is
a content guarantee that you will

396
00:20:35,025 --> 00:20:41,145
always, if you use the digest, it's
in theory and nearly impossible

397
00:20:41,145 --> 00:20:42,685
to have a collision of that name.

398
00:20:42,685 --> 00:20:45,375
And so it is a unique content addressable.

399
00:20:45,875 --> 00:20:46,505
String.

400
00:20:47,005 --> 00:20:51,225
That was the whole premise and
basis for how we build Docker

401
00:20:51,225 --> 00:20:53,505
images and how we store them in
registries and how we run them.

402
00:20:53,685 --> 00:20:56,835
So the cool thing is Kubernetes
and Swarm both will, if you say,

403
00:20:56,865 --> 00:21:00,735
give me this version of Python,
they will at least ensure that the

404
00:21:00,735 --> 00:21:03,075
exact same digest is on every node.

405
00:21:03,315 --> 00:21:03,555
Right.

406
00:21:03,555 --> 00:21:06,015
So that like when you
run it, you're getting

407
00:21:06,235 --> 00:21:09,615
I didn't even think about that because
you could have had, like if you

408
00:21:09,615 --> 00:21:12,200
were running, if you were spinning.

409
00:21:12,275 --> 00:21:12,965
by the way, didn't

410
00:21:12,965 --> 00:21:17,385
If you were spinning up a new node,
like you had a node with this Python

411
00:21:17,385 --> 00:21:21,885
app on 10, 10, 0 8, and then in
your EKS cluster or whatever, you

412
00:21:21,885 --> 00:21:24,315
spun up a new node the next day,

413
00:21:24,815 --> 00:21:25,295
yeah.

414
00:21:25,395 --> 00:21:26,645
it could have, it could

415
00:21:26,825 --> 00:21:28,985
And this used to be able to happen.

416
00:21:29,045 --> 00:21:32,885
There was a day where the orchestration
was not always resolving the

417
00:21:32,885 --> 00:21:36,335
digest before it issued commands.

418
00:21:37,505 --> 00:21:39,875
To each node to download an image, right?

419
00:21:39,875 --> 00:21:44,365
Like it might have been resolving
individually but at some point you have

420
00:21:44,365 --> 00:21:47,835
to resolve this inside of container
D and Docker d and like the low level

421
00:21:47,835 --> 00:21:51,045
tooling always does this because
it has to pull down these tar balls

422
00:21:51,045 --> 00:21:52,545
and it has to get 'em from a source.

423
00:21:52,595 --> 00:21:55,880
but somewhere there's a human that
types in a number and then the

424
00:21:55,880 --> 00:21:59,275
computer converts it into the digest
at some point and everything we use.

425
00:21:59,485 --> 00:22:03,645
So then what happens is when someone
learns this, they go, oh, well,

426
00:22:03,645 --> 00:22:05,445
I was told to pin the digests.

427
00:22:05,775 --> 00:22:12,245
So they can use tools to go in their
Docker files and then pin to the digest.

428
00:22:12,245 --> 00:22:15,205
And the way you can do
this, I actually made a blog

429
00:22:15,260 --> 00:22:20,620
the challenge there is we want
human readable version tags, right?

430
00:22:20,620 --> 00:22:24,960
Like we don't communicate, like,
I don't say, oh, you need to, you

431
00:22:24,965 --> 00:22:29,340
know, the prerequisite for this is
Python Digest, blah, blah, blah,

432
00:22:29,340 --> 00:22:31,350
blah, blah, F1 three or something.

433
00:22:31,740 --> 00:22:37,190
Like, that's not how we communicate
about versions or what we need, right?

434
00:22:37,475 --> 00:22:38,135
yes.

435
00:22:38,225 --> 00:22:41,675
And that is the problem that the way
you get around that problem, at least,

436
00:22:41,675 --> 00:22:46,285
is how I do it, is some people don't
realize that in the from line you can

437
00:22:46,285 --> 00:22:52,085
specify the tag and the digest together.

438
00:22:52,585 --> 00:22:56,545
Now, when you do it this way, the
tag becomes useless to the machine.

439
00:22:56,545 --> 00:22:58,195
The machine does not care
about the tag anymore.

440
00:22:58,195 --> 00:23:01,905
you could put whatever tag you want in
there, but the tag is like documentation.

441
00:23:01,905 --> 00:23:08,055
It's like a comment for the humans to know
that this is the tag that I intended and

442
00:23:08,055 --> 00:23:10,305
the digest was resolved by the computer.

443
00:23:10,485 --> 00:23:16,145
Now there's tooling that can give you this
sha hash, and then we'll update when the

444
00:23:16,145 --> 00:23:20,495
Python version updates, it will make sure
that the sha hash is the correct one, but.

445
00:23:20,845 --> 00:23:26,685
none of these tools we've mentioned
so far will, on ten seven, you have

446
00:23:26,685 --> 00:23:29,805
this di like if I'm running arm on
my servers, let's say I'm really cool

447
00:23:29,805 --> 00:23:30,855
and I'm running the latest arm image.

448
00:23:30,855 --> 00:23:36,255
So then, so when I created my cluster
and I deployed, I was on this one.

449
00:23:36,645 --> 00:23:42,865
But then I, you know, a, a week later I
realize, oh, there's this new version.

450
00:23:43,325 --> 00:23:48,445
You need to be sure that you're on
latest build of every image, even

451
00:23:48,445 --> 00:23:51,055
though the tag hasn't changed.

452
00:23:51,555 --> 00:23:52,635
you need to be aware of this.

453
00:23:52,635 --> 00:23:54,735
You need to have PRs that are
automated for this, and you

454
00:23:54,735 --> 00:23:55,695
need to be redeploying these.

455
00:23:56,200 --> 00:23:58,480
Okay, so if you don't know about
High Fivers, this is a little ad.

456
00:23:58,710 --> 00:24:02,250
high Fivers is a group of DevOps
professionals and we all meet once a

457
00:24:02,250 --> 00:24:07,380
month in, discord, and we have a video
call and we complain about our jobs.

458
00:24:07,880 --> 00:24:10,940
talk about DevOps and the problems
we're having and the solutions.

459
00:24:10,940 --> 00:24:12,530
And we talk about
Kubernetes and containers.

460
00:24:12,740 --> 00:24:16,040
So if you're interested in high fives,
you can be a member in this channel.

461
00:24:16,295 --> 00:24:17,345
And pick high fives.

462
00:24:17,825 --> 00:24:18,995
You can join us once a month.

463
00:24:18,995 --> 00:24:20,375
It costs it for a cup of coffee.

464
00:24:20,625 --> 00:24:22,175
you get to join and learn.

465
00:24:22,175 --> 00:24:25,395
And that's the whole point I have that
group I'm thinking of renaming it to the

466
00:24:25,400 --> 00:24:28,965
Agentic DevOps Guild to make it sound
like we're some sort of superheroes.

467
00:24:29,295 --> 00:24:30,575
But that's a thing we do.

468
00:24:30,575 --> 00:24:35,425
So yesterday, we did that for this
month and I was listening to one of our

469
00:24:35,425 --> 00:24:40,555
regulars, and Brandon was talking about
having this exact same problem at work,

470
00:24:40,555 --> 00:24:45,565
but not realizing the core problem of
it and the actual best solution of it.

471
00:24:45,715 --> 00:24:49,075
And so what they had was they
were going microservices.

472
00:24:49,575 --> 00:24:54,250
When they go microservices, the effect
of that sometimes is if you get then

473
00:24:54,250 --> 00:24:57,880
they were going so specific with their
microservices that you could, they

474
00:24:57,880 --> 00:25:01,360
were making a microservice per verb,
not just per endpoint, but per verb.

475
00:25:01,735 --> 00:25:01,955
Wow.

476
00:25:02,455 --> 00:25:04,935
Yeah, I mean, Carpe diem, so.

477
00:25:05,445 --> 00:25:09,585
that is not a wrong decision in the
right positions, in the right state.

478
00:25:09,615 --> 00:25:13,095
But what happens is, is that means
that you're writing code that

479
00:25:13,095 --> 00:25:14,835
doesn't need to change very often.

480
00:25:15,335 --> 00:25:18,275
So you're pinning to whatever, let's
say it's a Golang thing, right?

481
00:25:18,275 --> 00:25:20,895
You're pinning Golang,
or you're pinning, Java.

482
00:25:20,925 --> 00:25:25,575
And so you have that image, but
the code's kind of done, like

483
00:25:25,605 --> 00:25:28,095
we've got a thousand lines of code
or whatever we're kind of done.

484
00:25:28,395 --> 00:25:31,425
And so it sits there, it
becomes stale on the servers.

485
00:25:31,575 --> 00:25:35,235
For lots of us that are deploying cloud
native apps, the developers are constantly

486
00:25:35,235 --> 00:25:39,015
developing, so the images get refreshed
pretty quickly all the time in production.

487
00:25:39,015 --> 00:25:40,455
So they don't age.

488
00:25:40,955 --> 00:25:44,315
But what about your stat, your
demon sets, your Postgres,

489
00:25:44,315 --> 00:25:44,705
Yeah.

490
00:25:44,735 --> 00:25:46,970
Like everything, there's
like all this other stuff

491
00:25:47,300 --> 00:25:48,890
There's this other stuff
that just sits there.

492
00:25:48,890 --> 00:25:51,140
every day it sits there running.

493
00:25:51,640 --> 00:25:53,470
It's getting more CVEs over time.

494
00:25:53,470 --> 00:25:54,760
there's no way to avoid it.

495
00:25:54,940 --> 00:25:58,510
I don't care what you're running,
it's going to have CVEs eventually.

496
00:25:59,020 --> 00:26:03,840
So, so we get back to this core
problem of this thing is being rebuilt

497
00:26:04,340 --> 00:26:07,930
for the good of the internet, and no
one knows that it's being rebuilt.

498
00:26:07,930 --> 00:26:10,060
That's why I call it Silent Rebuilds.

499
00:26:12,485 --> 00:26:12,765
Interesting.

500
00:26:13,265 --> 00:26:15,365
I'm clearly on a soapbox moment, right?

501
00:26:15,365 --> 00:26:20,065
Because I have known this for five
years and I haven't cared enough.

502
00:26:20,565 --> 00:26:23,915
And it wasn't until Eric and I started
hanging out and talking about what are

503
00:26:23,915 --> 00:26:26,885
we gonna do for this piece of content
that we're gonna create together?

504
00:26:26,885 --> 00:26:32,425
And I started to care because I realized
that I could buy all the Chainguard images

505
00:26:32,925 --> 00:26:37,035
and spend however much money I wanna
spend, unlimited amounts of money just to

506
00:26:37,035 --> 00:26:38,775
buy every image I could possibly imagine.

507
00:26:38,925 --> 00:26:42,795
And then I could deploy those into
production on day one with zero CVEs.

508
00:26:42,795 --> 00:26:44,805
And on day two, they now have CVEs.

509
00:26:45,305 --> 00:26:46,355
What do I do then?

510
00:26:46,565 --> 00:26:47,315
What's my plan?

511
00:26:47,685 --> 00:26:47,975
Yeah.

512
00:26:48,095 --> 00:26:51,575
And Chainguard, their answer to that
is they literally rebuild every day.

513
00:26:51,575 --> 00:26:54,215
Like according to Eric,
I'll put 'em on the spot.

514
00:26:54,455 --> 00:26:55,505
They rebuild every day.

515
00:26:55,745 --> 00:26:57,305
They are just constantly rebuilding.

516
00:26:57,805 --> 00:27:02,825
so, if this is, doesn't seem interesting
to you and your operating container

517
00:27:02,825 --> 00:27:05,695
environments, this is super important.

518
00:27:06,195 --> 00:27:08,615
' cause this should be part of
that strategy that you started

519
00:27:08,615 --> 00:27:09,935
this conversation with, right?

520
00:27:09,935 --> 00:27:15,865
Like, how do you answer that question
if you're being gold on be being in

521
00:27:15,865 --> 00:27:20,485
a better security posture, which part
of that would be removing as many

522
00:27:20,485 --> 00:27:25,155
CVEs as possible in whatever window
of time, you're tracking it in.

523
00:27:25,390 --> 00:27:25,680
yeah.

524
00:27:25,800 --> 00:27:26,000
Yeah.

525
00:27:26,115 --> 00:27:30,895
And if you wanna learn more info,
because I think you just barely

526
00:27:30,895 --> 00:27:34,825
scratched the surface on this topic
with me today, so I appreciate that.

527
00:27:35,001 --> 00:27:37,881
to me it feels like it's a
very useful tool, solves a

528
00:27:37,881 --> 00:27:39,501
very specific niche problem.

529
00:27:39,501 --> 00:27:42,391
It is very much a Unix type style,
you know, solve one problem,

530
00:27:42,391 --> 00:27:43,801
do it great kind of thing.

531
00:27:44,161 --> 00:27:51,661
And, but I feel like in the quest
for us moving to zero CVE, like the

532
00:27:51,661 --> 00:27:55,321
eternal quest that will never, shall
probably never be fully realized.

533
00:27:55,821 --> 00:27:57,141
The north, the North star.

534
00:27:57,236 --> 00:27:57,476
Yeah.

535
00:27:57,476 --> 00:27:58,406
The North Star,

536
00:27:58,906 --> 00:27:59,086
yeah.

537
00:27:59,281 --> 00:27:59,491
yeah.

538
00:27:59,491 --> 00:28:03,321
CVE and CVE publication is random.

539
00:28:03,501 --> 00:28:08,411
CVE fixes are random upstream
happens at random times.

540
00:28:08,411 --> 00:28:11,831
So at any minute of any day, there
are many things in the works.

541
00:28:11,921 --> 00:28:16,271
And eventually it arrives at
the base image that you are not

542
00:28:16,271 --> 00:28:19,661
building, but you are using some
from some base image provider.

543
00:28:19,721 --> 00:28:20,801
And that moment.

544
00:28:21,301 --> 00:28:24,991
Starts a race to, in my, when I
imagine this, it's like, I, I, I

545
00:28:24,991 --> 00:28:26,251
watched the F1 movie this summer.

546
00:28:26,251 --> 00:28:28,561
It was abr fa great Brad Pitt movie.

547
00:28:28,891 --> 00:28:29,611
Loved it.

548
00:28:29,671 --> 00:28:32,191
Of course, if you gotta be a Brad
Pitt fan, but it was, I think

549
00:28:32,191 --> 00:28:32,971
it was the movie of the summer.

550
00:28:32,971 --> 00:28:33,601
It was fantastic.

551
00:28:33,881 --> 00:28:38,261
and so I'm thinking of a formula one
scenario of the minute my upstream

552
00:28:38,261 --> 00:28:43,001
image is rebuilt to remove that
CVE, it is a clock that is now my

553
00:28:43,001 --> 00:28:49,211
clock of how fast can I detect it,
PR it, and push it to production.

554
00:28:49,331 --> 00:28:49,721
Right?

555
00:28:50,171 --> 00:28:53,971
And that for some teams is
never, it never happens.

556
00:28:54,471 --> 00:28:59,241
but as the maturity of your team improves
and you get more automated and more

557
00:28:59,241 --> 00:29:04,911
sophisticated, and you become aware even
that some of these mechanisms exist,

558
00:29:05,061 --> 00:29:09,551
you start shortening that window to
the point where ideally we're gonna get

559
00:29:09,551 --> 00:29:12,851
to someday where things aren't polling
anymore, they're, everything's web hooked.

560
00:29:12,851 --> 00:29:15,251
Everything is, is chained.

561
00:29:15,751 --> 00:29:16,291
And.

562
00:29:16,791 --> 00:29:20,121
The way I understand, there's a great
video on Chainguard about how they build

563
00:29:20,121 --> 00:29:22,131
out their entire internal infrastructure.

564
00:29:22,131 --> 00:29:25,371
And it's largely based on GitHub
Actions, a little bit of Argo,

565
00:29:25,885 --> 00:29:29,525
CI, but it's it's largely GitHub
Actions and they just, they are, they

566
00:29:29,525 --> 00:29:30,965
build from everything from source.

567
00:29:31,175 --> 00:29:33,155
They build up everything
deterministically.

568
00:29:33,425 --> 00:29:37,295
They can truly do reproducible builds
unlike a lot of what I see out there.

569
00:29:37,295 --> 00:29:43,285
and it's all very sophisticated and ex
and like top tier expert level stuff.

570
00:29:43,525 --> 00:29:44,965
The rest of us aren't that.

571
00:29:45,235 --> 00:29:45,505
Right.

572
00:29:45,505 --> 00:29:46,945
The rest of us aren't like that.

573
00:29:47,035 --> 00:29:50,635
And there, and they have other competitors
like obviously Docker has hardened images

574
00:29:50,635 --> 00:29:53,025
So there's a lot of smart
people, but you and I are not

575
00:29:53,025 --> 00:29:54,765
the smart people necessarily.

576
00:29:55,005 --> 00:29:55,995
We don't have to be that smart.

577
00:29:56,205 --> 00:30:00,035
My point is, is that they, they're doing
that hard work, but because they're all

578
00:30:00,035 --> 00:30:04,595
providing us these updated digests because
they're rebuilding it like it's now on us.

579
00:30:04,595 --> 00:30:08,735
now that I've told you, it's on us
because we now have the knowledge that

580
00:30:08,735 --> 00:30:13,825
it is in our job now to detect that
difference and then do something about it.

581
00:30:14,205 --> 00:30:14,865
Do something about it.

582
00:30:15,245 --> 00:30:24,975
I this is like the day the tag is
invented and it's three point 18.9

583
00:30:25,305 --> 00:30:27,165
the day that it's invented the epoch.

584
00:30:27,495 --> 00:30:35,840
And then there are events that happen,
downstream and there's a day where, that

585
00:30:35,840 --> 00:30:41,260
new digest happens, and then there's
a day where you actually deploy it.

586
00:30:41,680 --> 00:30:46,200
And I don't think this happens in a lot
of organizations at all, because there's

587
00:30:46,200 --> 00:30:52,390
never an opportunity for them to update
the digest of the same tag because

588
00:30:52,390 --> 00:30:57,340
we've been yelling from the rooftops
pin to the sha hash pin to the digest.

589
00:30:57,760 --> 00:31:02,020
But when you do that, you've now actually
created a new problem because you're

590
00:31:02,020 --> 00:31:06,150
forcing yourself to a moment in time
with a set of binaries that every day

591
00:31:06,150 --> 00:31:08,340
after get worse in terms of security.

592
00:31:08,840 --> 00:31:10,820
And so they're not aging like fine wine.

593
00:31:11,210 --> 00:31:13,310
They're aging like spoiled, spoiled

594
00:31:13,415 --> 00:31:15,635
think there was good intent behind that,

595
00:31:15,950 --> 00:31:16,160
Oh, yeah.

596
00:31:16,160 --> 00:31:16,370
Yeah.

597
00:31:16,370 --> 00:31:17,360
you have to still do it.

598
00:31:17,855 --> 00:31:21,445
removing non-determinism is a good intent

599
00:31:21,945 --> 00:31:22,275
Right.

600
00:31:22,325 --> 00:31:25,265
I get what you mean, but pinning
is like half the problem.

601
00:31:25,535 --> 00:31:27,845
That's what you're kind of
saying, or half of the solution,

602
00:31:27,975 --> 00:31:32,925
It's like you unintentionally create a
side effect when you pin to a digest, and

603
00:31:32,925 --> 00:31:37,905
that is there's no opportunity for your
existing tools to deploy something newer.

604
00:31:38,405 --> 00:31:40,020
So like if you went to like three 18.

605
00:31:40,520 --> 00:31:43,640
' cause a lot of teams, especially
I see that node teams, especially

606
00:31:43,640 --> 00:31:47,330
teams that are just using Node to
build a front end system, they're not

607
00:31:47,330 --> 00:31:49,400
actually running node on a server.

608
00:31:49,400 --> 00:31:52,910
They just use it in CI, they
pin to the minor version.

609
00:31:52,910 --> 00:31:55,400
They might just pin to the major
version on a node version because

610
00:31:55,450 --> 00:31:59,890
they only need it to do JavaScript
compilation and CSS ification

611
00:31:59,890 --> 00:32:00,820
and like, stuff like that, right?

612
00:32:01,180 --> 00:32:06,070
So they don't really care as
much as long as the original open

613
00:32:06,070 --> 00:32:09,160
source team obeys, SemVer verb
rules and all that stuff, right?

614
00:32:09,210 --> 00:32:10,140
they don't break anything.

615
00:32:10,470 --> 00:32:15,760
the problem is when you pin to that
thing where, you do this and then.

616
00:32:16,260 --> 00:32:21,350
It looks like that, is that Docker
and Chainguard and GitHub and everyone

617
00:32:21,350 --> 00:32:25,930
who, you know, and your, AWS's public
images, like I bet you they're all

618
00:32:25,960 --> 00:32:28,930
doing the same thing that Docker Hub
originally did, which is, well, we've

619
00:32:28,930 --> 00:32:30,640
got a new version of the underlying os.

620
00:32:30,640 --> 00:32:34,480
Let's go ahead and reduce the CVE
count, let's do the right thing and

621
00:32:34,480 --> 00:32:36,880
update the tags with a better CVE count.

622
00:32:37,100 --> 00:32:37,940
that's the right thing.

623
00:32:38,210 --> 00:32:42,280
But there is very little indication
in the UI that that ever happened,

624
00:32:42,460 --> 00:32:42,910
right?

625
00:32:42,910 --> 00:32:47,010
It's not communicated with
the right language about the

626
00:32:47,010 --> 00:32:48,990
deterministic nature of that build

627
00:32:49,490 --> 00:32:49,820
right?

628
00:32:50,320 --> 00:32:52,570
or the reproducible nature of that build

629
00:32:52,900 --> 00:32:55,195
Because you can see, when you
look at all the tags, you see

630
00:32:55,195 --> 00:32:56,695
dates, you're like rebuilt.

631
00:32:56,695 --> 00:32:58,675
You can see all these built
yesterday, built yesterday.

632
00:32:59,175 --> 00:33:03,195
but there's nothing that indicates
this is the third variant or the

633
00:33:03,195 --> 00:33:07,315
third rebuild of this image imagine
if it said like every time it rebuilt,

634
00:33:07,315 --> 00:33:08,580
it added something to the tag.

635
00:33:09,080 --> 00:33:10,790
Like build one, build two,

636
00:33:10,840 --> 00:33:11,950
Build three, build five.

637
00:33:11,950 --> 00:33:12,250
Yeah.

638
00:33:12,640 --> 00:33:12,880
Yeah,

639
00:33:13,015 --> 00:33:13,855
which they just don't do.

640
00:33:13,855 --> 00:33:15,175
And maybe they don't need to.

641
00:33:15,175 --> 00:33:17,935
Maybe that's not something we want
to get into because again, that

642
00:33:17,935 --> 00:33:19,705
doesn't really solve the problem.

643
00:33:19,825 --> 00:33:21,145
It just makes you more aware.

644
00:33:21,315 --> 00:33:26,835
Like when os package managers rebuild
an app, as far as I know, like an

645
00:33:26,835 --> 00:33:33,215
apt when curl is changed in some
fundamental way, it gets a new version.

646
00:33:33,245 --> 00:33:36,095
'cause you can, if you look at like
app tags, they're, it's like the curl

647
00:33:36,095 --> 00:33:38,885
version and then something after,
which is kind of like the app to.

648
00:33:39,365 --> 00:33:40,055
Variant.

649
00:33:40,555 --> 00:33:42,355
I, I think, and at least
is how I understand it.

650
00:33:42,355 --> 00:33:45,145
So I actually think that this is a
pretty unique problem to container

651
00:33:45,145 --> 00:33:50,455
images because they include app
code independencies and os code

652
00:33:50,455 --> 00:33:52,225
dependencies or os package managers.

653
00:33:52,225 --> 00:33:53,515
I think it's because of the two

654
00:33:53,669 --> 00:33:56,520
you could be the most secure person
in the world and if you don't do this

655
00:33:56,520 --> 00:34:00,960
particular step, you're still gonna have
more CVEs in production than you should.

656
00:34:01,210 --> 00:34:02,920
it's actually very
little work to fix this.

657
00:34:02,920 --> 00:34:06,990
It's just knowing about, it's the
actual, I think, problem for everyone.

658
00:34:07,490 --> 00:34:08,870
Well thank you so much, Brett.

659
00:34:09,370 --> 00:34:11,110
Now I know what silent rebuilds are

660
00:34:11,530 --> 00:34:12,820
Silent rebuilds.

661
00:34:12,970 --> 00:34:14,290
and to everyone that's

662
00:34:14,790 --> 00:34:16,110
I'm gonna, we're gonna see if that sticks.

663
00:34:16,610 --> 00:34:17,060
okay.

664
00:34:17,360 --> 00:34:18,560
That was the stream we did.

665
00:34:18,560 --> 00:34:22,100
But since then I've had some updates that
I wanted to tell you about real quick.

666
00:34:22,430 --> 00:34:28,040
First, I've released a video walkthrough,
a repo, and a long blog post breaking

667
00:34:28,040 --> 00:34:31,700
down the problem, and a detailed series
of solutions using either renovate

668
00:34:31,760 --> 00:34:36,080
pena bot or a bot to ensure you're
getting PRs for all the silent builds.

669
00:34:36,450 --> 00:34:41,190
Second, I did more research and confirmed
again that the official Docker Hub images

670
00:34:41,190 --> 00:34:45,930
are rebuilt to a different digest for
the same image tag on a random schedule.

671
00:34:46,260 --> 00:34:51,135
Remember, official images of open source
software are usually volunteers in open

672
00:34:51,135 --> 00:34:55,185
source repos making these changes, and
each of these official images had their

673
00:34:55,185 --> 00:35:00,105
own repo dedicated to the Dockerfile that
builds and pushes the image to Docker Hub,

674
00:35:00,195 --> 00:35:02,475
and those images are silently rebuilt.

675
00:35:02,565 --> 00:35:05,355
Typically, like any GitHub Actions
workflow, whenever there's a

676
00:35:05,355 --> 00:35:07,215
commit to the release branches.

677
00:35:07,455 --> 00:35:11,895
Now those commits might be something that
could reduce CVEs or it could just be

678
00:35:11,895 --> 00:35:15,765
an update to a read me file or any other
reason you might make a commit to a repo.

679
00:35:16,105 --> 00:35:20,005
The real point here is that we don't
know why these images are silently

680
00:35:20,005 --> 00:35:24,585
rebuilt and there are no tools currently
to help with that, which means that we

681
00:35:24,585 --> 00:35:29,235
have to assume that every rebuild of
an image tag with a new digest is for

682
00:35:29,235 --> 00:35:34,065
a good reason and that we should deploy
it with the same mindset that we deploy

683
00:35:34,065 --> 00:35:36,455
updates that do change the SemVer tag.

684
00:35:36,805 --> 00:35:38,235
We just don't know why.

685
00:35:38,715 --> 00:35:42,255
Unlike the SemVer changes,
again, the solution isn't hard.

686
00:35:42,435 --> 00:35:46,185
Just implement, renovate, or depend
a bot with the right settings, which

687
00:35:46,185 --> 00:35:51,105
I show off in the repo, and they will
both ensure that your base images are

688
00:35:51,105 --> 00:35:55,815
using digest, pinning and that they're
checking every time you run 'em.

689
00:35:56,235 --> 00:36:01,865
Recommended daily for silent rebuilds
of that tagged to a different digest.

690
00:36:02,388 --> 00:36:06,888
I like these options because they bring
your container dependency updates into

691
00:36:06,888 --> 00:36:10,968
the same tool and process that you check
for application level dependency updates.

692
00:36:11,418 --> 00:36:15,498
In my example repo for this solution,
I have the settings you need to add to

693
00:36:15,498 --> 00:36:19,998
each repo, and I have a bunch of sample
PRs that show off the various ways

694
00:36:19,998 --> 00:36:24,598
these tools check for image updates,
including the silent rebuilds, All the

695
00:36:24,598 --> 00:36:28,138
resources are in the show notes and
if you have questions, Feel free to

696
00:36:28,138 --> 00:36:33,580
start a discussion in the repo or in my
Discord server, and thanks for watching.

697
00:36:33,760 --> 00:36:34,840
I'll see you in the next episode.