1
00:00:04,333 --> 00:00:08,053
This is DevOps and Docker talk,
and I am your host, Bret Fisher.

2
00:00:08,413 --> 00:00:10,483
This was a fun episode this week.

3
00:00:10,483 --> 00:00:14,743
Nirmal Mehta is back from
AWS and our guest this week

4
00:00:14,773 --> 00:00:17,443
was Michael Irwin of Docker.

5
00:00:17,443 --> 00:00:22,183
He is a recurring guest if you've been
listening to this podcast for any length

6
00:00:22,183 --> 00:00:25,743
of time, I think we had him on earlier
this year, and he's been a friend for a

7
00:00:25,743 --> 00:00:30,693
decade, former Docker captain, now Docker
employee advocating for Docker everywhere.

8
00:00:30,693 --> 00:00:34,233
He's all over the place and,  I
always have him on because he

9
00:00:34,233 --> 00:00:39,109
breaks stuff down to help us all
understand what's going on at Docker.

10
00:00:39,199 --> 00:00:45,109
And there is a giant list of things we had
to talk about this week, all AI related,

11
00:00:45,139 --> 00:00:49,789
all separate pieces of the puzzle that
can be used independently or together.

12
00:00:49,969 --> 00:00:54,259
So in this episode, we will be covering
the Docker Model Runner, which I've

13
00:00:54,259 --> 00:00:58,609
talked about a lot on this show over
the last four months, for running open

14
00:00:58,669 --> 00:01:03,109
or free models locally or remotely on
servers or on your machine, wherever.

15
00:01:03,229 --> 00:01:05,599
Basically you get to run your own
models and Docker will host them for

16
00:01:05,599 --> 00:01:10,289
you, Docker will encapsulate and allow
you to use the Docker CLI to manage it.

17
00:01:10,589 --> 00:01:15,539
Then we have the hub model catalogs,
so you can pick one of the dozens of

18
00:01:15,539 --> 00:01:19,649
models I guess, that are available in
Docker Hub and we talk about Gordon

19
00:01:19,649 --> 00:01:24,509
ai, which is their chat bot built into
Docker Desktop and the Docker CLI.

20
00:01:25,059 --> 00:01:31,509
We then get into MCP toolkit and the
hubs MCP catalog, and how to bring all

21
00:01:31,509 --> 00:01:37,504
our tools into our local ai or like,
using some other AI and you're just

22
00:01:37,504 --> 00:01:39,124
wanting to use your own MCP tools.

23
00:01:39,334 --> 00:01:43,204
We talk about how Docker manages
all that and it uses the MCP gateway

24
00:01:43,204 --> 00:01:47,764
that they recently open sourced
to front end all those tools and

25
00:01:47,764 --> 00:01:49,534
to help you manage your MCP tools.

26
00:01:50,074 --> 00:01:56,554
We also then get into compose and how
you add models and the MCP gateway plus

27
00:01:56,554 --> 00:02:00,049
MCP tools, all into your composed files.

28
00:02:00,319 --> 00:02:03,979
Then we talk about how to use that
for building agents a little bit

29
00:02:04,089 --> 00:02:10,209
And then Offload, which allows
you to build, run containers,

30
00:02:10,209 --> 00:02:15,999
run Docker models, all in Docker
Cloud, and they call it Offload

31
00:02:15,999 --> 00:02:18,264
It actually is a pretty unique name.

32
00:02:18,264 --> 00:02:21,564
Most other people just called it Docker
Cloud, but they call it Docker Offload.

33
00:02:21,834 --> 00:02:22,464
Great name.

34
00:02:22,464 --> 00:02:25,254
You just toggle inside your
Docker UI and you're good to go.

35
00:02:25,254 --> 00:02:26,064
Everything's remote.

36
00:02:26,154 --> 00:02:31,764
And then we at the very end talked
about Compose Now works at least

37
00:02:31,764 --> 00:02:35,364
the YAML files of Compose, now
work inside of Google Cloud Run.

38
00:02:35,704 --> 00:02:37,744
Probably coming to other places soon.

39
00:02:37,924 --> 00:02:43,504
So we go for almost 90 minutes on
this show because for good reason, we

40
00:02:43,534 --> 00:02:49,904
talk about the use cases of all these
different various AI parts of the puzzle.

41
00:02:50,264 --> 00:02:55,814
And I'm excited because it paints,
I hope for you a complete picture

42
00:02:56,114 --> 00:03:00,579
of what Docker has released, how it
all works together, when you would

43
00:03:00,579 --> 00:03:04,149
choose each part for problem solving,
different problems and all that.

44
00:03:04,359 --> 00:03:08,709
So please enjoy this episode with
Nirmal and Michael Irwin of Docker.

45
00:03:10,863 --> 00:03:11,533
Hi!

46
00:03:12,020 --> 00:03:13,071
Hey, Awesome.

47
00:03:13,444 --> 00:03:14,144
I'm doing all right.

48
00:03:14,174 --> 00:03:17,255
we have a special guest
Michael, another Virginian.

49
00:03:17,265 --> 00:03:17,955
Hello.

50
00:03:18,699 --> 00:03:19,179
Hello,

51
00:03:19,472 --> 00:03:21,222
we all have known each other a decade.

52
00:03:21,242 --> 00:03:22,592
So Michael, tell us who you are

53
00:03:23,052 --> 00:03:25,302
so I'm, Michael Irwin, I work at Docker.

54
00:03:25,302 --> 00:03:29,896
I'm in our developer success team,
and do a lot of teaching, training,

55
00:03:29,896 --> 00:03:34,966
education, breaking down the stuff,
trying to make it somewhat understandable.

56
00:03:35,056 --> 00:03:38,936
I've been with Docker for almost
three and a half years now, so it's

57
00:03:38,936 --> 00:03:41,726
gone pretty quickly, but, it's fun
to spend time with the community

58
00:03:41,776 --> 00:03:43,436
and just help developers out.

59
00:03:43,836 --> 00:03:47,006
Learn about this stuff, but also
how do you actually take advantage

60
00:03:47,006 --> 00:03:49,086
and do cool stuff with it?

61
00:03:49,385 --> 00:03:50,115
That's awesome.

62
00:03:50,145 --> 00:03:53,705
I mean, this is like a new wave
of, essentially development

63
00:03:53,715 --> 00:03:56,515
tooling and, it's pretty exciting.

64
00:03:56,915 --> 00:03:57,275
Yeah.

65
00:03:57,275 --> 00:03:58,265
we've got a list people.

66
00:03:58,365 --> 00:04:01,765
I only make lists on this
show a couple of times a year.

67
00:04:02,015 --> 00:04:05,495
Yeah, it does seem like every time I'm
on the show, there's a list involved.

68
00:04:06,065 --> 00:04:07,685
yeah, that's a pretty decent list.

69
00:04:07,965 --> 00:04:10,965
let's break it down because we got
a lot to get through and we realize

70
00:04:10,975 --> 00:04:13,895
that on Docker's channel and on
this channel, They've been talking

71
00:04:13,895 --> 00:04:15,175
about all these fun features.

72
00:04:15,205 --> 00:04:17,425
I've talked at length
about Docker Model Runner.

73
00:04:17,835 --> 00:04:23,155
Hub Catalog, I don't have a bunch of
videos on MCP or the MCP tools, but

74
00:04:23,725 --> 00:04:25,955
actually we've had streams on it before.

75
00:04:26,005 --> 00:04:30,605
But, I think the MCP toolkit and the
MCP gateway I'm most excited about that.

76
00:04:30,605 --> 00:04:31,935
So we're going to get
to that in a little bit.

77
00:04:32,329 --> 00:04:35,789
I think, technically, the first
thing out of the gate was Gordon AI.

78
00:04:36,794 --> 00:04:38,704
this to me is.

79
00:04:39,439 --> 00:04:44,509
From a user's perspective, it's a Docker
focused or maybe a developer focused

80
00:04:44,579 --> 00:04:52,219
AI chatbot, essentially similar to
ChatGPT, but it's in my Docker interface.

81
00:04:52,279 --> 00:04:56,329
if I'm staring at the Docker desktop
interface, there's an ask Gordon button.

82
00:04:56,929 --> 00:05:00,499
And if you've never clicked that,
or if you clicked it once a couple

83
00:05:00,499 --> 00:05:03,789
years ago, it has changed a lot.

84
00:05:03,849 --> 00:05:09,685
There have been enhancements so o now
we have memory and threads and it saves

85
00:05:09,685 --> 00:05:14,355
me from having to go to ChatGPT when
I'm working, specifically around Docker

86
00:05:14,355 --> 00:05:18,505
stuff, but anything it feels like I
can ask it developer related stuff

87
00:05:18,795 --> 00:05:21,275
Can you tell me now, is
this free for everyone?

88
00:05:21,275 --> 00:05:24,675
How does this play out in
terms of who can use this

89
00:05:24,675 --> 00:05:26,655
Yeah, so everybody can
access it right now.

90
00:05:26,655 --> 00:05:29,665
The only limitations may be
some of our business orgs.

91
00:05:29,705 --> 00:05:32,065
those orgs have to, enable it.

92
00:05:32,535 --> 00:05:35,825
We're just not going to roll out all
the AI stuff as most organizations

93
00:05:35,825 --> 00:05:37,155
are pretty cautious about that.

94
00:05:37,442 --> 00:05:39,522
but yeah, Gordon's available to everybody.

95
00:05:39,572 --> 00:05:43,562
it started off actually mostly as a
documentation help of just help me

96
00:05:43,572 --> 00:05:46,802
answer questions and keep up with
new features and that kind of stuff.

97
00:05:47,292 --> 00:05:51,472
But, as you've noted, we've added new
capabilities and new tools along the way.

98
00:05:51,937 --> 00:05:55,961
I was doing some experiments just
the other day of Hey Gordon, help

99
00:05:55,961 --> 00:06:00,601
me convert this Dockerfile to use
the new Docker hardened images.

100
00:06:00,611 --> 00:06:05,511
And it would do the conversion, find the
images that my organization has that match

101
00:06:05,511 --> 00:06:07,481
the one that I'm using this Dockerfile.

102
00:06:07,531 --> 00:06:10,981
you're starting to see more and more
capabilities built into it as well it's

103
00:06:10,981 --> 00:06:12,681
a pretty fun little assistant there.

104
00:06:13,057 --> 00:06:13,567
Yeah,

105
00:06:13,694 --> 00:06:18,714
highly recommend folks that are listening
that if you've not touched Docker and

106
00:06:18,714 --> 00:06:23,714
you just open it up, check out Gordon
and ask those questions that you probably

107
00:06:23,724 --> 00:06:27,104
have, compose all the things you could
probably put that in there and Gordon will

108
00:06:27,104 --> 00:06:28,804
probably try to compose all the things.

109
00:06:28,857 --> 00:06:32,887
you know what one of my common uses for
this is I need to translate a docker

110
00:06:32,887 --> 00:06:36,477
run command into a compose file or
back and forth like, please give me the

111
00:06:36,477 --> 00:06:40,807
docker run equivalent of this compose
file or please turn these docker run

112
00:06:40,807 --> 00:06:46,277
commands or this docker build command
into a compose file and it saves me.

113
00:06:46,677 --> 00:06:49,967
It's not, I mean, I've been doing this
stuff almost every day for a decade,

114
00:06:50,207 --> 00:06:55,697
so it's not like I needed to do it for
me, but it's still faster than a pro

115
00:06:55,747 --> 00:06:58,847
typing it in from memory, I'm at the
point now where I rarely need to refer

116
00:06:58,897 --> 00:07:03,067
to the docs, but it's still faster
than me at writing a Compose file

117
00:07:03,067 --> 00:07:04,467
out and saving it to the hard drive.

118
00:07:04,867 --> 00:07:08,757
You know, because sometimes we debate
around whether AI tools are useful for

119
00:07:08,767 --> 00:07:13,887
senior versus junior and blah, blah, blah,
and I don't know, as a senior, it saves

120
00:07:13,887 --> 00:07:17,727
me keystrokes, it saves me time, unlike
when it first came out a couple years

121
00:07:17,727 --> 00:07:23,562
ago, I have Tracked it, given feedback to
the team and it no longer makes Compose

122
00:07:23,562 --> 00:07:26,262
files with old Compose information.

123
00:07:26,262 --> 00:07:27,532
Like it, at least in the

124
00:07:27,721 --> 00:07:28,591
the version tag.

125
00:07:28,772 --> 00:07:29,192
Yeah.

126
00:07:29,192 --> 00:07:30,632
Cause it, the version tag.

127
00:07:30,642 --> 00:07:31,042
Yeah.

128
00:07:31,142 --> 00:07:33,772
I was a big proponent of Hey, this
hasn't been in there for five years.

129
00:07:33,772 --> 00:07:34,802
Why is it recommending it?

130
00:07:35,192 --> 00:07:36,122
and they fixed that.

131
00:07:36,132 --> 00:07:38,072
So, I'm very appreciative of it.

132
00:07:38,492 --> 00:07:41,042
We're going to save it, we're going
to talk about MCP tools, and then

133
00:07:41,042 --> 00:07:43,342
we're going to come back to this,
because it gets, it's gotten better,

134
00:07:43,592 --> 00:07:45,672
and one of the reasons it's gotten
better is it can talk to tools.

135
00:07:45,922 --> 00:07:49,382
So I don't want to really spoil that,
But yeah, I love the suggestions because

136
00:07:49,382 --> 00:07:53,625
sometimes when you're staring at a blank
chat box, it feels kind of like being a

137
00:07:53,625 --> 00:07:55,625
writer and staring at a blank document.

138
00:07:55,685 --> 00:07:57,845
It's like, I know this is
supposed to be a cool tool, but

139
00:07:57,845 --> 00:07:59,095
I don't know what to do with it.

140
00:07:59,125 --> 00:07:59,775
do I start?

141
00:07:59,885 --> 00:08:00,255
Yeah.

142
00:08:00,271 --> 00:08:00,951
analogy

143
00:08:01,375 --> 00:08:04,148
there's even some cool spots, like if
you go to the containers tab, You'll

144
00:08:04,148 --> 00:08:08,428
see on the right side a little like
magic icon there, and then you can

145
00:08:08,438 --> 00:08:10,338
ask questions about that container.

146
00:08:10,338 --> 00:08:13,348
So if you have a container that's
failing to start or whatever, you can

147
00:08:13,538 --> 00:08:17,668
basically start a thread around the
context of why is this thing not starting.

148
00:08:17,906 --> 00:08:18,386
Interesting.

149
00:08:18,386 --> 00:08:19,616
I have not tried that.

150
00:08:20,256 --> 00:08:23,506
it actually reminds me of the little
star we get for Gemini that's in every

151
00:08:23,506 --> 00:08:25,066
single Google app that I'm opening.

152
00:08:25,066 --> 00:08:29,896
I guess we're all, slowly converging
on the starlight AI icons now, I

153
00:08:29,896 --> 00:08:32,336
would have thought it would have
been a robot face, but we chose this

154
00:08:32,346 --> 00:08:34,666
vague collection of stars and pluses.

155
00:08:34,666 --> 00:08:35,646
I and that is pretty cool.

156
00:08:35,656 --> 00:08:38,336
I, so is there more, is
it like an images as well?

157
00:08:38,396 --> 00:08:40,746
So yeah, anywhere you see
that icon, you can start a

158
00:08:40,746 --> 00:08:42,636
conversation around that thing.

159
00:08:42,997 --> 00:08:45,137
I'm going to start clicking on that
more often to see what it tells me.

160
00:08:45,137 --> 00:08:47,307
Like, Hey, what's the, what data.

161
00:08:47,652 --> 00:08:49,402
Ooh, How do I back up this volume?

162
00:08:49,432 --> 00:08:50,332
That's, that is pretty cool.

163
00:08:50,332 --> 00:08:50,542
So they're

164
00:08:50,778 --> 00:08:52,568
that is the number one question

165
00:08:52,736 --> 00:08:54,386
I love that they're context aware.

166
00:08:54,386 --> 00:08:57,446
Like they know that this is a
volume, or at least the prompts.

167
00:08:57,956 --> 00:09:00,912
it's like prompt suggestions,
which This feels a little meta.

168
00:09:00,912 --> 00:09:03,162
Is the AI suggesting prompts for the ai?

169
00:09:03,242 --> 00:09:06,152
It's not in this case,
but it certainly could.

170
00:09:08,053 --> 00:09:08,433
All right.

171
00:09:08,973 --> 00:09:10,573
so that's ask Gordon,

172
00:09:10,673 --> 00:09:10,763
Woo!

173
00:09:11,283 --> 00:09:11,693
Alright.

174
00:09:11,976 --> 00:09:12,966
And that is what you said.

175
00:09:13,164 --> 00:09:15,614
that is available to everyone
running Docker Desktop.

176
00:09:15,804 --> 00:09:19,114
Is that, to be clear, that's
not available for people running

177
00:09:19,124 --> 00:09:21,404
Docker Engine on Linux, right?

178
00:09:21,424 --> 00:09:25,494
That is not There is a
CLI to it, isn't there?

179
00:09:25,594 --> 00:09:26,484
There's Docker AI.

180
00:09:26,705 --> 00:09:29,595
but that's part of the
Docker Desktop installation.

181
00:09:29,595 --> 00:09:32,435
that's a CLI plugin and it's going
to talk to the components that are

182
00:09:32,435 --> 00:09:34,115
bundled in Docker Desktop there.

183
00:09:34,165 --> 00:09:34,465
Yeah.

184
00:09:34,505 --> 00:09:37,895
so we have our, this is an AI
that's outsourced to Docker.

185
00:09:37,895 --> 00:09:38,795
It's not running locally.

186
00:09:38,825 --> 00:09:41,265
it's just calling APIs from
the Docker desktop interface.

187
00:09:41,315 --> 00:09:47,165
So then if I don't want to use Gordon
AI, but I want to run my own AI models

188
00:09:47,175 --> 00:09:52,295
locally, which I feel like this is
pretty niche because a lot of people I

189
00:09:52,295 --> 00:09:58,645
talk to, their GPU is underwhelming, and
they don't have an M4 Mac or a NVIDIA

190
00:09:58,655 --> 00:10:02,035
desktop tower with a giant GPU in it.

191
00:10:02,065 --> 00:10:05,185
granted, I guess there are
models that run on CPUs, but I

192
00:10:05,185 --> 00:10:07,195
am not the model expert here.

193
00:10:07,405 --> 00:10:11,075
I've been trying to catch up this year
because I actually do have a decent

194
00:10:11,075 --> 00:10:15,345
laptop now with an M4 in it so I can
run at least some of the Apple models.

195
00:10:15,675 --> 00:10:18,415
But this Docker model runner.

196
00:10:18,818 --> 00:10:22,318
it's a pretty cool feature, but
you can pick your models, so

197
00:10:22,318 --> 00:10:23,458
you can pick a really small one.

198
00:10:24,458 --> 00:10:28,178
On Windows, it has to be an
NVIDIA GPU on Windows to work.

199
00:10:28,188 --> 00:10:28,828
Is that right?

200
00:10:29,272 --> 00:10:32,542
So it supports both NVIDIA and
Adreno,  so if you've got a

201
00:10:32,572 --> 00:10:35,298
Qualcomm chip, it'll work there.

202
00:10:35,668 --> 00:10:39,238
Okay, and on Mac, it just uses system
memory because that's how Macs work.

203
00:10:39,238 --> 00:10:41,768
They have the universal memory, right?

204
00:10:42,188 --> 00:10:45,588
And I mean, for me, it's been
great because I have a brand

205
00:10:45,588 --> 00:10:46,908
new machine with 48 gig of RAM.

206
00:10:46,948 --> 00:10:48,378
I know that is not normal.

207
00:10:48,808 --> 00:10:54,395
but these models, Mac, it can be a little
problematic because it's like, if I have

208
00:10:54,395 --> 00:10:57,475
a bunch of browser tabs open, I can't
run the full size model that I want to

209
00:10:57,475 --> 00:11:00,545
run, because that's one of the problems
is like, it's all using the same thing.

210
00:11:00,545 --> 00:11:03,635
So I have to do like we did back with
VMs, I have to shut all my apps down

211
00:11:03,975 --> 00:11:06,555
and then run, because I always want
to run the biggest model possible.

212
00:11:06,555 --> 00:11:09,390
So I have this like 40, 46 gig model.

213
00:11:09,390 --> 00:11:11,820
I think it's the Devstral new
model that's supposed to be

214
00:11:11,820 --> 00:11:13,110
great for local development.

215
00:11:13,570 --> 00:11:16,550
So for those of you listening, if you
want to go in depth, we're not going to

216
00:11:16,580 --> 00:11:20,440
have time for that, but it is, Docker
Model Runner lets you run models locally.

217
00:11:20,820 --> 00:11:23,820
you can technically run this now
on Docker infrastructure, right?

218
00:11:23,820 --> 00:11:25,070
Which we're going to
get to in a little bit.

219
00:11:25,759 --> 00:11:27,089
dun dun.

220
00:11:27,130 --> 00:11:27,450
Okay.

221
00:11:27,600 --> 00:11:29,290
yeah, there is this thing called offload.

222
00:11:29,380 --> 00:11:31,010
Michael's going to know
way more about that.

223
00:11:31,010 --> 00:11:33,850
Cause that's a really new feature,
but you don't have to actually

224
00:11:33,850 --> 00:11:34,970
run these models locally now.

225
00:11:34,970 --> 00:11:36,290
Like you could offload them.

226
00:11:36,290 --> 00:11:36,530
Right.

227
00:11:36,930 --> 00:11:37,240
Okay.

228
00:11:37,540 --> 00:11:38,530
What are you seeing?

229
00:11:38,530 --> 00:11:42,206
are people coming to Docker, like trying
to run the biggest models possible?

230
00:11:42,206 --> 00:11:45,446
They got like multiple GPUs or are you
just seeing people kind of tinkering out?

231
00:11:45,546 --> 00:11:46,556
What's some analogies there?

232
00:11:46,995 --> 00:11:48,655
there's a little bit of everything.

233
00:11:48,835 --> 00:11:52,785
I'd say a lot of folks are still exploring
the space and figure out what's the

234
00:11:52,785 --> 00:11:54,765
right ways of doing things as well too.

235
00:11:54,765 --> 00:11:58,205
So, one of the interesting things is,
and one of the things to keep in mind

236
00:11:58,775 --> 00:12:02,585
is that Let's break outta the AI space.

237
00:12:02,585 --> 00:12:04,595
when you say, for example, a database.

238
00:12:04,625 --> 00:12:04,805
Okay.

239
00:12:04,805 --> 00:12:05,945
I'm using Postgres database.

240
00:12:06,365 --> 00:12:10,585
If I'm using a Postgres database, a
managed offering out in the cloud.

241
00:12:10,645 --> 00:12:10,795
Okay.

242
00:12:10,795 --> 00:12:11,965
Let's just say RDS.

243
00:12:12,115 --> 00:12:12,355
Okay.

244
00:12:13,155 --> 00:12:17,085
during development I can run a Postgres
container and yeah, it's a smaller

245
00:12:17,085 --> 00:12:18,795
version and it's not a managed offering.

246
00:12:19,075 --> 00:12:21,595
and that works because all the
binary protocols are the same.

247
00:12:21,595 --> 00:12:22,915
I can basically just swap out.

248
00:12:23,120 --> 00:12:24,870
My connection URL, and it just works.

249
00:12:25,360 --> 00:12:26,690
But models are a little different.

250
00:12:27,130 --> 00:12:30,430
And so, I've seen some folks that
have been like, I'm going to use a

251
00:12:30,450 --> 00:12:34,810
smaller version of a model during local
development, but then when I deploy,

252
00:12:34,840 --> 00:12:39,637
I'm going to use the larger hosted
model that my cloud provider provides.

253
00:12:39,727 --> 00:12:43,937
Okay, so maybe it's a parameter
change, I'm using fewer parameters

254
00:12:43,937 --> 00:12:46,347
so it can actually run locally,
and then use the larger version.

255
00:12:47,977 --> 00:12:52,497
The analogy of, okay, me running a
local PostgreSQL container and me

256
00:12:52,497 --> 00:12:58,117
running a local model, if it's a
different model, it's a different model.

257
00:12:58,707 --> 00:13:02,347
And the results that you get from that
model are going to vary quite a bit.

258
00:13:02,627 --> 00:13:05,077
And so that's one of the things that
we have to kind of keep reminding

259
00:13:05,077 --> 00:13:08,562
people if you use for example a four
billion parameter, So, we're going to

260
00:13:08,562 --> 00:13:09,838
talk a little bit about how you can use
this model during local development.

261
00:13:09,838 --> 00:13:11,206
Yes, it can fit on your machine, but
then if you deploy and you're using

262
00:13:11,206 --> 00:13:16,047
the 32 billion parameter version in
production, those are very different

263
00:13:16,057 --> 00:13:18,627
models and you're going to get very
different interactions and different

264
00:13:18,637 --> 00:13:20,177
outputs from the models there.

265
00:13:20,177 --> 00:13:23,457
So, it's something to keep in
mind as folks are looking at

266
00:13:23,717 --> 00:13:24,707
building their applications.

267
00:13:26,157 --> 00:13:28,617
So, where are we seeing folks use this?

268
00:13:28,657 --> 00:13:33,132
Well Of course, if you can use
the same model across your entire

269
00:13:33,282 --> 00:13:35,822
software development life cycle,
then that works out pretty well.

270
00:13:36,322 --> 00:13:40,742
but we're also starting to see a little
bit of a rise of, using basically

271
00:13:40,752 --> 00:13:45,092
fine tuned use case specific models
or folks training their own models.

272
00:13:45,422 --> 00:13:48,442
and using those for the
specific application.

273
00:13:48,832 --> 00:13:53,422
and again, that's tends to be a little
bit smaller, more use case specific.

274
00:13:53,652 --> 00:13:55,892
And then, yes, that
makes sense to use there.

275
00:13:56,242 --> 00:13:58,312
okay, I need to be able
to run that on my own.

276
00:13:58,722 --> 00:13:59,392
et cetera.

277
00:13:59,442 --> 00:14:02,152
again, I think a lot of folks are
still feeling out the space and

278
00:14:02,152 --> 00:14:05,362
figuring out exactly how should I
think about how should I use this?

279
00:14:05,702 --> 00:14:08,532
and of course, the tooling has to
exist kind of before you can actually

280
00:14:08,532 --> 00:14:10,102
do a lot of those experiments.

281
00:14:10,102 --> 00:14:13,012
So, in many ways, we've been
building out that tooling to help

282
00:14:13,112 --> 00:14:14,582
support that experimentation.

283
00:14:15,052 --> 00:14:17,792
but I think in many ways, folks are still
figuring out exactly what this is going

284
00:14:17,792 --> 00:14:19,242
to look like for them going forward.

285
00:14:19,647 --> 00:14:20,047
Yeah.

286
00:14:20,361 --> 00:14:23,531
I'm trying to imagine what enterprises are
doing and building out and I'm imagining

287
00:14:23,531 --> 00:14:29,421
it's not like this, but one of my
imaginations is reminding me of 20 years

288
00:14:29,421 --> 00:14:33,581
ago, buying a Google box, which I don't
remember the name of, they called it, but

289
00:14:33,581 --> 00:14:37,331
it was this appliance you would put in
your data center that was, it was yellow.

290
00:14:37,381 --> 00:14:37,891
Racked

291
00:14:38,107 --> 00:14:39,067
search appliance.

292
00:14:39,491 --> 00:14:39,771
you go.

293
00:14:39,771 --> 00:14:40,071
Yeah.

294
00:14:40,351 --> 00:14:43,381
And I don't know if Nirmal, if you had
any customers back then with, if you

295
00:14:43,381 --> 00:14:44,811
were a consultant back then, but that,

296
00:14:45,047 --> 00:14:46,507
I can't name those customers.

297
00:14:46,831 --> 00:14:50,921
so I can, I was in, the city of
Virginia beach running the IT and

298
00:14:50,921 --> 00:14:53,131
the data center there, or at least
running the engineering groups.

299
00:14:53,461 --> 00:14:57,771
And those were back in the days
where we didn't want Google indexing

300
00:14:57,771 --> 00:14:59,481
our internal infrastructure.

301
00:14:59,631 --> 00:15:02,481
You don't want your internal data to be.

302
00:15:02,784 --> 00:15:07,154
accessed or used potentially
by this big IT conglomerate.

303
00:15:07,434 --> 00:15:07,674
Yeah.

304
00:15:07,674 --> 00:15:11,724
And so you bring, they, they sell you
a, on prem box and you put it in your

305
00:15:11,724 --> 00:15:14,894
data center and it would scan and have
all access to everything you could give

306
00:15:14,894 --> 00:15:18,384
it, back when apps didn't really have
their own search, Google was providing

307
00:15:18,384 --> 00:15:22,014
that for us, and they would index
our email and index our file servers

308
00:15:22,014 --> 00:15:24,724
and databases if we wanted to give
it access to them I'm not sure that's

309
00:15:24,734 --> 00:15:29,914
around anymore, but, at a time, Google
wasn't going to give away their software

310
00:15:29,914 --> 00:15:31,344
and we didn't all know how to run it.

311
00:15:31,694 --> 00:15:33,434
And, I'm sure it was running on Linux.

312
00:15:33,434 --> 00:15:37,864
And at the time in the mid 2000s,
we weren't yet Linux at the city.

313
00:15:38,624 --> 00:15:39,684
For different reasons.

314
00:15:39,684 --> 00:15:41,934
This kind of feels like the same
moment for enterprise where.

315
00:15:42,254 --> 00:15:44,814
They're going to have to buy GPUs
probably for the first time if they're

316
00:15:44,814 --> 00:15:47,314
going to run it on prem, they're
going to want to keep it separate.

317
00:15:47,654 --> 00:15:50,114
They're going to either buy something
for the first, probably GPUs if they're

318
00:15:50,114 --> 00:15:53,484
not training models, they just want to
run things internally to access their

319
00:15:53,484 --> 00:15:58,004
internal data, or maybe they're doing
it in the cloud and Nirmal's company,

320
00:15:58,034 --> 00:16:04,194
AWS, is providing them the GPUs and
then presumably because they're getting

321
00:16:04,194 --> 00:16:07,614
dedicated hardware, they won't have
to worry about OpenAI or Anthropic

322
00:16:07,624 --> 00:16:08,894
having access to all their stuff

323
00:16:08,914 --> 00:16:12,624
so with respect to ModelRunner and
again, just a reminder, these are

324
00:16:12,624 --> 00:16:16,074
my own opinions and not that of
my employer, Amazon Web Services.

325
00:16:16,375 --> 00:16:18,065
There's still a lot of use cases.

326
00:16:18,075 --> 00:16:21,055
what Michael was talking about with
respect to choosing lots of different

327
00:16:21,065 --> 00:16:26,625
models for different types of tasks, I
think there's probably a hybrid model

328
00:16:26,625 --> 00:16:32,150
at some point where folks are using
different fine tuned niche models

329
00:16:32,530 --> 00:16:37,430
for specific tasks that they're doing
locally, and then as hardware improves,

330
00:16:37,430 --> 00:16:43,260
and hopefully, you know, maybe your 3
year old developer laptop that you get

331
00:16:43,290 --> 00:16:47,980
at your corporation has enough, or the
models get, you know, Get, optimize it

332
00:16:47,980 --> 00:16:52,760
enough that they can run on CPU or the
GPUs that you have on a corporate laptop,

333
00:16:52,770 --> 00:16:57,230
but there'll be some tooling, probably
embedded into the development tooling

334
00:16:57,550 --> 00:16:59,690
and, or you can choose your own models.

335
00:17:00,200 --> 00:17:04,190
and then those, there'll be other models
where you need to reach out to the

336
00:17:04,240 --> 00:17:08,840
hyperscalers for those that just you're
not going to get the depth of reasoning

337
00:17:08,850 --> 00:17:13,340
you're not going to get the depth of
knowledge that you will from something

338
00:17:13,340 --> 00:17:18,640
like Claude in the cloud versus something
like deep seek running a quantized deep

339
00:17:18,640 --> 00:17:24,700
seek running on your mac but again this
is all changing very very fast and Right

340
00:17:24,700 --> 00:17:30,740
now, we're in a different state where
folks have to spend money to access

341
00:17:30,790 --> 00:17:35,620
those larger models, which hasn't been
the pattern with respect to software

342
00:17:35,620 --> 00:17:37,990
development in a long time, right?

343
00:17:38,210 --> 00:17:39,900
Not everyone has that advantage, right?

344
00:17:40,070 --> 00:17:45,280
I mean, if you want to use Claude Code
and, not hit any major limits, then

345
00:17:45,280 --> 00:17:50,480
you got to pay 200 a month, which, not
everyone's going to be able to afford 2,

346
00:17:50,480 --> 00:17:54,990
400 a year, to do software development,
there's also edge use cases, right?

347
00:17:54,990 --> 00:17:57,930
So IoT devices, just
trying to figure it out.

348
00:17:57,930 --> 00:17:59,020
Also just kicking the tires.

349
00:17:59,020 --> 00:18:01,880
I think that's probably the main,
like Michael said, I think that's

350
00:18:01,890 --> 00:18:04,950
just, everyone is just trying to kick
the tires as cheaply as possible.

351
00:18:05,156 --> 00:18:08,386
model runner feels like a gateway, like
it feels like a gateway drug to get

352
00:18:08,396 --> 00:18:13,616
me Hooked on like the idea, cause I, I
mean, I wasn't paying attention to, I

353
00:18:13,616 --> 00:18:18,236
wasn't a machine, I wasn't an ML person
or I wasn't building AI infrastructure,

354
00:18:18,256 --> 00:18:21,846
but Docker model runner and, well, and,
you know, to a lesser extent, Ollama,

355
00:18:22,166 --> 00:18:25,056
Ollama always felt like it was more
for those people that were doing that,

356
00:18:25,056 --> 00:18:28,916
but bringing this capability into a
tool that I'm already using actually

357
00:18:28,936 --> 00:18:34,136
felt like, Okay, this is meant for me
now, like this is, I don't have to be,

358
00:18:34,226 --> 00:18:39,736
I don't have to understand weights and
all the intricacies of how models are

359
00:18:39,746 --> 00:18:43,776
built and work and the different, I
don't have to, I don't have to understand

360
00:18:43,786 --> 00:18:46,976
the different file format and whether
that works on my particular thing.

361
00:18:47,026 --> 00:18:47,976
it just all kind of works.

362
00:18:47,976 --> 00:18:49,656
It's Docker easy at that point for me.

363
00:18:51,355 --> 00:18:55,695
think that's the key point here of
just, let's try to increase the access

364
00:18:55,695 --> 00:18:59,935
to these capabilities and let folks
start to experiment and I think Again,

365
00:18:59,935 --> 00:19:03,615
we, as an industry, are trying to still
figure out exactly how to use a lot

366
00:19:03,615 --> 00:19:08,545
of these tools and, okay, what is the
right size model for the job at hand?

367
00:19:08,965 --> 00:19:12,225
and so, again, in order to be able
to do that experimentation, you

368
00:19:12,775 --> 00:19:14,185
have to increase the access to it.

369
00:19:14,565 --> 00:19:17,305
and so, but that's still kind of,
you know, Where we are in many ways.

370
00:19:17,632 --> 00:19:19,622
So did we check off
another thing on the list?

371
00:19:20,381 --> 00:19:23,751
about to I'm gonna have to keep
saying this We're not a news podcast.

372
00:19:23,771 --> 00:19:28,591
We don't talk about the latest things
like in general in tech but you

373
00:19:28,591 --> 00:19:31,232
know, I got to give a shout out to
Fireship because I have Between that

374
00:19:31,232 --> 00:19:34,082
and a few other channels, I've really
learned a lot this year about models.

375
00:19:34,392 --> 00:19:38,152
And I was, there's this little,
diagram of sort of the state of a lot

376
00:19:38,152 --> 00:19:41,602
of the open, waiter open, or free,
I'm just going to say free models.

377
00:19:41,852 --> 00:19:44,562
They're free in some capacity,
that you can download.

378
00:19:44,592 --> 00:19:45,912
And, I was using Devstral.

379
00:19:45,932 --> 00:19:48,422
We talked about it a couple of
times on this show already, which

380
00:19:48,422 --> 00:19:49,802
is a model that came out in May.

381
00:19:49,842 --> 00:19:51,122
I actually had a newsletter on that.

382
00:19:51,122 --> 00:19:52,302
Shout out to, Bret.

383
00:19:52,302 --> 00:19:54,702
news, there's this guy, Bret,
he makes a newsletter, Bret.

384
00:19:54,852 --> 00:19:56,002
news, you can go check that out.

385
00:19:56,282 --> 00:20:01,372
and I talked about this, that maybe this
was the sweet spot because it was small

386
00:20:01,372 --> 00:20:07,212
enough, you could run it with a modern
GPU or modern Mac, and it wasn't the

387
00:20:07,212 --> 00:20:11,632
worst, nothing like the frontier models
that we get with OpenAI and Anthropic,

388
00:20:11,632 --> 00:20:13,682
but it was something that was better.

389
00:20:15,342 --> 00:20:16,792
And, that's called Devstral.

390
00:20:17,302 --> 00:20:19,962
And then we had Quinn three
just come out, I don't know, a

391
00:20:19,962 --> 00:20:22,872
week ago or something that is

392
00:20:23,198 --> 00:20:24,478
I think it was earlier this week.

393
00:20:24,892 --> 00:20:27,222
Oh, see, time warp of AI,

394
00:20:27,378 --> 00:20:27,778
I think it

395
00:20:27,922 --> 00:20:29,262
three days is like three weeks.

396
00:20:29,746 --> 00:20:30,926
It was two days ago.

397
00:20:31,198 --> 00:20:31,648
Yeah,

398
00:20:31,912 --> 00:20:32,752
Oh gosh.

399
00:20:33,242 --> 00:20:35,972
I always assume that I'm seeing a
Fireship videos late, but, yeah,

400
00:20:35,972 --> 00:20:37,432
one day ago, so yeah, that's true.

401
00:20:37,432 --> 00:20:41,387
One day ago, there's a newer model
coming out from Alibaba, right?

402
00:20:41,467 --> 00:20:45,687
And it is even better, although
it does take more GPU memory, I

403
00:20:45,918 --> 00:20:48,438
that's hard to run like on a laptop.

404
00:20:48,598 --> 00:20:54,901
I don't think you can, so that's a perfect
encapsulation of like why Docker model

405
00:20:54,901 --> 00:20:59,951
run is there, but also why it's going
to be a mix of models going forward.

406
00:21:00,421 --> 00:21:05,751
so we've, we have an AI that helps
you use Docker and get started.

407
00:21:05,871 --> 00:21:11,121
We have a tool that helps you run
models locally, and understand that

408
00:21:11,481 --> 00:21:13,611
what's the next step in this journey?

409
00:21:14,088 --> 00:21:17,378
by the way, you go to Docker Hub
to look for models, or you can

410
00:21:17,378 --> 00:21:20,388
do it in Docker Desktop, or you
can look at things in the CLI.

411
00:21:20,398 --> 00:21:22,148
there's many ways to see the models.

412
00:21:22,508 --> 00:21:24,128
you can pull things from HuggingFace now.

413
00:21:24,351 --> 00:21:24,981
That's pretty sweet.

414
00:21:25,090 --> 00:21:26,660
Can we build, can we make our own yet?

415
00:21:26,660 --> 00:21:27,500
Or do we have that?

416
00:21:28,480 --> 00:21:28,730
We can

417
00:21:28,874 --> 00:21:32,464
so, you can package, but you would
have to already have the gguff

418
00:21:32,484 --> 00:21:33,914
file and all that kind of stuff.

419
00:21:33,914 --> 00:21:36,174
So we don't have a lot of
the tooling there to help you

420
00:21:36,174 --> 00:21:37,814
actually create the model itself.

421
00:21:38,004 --> 00:21:41,554
Although a lot of folks will use
container based environments to do that.

422
00:21:41,914 --> 00:21:44,264
we don't have any specific
tooling around that ourselves.

423
00:21:44,275 --> 00:21:46,385
So is there, you're saying there's
a doc, oh, there, there is.

424
00:21:46,385 --> 00:21:47,375
Oh, I didn't realize.

425
00:21:47,415 --> 00:21:52,685
So there is now a Docker model
package CLI for creating, basically

426
00:21:52,685 --> 00:21:59,615
wrapping the GGUF or, other models
or whatever, into the Docker OCI

427
00:21:59,625 --> 00:22:04,895
standard format for shipping and
pulling and pushing essentially Docker.

428
00:22:05,505 --> 00:22:08,535
models that are in Docker hubs
that are in the Docker format.

429
00:22:09,034 --> 00:22:10,004
OCI format.

430
00:22:10,044 --> 00:22:10,434
Yep.

431
00:22:11,484 --> 00:22:15,324
So when we look at agentic applications,
step one is, you need models.

432
00:22:15,324 --> 00:22:17,174
It's kind of the brains of the operations.

433
00:22:17,469 --> 00:22:19,029
and then you need tools.

434
00:22:19,029 --> 00:22:21,439
And so you've got highlighted
here the MCP toolkit.

435
00:22:21,439 --> 00:22:25,299
That was kind of the first
adventure into the MCP space.

436
00:22:25,299 --> 00:22:29,419
And that one was focused a little bit
more on how do we provide tools to the

437
00:22:29,429 --> 00:22:32,539
other agentic applications that you
are already running on your machine.

438
00:22:32,539 --> 00:22:39,929
So Claude Desktop or using VS Code
Copilot on my machine or Cursor, etc.

439
00:22:40,579 --> 00:22:44,709
How do we provide those MCP servers?

440
00:22:45,599 --> 00:22:49,079
And containerized ways, secure
credential injection, etc.

441
00:22:49,259 --> 00:22:52,179
Basically manage the life
cycle of those MCP servers.

442
00:22:52,559 --> 00:22:57,809
Again, in the use case of connecting
them to your other agentic applications.

443
00:22:58,449 --> 00:23:00,700
And so, again, this is kind of
where we started our MCP journey.

444
00:23:01,080 --> 00:23:04,520
if you see flipping through a couple
of these, actually we just released a

445
00:23:04,520 --> 00:23:09,030
Docker Hub MCP server that allows you
to Search for images on hub or, you

446
00:23:09,030 --> 00:23:13,070
know, those within your organization,
which is super helpful for like maybe

447
00:23:13,460 --> 00:23:15,520
a write me a Dockerfile that does X.

448
00:23:15,550 --> 00:23:16,130
Well, cool.

449
00:23:16,130 --> 00:23:18,740
Let's go find that the right image
that should be used for that.

450
00:23:19,140 --> 00:23:22,060
so again, starts to open up some of
these, additional capabilities here.

451
00:23:22,826 --> 00:23:23,286
Yeah.

452
00:23:23,286 --> 00:23:28,956
And inside the Docker desktop UI,
there is now a beta tab, essentially,

453
00:23:29,346 --> 00:23:30,846
that's called MCP Toolkit.

454
00:23:31,346 --> 00:23:39,796
And it is a GUI that allows me to explore
and enable one of, 141 different tools

455
00:23:39,836 --> 00:23:41,636
and growing that Docker has added.

456
00:23:41,966 --> 00:23:47,676
So like a lot of the other places on the
internet that either they host models

457
00:23:47,676 --> 00:23:53,346
like Anthropic or OpenAI, or They're a
place where you can create AI applications

458
00:23:53,626 --> 00:23:58,036
and all those places have started to
create their own little portals for

459
00:23:58,036 --> 00:24:01,516
finding tools and they may or may not,
I mean, most of them all now settle on

460
00:24:01,516 --> 00:24:05,286
MCP, but before we had really MCP as
the protocol standard, they were already

461
00:24:05,286 --> 00:24:09,386
doing like OpenAI was doing this before,
but they were very, it was proprietary.

462
00:24:09,386 --> 00:24:14,116
You don't know how Evernote or
Notion got, showed up as a tool

463
00:24:14,126 --> 00:24:16,556
feature in ChatGPT, it did.

464
00:24:17,371 --> 00:24:21,021
But we just assumed that was their
custom integration and now we have this

465
00:24:21,021 --> 00:24:24,531
standard called MCP that everything
should interact with everything

466
00:24:24,551 --> 00:24:26,941
properly the way that they should.

467
00:24:27,631 --> 00:24:29,491
At least right now it's
the one that's winning.

468
00:24:29,491 --> 00:24:31,531
We don't know whether we'll
still be talking about MCP in

469
00:24:31,531 --> 00:24:33,321
five years, but it's here now.

470
00:24:33,631 --> 00:24:34,751
It's what we're talking about now.

471
00:24:35,091 --> 00:24:39,034
And this lights up a lot of capabilities.

472
00:24:39,544 --> 00:24:42,334
In other words, you turn on an MCP tool.

473
00:24:43,334 --> 00:24:46,184
And that sits behind
something called MCP Gateway.

474
00:24:46,434 --> 00:24:48,304
So tell me, what is MCP Gateway?

475
00:24:48,749 --> 00:24:49,219
Yeah.

476
00:24:49,219 --> 00:24:53,719
So at the end of the day, the toolkit is
a combination of several different things.

477
00:24:53,749 --> 00:24:57,199
the MCP gateway is actually
a component that we just open

478
00:24:57,199 --> 00:24:58,539
source at WeAreDeveloper.

479
00:24:58,539 --> 00:25:00,054
So you can actually run
this gateway directly.

480
00:25:00,344 --> 00:25:01,844
In a container, completely on its own.

481
00:25:02,314 --> 00:25:05,724
And the MCP gateway is what's
actually responsible for managing

482
00:25:05,734 --> 00:25:07,554
the lifecycle of these MCP servers.

483
00:25:07,711 --> 00:25:10,471
it itself is an MCP server.

484
00:25:10,511 --> 00:25:12,531
think of it more like an MCP proxy.

485
00:25:12,731 --> 00:25:17,351
it exposes itself as an MCP server that
then can connect to your applications.

486
00:25:17,661 --> 00:25:20,596
But when you ask that server,
Hey, what tools do you have?

487
00:25:21,056 --> 00:25:25,146
It's really delegating, or I mean,
it's using cache versions of, okay,

488
00:25:25,146 --> 00:25:27,546
what are the downstream MCP servers?

489
00:25:28,316 --> 00:25:31,336
And so it's acting as
basically a proxy here.

490
00:25:31,946 --> 00:25:37,141
so when requests come in and say, hey,
you know, from the agents at GAP, if

491
00:25:37,141 --> 00:25:42,233
I want to execute this tool, go do a
search on DuckDuckGo at that point, the

492
00:25:42,233 --> 00:25:47,353
MCP gateway will actually spin up the
container, that DuckDuckGo MCP server,

493
00:25:47,643 --> 00:25:52,013
and then delegate the request to that
container, which then does the search, and

494
00:25:52,013 --> 00:25:54,043
then the MCP gateway returns the results.

495
00:25:54,043 --> 00:25:58,023
So kind of think of it as a proxy
that's managing the lifecycle of all

496
00:25:58,023 --> 00:26:01,803
those containers, but also, you know,
injecting the credentials, configuration.

497
00:26:02,143 --> 00:26:05,203
it also does other things like actually
looking at what's going in and out of

498
00:26:05,203 --> 00:26:09,233
the prompts going through the proxy to
make sure, you know, secrets aren't being

499
00:26:09,233 --> 00:26:11,243
leaked or, that kind of stuff as well too.

500
00:26:11,293 --> 00:26:14,743
and we're even starting to do some
further explorations of what are other

501
00:26:14,743 --> 00:26:16,883
ways to kind of secure those MCP servers?

502
00:26:16,933 --> 00:26:20,283
You know, for example, a file system
one should never have network access.

503
00:26:20,483 --> 00:26:24,473
Cool, so let's, when that container
starts, you get no network access,

504
00:26:24,493 --> 00:26:27,923
or, the GitHub MCP server, you know,
it's talking to the GitHub APIs.

505
00:26:29,178 --> 00:26:32,898
Let's only authorize those host
names that it can communicate with.

506
00:26:32,908 --> 00:26:35,988
So, you know, start to do a little
bit more of a kind of permissioning

507
00:26:35,998 --> 00:26:40,238
model around these MCP servers, which
is where a lot of people are kind of

508
00:26:40,238 --> 00:26:44,038
most cautious and nervous about MCP
servers, because it's, they're completely

509
00:26:44,038 --> 00:26:48,458
autonomous for the most part, and you
have to trust what's going on there.

510
00:26:48,478 --> 00:26:52,338
this exact feature is both
necessary and also Solomon Hyke's

511
00:26:52,358 --> 00:26:54,128
prediction from a month ago.

512
00:26:54,128 --> 00:26:57,408
He was on this show and was saying
that we're going to see all these

513
00:26:57,408 --> 00:27:00,398
infrastructure companies and all
these tooling companies that are

514
00:27:00,398 --> 00:27:01,908
going to offer to lock this shit down.

515
00:27:01,908 --> 00:27:03,788
I think it's the quote
I have to get from him.

516
00:27:04,217 --> 00:27:04,777
that sounds right.

517
00:27:04,928 --> 00:27:08,578
he compared the origin of containers
and how it started with developers.

518
00:27:08,578 --> 00:27:12,268
And then eventually it took it and sort
of managed it in the infrastructure

519
00:27:12,268 --> 00:27:16,708
layer and provided all the restrictions
and security and limitations and

520
00:27:16,738 --> 00:27:17,858
configuration and all this stuff.

521
00:27:18,268 --> 00:27:20,618
And the same thing's happening to MCP.

522
00:27:20,828 --> 00:27:25,348
Where it started out as a developer tool
to empower developers to do all these

523
00:27:25,348 --> 00:27:29,328
cool things with AI that they couldn't do
and let the AI actually do stuff for us.

524
00:27:29,588 --> 00:27:34,038
And now very quickly in a matter of
months, IT is coming in and saying,

525
00:27:34,038 --> 00:27:35,218
okay, we're going to lock this down.

526
00:27:35,218 --> 00:27:35,898
It's crazy.

527
00:27:35,908 --> 00:27:39,088
You can, you know, your prompts can
delete your, drop your databases,

528
00:27:39,088 --> 00:27:42,748
your, as we just saw happen on
the internet recently this week.

529
00:27:43,038 --> 00:27:48,598
I want to, on this gateway topic
though, it can sound complicated.

530
00:27:49,013 --> 00:27:52,403
And maybe the internals are a
little, and there was obviously

531
00:27:52,403 --> 00:27:54,003
code built into this program.

532
00:27:54,223 --> 00:27:58,353
But for those of us that aren't
maybe building agents yet, or

533
00:27:58,353 --> 00:28:03,593
like really getting into building
apps that use AIs in the app, this

534
00:28:03,593 --> 00:28:05,733
just appears as kind of magic.

535
00:28:05,773 --> 00:28:11,263
Like it, you go into the Docker desktop
UI, I enable, I go through the toolkit,

536
00:28:11,273 --> 00:28:15,788
there's all these suggestions, everything
from, the MCP server for GitHub itself to

537
00:28:15,998 --> 00:28:20,418
an MCP server that could give me access
to Grafana data to accessing the Heroku

538
00:28:20,418 --> 00:28:24,248
API and you're looking at all these things
and you're just like, I'm enabling them.

539
00:28:24,258 --> 00:28:25,518
it's like a kid in a candy store.

540
00:28:25,518 --> 00:28:26,348
I'm just going check, check, check.

541
00:28:26,348 --> 00:28:27,778
Yeah, I want Notion.

542
00:28:27,778 --> 00:28:28,538
I want Stripe.

543
00:28:28,588 --> 00:28:31,518
I get them in a list, they're
enabled, which means they're

544
00:28:31,518 --> 00:28:32,968
not actually running, right?

545
00:28:32,968 --> 00:28:37,238
they're waiting to be called before
the gateway runs them in memory.

546
00:28:37,518 --> 00:28:40,448
That's all transparent to me, I
don't realize that's happening.

547
00:28:40,838 --> 00:28:47,108
And if I choose to use this toolkit
with Gordon, if I just go into Gordon,

548
00:28:47,158 --> 00:28:52,668
And in the Gordon AI, if I don't want
to run a local model myself, or I'm

549
00:28:52,848 --> 00:28:57,338
not using Claude Desktop or something
that gives me the ability to enable MCP

550
00:28:57,338 --> 00:29:02,318
tools, I can just go in here and say,
enable all my MCP tools, all 34 of them.

551
00:29:02,418 --> 00:29:08,188
I've got get ones and I've got, and
so now what that means is the Gordon

552
00:29:08,188 --> 00:29:13,618
AI can now use these tools, which
makes this free AI even smarter.

553
00:29:14,008 --> 00:29:20,592
And I can say, is there a
Docker hub image  NGINX?

554
00:29:20,712 --> 00:29:21,812
I don't know if there is.

555
00:29:22,407 --> 00:29:22,787
Let's see.

556
00:29:22,817 --> 00:29:23,817
I've never even tested this.

557
00:29:23,817 --> 00:29:25,947
So it's kind of a, what could go wrong?

558
00:29:25,977 --> 00:29:28,137
let's use Google live on the
internet and see what happens.

559
00:29:28,137 --> 00:29:32,037
yeah, I was just saying, look at it,
going out and checking, using the

560
00:29:32,037 --> 00:29:37,297
Docker, the newly created Docker hub,
MCB, tool That you had, just, released.

561
00:29:38,107 --> 00:29:46,027
so is this going through a, MCP gateway
or is this not with the MCP gateway yet?

562
00:29:46,245 --> 00:29:53,165
when you and Gordon AI flip the switch to
say, yes, I want to use the MCP toolkit,

563
00:29:53,165 --> 00:29:59,319
basically what that's doing is, in the
Gordon AI application here, it's enrolling

564
00:29:59,529 --> 00:30:01,729
the MCP toolkit as an MCP server.

565
00:30:02,118 --> 00:30:05,796
And so then it's going to ask
The MCP toolkit, hey, what

566
00:30:05,796 --> 00:30:06,586
tools do you have available?

567
00:30:06,586 --> 00:30:10,086
And so when you saw that list of tools,
that's again coming from the gateway.

568
00:30:10,536 --> 00:30:15,961
Gordon is just simply treating the
MCP toolkit as An MCP server, which in

569
00:30:15,961 --> 00:30:18,141
itself is going to launch MCP server.

570
00:30:18,141 --> 00:30:21,301
So that's kind of why I mentioned, it's
kind of thinking of it like a proxy there.

571
00:30:21,327 --> 00:30:22,707
Yeah, it does feel like one.

572
00:30:22,707 --> 00:30:23,197
Yeah.

573
00:30:23,307 --> 00:30:27,387
And this, for those watching, like it
didn't actually work because I don't

574
00:30:27,397 --> 00:30:30,377
actually have access to hardened images,
but, I just wanted to see what it would,

575
00:30:30,377 --> 00:30:32,117
what it'd say, but it, the, the UI

576
00:30:32,318 --> 00:30:33,648
Which is the right answer?

577
00:30:33,708 --> 00:30:34,788
Which is the right answer for

578
00:30:35,027 --> 00:30:36,317
did the right thing.

579
00:30:36,367 --> 00:30:39,607
it didn't expose a
vulnerability in the MCP server.

580
00:30:39,657 --> 00:30:43,047
But yeah, so it basically,
I can give Gordon AI more.

581
00:30:43,467 --> 00:30:46,407
And I can do a lot more functionality,
more abilities to do things without

582
00:30:46,417 --> 00:30:49,497
having to run my own model, without
having to figure out Claude Desktop.

583
00:30:49,767 --> 00:30:54,927
But I will say, because I'm in love with
this toolkit so much, because I love

584
00:30:54,927 --> 00:31:00,237
this idea of one place for my MCP tools,
for me to enter in the API secrets so it

585
00:31:00,237 --> 00:31:03,987
can access my notion and my Gmail but I
don't want to have to do that in Claude

586
00:31:04,007 --> 00:31:08,057
Desktop and then in warp and then in
VS code and then in Docker and then in

587
00:31:08,057 --> 00:31:10,637
Ollama and like every place I might run.

588
00:31:11,367 --> 00:31:14,127
A tool that needs MCP
tools or access to an LLM.

589
00:31:14,417 --> 00:31:16,807
So I did it in Docker Desktop.

590
00:31:16,827 --> 00:31:18,907
I enabled the ones and set
them up the way I wanted.

591
00:31:19,347 --> 00:31:25,537
And then inside of my tools around
my computer that all support AI MCPs,

592
00:31:26,537 --> 00:31:31,787
they all have now added MCP client
functionality that lets me talk to

593
00:31:31,847 --> 00:31:36,364
another MCP, any MCP server that
speaks proper MCP through their API.

594
00:31:36,694 --> 00:31:40,454
And in this case, what I've done in the
warp terminal, because it does support

595
00:31:40,464 --> 00:31:46,494
MCP, is I just tell it, the command it's
going to run is docker mcp gateway run.

596
00:31:46,904 --> 00:31:50,124
And it uses the standard in and
standard out, which is one of

597
00:31:50,124 --> 00:31:51,714
the ways that you can use mcp.

598
00:31:52,124 --> 00:31:57,084
And then I suddenly have all 34 tools
that we enabled in my docker desktop.

599
00:31:57,514 --> 00:32:01,254
Available in Warp, just as long as Docker
Desktop's running, that's all I gotta do.

600
00:32:01,874 --> 00:32:06,864
And then, because Warp is using Claude
Sonnet 4, or whatever I told Warp to

601
00:32:06,864 --> 00:32:10,204
do, Docker doesn't care about that,
because I'm not asking it to use,

602
00:32:10,614 --> 00:32:12,704
think that, we talked about this it's
called bring your own key, I guess

603
00:32:12,704 --> 00:32:13,664
that's what everybody's talking about.

604
00:32:14,024 --> 00:32:19,334
Uh, bring your own key is when you want
to bring your own Model to whatever

605
00:32:19,384 --> 00:32:19,854
to access,

606
00:32:20,134 --> 00:32:21,464
you're a key to access the model.

607
00:32:21,464 --> 00:32:25,234
Yeah, but in warp in particular,
like this is nuanced, but in

608
00:32:25,234 --> 00:32:28,434
warp, you can't usually, you can't
yet access your own models yet.

609
00:32:28,444 --> 00:32:30,394
I think they're going to make that
an enterprise feature or something.

610
00:32:30,744 --> 00:32:32,754
But if I open up VS code,
I could do the same thing.

611
00:32:32,754 --> 00:32:36,304
If I opened up a ChatGPT desktop,
I could do the same thing.

612
00:32:36,664 --> 00:32:41,494
And, Aider or like any of the CLI tools,
although anything that accepts MCP

613
00:32:41,524 --> 00:32:44,314
so far, I've gotten to work this way.

614
00:32:44,444 --> 00:32:49,914
And it's been awesome because all these,
all these different IDEs and AI tools

615
00:32:50,154 --> 00:32:51,324
all set up a little bit different.

616
00:32:51,324 --> 00:32:54,374
Goose you set up differently
than Claude Desktop.

617
00:32:54,374 --> 00:32:54,854
So

618
00:32:54,865 --> 00:32:55,565
Client.

619
00:32:56,225 --> 00:33:00,936
they're all, and all they, all of
them have different knobs and, ways of

620
00:33:00,936 --> 00:33:05,976
controlling MCP servers and at varying
degrees of control and flexibility.

621
00:33:06,416 --> 00:33:09,926
So this is really nice because
then you can also just have

622
00:33:09,936 --> 00:33:11,706
all those tools running if you

623
00:33:11,815 --> 00:33:12,265
right.

624
00:33:12,335 --> 00:33:15,295
They look like one giant MCP server.

625
00:33:15,295 --> 00:33:19,875
Yeah, because normally I would have to
add each MCP tool as its own server,

626
00:33:20,021 --> 00:33:22,811
I think, it feels like some of the
tools are all standardizing on Claude

627
00:33:22,831 --> 00:33:25,451
Desktop as, that config file, which I

628
00:33:25,563 --> 00:33:25,713
the mcp.

629
00:33:25,713 --> 00:33:27,003
json,

630
00:33:27,191 --> 00:33:31,771
Yeah, it feels like that, like everyone's
settling on just using that one file.

631
00:33:32,131 --> 00:33:35,731
Which is, I guess it's kind of feels
kind of hacky, but I guess it's fine.

632
00:33:36,211 --> 00:33:38,731
it feels like every editor
using my VIM settings.

633
00:33:38,731 --> 00:33:39,351
It's like, no, no, no, no.

634
00:33:39,381 --> 00:33:39,851
Calm down.

635
00:33:39,851 --> 00:33:42,041
I don't necessarily want you
all to use the same file.

636
00:33:42,261 --> 00:33:45,031
I don't want you all overwriting each
other and changing the same file.

637
00:33:45,081 --> 00:33:49,901
so you're bringing up a really important
thing, which is, since we're new to this,

638
00:33:50,001 --> 00:33:55,451
the number of MCP servers is not too many
yet, even though it does feel like there's

639
00:33:55,452 --> 00:33:56,571
probably like a lot of MCP servers.

640
00:33:56,572 --> 00:33:59,541
I've been using 10 new MCP servers
since we've started this conversation.

641
00:34:00,201 --> 00:34:04,451
But it's still like a number of tools
that we can still rationalize about.

642
00:34:04,881 --> 00:34:10,319
But probably in another month or two at
this rate, there is a limit to how many

643
00:34:10,329 --> 00:34:17,434
MCP tools One single client, I guess
you could say, or one instantiation of

644
00:34:17,434 --> 00:34:23,824
a task that you're using like Claude
Code or Cline, it's context window can

645
00:34:23,844 --> 00:34:25,924
only use a certain amount of tools.

646
00:34:26,354 --> 00:34:33,424
And so, is there some ideas about breaking
up in the MCP gateway, like having maybe

647
00:34:33,424 --> 00:34:37,934
like sets of tools that have like specific
supersets or tools or something like that?

648
00:34:38,014 --> 00:34:38,284
Yeah.

649
00:34:38,284 --> 00:34:38,914
Good question.

650
00:34:38,934 --> 00:34:40,314
And so that's a good call out.

651
00:34:40,314 --> 00:34:43,774
And so I actually want to, zoom in on
that just a tiny bit there, because

652
00:34:43,844 --> 00:34:46,584
for folks that may be new to this,
they may not quite understand that

653
00:34:47,154 --> 00:34:51,584
the way that tools work is basically,
it's taking all the tool descriptions.

654
00:34:51,904 --> 00:34:53,034
Okay, here's a tool name.

655
00:34:53,034 --> 00:34:54,704
Here's when I'm going to use this tool.

656
00:34:54,704 --> 00:34:57,214
Here's the parameters that are
needed to invoke this tool.

657
00:34:57,594 --> 00:35:00,584
And it's sending that to
the model on every request.

658
00:35:01,754 --> 00:35:05,364
And so the model's having to read
all that and basically say, hey,

659
00:35:05,364 --> 00:35:07,764
based on this conversation, hey,
here's a toolbox of stuff that

660
00:35:07,764 --> 00:35:09,564
I may or may not be able to use.

661
00:35:10,234 --> 00:35:15,889
But as Nirmal just pointed out, like
that takes context window there and

662
00:35:16,619 --> 00:35:21,099
granted yes some of the newer models
have incredibly huge context windows but

663
00:35:21,099 --> 00:35:25,589
depending on the use case, it's going
to affect your speed, it's going to

664
00:35:25,589 --> 00:35:29,559
affect the quality, and so yeah, you do
want to be careful of like, okay, I'm

665
00:35:29,559 --> 00:35:32,619
not just going to go in there and just
flip the box on all the MCP servers.

666
00:35:32,629 --> 00:35:33,739
Now you have access to everything.

667
00:35:33,739 --> 00:35:36,619
Like, you do want to be a little
conscious of that as well.

668
00:35:36,929 --> 00:35:40,599
in fact, I found it funny in a, I
was playing in Cursor not long ago,

669
00:35:40,599 --> 00:35:44,119
and you know, they have even just
YOLO mode, just go crazy with it.

670
00:35:44,529 --> 00:35:48,339
But even they have A warning once
you, I think it's after you enable

671
00:35:48,339 --> 00:35:54,009
the 31st tool of like, hey, heads
up, you're getting a little crazy.

672
00:35:54,499 --> 00:35:58,599
So like, I'm like, for the one that
has YOLO mode to call me out for being

673
00:35:58,599 --> 00:36:02,569
crazy for too many tools, like it was,
it's again, kind of a reminder of just,

674
00:36:03,029 --> 00:36:05,899
okay, you do want to be conscious of
the number of tools that you're using.

675
00:36:06,369 --> 00:36:07,619
so to actually answer the question.

676
00:36:07,619 --> 00:36:07,739
Yeah.

677
00:36:07,739 --> 00:36:12,099
It's been something that we've been
exploring and kind of waiting to just see

678
00:36:12,099 --> 00:36:14,169
what the feedback is gonna be on that.

679
00:36:14,229 --> 00:36:17,349
Are there separate tool sets
that, clients can connect to.

680
00:36:18,139 --> 00:36:21,929
you know, that's certainly a possibility
as well, since this MCP gateway is an

681
00:36:21,929 --> 00:36:26,779
open source container, when you run
this for your application, not only can

682
00:36:26,779 --> 00:36:30,229
you say, these are the servers I want,
but then you can even further filter

683
00:36:30,239 --> 00:36:33,559
through, these are the tools from those
servers that I actually want to expose.

684
00:36:33,559 --> 00:36:36,069
So, for example, I think the
GitHub official one is up to

685
00:36:36,069 --> 00:36:38,109
72 tools now or something.

686
00:36:38,349 --> 00:36:39,829
It's a crazy number.

687
00:36:40,179 --> 00:36:41,949
but most of the time, I only
need maybe three or four of them.

688
00:36:42,269 --> 00:36:43,559
So, I want to filter that.

689
00:36:43,559 --> 00:36:46,529
And that's why you see, Cloud and
VS Code and many of these others.

690
00:36:46,539 --> 00:36:50,309
Even though you're pulling in
these MCP servers, many of those

691
00:36:50,329 --> 00:36:53,249
provide client side functionality
to kind of filter that list as well,

692
00:36:53,249 --> 00:36:53,629
Yeah.

693
00:36:54,290 --> 00:36:57,730
I wonder if we get to a state because
the MC, this is getting a little bit

694
00:36:57,730 --> 00:37:01,760
meta, but everything when you talk about
agentic AI gets meta really quickly.

695
00:37:02,450 --> 00:37:07,910
So, I wonder if the MCP gateway itself
is an MCP server, so it can rationalize

696
00:37:07,920 --> 00:37:12,700
about itself, I wonder if we get into
the pattern of, okay, there's this new

697
00:37:12,700 --> 00:37:17,850
task that I want this agent to do, and
the first thing, after it comes up with

698
00:37:17,850 --> 00:37:22,210
its task list, the steps it wants to
take, is go through that list and then,

699
00:37:22,690 --> 00:37:28,505
ask the MCP gateway to reconfigure
itself on each task and turn on Only

700
00:37:28,505 --> 00:37:32,505
the ones that it identified as likely
the ones that it needs for that task.

701
00:37:32,815 --> 00:37:36,265
And just dynamically, at any
given time, don't have anything

702
00:37:36,265 --> 00:37:37,485
more than five running.

703
00:37:37,865 --> 00:37:38,905
So figure it out.

704
00:37:39,005 --> 00:37:41,555
You can choose whatever five
you want, but only have five.

705
00:37:41,618 --> 00:37:45,548
We've done some experiments with that,
not quite to that full dynamicness,

706
00:37:45,578 --> 00:37:49,548
but I've even done some ones of a
Okay, here's a tool to enable other

707
00:37:49,548 --> 00:37:51,398
tools, is basically what it is.

708
00:37:51,848 --> 00:37:55,368
And, okay, and give me parameters
of, okay, do you need, GitHub?

709
00:37:55,378 --> 00:37:56,648
Do you need, Slack?

710
00:37:56,648 --> 00:37:59,768
You know, tell me what it is
that you need, and then I'll

711
00:37:59,768 --> 00:38:01,288
enable those specific things.

712
00:38:01,288 --> 00:38:05,378
And then what's cool then is,
as part of the MCP protocol,

713
00:38:05,378 --> 00:38:06,488
there's also notifications.

714
00:38:06,488 --> 00:38:10,393
So the MCP server can then notify,
The client says, hey, there's a new

715
00:38:10,393 --> 00:38:13,573
list of tools available, and then
the next API request to the model

716
00:38:13,863 --> 00:38:15,213
then has this new set of tools.

717
00:38:15,985 --> 00:38:16,055
I

718
00:38:16,055 --> 00:38:16,635
think we're almost

719
00:38:16,763 --> 00:38:18,353
capability is there, but,

720
00:38:19,055 --> 00:38:20,385
I think that's likely the next step.

721
00:38:21,423 --> 00:38:26,793
but it's also kind of like a,
yeah, how do you safeguard that?

722
00:38:26,853 --> 00:38:31,398
So it's, Yeah, it's an
interesting time period, for sure.

723
00:38:31,975 --> 00:38:35,445
we got an interesting question, is
MCP Gateways intent to replace an

724
00:38:35,465 --> 00:38:37,545
API Gateway or in parallel to it?

725
00:38:37,935 --> 00:38:38,645
Great question.

726
00:38:38,645 --> 00:38:39,715
Michael, you want to take that one?

727
00:38:39,858 --> 00:38:40,638
yeah, great question.

728
00:38:40,688 --> 00:38:44,672
I'd say in some ways that there's
similar functionality, but they

729
00:38:44,732 --> 00:38:45,932
serve very different purposes.

730
00:38:45,932 --> 00:38:49,822
So an API gateway, I'll just take
the most basic example, but I know

731
00:38:49,822 --> 00:38:53,932
there's lots of different ones, An API
gateway, single endpoint, and I may

732
00:38:53,932 --> 00:38:55,652
have lots of different microservices.

733
00:38:55,672 --> 00:38:57,092
Let's just pick a catalog.

734
00:38:57,132 --> 00:39:00,452
Okay, so for product related ones,
it's going to go to this microservice.

735
00:39:00,642 --> 00:39:02,182
Users, it's going to go to this other one.

736
00:39:02,182 --> 00:39:03,862
Cart, another service, whatever.

737
00:39:04,202 --> 00:39:08,112
And the API gateway is routing all
those different requests and rate

738
00:39:08,112 --> 00:39:14,751
limiting, etc. In many ways, like this
MCP gateway serves in a similar fashion

739
00:39:15,371 --> 00:39:18,921
in which it's going to be routing
to the right MCP server to actually

740
00:39:18,941 --> 00:39:20,521
handle the tool execution and whatnot.

741
00:39:20,931 --> 00:39:24,231
But again, it's only for the MCP protocol.

742
00:39:24,531 --> 00:39:27,811
So it's not going to be replacing an
API gateway because it's not doing

743
00:39:27,811 --> 00:39:33,451
normal API requests, etc. It's only
for MCP related workloads and requests.

744
00:39:34,381 --> 00:39:35,931
different protocols at play here.

745
00:39:36,737 --> 00:39:38,607
I think that's probably the
best way to describe it.

746
00:39:38,657 --> 00:39:44,357
otherwise, you could also say that
MCP and API Gateway are likely

747
00:39:44,357 --> 00:39:46,407
going to be running in parallel.

748
00:39:46,767 --> 00:39:50,757
and so probably what I would see would
be, I have an API gateway that routes

749
00:39:50,757 --> 00:39:55,807
a request to an endpoint, and then that
particular application, let's just say

750
00:39:55,807 --> 00:40:01,977
it's an agentic application, can then have
its own MCP gateway to satisfy whatever

751
00:40:01,977 --> 00:40:03,997
agentic flow it needs to use there.

752
00:40:03,997 --> 00:40:07,467
I wanted to, while you guys were having
an awesome conversation, I was trying

753
00:40:07,467 --> 00:40:13,947
to draw up, just a visualization to
try to represent, okay, so just so

754
00:40:13,947 --> 00:40:16,907
people understand, because this MCP,
we could make a whole show on MCP

755
00:40:16,907 --> 00:40:20,107
tools, honestly, from an infrastructure
perspective, how do these things talk?

756
00:40:20,117 --> 00:40:20,927
How do they integrate?

757
00:40:21,347 --> 00:40:23,367
The fact that you're talking about
that they're just really adding to

758
00:40:23,367 --> 00:40:26,898
the context window is a fantastic
fact that, A lot of people could go

759
00:40:27,088 --> 00:40:31,498
months or years using MCP tools day
to day and never know that, right?

760
00:40:31,558 --> 00:40:35,458
a normal non engineer could use
MCP tools, not understand how

761
00:40:35,458 --> 00:40:36,478
these things are all working.

762
00:40:36,738 --> 00:40:40,288
for those that are into this, are
playing around with MCP tools elsewhere

763
00:40:40,698 --> 00:40:44,538
and understanding a little bit of MCP
server functionality and client versus

764
00:40:44,538 --> 00:40:46,138
server versus host and all that stuff.

765
00:40:46,458 --> 00:40:51,908
Before Docker's gateway, the MCP gateway,
you would have like an MCP client

766
00:40:51,978 --> 00:40:55,748
That whether it's your IDE, your terminal,
or AI, chat, desktop, or whatever you've

767
00:40:55,748 --> 00:40:58,068
got, that is acting as an MCP client.

768
00:40:58,358 --> 00:41:01,518
Assuming it supports MCP servers,
you can add them one at a time.

769
00:41:01,748 --> 00:41:05,978
So I would add GitHub's MCP server, then
I would add DuckDuckGo's MCP server.

770
00:41:06,228 --> 00:41:09,058
I might add Notion's MCP server,
since I'm a big Notion fan.

771
00:41:09,338 --> 00:41:14,368
And each one of those servers
has One to infinity, tools, which

772
00:41:14,368 --> 00:41:16,538
are, I look at as like API routes.

773
00:41:16,918 --> 00:41:20,308
and each one has its
own very niche purpose.

774
00:41:20,774 --> 00:41:23,414
depending on the tool, and this is part
of the frustration with the ecosystem

775
00:41:23,414 --> 00:41:26,414
right now is we're only months into this,
but it's amazing that all these tools are

776
00:41:26,414 --> 00:41:30,024
all starting to support each other, tools
have different ways where you manage this.

777
00:41:30,034 --> 00:41:33,244
Some of them you can disable
and enable specific servers.

778
00:41:33,454 --> 00:41:36,914
Some, you can actually choose
the tools individually, which

779
00:41:36,914 --> 00:41:38,494
is like choosing API routes.

780
00:41:38,944 --> 00:41:41,994
And to me, it's you're always trying
to get down to the smallest amount of

781
00:41:41,994 --> 00:41:44,134
tools that you need to prevent confusion.

782
00:41:44,144 --> 00:41:47,854
Cause I'm, my biggest problem is
I enable all the tools because

783
00:41:47,854 --> 00:41:49,514
I get tired of managing them.

784
00:41:49,674 --> 00:41:50,398
I just want them to work.

785
00:41:50,759 --> 00:41:52,829
I just want them all to
work when they need to work.

786
00:41:53,189 --> 00:41:54,589
And then I, so I enable them all.

787
00:41:55,379 --> 00:41:56,939
I end up with 50 plus tools.

788
00:41:57,239 --> 00:42:01,419
And then when I'm asking AI to do things,
it chooses the wrong tool because I

789
00:42:01,419 --> 00:42:06,699
wasn't precise enough in my ask to
trigger the right words that are written

790
00:42:06,699 --> 00:42:09,419
in the system prompt of that MCP server.

791
00:42:09,469 --> 00:42:14,769
So actually, maybe an easier update might
be to put another layer on top of the

792
00:42:14,769 --> 00:42:18,029
MCP server and kind of an in between.

793
00:42:18,109 --> 00:42:23,929
so I'm connecting the MCP gateway
now to multiple other MCP servers.

794
00:42:24,189 --> 00:42:25,049
So I get, yeah, you're right.

795
00:42:25,049 --> 00:42:26,009
I need another layer here.

796
00:42:26,009 --> 00:42:27,379
That's actually MCP servers.

797
00:42:27,739 --> 00:42:33,829
so, there's now this gateway in the
middle, and it, the only negative of this

798
00:42:33,829 --> 00:42:39,119
approach is for right now, because we
don't have this futuristic utopia yet,

799
00:42:39,539 --> 00:42:47,579
is that to my terminal, or my IDE, it
all looks like one giant list of tools.

800
00:42:47,859 --> 00:42:51,589
And in one MCP server, which is
just the nature of proxy, right?

801
00:42:51,639 --> 00:42:54,379
But behind one IP address is a
whole bunch of websites, like

802
00:42:54,379 --> 00:42:55,349
you don't realize it, right?

803
00:42:55,719 --> 00:42:58,129
So it is, the analogy still
works, I believe, there.

804
00:42:58,349 --> 00:43:04,289
But in this case, because it's connecting
all of them together into one proxy,

805
00:43:04,309 --> 00:43:07,929
and the nice thing is, it's, I can see
in the memory usage and the containers.

806
00:43:07,929 --> 00:43:12,229
In fact, when Michael was on weeks
ago, We saw the MCP gateway spinning up

807
00:43:12,229 --> 00:43:15,619
servers dynamically and then shutting
them down and you could see the container

808
00:43:15,619 --> 00:43:19,279
launch run, you know, run the curl
command or whatever, and then close.

809
00:43:19,399 --> 00:43:22,919
And it was so quick, we couldn't,
capture and swat toggle windows

810
00:43:23,129 --> 00:43:24,269
to see the tools launching.

811
00:43:24,269 --> 00:43:25,379
And, I mean, it's beautiful.

812
00:43:25,379 --> 00:43:26,819
It's exactly what containers were for.

813
00:43:26,894 --> 00:43:28,654
It's, it's ephemeral, it's wonderful.

814
00:43:29,379 --> 00:43:34,639
But, if your IDE or if your chat desktop
or whatever is acting as your MCP client,

815
00:43:34,639 --> 00:43:39,729
the agent thing, if that doesn't let
you choose individual tools, then this

816
00:43:39,729 --> 00:43:44,989
approach is a little hard because the
only way from my IDE that I can, I have to

817
00:43:44,989 --> 00:43:47,169
turn off all of Docker or none of Docker.

818
00:43:47,169 --> 00:43:47,762
It I think.

819
00:43:48,312 --> 00:43:51,342
This gets us to a conversation
of eventually we will have this.

820
00:43:51,532 --> 00:43:55,162
I'm thinking of it as like the model,
the plan model before the model that will

821
00:43:55,162 --> 00:43:57,112
go, okay, you used all these keywords.

822
00:43:57,132 --> 00:43:59,512
I'm going to pick out the right
tools and I'm going to hand those

823
00:43:59,512 --> 00:44:01,782
off to the next model, which
is going to do the actual work.

824
00:44:02,352 --> 00:44:03,912
That's probably already here.

825
00:44:04,022 --> 00:44:05,632
Solomon predicted it a month ago.

826
00:44:05,822 --> 00:44:06,422
Yeah, I'm sorry.

827
00:44:06,422 --> 00:44:06,682
What?

828
00:44:07,023 --> 00:44:09,213
so that's what Michael and I,
while you were drawing this

829
00:44:09,402 --> 00:44:10,562
Oh, is that what you
were just talking about?

830
00:44:10,763 --> 00:44:12,533
that's what Michael and
I, we were talking about.

831
00:44:12,734 --> 00:44:18,600
So the gateway itself has its own
MCP server that controls itself.

832
00:44:19,180 --> 00:44:24,030
And so we're a few months away from
exactly what you were just talking about.

833
00:44:24,290 --> 00:44:28,200
Bret, because of context windows,
because there's too many tools, because

834
00:44:28,200 --> 00:44:31,690
of all the things that you did, all the
challenges you just mentioned, Bret.

835
00:44:32,120 --> 00:44:37,300
the first step might be the client
going to the MCP gateway, MCP server

836
00:44:37,300 --> 00:44:41,500
first and saying, hey, these are
the things I'm about to go do.

837
00:44:41,500 --> 00:44:46,560
Out of the list, check your MCP
gateway and tell me the list of, MCP

838
00:44:46,560 --> 00:44:49,010
tools that I actually need for that.

839
00:44:49,480 --> 00:44:52,690
And then only turn those
on for the next task.

840
00:44:53,228 --> 00:44:53,698
Yeah.

841
00:44:54,250 --> 00:44:57,130
and then it'll just
repeat that cycle again.

842
00:44:57,240 --> 00:45:03,640
and then winnow down that list of
MCP tools to the only things that

843
00:45:03,640 --> 00:45:05,290
are needed for that task at hand.

844
00:45:05,740 --> 00:45:09,220
So there's another layer here,
which we're, and Michael and I,

845
00:45:09,220 --> 00:45:11,520
we were discussing while you were
building that beautiful diagram.

846
00:45:12,110 --> 00:45:14,530
It's, people are experimenting with that.

847
00:45:14,580 --> 00:45:19,460
All the pieces are in place, but that this
pattern isn't quite there just yet, but

848
00:45:19,460 --> 00:45:23,200
it will likely be, I'm pretty sure this
is what we're going to be doing pretty

849
00:45:23,253 --> 00:45:26,823
Nobody wants to go manually choose
every MCP server that they're going

850
00:45:26,823 --> 00:45:28,403
to need before every AI request.

851
00:45:28,953 --> 00:45:31,673
almost feels like it takes away
the speed advantage of using the

852
00:45:31,673 --> 00:45:33,373
MCP tool to go get the data for me.

853
00:45:33,373 --> 00:45:38,388
if I have to do all this work in each
tool independently, Because I often

854
00:45:38,388 --> 00:45:43,648
will have an IDE accessing AI, acting
as the MCP client, and then I'll have

855
00:45:43,648 --> 00:45:45,318
a terminal acting as an MCP client.

856
00:45:45,548 --> 00:45:49,728
At the same time, I've got ChatGPT
desktop running over here, also

857
00:45:49,728 --> 00:45:52,928
while VS Code, I think a lot of us,
eventually evolve to the point where

858
00:45:52,988 --> 00:45:57,418
we've got two or three tools all at
the same time managing MCP tools.

859
00:45:57,448 --> 00:45:59,828
We've got, I guess we have
multiple IDEs, I should say.

860
00:46:00,238 --> 00:46:04,848
and trying to understand how all this
comes together is only interesting right

861
00:46:04,848 --> 00:46:08,538
now, but in six months, we're not going
to want to be messing with all this stuff.

862
00:46:08,538 --> 00:46:11,598
We're just going to want this part to
work so we can work on building agents,

863
00:46:11,648 --> 00:46:11,958
All right.

864
00:46:11,958 --> 00:46:16,348
So Compose, my favorite tool, a lot of
people's favorite Docker tool, other

865
00:46:16,348 --> 00:46:17,588
than the fact that Docker exists.

866
00:46:17,838 --> 00:46:21,828
you announced at, we are developers
that Compose is getting more.

867
00:46:22,138 --> 00:46:24,878
There's functionality in the YAML
specifically, where I guess we're talking

868
00:46:24,878 --> 00:46:30,248
about the YAML configuration that drives
the Compose command line, that in just

869
00:46:30,248 --> 00:46:34,458
three months ago, you were adding model
support, and that was like an early

870
00:46:34,458 --> 00:46:40,078
alpha idea of what if I could specify
the model I wanted Docker Model Runner

871
00:46:40,078 --> 00:46:47,588
to run when I launch my app that maybe
needs a model, a local model, and I

872
00:46:47,588 --> 00:46:51,893
use the example and I have a An actual
demo over on GIST, that people can

873
00:46:51,893 --> 00:46:57,693
pick up that you simply, you, you write
your Compose file, you use something

874
00:46:57,693 --> 00:47:00,548
called open web, open, what is it?

875
00:47:00,598 --> 00:47:03,398
Open web, web, open web UI, I think.

876
00:47:03,618 --> 00:47:04,908
Yeah, horrible name.

877
00:47:05,448 --> 00:47:10,308
Extremely generic name for what
is a ChatGPT clone, essentially.

878
00:47:10,368 --> 00:47:14,428
the open source variant, which can
use any models or more than one model.

879
00:47:14,438 --> 00:47:16,448
It actually lets you
choose it in the interface.

880
00:47:16,908 --> 00:47:21,608
And all you need is a
little bit of compose file.

881
00:47:23,528 --> 00:47:27,368
So, I created a 29 lines, and it
probably needs to be updated because

882
00:47:27,368 --> 00:47:34,978
it's probably outdated, but, 29 lines of
Compose that's half comments that allows

883
00:47:34,978 --> 00:47:40,958
me to spin up an open web UI container
while also spinning up the models or

884
00:47:40,968 --> 00:47:44,768
making sure, basically, that I have the
models locally that I need to run it.

885
00:47:44,998 --> 00:47:49,138
And this gives me a ChatGPT
experience without ChatGPT.

886
00:47:49,138 --> 00:47:50,238
Thank you.

887
00:47:50,318 --> 00:47:52,098
And you guys, you enable this.

888
00:47:52,098 --> 00:47:53,928
Now you're not creating the models.

889
00:47:54,088 --> 00:47:55,938
You're not creating the open web UI.

890
00:47:56,138 --> 00:48:00,358
you're simply providing the
glue for it to all come together

891
00:48:00,358 --> 00:48:02,208
in a very easy way locally.

892
00:48:02,548 --> 00:48:02,788
Yeah.

893
00:48:02,788 --> 00:48:06,258
as we agentic apps need three things.

894
00:48:06,258 --> 00:48:09,578
They need models, they need tools, and
then the code that glues it all together.

895
00:48:09,978 --> 00:48:13,398
What the Compose file lets us
do now is define all three of

896
00:48:13,398 --> 00:48:15,788
those in a single document.

897
00:48:15,978 --> 00:48:21,658
here's the models that my app is going to
need for MCP gateway that I'm just going

898
00:48:21,658 --> 00:48:23,378
to run as another containerized service.

899
00:48:23,708 --> 00:48:27,408
And then the code, the custom code,
can be really any agentic framework.

900
00:48:27,418 --> 00:48:32,308
this example is, Open web UI, but that
Compose snippet what we've done is

901
00:48:32,308 --> 00:48:37,728
we've evolved the specification now
models are a top level element in the

902
00:48:37,728 --> 00:48:39,153
Compose file, which is pretty cool.

903
00:48:39,693 --> 00:48:42,453
This just dropped in the last couple
of weeks, so this is brand new.

904
00:48:42,819 --> 00:48:43,749
Gotta update my gist.

905
00:48:44,213 --> 00:48:47,713
yep, and so where before, yeah,
you had to use this provider

906
00:48:47,713 --> 00:48:49,613
syntax, and that still works.

907
00:48:49,933 --> 00:48:52,123
now it's actually part
of the specification.

908
00:48:52,543 --> 00:48:55,203
defining a model, This is
going to pull from Docker Hub.

909
00:48:55,213 --> 00:48:58,318
again, You can have your own models
and your own container registry.

910
00:48:58,318 --> 00:48:59,608
It's just an OCI artifact.

911
00:48:59,608 --> 00:49:01,058
You can specify that anywhere.

912
00:49:01,698 --> 00:49:03,108
then we've got the services,

913
00:49:03,378 --> 00:49:04,908
And then the app itself.

914
00:49:05,228 --> 00:49:08,958
What's cool about the model now is
with the specification evolution,

915
00:49:09,278 --> 00:49:13,658
you can now specify, hey, this is the
environment variable I want you to

916
00:49:13,658 --> 00:49:15,658
basically inject into my container.

917
00:49:16,038 --> 00:49:19,868
to specify what's the endpoint,
where's the base URL that I

918
00:49:19,878 --> 00:49:22,108
should use to access this model.

919
00:49:22,488 --> 00:49:24,098
And then what's the model name as well.

920
00:49:24,398 --> 00:49:29,718
So the cool thing then is I can
go back up to the top level model

921
00:49:29,718 --> 00:49:33,908
specification, I can swap that out
and the environment variables will be

922
00:49:33,918 --> 00:49:37,858
automatically updated and as assuming
that my app is using those environment

923
00:49:37,858 --> 00:49:39,848
variables, Everything just works.

924
00:49:40,308 --> 00:49:44,188
So again, think of Compose as, it's
the glue that's making sure that

925
00:49:44,188 --> 00:49:48,218
everything is there for the application
to actually be able to leverage it.

926
00:49:49,029 --> 00:49:49,479
Yeah.

927
00:49:49,539 --> 00:49:51,729
the gateway part here
was pretty cool to me.

928
00:49:51,729 --> 00:49:58,429
That I can add in my tools, my
MCP tools inside of the YAML file.

929
00:49:58,449 --> 00:50:00,039
when I saw that part, I was like, yes.

930
00:50:00,089 --> 00:50:05,329
that is like my vision, my dream is
that I can pass a composed file to

931
00:50:05,329 --> 00:50:09,159
someone else and it'll use their keys.

932
00:50:09,689 --> 00:50:10,469
Presuming, my.

933
00:50:10,934 --> 00:50:15,354
Team is all using the same provider
that we would have the same.

934
00:50:15,364 --> 00:50:19,544
Because open, well, open AI,
base URL, open AI model, and

935
00:50:19,544 --> 00:50:21,704
then open AI API key or whatever.

936
00:50:21,734 --> 00:50:24,364
if you're going to use ones in the
SAS, like those are all pretty generic.

937
00:50:24,514 --> 00:50:28,154
Even if you're not using OpenAI, they're
all pretty generic, environment variables.

938
00:50:28,164 --> 00:50:30,794
So I guess this would work
across teams or across people

939
00:50:30,838 --> 00:50:32,388
well, and that's a good point to call it.

940
00:50:32,398 --> 00:50:36,528
One of the things that OpenAI did when
they released their APIs was basically,

941
00:50:36,528 --> 00:50:40,008
hey, here's a specification on how to
interact with models that pretty much,

942
00:50:40,058 --> 00:50:42,168
Everybody else has adopted and used.

943
00:50:42,548 --> 00:50:47,068
and so the Docker model runner
exposes an OpenAI compatible API.

944
00:50:47,068 --> 00:50:49,428
And so that's why you see these
environment variables kind

945
00:50:49,428 --> 00:50:52,288
of using the OpenAI prefix.

946
00:50:52,718 --> 00:50:56,168
Because again, I can use now any
agentic application that can talk to

947
00:50:56,168 --> 00:50:58,278
OpenAI or use the OpenAI libraries.

948
00:50:58,278 --> 00:51:00,328
And it's just a configuration
change at this point.

949
00:51:00,433 --> 00:51:00,813
All right.

950
00:51:00,833 --> 00:51:01,923
Now, the coup de gras.

951
00:51:02,363 --> 00:51:03,753
Piece la resistance.

952
00:51:04,463 --> 00:51:08,393
I can't even do my pretend French, All
this stuff has been running locally.

953
00:51:08,393 --> 00:51:10,893
Like when we think of Docker desktop,
we think of everything locally.

954
00:51:10,893 --> 00:51:17,153
And then a year or two ago, Docker
launched, Docker build cloud, which was

955
00:51:17,163 --> 00:51:19,583
like getting back to Docker's roots.

956
00:51:19,583 --> 00:51:23,903
I almost feel like of providing more
a SaaS service that essentially is.

957
00:51:24,678 --> 00:51:26,528
Doing something in a container for me.

958
00:51:26,528 --> 00:51:30,228
And in that case, it was just building
containers using an outsourced build kit.

959
00:51:30,238 --> 00:51:34,538
So it was better for parallelization
and multi architecture.

960
00:51:34,588 --> 00:51:35,168
it was sweet.

961
00:51:35,178 --> 00:51:39,878
And I love it for when I need to build
like enterprise tools or big business

962
00:51:39,968 --> 00:51:41,838
things that take 20 minutes to build.

963
00:51:41,838 --> 00:51:45,598
None of my sample little examples do
that, but anything in the real world

964
00:51:45,598 --> 00:51:48,718
takes that long and you need to build
multi architecture and generally it's

965
00:51:48,758 --> 00:51:50,368
going to be faster in a cloud environment.

966
00:51:50,368 --> 00:51:51,278
So you provided that.

967
00:51:51,658 --> 00:51:58,428
Now it feels like you've upgraded, like
it's beyond just building, and it does any

968
00:51:58,428 --> 00:52:04,088
image or any container I want to run, any
model I want to run, I guess, not maybe

969
00:52:04,098 --> 00:52:07,748
any, I don't know if there's a limitation
there, but, bigger models, then maybe I

970
00:52:07,748 --> 00:52:11,238
can run locally, and then, also builds.

971
00:52:11,653 --> 00:52:15,443
So it can do building the image,
hosting the container, running the

972
00:52:15,443 --> 00:52:17,903
model endpoint for a, QIN3 or whatever.

973
00:52:18,423 --> 00:52:20,953
I can now do all that in
something called Offload.

974
00:52:21,303 --> 00:52:22,243
So tell me about that.

975
00:52:22,548 --> 00:52:26,588
Docker Offload, the way I explain it to
people is, hey, you need more resources?

976
00:52:26,768 --> 00:52:27,808
Burst into the cloud.

977
00:52:28,628 --> 00:52:31,838
And so it's basically, I'm going
to offload this into the cloud,

978
00:52:32,178 --> 00:52:35,548
but yet it's still, everything
still works as if it were local.

979
00:52:35,548 --> 00:52:38,058
So if I've got bind mounts, okay,
great, we're going to automatically

980
00:52:38,058 --> 00:52:39,518
set up the synchronized file shares.

981
00:52:39,528 --> 00:52:43,448
And so all that's going to work the way
using mutagen and some of the other tools

982
00:52:43,448 --> 00:52:44,948
behind the scenes to make that work.

983
00:52:45,278 --> 00:52:48,008
Port publishing that still
works as you would expect it to.

984
00:52:48,008 --> 00:52:54,798
So again, it gives that local
experience, but using remote resources.

985
00:52:54,848 --> 00:52:57,758
I'm just offloading this to
the cloud, but yet it's still.

986
00:52:58,393 --> 00:52:59,153
My environment.

987
00:52:59,553 --> 00:53:02,553
and so, yeah, to make it clear, like,
this is not a production runtime

988
00:53:02,553 --> 00:53:05,863
environment, I can't share this
environment out, or, I can't, you

989
00:53:05,863 --> 00:53:08,573
know, create a URL and say, hey, check
this out, colleague, or whatever,

990
00:53:08,573 --> 00:53:10,253
it's still for your personal use.

991
00:53:10,423 --> 00:53:13,853
Now, of course, can you make a,
Cloudflare tunnels, and I'm going

992
00:53:13,853 --> 00:53:17,013
to make it production, sure, but I

993
00:53:17,053 --> 00:53:17,243
wouldn't

994
00:53:17,304 --> 00:53:17,614
could hack

995
00:53:17,973 --> 00:53:18,223
that.

996
00:53:18,444 --> 00:53:18,884
but yeah.

997
00:53:18,945 --> 00:53:19,555
Yeah.

998
00:53:19,795 --> 00:53:20,755
So what is the intent?

999
00:53:20,765 --> 00:53:23,095
so what is the use case?

1000
00:53:23,935 --> 00:53:24,205
the big

1001
00:53:24,267 --> 00:53:26,147
should I use Docker offload for first?

1002
00:53:26,450 --> 00:53:27,620
Yeah, so, okay, great.

1003
00:53:27,630 --> 00:53:31,160
You're wanting to play around these
agentic apps and, you know, we

1004
00:53:31,170 --> 00:53:35,450
were talking about not everybody
has access to high end GPUs or,

1005
00:53:35,450 --> 00:53:37,930
you know, M4 machines and whatnot.

1006
00:53:37,970 --> 00:53:41,800
great with the flip of a switch, and
you had it there in Docker Desktop,

1007
00:53:41,800 --> 00:53:45,440
but at the top you just flip a
switch, and now you're using offload.

1008
00:53:45,810 --> 00:53:51,990
and so now you've got access to a pretty
significant NVIDIA GPU, and additional

1009
00:53:51,990 --> 00:53:57,120
resources, and so yeah, as you're, We
see the use case, especially more for

1010
00:53:57,120 --> 00:54:00,640
the agent applications, because that's
where those resources are needed.

1011
00:54:01,580 --> 00:54:07,070
It does open up some interesting doors
for, maybe I'm just on a super lightweight

1012
00:54:07,100 --> 00:54:09,930
laptop that I'm using for school and
I don't have the ability to even run

1013
00:54:09,930 --> 00:54:11,640
a lot of my containerized workloads.

1014
00:54:12,220 --> 00:54:15,480
Great, I can use that for offload,
you know, offload that to the cloud.

1015
00:54:15,530 --> 00:54:19,460
it does open up some interesting
opportunities for, Use cases beyond

1016
00:54:19,460 --> 00:54:23,190
agentic apps, but that's kind of
where the big focus is right now.

1017
00:54:23,415 --> 00:54:26,765
So if you're like a Docker insider, or if
you're someone who's used Docker a while,

1018
00:54:27,420 --> 00:54:32,470
it's the Docker context command that we've
had forever, augmenting or changing the

1019
00:54:32,470 --> 00:54:36,100
environment variable docker underscore
host, which we've had since almost the

1020
00:54:36,100 --> 00:54:43,475
beginning, and it allows you from your
local Docker CLI, And even the GUI works

1021
00:54:43,475 --> 00:54:48,785
this way too, because I could always
set up a Docker remote engine and then

1022
00:54:49,075 --> 00:54:51,545
create a new context in the Docker CLI.

1023
00:54:52,025 --> 00:54:57,665
That would use SSH tunneling to go to that
server, and then I could run my Docker

1024
00:54:57,665 --> 00:55:02,015
CLI locally, my Compose CLI locally, and
it would technically be accessing and

1025
00:55:02,015 --> 00:55:06,615
running against, the remote host that I
had set up, but that was never really a

1026
00:55:06,615 --> 00:55:11,695
cloud service, like it was never, no one
provides Docker API access as a service

1027
00:55:11,695 --> 00:55:16,205
that I'm aware of, and, the context
command, while it's easy to use, and you

1028
00:55:16,205 --> 00:55:19,195
can actually use it on any command, you
can use docker run dash dash context,

1029
00:55:19,195 --> 00:55:22,815
I believe, or docker dash dash context
run, I can't remember the order, but,

1030
00:55:22,965 --> 00:55:24,125
you can change that on any command.

1031
00:55:24,275 --> 00:55:27,285
these are all things that existed,
but you made it, stupid easy.

1032
00:55:27,945 --> 00:55:30,445
It's just, like you said,
it's a toggle, it's so easy.

1033
00:55:30,885 --> 00:55:33,655
You just click that button
and then the UI changes, the

1034
00:55:34,015 --> 00:55:35,895
colors change, so now you know.

1035
00:55:36,365 --> 00:55:37,525
You're now remote.

1036
00:55:38,350 --> 00:55:41,830
Yeah, and so I'll go ahead and say too,
behind the scenes, it's using context.

1037
00:55:41,830 --> 00:55:43,250
It's using those exact things.

1038
00:55:43,610 --> 00:55:47,980
The tricky part is, because I've done
similar, development environments where,

1039
00:55:48,030 --> 00:55:52,280
I'm going to work against a Raspberry Pi
at home or, whatever else it might be.

1040
00:55:52,720 --> 00:55:56,050
the tricky part is when you want to
get into bind mounts, file sharing

1041
00:55:56,050 --> 00:55:58,620
kind of stuff, or port publishing,
and I want to be able to access

1042
00:55:58,620 --> 00:56:01,530
that port from my machine, like
automating all those different pieces.

1043
00:56:02,420 --> 00:56:03,730
That's not trivial.

1044
00:56:03,750 --> 00:56:04,900
I mean, it's possible.

1045
00:56:05,080 --> 00:56:07,810
a separate tool, yeah, you gotta
download ngrok or something,

1046
00:56:08,560 --> 00:56:12,310
And so this brings all that together
into a single offering here.

1047
00:56:12,902 --> 00:56:13,802
That's pretty amazing.

1048
00:56:13,812 --> 00:56:18,132
Like there's a lot going on underneath
the hood that, switch is hiding

1049
00:56:18,142 --> 00:56:19,922
a lot of different functionality.

1050
00:56:19,952 --> 00:56:20,152
Like

1051
00:56:20,152 --> 00:56:20,442
it's.

1052
00:56:20,997 --> 00:56:22,587
To make that very transparent

1053
00:56:23,250 --> 00:56:25,350
And this supports builds too, right?

1054
00:56:25,360 --> 00:56:25,890
So like.

1055
00:56:26,505 --> 00:56:30,475
When I toggle this in the
UI, or is there a CLI toggle?

1056
00:56:30,685 --> 00:56:31,305
Yeah, there is.

1057
00:56:31,591 --> 00:56:31,991
okay.

1058
00:56:32,251 --> 00:56:35,681
So if I toggle this, it's, yeah, you're
like, you're saying it's a context

1059
00:56:35,681 --> 00:56:41,161
change, but it's UI aware, and it takes
in all the other little things that we

1060
00:56:41,161 --> 00:56:42,761
don't think about until they don't work.

1061
00:56:42,761 --> 00:56:45,761
And then we're like, oh, yeah, it's
not really running locally anymore.

1062
00:56:45,761 --> 00:56:47,231
So now I can't use localhost colon.

1063
00:56:47,381 --> 00:56:49,631
Well, that all just, I'm going to show
you how this works and you don't even

1064
00:56:49,641 --> 00:56:52,721
have to know kind of like the rest of
the Dockery because you don't really

1065
00:56:52,721 --> 00:56:54,231
have to know how it works underneath.

1066
00:56:54,591 --> 00:56:57,531
but if you think it's too much magic,
I like to break it down and say,

1067
00:56:57,701 --> 00:56:59,241
it's just really the Docker context.

1068
00:56:59,271 --> 00:57:01,221
I don't actually look
anything at the code.

1069
00:57:01,241 --> 00:57:02,641
I don't know really how it's working.

1070
00:57:02,891 --> 00:57:06,391
But to me, when I went and checked,
it does change the context for me.

1071
00:57:06,391 --> 00:57:08,031
It actually injects it
and then removes it.

1072
00:57:08,311 --> 00:57:10,011
I did notice it from the CLI.

1073
00:57:10,211 --> 00:57:11,441
I could change context.

1074
00:57:11,867 --> 00:57:15,657
And it would retain the context, but
if I use the toggle button, it deletes

1075
00:57:15,657 --> 00:57:16,987
the context and then re adds it.

1076
00:57:17,107 --> 00:57:20,347
Regardless, it is in the
background, it's doing cool things.

1077
00:57:20,647 --> 00:57:23,817
I think the immediate request from
the captains was, can I do both?

1078
00:57:24,017 --> 00:57:30,157
Can I have, Per workload or per service
offload so that just my model's remote

1079
00:57:30,157 --> 00:57:34,087
and maybe that really big database
server and then all my apps are local.

1080
00:57:34,397 --> 00:57:38,497
I don't know why I would care, but like
it, that's something that people ask for.

1081
00:57:38,707 --> 00:57:40,927
I'm not sure that I
care that to that level.

1082
00:57:40,927 --> 00:57:44,857
I think I'm fine with either or, but I can
understand that if I'm running some things

1083
00:57:44,857 --> 00:57:49,167
locally already and I just want to add on
something in addition, it would be neat

1084
00:57:49,167 --> 00:57:51,477
if I could just choose for one service.

1085
00:57:52,462 --> 00:57:52,682
Yeah.

1086
00:57:52,682 --> 00:57:54,612
So as of right now, it
is an all or nothing.

1087
00:57:54,622 --> 00:57:55,742
you're doing everything local.

1088
00:57:55,742 --> 00:57:57,232
You're doing everything out in the cloud.

1089
00:57:57,232 --> 00:57:58,317
there's not a way to.

1090
00:57:58,787 --> 00:58:00,257
Split that up yet.

1091
00:58:00,307 --> 00:58:02,367
it's something that we've heard
from a couple of folks, but

1092
00:58:02,417 --> 00:58:05,237
again, it's that same thing of,
tell us more about the use cases.

1093
00:58:05,237 --> 00:58:08,447
So if that's a use case you have,
feel free to reach out to us and

1094
00:58:08,447 --> 00:58:13,907
help us better understand, why
you might want to split runtime.

1095
00:58:14,147 --> 00:58:16,977
hosting, split environment,
hybrid environment.

1096
00:58:17,237 --> 00:58:18,317
That's the correct term.

1097
00:58:18,476 --> 00:58:19,236
Why do you say it like that?

1098
00:58:19,436 --> 00:58:21,816
and just to be clear,
offload has its own cost.

1099
00:58:21,816 --> 00:58:24,416
like this isn't free forever for infinity.

1100
00:58:24,416 --> 00:58:26,246
You can't just take up a bunch of GPUs.

1101
00:58:26,506 --> 00:58:29,686
I was asking the team a little bit
and without getting too nerdy, it

1102
00:58:29,696 --> 00:58:33,826
sounds like it isolates, it spins up
a VM or there's maybe some hot VMs.

1103
00:58:33,826 --> 00:58:35,086
And I get a dedicated.

1104
00:58:36,026 --> 00:58:39,476
OS essentially it sounds like so
that I can get the GPU if I need.

1105
00:58:39,476 --> 00:58:43,206
And you kind of get an option of, do
I want GPU, servers with GPUs or not?

1106
00:58:43,216 --> 00:58:45,486
do I, am I going to run
GPU workloads or not?

1107
00:58:45,486 --> 00:58:48,476
And that affects pricing do we
get anything out of the box with

1108
00:58:48,476 --> 00:58:51,386
a Docker subscription or is it
just an, a completely separate

1109
00:58:51,726 --> 00:58:56,153
So actually it's a. Kind of private
beta, but yet people can sign up

1110
00:58:56,163 --> 00:58:57,353
for it and that kind of stuff.

1111
00:58:57,683 --> 00:59:01,633
folks will get the 300 GPU minutes,
which isn't a ton, but it's enough to

1112
00:59:01,653 --> 00:59:03,143
experiment and play around with it.

1113
00:59:03,443 --> 00:59:05,233
and then, start giving us feedback, etc.

1114
00:59:05,549 --> 00:59:08,169
Yeah, if you spin up the GPU
instance and then go to lunch, by

1115
00:59:08,169 --> 00:59:10,659
the time you get back, you'll have
probably used up your free minutes.

1116
00:59:10,738 --> 00:59:11,448
It's a long lunch,

1117
00:59:11,508 --> 00:59:12,668
hey, that's my kind of lunch.

1118
00:59:14,458 --> 00:59:17,998
but yeah, so we went an hour,
and we barely scratched the

1119
00:59:17,998 --> 00:59:19,228
surface, do we cover it all?

1120
00:59:19,298 --> 00:59:22,978
did we list at least all the
announcements of Major features and tools.

1121
00:59:22,978 --> 00:59:25,148
I don't even want to say we've covered
all the features because there's

1122
00:59:25,148 --> 00:59:26,798
probably some stuff with MCP we missed.

1123
00:59:27,188 --> 00:59:31,338
So you open source MCP gateway, but
we should point out you don't actually

1124
00:59:31,338 --> 00:59:36,488
have to know, like you can just use
Docker desktop and MCP tools locally.

1125
00:59:37,028 --> 00:59:40,618
But the reason you provide an
MCP gateway is open source is so

1126
00:59:40,618 --> 00:59:44,478
we could put it in the compose
file and then run it on servers.

1127
00:59:44,528 --> 00:59:45,498
think about it this way.

1128
00:59:45,518 --> 00:59:48,478
the MCP toolkit bundled with Docker
Desktop is going to be more for,

1129
00:59:48,728 --> 00:59:52,678
I'm consuming, I'm just wanting to
use MCP servers and connect them

1130
00:59:52,678 --> 00:59:54,558
to my other agentic applications.

1131
00:59:54,868 --> 00:59:57,888
And the MCP Gateway is going to
be more for, now I want to build

1132
00:59:57,888 --> 01:00:01,948
my own agentic applications and
connect those tools to, Those, those

1133
01:00:01,948 --> 01:00:03,508
applications that we're running there.

1134
01:00:03,949 --> 01:00:04,389
Yeah.

1135
01:00:04,729 --> 01:00:07,229
Do you see people using
MCP Gateway in production?

1136
01:00:07,249 --> 01:00:10,819
Do you see that as like a. Not that
you provide support or anything like

1137
01:00:10,819 --> 01:00:12,499
that, but is it designed so that

1138
01:00:13,499 --> 01:00:16,989
We've got a couple of folks that
are already starting to do so.

1139
01:00:17,059 --> 01:00:19,609
stay tuned for some use
case stories around that.

1140
01:00:19,659 --> 01:00:19,939
Yeah.

1141
01:00:20,569 --> 01:00:20,919
Awesome.

1142
01:00:21,419 --> 01:00:22,329
well, this is a lot.

1143
01:00:22,359 --> 01:00:28,139
I feel like I need to launch another
10 Docker YouTube uploads just

1144
01:00:28,149 --> 01:00:31,319
to cover each tool specifically,
each use case specifically.

1145
01:00:31,609 --> 01:00:35,314
there's a lot here, but
this is Amazing work.

1146
01:00:35,314 --> 01:00:39,884
I mean, I don't know if you have a fleet
of AI robots working for you yet, but

1147
01:00:40,118 --> 01:00:44,348
certainly feels like a lot of different
products that are all coming together very

1148
01:00:44,348 --> 01:00:48,678
quickly that are all somehow related to
each other, but also independently usable.

1149
01:00:49,143 --> 01:00:54,103
And, I'm having you on the show as
usual is a great way to break it down

1150
01:00:54,133 --> 01:00:58,233
into the real usable bits What do
we really care about without all the

1151
01:00:58,233 --> 01:01:03,053
marketing, hype of general AI hype, which
is always a problem on the internet.

1152
01:01:03,053 --> 01:01:05,173
But this feels like really useful stuff.

1153
01:01:05,203 --> 01:01:05,793
Um,

1154
01:01:08,058 --> 01:01:09,268
Eivor, another podcast.

1155
01:01:09,268 --> 01:01:10,738
I don't know, Eivor, what's up?

1156
01:01:10,748 --> 01:01:13,038
Are you requesting yet another podcast?

1157
01:01:13,398 --> 01:01:14,028
Um,

1158
01:01:14,432 --> 01:01:15,132
a whole new show

1159
01:01:15,178 --> 01:01:17,668
about Compose provider services?

1160
01:01:18,088 --> 01:01:18,888
Oh, yes.

1161
01:01:18,928 --> 01:01:24,448
Also, you can now run Compose directly
from, well, you can use Compose YAML

1162
01:01:24,468 --> 01:01:27,618
from directly inside of cloud tools.

1163
01:01:28,173 --> 01:01:29,923
The first one was Google Cloud Run.

1164
01:01:30,253 --> 01:01:33,373
So I could technically spin
up Google, which I love,

1165
01:01:33,383 --> 01:01:34,883
Google Cloud Run is fantastic.

1166
01:01:35,203 --> 01:01:38,403
Um, it would be my first choice
for running any containers in

1167
01:01:38,403 --> 01:01:39,673
Google if I was using Google.

1168
01:01:40,193 --> 01:01:45,902
Um, and so now they're, accepting
the Compose YAML, spec, essentially,

1169
01:01:46,422 --> 01:01:48,282
inside of their command line.

1170
01:01:48,559 --> 01:01:51,039
So this is like, this feels like the
opposite of what Docker used to do.

1171
01:01:51,039 --> 01:01:54,339
Docker used to build in cloud
functionality into the Docker tooling.

1172
01:01:54,339 --> 01:01:57,769
But now we're saying, Hey, let's partner
with those tools, those companies,

1173
01:01:57,989 --> 01:02:03,489
and let them build in cloud or
compose specification into their tool.

1174
01:02:03,849 --> 01:02:06,419
So we can have basically file reuse.

1175
01:02:06,479 --> 01:02:07,209
YAML reuse.

1176
01:02:07,209 --> 01:02:07,919
Is that right?

1177
01:02:08,549 --> 01:02:08,829
yeah.

1178
01:02:08,829 --> 01:02:13,249
So this is the first time exactly in which
it's not Docker tooling that's providing

1179
01:02:13,249 --> 01:02:15,959
the cloud support, but it's cloud native.

1180
01:02:16,216 --> 01:02:19,406
They're the ones building the tooling
and consuming the Compose file.

1181
01:02:19,590 --> 01:02:21,160
yeah, it's a big moment.

1182
01:02:21,170 --> 01:02:25,530
And as we work with Google Cloud
on this, yeah, you can deploy the

1183
01:02:25,530 --> 01:02:27,020
normal container workloads, etc.

1184
01:02:27,020 --> 01:02:30,890
But they already have support for Model
Runner to be able to run the models

1185
01:02:30,890 --> 01:02:34,460
there as well, it's pretty exciting
And I know, the provider services

1186
01:02:34,520 --> 01:02:37,410
this is how we started with models.

1187
01:02:37,815 --> 01:02:39,555
having support in Compose, where

1188
01:02:39,555 --> 01:02:44,964
That was another service
in which the service wasn't

1189
01:02:45,024 --> 01:02:46,704
backed by a normal container.

1190
01:02:47,174 --> 01:02:47,984
the old method.

1191
01:02:48,034 --> 01:02:52,894
yes, but what's cool about this is, so
first off, these hooks still are in place.

1192
01:02:53,309 --> 01:02:57,259
So that a Compose file can basically
delegate off to this additional

1193
01:02:57,279 --> 01:03:00,899
provider plugin to say, hey, this is
how you're going to spin up a model.

1194
01:03:01,159 --> 01:03:04,029
But it starts to open up a whole
ecosystem where anybody can make a

1195
01:03:04,029 --> 01:03:08,779
provider or, okay, hey, I've got this
cloud based database, just as an example.

1196
01:03:09,069 --> 01:03:12,429
And, okay, Now I can still use
Compose and it's going to spin up

1197
01:03:12,429 --> 01:03:17,439
my containers, but also create this
cloud based container and then inject

1198
01:03:17,629 --> 01:03:19,669
environment variables into my app.

1199
01:03:19,719 --> 01:03:22,039
again, it starts to open up
some pretty cool extensibility

1200
01:03:22,039 --> 01:03:23,589
capabilities of Compose as well.

1201
01:03:23,639 --> 01:03:28,279
I think we, yeah, we need to bring Michael
back just to dig into that because, it's

1202
01:03:28,279 --> 01:03:31,359
essentially like extensions or, Plugins

1203
01:03:31,807 --> 01:03:32,197
Yeah.

1204
01:03:32,328 --> 01:03:32,478
for

1205
01:03:32,507 --> 01:03:35,467
Yeah, so Compose is about to
get a whole lot more love.

1206
01:03:35,477 --> 01:03:38,777
It feels like it's already, I
mean, it's been years since we've

1207
01:03:38,777 --> 01:03:41,007
added a root extension or like a,

1208
01:03:41,158 --> 01:03:41,738
top level,

1209
01:03:41,947 --> 01:03:42,857
top level build.

1210
01:03:43,497 --> 01:03:46,247
it's not every day that Docker
decides there's a whole new

1211
01:03:46,807 --> 01:03:48,617
type of thing that we deploy.

1212
01:03:48,617 --> 01:03:52,127
Now we have models, we'll see if
providers someday become something.

1213
01:03:52,427 --> 01:03:52,777
that'll be cool.

1214
01:03:53,237 --> 01:03:56,687
and this is all due to the Compose
spec, which now allows other

1215
01:03:56,687 --> 01:03:59,877
tools to use the Compose standard.

1216
01:03:59,887 --> 01:04:02,157
And that's just great for everybody,
because everybody uses Compose.

1217
01:04:02,167 --> 01:04:05,117
it's like the most universal
YAML out there, in my opinion.

1218
01:04:05,527 --> 01:04:05,997
great.

1219
01:04:06,087 --> 01:04:07,677
Well, I think we've covered it all.

1220
01:04:07,917 --> 01:04:10,907
Nirmal and I need, another
month to digest all this, and

1221
01:04:10,907 --> 01:04:11,997
then we'll invite you back on.

1222
01:04:12,367 --> 01:04:12,627
do it.

1223
01:04:13,177 --> 01:04:16,487
but yeah, we've checked the
box of, Everything Docker first

1224
01:04:16,487 --> 01:04:18,917
half of the year, stay tuned
for the second half of the year.

1225
01:04:19,137 --> 01:04:22,357
I actually sincerely hope you don't
have as busy of a second half,

1226
01:04:22,387 --> 01:04:25,027
just because it's, these are a lot
of videos I got to make, you're

1227
01:04:25,027 --> 01:04:26,727
putting a lot of work into my, inbox,

1228
01:04:26,787 --> 01:04:28,607
We're helping you have content to create.

1229
01:04:28,903 --> 01:04:32,563
I know, yeah, there's no shortage of
content to create right now with Docker.

1230
01:04:32,903 --> 01:04:34,813
I am very excited to play
with all these things.

1231
01:04:34,813 --> 01:04:37,253
I sound excited because I am excited.

1232
01:04:37,253 --> 01:04:42,123
This is real stuff that I think is
beneficial and largely free, largely,

1233
01:04:42,438 --> 01:04:45,678
Like almost all of this stuff is
really just extra functionality that

1234
01:04:45,678 --> 01:04:49,218
already exists that in our tooling
without adding a whole bunch of SaaS

1235
01:04:49,218 --> 01:04:50,808
services we have to buy on top of it.

1236
01:04:51,168 --> 01:04:52,728
yeah, so congrats.

1237
01:04:53,728 --> 01:04:55,958
People can find out more docker.com.

1238
01:04:56,298 --> 01:05:00,368
docs.Docker.com, dockers got videos
on YouTube now they're putting up

1239
01:05:00,368 --> 01:05:02,618
YouTube videos, so check that out.

1240
01:05:02,768 --> 01:05:05,443
I saw Michael putting up some
videos recently on LinkedIn.

1241
01:05:06,443 --> 01:05:07,203
it's all over the place.

1242
01:05:07,203 --> 01:05:10,063
You can follow Michael Irwin on LinkedIn.

1243
01:05:10,113 --> 01:05:11,013
he's on BlueSky.

1244
01:05:11,013 --> 01:05:11,943
I think you're on BlueSky.

1245
01:05:12,516 --> 01:05:13,446
I think you're on BlueSky.

1246
01:05:14,256 --> 01:05:16,416
Um, or, or, where, yeah,

1247
01:05:16,441 --> 01:05:17,451
figured out where I'm hanging out.

1248
01:05:17,883 --> 01:05:18,753
Thanks so much for being here.

1249
01:05:18,934 --> 01:05:19,474
Thank you, Michael.

1250
01:05:19,474 --> 01:05:21,694
Thank you, Nirmal, for
joining and staying so long.

1251
01:05:22,087 --> 01:05:22,867
I'll see you in the next one.