1
00:00:13,770 --> 00:00:15,562
If you enjoy this content and want to

2
00:00:15,562 --> 00:00:17,125
support it, go to

3
00:00:17,125 --> 00:00:20,291
makeitwork.tv, join as a member,

4
00:00:20,708 --> 00:00:21,520
and watch the full

5
00:00:21,520 --> 00:00:23,437
conversation as a 4K movie.

6
00:00:24,312 --> 00:00:26,562
You can stream it straight from the CDN

7
00:00:26,562 --> 00:00:29,500
or from the Jellyfin media server.

8
00:00:33,333 --> 00:00:38,020
It was summer, 5pm on a Saturday, and I

9
00:00:38,020 --> 00:00:39,645
sent the following email to support

10
00:00:40,187 --> 00:00:42,187
at namespace.so.

11
00:00:42,187 --> 00:00:43,458
Hi, I would like to

12
00:00:43,458 --> 00:00:45,687
debug a GitHub actions workflow locally.

13
00:00:46,083 --> 00:00:47,875
Is it possible to run the namespace

14
00:00:47,875 --> 00:00:50,958
managed Ubuntu container in Docker.

15
00:00:51,041 --> 00:00:53,125
And 12 minutes later I received a reply.

16
00:00:54,166 --> 00:00:56,104
Hi Gerhard, unfortunately we don't have

17
00:00:56,104 --> 00:00:57,083
that possibility yet,

18
00:00:57,520 --> 00:00:59,104
but it is something that we are working

19
00:00:59,104 --> 00:01:01,729
on. What we often suggest to folks who

20
00:01:01,729 --> 00:01:04,812
want to debug image related issues is to

21
00:01:04,812 --> 00:01:07,562
rely on breakpoint action, which allows

22
00:01:07,562 --> 00:01:09,458
you to stop the execution of a workflow

23
00:01:09,458 --> 00:01:11,416
for debugging purposes. So where does

24
00:01:11,416 --> 00:01:12,270
for debugging purposes. So where does

25
00:01:12,270 --> 00:01:15,229
replying to customer support requests on

26
00:01:15,229 --> 00:01:18,416
a weekend fit in your CEO role?

27
00:01:18,416 --> 00:01:20,625
It's a great question and thanks for

28
00:01:20,625 --> 00:01:24,041
actually reaching out because we love

29
00:01:24,041 --> 00:01:25,916
working with developers and I think that

30
00:01:25,916 --> 00:01:27,520
just boils down to that.

31
00:01:29,166 --> 00:01:31,979
We care so much about offering great

32
00:01:31,979 --> 00:01:36,104
support and we are engineers, developers

33
00:01:36,104 --> 00:01:39,062
ourselves and many of these starting

34
00:01:39,062 --> 00:01:42,291
points of projects also started at the

35
00:01:42,291 --> 00:01:44,645
weekend. In fact, that's how Namespace

36
00:01:44,645 --> 00:01:46,666
itself started. It was a weekend project.

37
00:01:46,895 --> 00:01:49,395
So I have a lot of kind of a connection

38
00:01:49,562 --> 00:01:52,708
with that and balancing it out with

39
00:01:52,833 --> 00:01:54,708
regular life, but whenever we see a

40
00:01:54,708 --> 00:01:56,791
request coming in that could

41
00:01:56,791 --> 00:01:58,937
benefit from being unblocked, we try to

42
00:01:58,937 --> 00:01:59,750
do that very quickly

43
00:01:59,750 --> 00:02:01,666
because we care deeply about

44
00:02:01,666 --> 00:02:03,895
offering great support and kind of from

45
00:02:03,895 --> 00:02:06,125
an engineer to an

46
00:02:06,125 --> 00:02:07,458
engineer. And that goes to

47
00:02:07,458 --> 00:02:09,750
everyone in the company. It happened to

48
00:02:09,750 --> 00:02:11,562
be me replying that

49
00:02:11,562 --> 00:02:12,875
time, but it could have been

50
00:02:12,875 --> 00:02:15,187
someone else in the team as well. That's

51
00:02:15,187 --> 00:02:18,395
something that we try to embed as much as

52
00:02:18,395 --> 00:02:19,750
possible in our company culture.

53
00:02:20,708 --> 00:02:23,729
As first experiences go, that was a great one.

54
00:02:23,729 --> 00:02:26,729
So thank you very much. And it set the

55
00:02:26,729 --> 00:02:27,979
tone, I have to say,

56
00:02:27,979 --> 00:02:29,833
and to this day, all my

57
00:02:29,833 --> 00:02:32,208
interactions with namespace have been like this.

58
00:02:32,208 --> 00:02:34,645
Whenever there's a problem, I am confident

59
00:02:34,645 --> 00:02:36,645
there's someone on the other end and I

60
00:02:36,645 --> 00:02:37,520
will get the help that I

61
00:02:37,520 --> 00:02:39,458
need. Oftentimes something that

62
00:02:39,458 --> 00:02:40,770
I didn't know. So always there's

63
00:02:40,770 --> 00:02:41,604
something to learn. I know

64
00:02:41,604 --> 00:02:42,750
about the breakpoint action,

65
00:02:42,750 --> 00:02:44,625
for example. This is a very useful one.

66
00:02:45,000 --> 00:02:45,791
And since then,

67
00:02:45,791 --> 00:02:47,166
obviously, I've learned about the

68
00:02:47,166 --> 00:02:50,291
NSC CLI and a couple of other things, but

69
00:02:50,291 --> 00:02:51,729
it's all there and we all

70
00:02:51,729 --> 00:02:53,020
meet as people and we're

71
00:02:53,020 --> 00:02:54,687
all passionate about this thing because

72
00:02:54,687 --> 00:02:57,833
who else, or how else could

73
00:02:57,833 --> 00:02:59,666
you get this sort of interaction

74
00:02:59,666 --> 00:03:03,020
5pm on a Saturday in the middle of the

75
00:03:03,020 --> 00:03:04,479
summer when maybe you would be out and

76
00:03:04,479 --> 00:03:05,395
about and doing things.

77
00:03:06,062 --> 00:03:07,750
You know, I don't remember anymore, but

78
00:03:07,750 --> 00:03:09,895
perhaps I was out and about.

79
00:03:09,895 --> 00:03:11,375
It's possible, yes, you're on your phone

80
00:03:11,375 --> 00:03:12,916
somewhere. And the

81
00:03:12,916 --> 00:03:14,291
request came in and you're

82
00:03:14,291 --> 00:03:17,291
the first one to pick it. Okay, so I know

83
00:03:17,291 --> 00:03:18,312
that you have a deep love

84
00:03:18,312 --> 00:03:19,666
for all things infrastructure.

85
00:03:19,666 --> 00:03:21,166
And this is something that I've learned

86
00:03:21,166 --> 00:03:22,729
over the months that

87
00:03:22,729 --> 00:03:26,104
we've been in contact and have

88
00:03:26,104 --> 00:03:29,000
two related questions. How did this love

89
00:03:29,000 --> 00:03:30,187
for infrastructure start

90
00:03:30,187 --> 00:03:32,187
and how does it translate

91
00:03:32,187 --> 00:03:34,166
to your day to day?

92
00:03:38,333 --> 00:03:43,437
I've always been fascinated by how things work

93
00:03:43,437 --> 00:03:46,520
and it's very hard to put my

94
00:03:46,541 --> 00:03:48,750
finger on when did that start. But I

95
00:03:48,750 --> 00:03:51,604
think it started early.

96
00:03:51,604 --> 00:03:54,479
I think the earliest memory that I have

97
00:03:54,479 --> 00:03:58,770
 was in some very distant early

98
00:03:58,770 --> 00:04:02,416
years. Christmas, I got a

99
00:04:02,416 --> 00:04:05,958
present, a remote-controlled car.

100
00:04:05,958 --> 00:04:07,833
And I'm old, so it's like very

101
00:04:07,833 --> 00:04:10,250
rickety, early stages, remote-controlled

102
00:04:10,250 --> 00:04:13,895
car. And one of the first

103
00:04:13,895 --> 00:04:18,645
things that I did was open it up and see

104
00:04:18,645 --> 00:04:21,020
how it was inside. So I

105
00:04:21,020 --> 00:04:23,020
think it's just something

106
00:04:23,020 --> 00:04:27,500
wired in my head that I have just some

107
00:04:27,500 --> 00:04:29,645
curiosity to how things work.

108
00:04:29,645 --> 00:04:32,291
And then over time, also how

109
00:04:32,291 --> 00:04:34,562
complex things work and how they are the

110
00:04:34,562 --> 00:04:36,916
sum of simple things composed

111
00:04:36,916 --> 00:04:38,958
together to work in concert.

112
00:04:38,958 --> 00:04:42,125
And then over time, not just technology,

113
00:04:42,125 --> 00:04:44,000
but also people.

114
00:04:44,000 --> 00:04:47,104
They also have their own sets of

115
00:04:47,104 --> 00:04:49,958
complexities and they're also systems

116
00:04:49,958 --> 00:04:55,020
at work. So I think it came down from

117
00:04:55,020 --> 00:04:57,687
just a natural curiosity. I got involved

118
00:04:57,687 --> 00:05:00,125
with technology a

119
00:05:00,125 --> 00:05:01,604
little bit accidentally.

120
00:05:03,104 --> 00:05:04,520
Maybe another,

121
00:05:04,520 --> 00:05:06,687
Actually, another interesting story. I

122
00:05:06,687 --> 00:05:08,208
had my tonsils removed very

123
00:05:08,208 --> 00:05:11,708
early on. I was four or five

124
00:05:11,708 --> 00:05:16,041
years old and I have this distinct memory

125
00:05:16,041 --> 00:05:17,479
of turning to the side

126
00:05:17,479 --> 00:05:20,916
and seeing a screen which

127
00:05:20,916 --> 00:05:23,833
was probably plotting either my heart

128
00:05:23,833 --> 00:05:25,395
rate or plotting something.

129
00:05:27,041 --> 00:05:28,770
And I started asking about

130
00:05:28,770 --> 00:05:31,645
that screen because it's just in the haze

131
00:05:31,645 --> 00:05:33,687
of going under for

132
00:05:33,687 --> 00:05:36,625
surgery. It probably was the

133
00:05:36,625 --> 00:05:38,354
thing that came to mind as a small kid.

134
00:05:40,000 --> 00:05:41,708
And the nurse said,

135
00:05:41,708 --> 00:05:44,062
"Don't worry, just calm down

136
00:05:44,062 --> 00:05:46,250
and we'll show you everything about this

137
00:05:46,250 --> 00:05:48,812
screen afterwards."

138
00:05:48,812 --> 00:05:51,604
And they never did, but that was my

139
00:05:51,604 --> 00:05:56,604
first connection with computers and the

140
00:05:56,604 --> 00:05:58,812
idea of screens and things

141
00:05:58,812 --> 00:06:00,104
that show up on the screen.

142
00:06:02,083 --> 00:06:06,958
And a few years later, started at a time

143
00:06:06,958 --> 00:06:07,979
where you would still buy

144
00:06:07,979 --> 00:06:12,895
magazines that had printouts of code.

145
00:06:12,895 --> 00:06:16,229
And then it's how I got introduced

146
00:06:16,229 --> 00:06:18,000
into the idea that you can actually

147
00:06:18,000 --> 00:06:19,791
program these machines.

148
00:06:19,791 --> 00:06:22,229
And then later on, I was lucky enough

149
00:06:22,229 --> 00:06:23,520
And then later on, I was lucky enough

150
00:06:23,520 --> 00:06:27,416
when I was 12, I got my first

151
00:06:27,416 --> 00:06:29,208
computer. But a couple years

152
00:06:29,208 --> 00:06:32,041
before that, I had access to a school

153
00:06:32,041 --> 00:06:32,520
before that, I had access to a school

154
00:06:32,520 --> 00:06:33,979
where there was a computer

155
00:06:33,979 --> 00:06:35,895
I could use. So I started

156
00:06:35,895 --> 00:06:38,145
kind of playing around by myself. But

157
00:06:38,145 --> 00:06:40,125
then when I got my first computer,

158
00:06:40,125 --> 00:06:40,187
you start to explore, navigate,

159
00:06:40,187 --> 00:06:43,000
you start to explore, navigate,

160
00:06:43,000 --> 00:06:45,020
eventually the internet becomes a thing.

161
00:06:45,416 --> 00:06:47,104
My first connection to the internet was

162
00:06:47,104 --> 00:06:48,958
actually with a dial-up modem.

163
00:06:48,958 --> 00:06:49,125
actually with a dial-up modem.

164
00:06:49,125 --> 00:06:51,604
But where I lived, you didn't

165
00:06:51,604 --> 00:06:54,645
have an RJ11 plug. So we had like older

166
00:06:54,645 --> 00:06:55,645
have an RJ11 plug. So we had like older

167
00:06:55,645 --> 00:06:58,270
plugs with three prongs. I

168
00:06:58,270 --> 00:06:59,520
actually don't even know what

169
00:06:59,520 --> 00:07:04,062
kind of plug it was. And I was 14, and I

170
00:07:04,062 --> 00:07:05,333
got the modem for

171
00:07:05,333 --> 00:07:09,583
Christmas. And it came with this RJ11

172
00:07:09,583 --> 00:07:12,500
on one side. I really want to use this,

173
00:07:12,500 --> 00:07:13,458
but I don't have anywhere

174
00:07:13,458 --> 00:07:15,416
to plug it. And I thought,

175
00:07:15,416 --> 00:07:17,729
"Well, there must just be electricity."

176
00:07:17,729 --> 00:07:21,750
So I unplugged this

177
00:07:21,750 --> 00:07:24,416
old-school plug from the wall at

178
00:07:24,437 --> 00:07:27,166
my place. And I kind of tear apart RJ11.

179
00:07:27,166 --> 00:07:28,666
And I start trying

180
00:07:28,666 --> 00:07:30,916
different combinations of cables,

181
00:07:30,916 --> 00:07:33,625
which probably I shouldn't have. But

182
00:07:33,625 --> 00:07:35,395
eventually, I got a

183
00:07:35,395 --> 00:07:38,645
dial tone. And this magical

184
00:07:38,645 --> 00:07:39,791
(mimicks dial up tone)

185
00:07:39,791 --> 00:07:42,750
of the modem starting to dial out, which

186
00:07:42,750 --> 00:07:45,166
I had heard before. And I

187
00:07:45,166 --> 00:07:46,875
was like, "Wow, this is the

188
00:07:46,875 --> 00:07:48,833
beginning of something." And the

189
00:07:48,833 --> 00:07:51,645
internet... Yeah, that was the thing. Did

190
00:07:51,645 --> 00:07:53,145
it ever happen for you to

191
00:07:53,145 --> 00:07:54,979
receive a phone call while you were

192
00:07:54,979 --> 00:07:57,145
messing with wires? Well, not messing

193
00:07:57,145 --> 00:07:58,375
with the wires, because...

194
00:07:59,125 --> 00:08:00,791
It's that moment when you're plugging the

195
00:08:00,791 --> 00:08:03,562
wires in, because that was

196
00:08:03,562 --> 00:08:05,229
my moment when I realized

197
00:08:05,229 --> 00:08:06,854
I shouldn't be doing that. I did exactly

198
00:08:06,854 --> 00:08:08,229
the same thing. And there

199
00:08:08,229 --> 00:08:09,416
was a phone call coming in,

200
00:08:09,416 --> 00:08:11,187
so you get a little bit of a shock. Not

201
00:08:11,187 --> 00:08:13,791
too much. But I wasn't much older than

202
00:08:13,791 --> 00:08:14,958
you. And I tried the

203
00:08:14,958 --> 00:08:16,270
same thing. And I remember, "Okay, so

204
00:08:16,270 --> 00:08:18,062
that's why you don't mess with wires

205
00:08:18,062 --> 00:08:19,229
because they're live."

206
00:08:19,229 --> 00:08:21,687
And, yeah, I mean, it's not like the

207
00:08:21,687 --> 00:08:22,916
voltage is very low. I forget

208
00:08:22,916 --> 00:08:24,083
exactly how much it is. It's

209
00:08:24,083 --> 00:08:26,145
enough to feel it, to feel the phone

210
00:08:26,145 --> 00:08:28,937
call. But that was my moment when I...

211
00:08:29,604 --> 00:08:31,437
Same approach. Let's

212
00:08:31,437 --> 00:08:32,854
figure this thing out. Let's wire them

213
00:08:32,854 --> 00:08:34,250
together. And at the same

214
00:08:34,250 --> 00:08:35,895
time as I was wiring them, there

215
00:08:35,895 --> 00:08:37,312
was a phone call coming in. So it got a

216
00:08:37,312 --> 00:08:38,583
bit of a shock. But

217
00:08:38,583 --> 00:08:40,312
nothing happened apart from that.

218
00:08:41,312 --> 00:08:42,291
You were shocked twice.

219
00:08:42,875 --> 00:08:44,833
I was shocked twice, yes. Once for real?

220
00:08:44,833 --> 00:08:46,604
Wow, okay, I shouldn't be doing this.

221
00:08:46,604 --> 00:08:48,312
So, yeah, it was no good deal.

222
00:08:49,041 --> 00:08:53,145
I wasn't lucky enough to be shocked. But

223
00:08:53,145 --> 00:08:55,750
it was very common that

224
00:08:55,750 --> 00:08:59,000
either my mom would want to

225
00:08:59,000 --> 00:09:01,479
dial out or someone would be dialing in

226
00:09:01,479 --> 00:09:04,645
and it would interfere with

227
00:09:04,645 --> 00:09:06,187
a connection. And that was

228
00:09:06,187 --> 00:09:09,958
definitely a lot of drama around the fact

229
00:09:09,958 --> 00:09:13,854
that you cannot really utilize the line.

230
00:09:13,875 --> 00:09:16,500
The beginning was two phone lines. So, I

231
00:09:16,500 --> 00:09:18,291
had one friend that had

232
00:09:18,291 --> 00:09:19,458
two phone lines for this very

233
00:09:19,458 --> 00:09:21,458
purpose. I was like, "Oh, wow, he is

234
00:09:21,458 --> 00:09:24,520
living the dream." Two phone lines. One

235
00:09:24,520 --> 00:09:25,375
for internet and one

236
00:09:25,375 --> 00:09:28,395
for like, you know, regular phone. Yeah.

237
00:09:28,645 --> 00:09:31,833
And I had a friend that had ISDN at home.

238
00:09:32,250 --> 00:09:34,145
Oh, that was just...

239
00:09:34,395 --> 00:09:35,916
He was rich. He was one of

240
00:09:35,916 --> 00:09:37,208
the rich kids. I can tell.

241
00:09:38,187 --> 00:09:40,416
That's like he lived in a part of the

242
00:09:40,416 --> 00:09:43,645
city where people couldn't afford ISDN.

243
00:09:44,187 --> 00:09:46,437
Now to this day, we were talking about

244
00:09:46,437 --> 00:09:48,291
this yesterday, your

245
00:09:48,291 --> 00:09:50,083
connection is unheard of.

246
00:09:50,291 --> 00:09:52,458
I think even for most people, like what

247
00:09:52,458 --> 00:09:54,666
they have at home, can you tell us a

248
00:09:54,666 --> 00:09:55,416
little bit about it,

249
00:09:55,416 --> 00:09:56,020
about the connection

250
00:09:56,020 --> 00:09:57,041
that you have currently?

251
00:09:57,479 --> 00:09:58,875
So I live in Switzerland. And there's

252
00:09:58,875 --> 00:10:02,770
this fantastic ISP here called Init7

253
00:10:02,770 --> 00:10:03,833
And they don't pay me

254
00:10:03,833 --> 00:10:04,750
to say this. They're really

255
00:10:04,750 --> 00:10:06,812
fantastic. So, they actually started many

256
00:10:06,812 --> 00:10:07,833
years back when I

257
00:10:07,833 --> 00:10:09,041
moved here 12 years ago.

258
00:10:09,041 --> 00:10:10,020
moved here 12 years ago.

259
00:10:10,020 --> 00:10:11,333
When I moved here, I already had one

260
00:10:11,333 --> 00:10:13,437
gigabit symmetric. So, up and

261
00:10:13,437 --> 00:10:15,875
down. And fiber to the home.

262
00:10:16,875 --> 00:10:21,062
But nowadays, they have 25 gigabit

263
00:10:21,062 --> 00:10:24,770
symmetric to the home.

264
00:10:24,770 --> 00:10:28,645
They're nerds as well. And it's a great

265
00:10:28,645 --> 00:10:31,645
company for other nerds. Obviously, I

266
00:10:31,645 --> 00:10:33,437
don't utilize the full 25

267
00:10:33,437 --> 00:10:36,250
gigabit per second because

268
00:10:36,250 --> 00:10:38,645
it's kind of unearthing more so than

269
00:10:38,645 --> 00:10:40,375
anything else. In the

270
00:10:40,375 --> 00:10:42,000
office, we also have Init7. And

271
00:10:42,000 --> 00:10:44,375
we do have 25. And there, sometimes we do

272
00:10:44,375 --> 00:10:46,458
exercise the full 25.

273
00:10:46,458 --> 00:10:48,416
But it's great. It's great.

274
00:10:48,416 --> 00:10:49,604
Cannot complain.

275
00:10:49,604 --> 00:10:51,145
That's amazing.

276
00:10:51,145 --> 00:10:52,875
I can show you a quick demo if you want.

277
00:10:52,875 --> 00:10:54,937
Yes, please. Let's see it. We have a

278
00:10:54,937 --> 00:10:57,104
Chrome window here. Let's go to fast.com.

279
00:10:59,229 --> 00:11:03,312
Oh, wow. That's not real. It's a bit

280
00:11:03,312 --> 00:11:04,729
slow. Yeah, it is a bit

281
00:11:04,729 --> 00:11:06,583
slow. No, that can't be right.

282
00:11:07,187 --> 00:11:09,375
Can you try speedtest.net?

283
00:11:09,375 --> 00:11:11,583
I can't believe fast.com

284
00:11:12,729 --> 00:11:20,458
Oh, wow. 3.5 gigabits

285
00:11:20,458 --> 00:11:24,312
per second. Yeah. Oh, wow.

286
00:11:24,312 --> 00:11:25,291
So, a couple of things are

287
00:11:25,291 --> 00:11:27,062
happening here. So, this is a

288
00:11:27,062 --> 00:11:28,708
I'm on my Mac Studio. It has

289
00:11:28,708 --> 00:11:30,479
a 10 gig ethernet connection. Then it

290
00:11:30,479 --> 00:11:32,145
a 10 gig ethernet connection. Then it

291
00:11:32,145 --> 00:11:34,000
goes over to an ethernet 10

292
00:11:34,000 --> 00:11:36,375
gig switch as well. And then I go

293
00:11:36,375 --> 00:11:41,312
to our router that has a 25 gig port. But

294
00:11:41,312 --> 00:11:45,645
I actually, because I've done a few

295
00:11:45,645 --> 00:11:46,937
changes, I used to have

296
00:11:48,500 --> 00:11:50,375
so a part of my infrastructure at home is

297
00:11:50,375 --> 00:11:52,145
fiber. I set up by me

298
00:11:52,145 --> 00:11:55,541
and I used to have it like

299
00:11:56,270 --> 00:11:59,520
I had a couple of racks downstairs and I

300
00:11:59,520 --> 00:12:00,979
had it kind of connected

301
00:12:00,979 --> 00:12:02,875
down with fiber. And I think I

302
00:12:02,875 --> 00:12:05,125
damaged one of the fibers. So, I think

303
00:12:05,125 --> 00:12:06,645
there is some loss. I haven't

304
00:12:06,645 --> 00:12:09,145
measured it because this used

305
00:12:09,145 --> 00:12:12,666
to be I used to be able to go to eight on

306
00:12:12,666 --> 00:12:13,708
my Mac. But so, I think

307
00:12:13,708 --> 00:12:14,854
there's actually a constraint

308
00:12:15,312 --> 00:12:17,166
now that the signal is not as good. And I

309
00:12:17,166 --> 00:12:18,083
haven't checked this.

310
00:12:18,083 --> 00:12:20,166
Yeah, it's too slow. Right. 3.5 gigs

311
00:12:20,208 --> 00:12:24,250
is too slow. I love that. Like only a

312
00:12:24,250 --> 00:12:25,270
nerd would say that like,

313
00:12:25,270 --> 00:12:27,020
Hey, I'm like pushing almost four

314
00:12:27,062 --> 00:12:29,937
gigs per second, both up and down, but

315
00:12:29,937 --> 00:12:31,541
it's too slow. This could

316
00:12:31,541 --> 00:12:33,895
go faster. That's amazing.

317
00:12:33,895 --> 00:12:35,354
Sometimes you want to upload something

318
00:12:35,354 --> 00:12:37,645
and it's... Right. well, I

319
00:12:37,645 --> 00:12:38,875
think the problem that you will

320
00:12:38,875 --> 00:12:41,187
see with this, and I'm sure you have hit

321
00:12:41,187 --> 00:12:42,854
it a couple of times, the

322
00:12:42,854 --> 00:12:43,833
whatever you're wherever

323
00:12:43,833 --> 00:12:45,895
you're uploading to, sometimes they can't

324
00:12:45,895 --> 00:12:47,062
accept more than one

325
00:12:47,062 --> 00:12:48,666
gigabit per second. So, sometimes

326
00:12:48,770 --> 00:12:50,687
they're limited, you know, on their end,

327
00:12:50,687 --> 00:12:51,562
because they don't expect

328
00:12:51,562 --> 00:12:53,229
users to have this type of

329
00:12:53,229 --> 00:12:55,645
setup. But that's very nice. Very, very

330
00:12:55,645 --> 00:12:58,687
nice. Okay. Where I really have seen so

331
00:12:58,687 --> 00:12:59,750
nowadays, I don't have

332
00:13:01,062 --> 00:13:03,437
for many years, I haven't really played

333
00:13:03,437 --> 00:13:05,020
any games.

334
00:13:05,020 --> 00:13:06,604
many years ago, I used to play

335
00:13:06,604 --> 00:13:07,458
quite a bit of Blizzard

336
00:13:07,458 --> 00:13:09,520
games, so World of Warcraft,

337
00:13:09,520 --> 00:13:12,229
Starcraft, and they have an installer

338
00:13:12,229 --> 00:13:14,854
that internally uses,

339
00:13:14,854 --> 00:13:16,166
it might even be BitTorrent,

340
00:13:16,166 --> 00:13:18,395
but something like BitTorrent, at least.

341
00:13:18,395 --> 00:13:19,416
So you can really get like

342
00:13:19,416 --> 00:13:22,041
multi stream. Yeah. And that's

343
00:13:22,041 --> 00:13:24,666
just incredible. Like you can easily use

344
00:13:24,666 --> 00:13:25,937
your whole link because

345
00:13:25,937 --> 00:13:27,395
you just be able to pull from

346
00:13:27,395 --> 00:13:30,104
multiple sources. So for things like

347
00:13:30,104 --> 00:13:31,833
that, it's, it's really you

348
00:13:31,833 --> 00:13:33,000
can really tell a difference.

349
00:13:33,479 --> 00:13:35,437
So how does like all this love that you

350
00:13:35,437 --> 00:13:37,187
have for infrastructure

351
00:13:37,187 --> 00:13:40,083
for networks for, you know,

352
00:13:40,083 --> 00:13:44,208
fast things translate to namespace? We

353
00:13:44,208 --> 00:13:46,479
first and foremost, build

354
00:13:46,479 --> 00:13:48,583
something for ourselves. Well,

355
00:13:48,583 --> 00:13:50,437
the origin of Namespace Labs, and the

356
00:13:50,437 --> 00:13:51,937
name is that we were going

357
00:13:51,937 --> 00:13:53,562
to build an infrastructure

358
00:13:53,562 --> 00:13:57,312
company that focuses on software defined

359
00:13:57,312 --> 00:13:59,833
storage, because it was kind

360
00:13:59,833 --> 00:14:01,812
of a big thing that both me

361
00:14:01,812 --> 00:14:04,791
and another person that is not here, HDR,

362
00:14:04,791 --> 00:14:06,479
have a passion for. But

363
00:14:06,479 --> 00:14:08,104
as we were building it out,

364
00:14:08,104 --> 00:14:09,604
we kind of found a few

365
00:14:09,604 --> 00:14:10,791
challenges along the way.

366
00:14:10,791 --> 00:14:13,583
And then we moved over to build

367
00:14:13,583 --> 00:14:16,250
an application platform. And as we were

368
00:14:16,250 --> 00:14:18,895
doing that, we wanted to

369
00:14:18,895 --> 00:14:22,000
run a lot of tests in parallel

370
00:14:22,020 --> 00:14:23,687
very quickly, because we didn't want to

371
00:14:23,687 --> 00:14:25,916
wait minutes for an EKS

372
00:14:25,916 --> 00:14:28,020
cluster to be created, or even

373
00:14:28,020 --> 00:14:30,312
a GKE cluster to be to be created a

374
00:14:30,312 --> 00:14:31,250
Kubernetes cluster.

375
00:14:31,250 --> 00:14:31,854
I put together

376
00:14:31,854 --> 00:14:34,270
something that kind of cut through all

377
00:14:34,270 --> 00:14:36,625
the layers and just focus on the

378
00:14:36,625 --> 00:14:38,375
essential to start a

379
00:14:38,375 --> 00:14:40,020
Kubernetes cluster really, really

380
00:14:40,020 --> 00:14:42,541
quickly, because we wanted to run many of

381
00:14:42,541 --> 00:14:44,187
them in full isolation

382
00:14:44,187 --> 00:14:46,250
to test foundation to test this

383
00:14:46,250 --> 00:14:48,854
application platform. So

384
00:14:48,854 --> 00:14:50,270
that was the genesis. And it was

385
00:14:50,270 --> 00:14:52,875
really for us, because we are developers

386
00:14:52,875 --> 00:14:53,979
ourselves, actually,

387
00:14:53,979 --> 00:14:55,625
majority of the company

388
00:14:55,625 --> 00:14:59,062
is engineers, and we have an appreciation

389
00:14:59,062 --> 00:15:01,479
for infrastructure

390
00:15:01,479 --> 00:15:03,229
that works well, that is

391
00:15:03,229 --> 00:15:06,145
understandable, and it's that it's fast.

392
00:15:07,395 --> 00:15:08,854
So that's something that we try to

393
00:15:08,854 --> 00:15:09,854
project into the products

394
00:15:09,854 --> 00:15:12,520
that we that we build. And many of the

395
00:15:12,520 --> 00:15:16,354
things that we do at Namespace, one of

396
00:15:16,354 --> 00:15:17,541
our product principles

397
00:15:17,541 --> 00:15:21,812
is fast is a feature. So we try to spend

398
00:15:21,812 --> 00:15:23,437
quite a bit of energy on

399
00:15:23,437 --> 00:15:27,000
making things as fast as possible.

400
00:15:27,625 --> 00:15:30,812
Yeah. When you say Kubernetes clusters

401
00:15:30,812 --> 00:15:33,562
that spin up fast, or very fast, what

402
00:15:33,562 --> 00:15:35,458
does that mean to you, very fast?

403
00:15:35,458 --> 00:15:37,895
It had to be seconds, like that's what

404
00:15:37,895 --> 00:15:39,500
made sense. But it wasn't

405
00:15:39,500 --> 00:15:42,958
just a bullish, a we need to,

406
00:15:42,958 --> 00:15:45,208
this should be seconds, but it came from

407
00:15:45,208 --> 00:15:47,958
the source of, can we know

408
00:15:47,958 --> 00:15:50,354
how things work? So we know

409
00:15:50,354 --> 00:15:54,395
how long Linux takes to boot up like the

410
00:15:54,395 --> 00:15:55,479
kernel to start, we know

411
00:15:55,479 --> 00:15:57,833
how long it takes for to scan

412
00:15:57,833 --> 00:15:59,770
devices, we know how long it takes to

413
00:15:59,770 --> 00:16:01,854
mount a file system, we know how long it

414
00:16:01,854 --> 00:16:02,979
takes to start a process

415
00:16:02,979 --> 00:16:05,479
we know, you know, if you kind of add all

416
00:16:05,479 --> 00:16:07,541
of those things up, you get to a point

417
00:16:07,541 --> 00:16:09,125
where where you start

418
00:16:09,125 --> 00:16:11,625
questioning, why does it take minutes?

419
00:16:11,625 --> 00:16:12,416
And so there's kind of

420
00:16:12,416 --> 00:16:14,750
inefficiencies in the system.

421
00:16:14,750 --> 00:16:17,645
And even today in like the Kubernetes API

422
00:16:17,645 --> 00:16:19,354
server is fantastic, but

423
00:16:19,354 --> 00:16:22,145
it has a few things built in

424
00:16:22,187 --> 00:16:25,875
that are not kind of level triggered. So

425
00:16:25,875 --> 00:16:26,666
there's kind of waiting

426
00:16:26,666 --> 00:16:30,020
periods, even we wanted to make

427
00:16:30,020 --> 00:16:31,958
it faster. But even to make it even

428
00:16:31,958 --> 00:16:33,166
further fast, we would

429
00:16:33,166 --> 00:16:35,062
have to go and change the

430
00:16:35,062 --> 00:16:37,166
implementation. So it really came down to

431
00:16:37,166 --> 00:16:39,416
how fast we think this

432
00:16:39,416 --> 00:16:42,020
should be. And I did some kind of

433
00:16:42,020 --> 00:16:44,854
back of the envelope kind of calculations.

434
00:16:44,854 --> 00:16:45,645
And I said it shouldn't

435
00:16:45,645 --> 00:16:47,187
take more than 10 seconds to

436
00:16:47,187 --> 00:16:52,437
start a single node Kubernetes cluster.

437
00:16:52,437 --> 00:16:54,770
So that was the starting point.

438
00:16:54,770 --> 00:16:57,375
Creating Kubernetes clusters

439
00:16:57,375 --> 00:17:01,416
from scratch, fully isolated, so not, you

440
00:17:01,416 --> 00:17:02,979
know, a pod running in

441
00:17:02,979 --> 00:17:04,541
another Kubernetes cluster,

442
00:17:04,541 --> 00:17:06,645
but rather like a virtual machine where

443
00:17:06,645 --> 00:17:08,562
you have access to, to the

444
00:17:08,562 --> 00:17:11,437
kernel, you, your own kernel,

445
00:17:11,437 --> 00:17:14,270
you can, you have your own Rootfs, so you

446
00:17:14,270 --> 00:17:15,208
can decide what gets

447
00:17:15,208 --> 00:17:17,187
packaged into it. And then that

448
00:17:17,187 --> 00:17:19,229
allowed us to start kind of running more

449
00:17:19,229 --> 00:17:23,479
tests and both faster, but the

450
00:17:23,479 --> 00:17:24,541
main thing was the fan

451
00:17:24,541 --> 00:17:26,687
out, we wanted to run many in parallel.

452
00:17:26,687 --> 00:17:28,541
Yeah. So just to have a better

453
00:17:28,541 --> 00:17:30,208
understanding of the scale

454
00:17:30,208 --> 00:17:32,083
that we're talking about, and I'm just

455
00:17:32,083 --> 00:17:33,333
looking for a magnitude,

456
00:17:33,333 --> 00:17:35,041
are we talking thousands of

457
00:17:35,041 --> 00:17:37,916
Kubernetes clusters? Are we talking 10s

458
00:17:37,916 --> 00:17:40,541
of 1000s, hundreds of 1000s?

459
00:17:40,541 --> 00:17:41,916
Like, how much are we talking

460
00:17:41,916 --> 00:17:43,687
about in like, what period of time as

461
00:17:43,687 --> 00:17:45,937
well, just for listeners to have an

462
00:17:45,937 --> 00:17:48,291
appreciation of the scale

463
00:17:48,291 --> 00:17:49,812
that this operates at?

464
00:17:49,812 --> 00:17:52,125
Thinking about Namespace, we do many

465
00:17:52,125 --> 00:17:54,625
millions of runs over

466
00:17:54,625 --> 00:17:58,395
a short period of time. So that's kind of

467
00:17:58,395 --> 00:17:59,750
the scale that we're operating.

468
00:17:59,750 --> 00:18:00,916
And every single

469
00:18:00,916 --> 00:18:03,312
instance is fully unique. So it's

470
00:18:03,312 --> 00:18:04,854
completely new virtual machine.

471
00:18:05,708 --> 00:18:07,395
Everything gets started from

472
00:18:07,395 --> 00:18:09,375
from scratch, the network gets programmed

473
00:18:09,375 --> 00:18:10,375
dynamically for that

474
00:18:10,375 --> 00:18:11,479
instance, there's like

475
00:18:11,479 --> 00:18:13,604
everything is from scratch. But when we

476
00:18:13,604 --> 00:18:15,270
started, like our target

477
00:18:15,270 --> 00:18:18,145
was to run like 100 Kubernetes

478
00:18:18,145 --> 00:18:22,041
clusters in parallel. So you can see that

479
00:18:22,041 --> 00:18:24,687
we the humbling starts and

480
00:18:24,687 --> 00:18:25,937
now we have customers that

481
00:18:25,937 --> 00:18:27,166
now we have customers that

482
00:18:27,791 --> 00:18:29,166
start  a high magnitude of

483
00:18:29,166 --> 00:18:30,375
concurrent jobs. And

484
00:18:30,375 --> 00:18:31,416
that's what that's even

485
00:18:31,416 --> 00:18:34,020
one of our biggest challenges nowadays is

486
00:18:34,020 --> 00:18:37,729
supporting that type

487
00:18:37,729 --> 00:18:39,520
of performance. So very

488
00:18:39,520 --> 00:18:42,416
low latency creation at a very high

489
00:18:42,416 --> 00:18:45,729
concurrency, we have tenants. So that's

490
00:18:45,729 --> 00:18:46,875
kind of the the unit in

491
00:18:46,875 --> 00:18:50,437
our in our system that create 1000s of

492
00:18:50,437 --> 00:18:53,062
jobs in an extremely small

493
00:18:53,062 --> 00:18:55,562
period of time. And those

494
00:18:55,562 --> 00:18:58,333
run over many, many, many machines. But

495
00:18:58,333 --> 00:18:59,916
even today, if you go to a

496
00:18:59,916 --> 00:19:01,062
Kubernetes cluster, and you

497
00:19:01,062 --> 00:19:04,437
start the 1000 pods, which is kind of the

498
00:19:04,437 --> 00:19:06,520
quick, if you define some

499
00:19:06,520 --> 00:19:07,770
some kind of equivalence with

500
00:19:07,770 --> 00:19:10,395
another system, you'll see how long it

501
00:19:10,395 --> 00:19:12,562
takes for those posts to be created.

502
00:19:12,562 --> 00:19:14,937
Because first, you need

503
00:19:14,937 --> 00:19:17,520
to commit the state, you need to kind of

504
00:19:17,520 --> 00:19:18,833
the scheduler needs to

505
00:19:18,833 --> 00:19:19,875
design in which machine they're

506
00:19:19,875 --> 00:19:22,750
going to run, then the machine needs, you

507
00:19:22,750 --> 00:19:23,750
need to have IPAM. So you

508
00:19:23,750 --> 00:19:24,916
need to have like an IP address

509
00:19:24,916 --> 00:19:26,979
assigned to the pod. So there's kind of

510
00:19:26,979 --> 00:19:28,125
many things that need to

511
00:19:28,125 --> 00:19:30,437
happen. And as you scale out

512
00:19:30,437 --> 00:19:33,312
the concurrency, you hit serialization

513
00:19:33,312 --> 00:19:34,708
limits, because some of

514
00:19:34,708 --> 00:19:36,854
these they need to be, you need

515
00:19:36,854 --> 00:19:38,187
to have like a consistent view of the

516
00:19:38,187 --> 00:19:39,479
universe to be able to make

517
00:19:39,479 --> 00:19:40,687
a decision, like you cannot

518
00:19:40,687 --> 00:19:43,187
assign the same IP address to two pods.

519
00:19:43,187 --> 00:19:44,270
So you need to have some sort of

520
00:19:44,270 --> 00:19:45,729
serialization. Yeah.

521
00:19:46,062 --> 00:19:47,937
And so that's kind of the types of

522
00:19:47,937 --> 00:19:50,166
challenges that we're tackling today.

523
00:19:50,937 --> 00:19:51,916
Because when you have

524
00:19:52,833 --> 00:19:54,666
a little bit of an aside,

525
00:19:54,666 --> 00:19:56,187
but when you have a natural

526
00:19:58,104 --> 00:20:00,291
partitioning scheme, like two customers,

527
00:20:00,291 --> 00:20:02,354
for example, scaling

528
00:20:02,354 --> 00:20:05,020
across customers is a little bit easier,

529
00:20:05,020 --> 00:20:05,875
because you can

530
00:20:05,875 --> 00:20:07,083
partition your infrastructure.

531
00:20:08,000 --> 00:20:11,125
But when you go inside one customer, then

532
00:20:11,125 --> 00:20:13,583
things start to become a

533
00:20:13,583 --> 00:20:15,083
little bit more challenging.

534
00:20:15,083 --> 00:20:16,916
And that's those types of kind of scaling

535
00:20:16,916 --> 00:20:18,416
challenges that we have today.

536
00:20:18,416 --> 00:20:19,187
Yeah, especially the big

537
00:20:19,187 --> 00:20:20,583
customers that you mentioned that

538
00:20:22,000 --> 00:20:25,333
start a lot of jobs at once. And a job,

539
00:20:25,333 --> 00:20:26,354
what does a job mean?

540
00:20:26,354 --> 00:20:27,625
Like, what does it translate to

541
00:20:27,625 --> 00:20:29,270
in infrastructure terms? Are we talking

542
00:20:29,270 --> 00:20:32,625
containers, virtual machines, how many

543
00:20:32,625 --> 00:20:33,916
CPUs, how much memory,

544
00:20:33,916 --> 00:20:35,770
like, what does the job look like? The

545
00:20:35,770 --> 00:20:38,625
unit of compute in our

546
00:20:38,625 --> 00:20:41,687
world is an instance. But that

547
00:20:41,708 --> 00:20:44,958
instance is a combination of a virtual

548
00:20:44,958 --> 00:20:48,645
machine. So you get full access to the

549
00:20:48,645 --> 00:20:50,687
kernel and everything

550
00:20:50,687 --> 00:20:51,937
in that virtual machine. But it's an

551
00:20:51,937 --> 00:20:52,645
in that virtual machine. But it's an

552
00:20:52,645 --> 00:20:54,854
environment that is designed to run

553
00:20:54,854 --> 00:20:57,041
containers. So it's not

554
00:20:57,041 --> 00:20:59,187
you don't get an Ubuntu virtual machine,

555
00:20:59,187 --> 00:21:00,750
and then you go and you

556
00:21:00,750 --> 00:21:03,500
know, deploy Systemd units,

557
00:21:03,500 --> 00:21:05,312
that's that's not how we think about the

558
00:21:05,312 --> 00:21:07,458
problem. We approach it

559
00:21:07,458 --> 00:21:10,625
from we use containers as a

560
00:21:10,625 --> 00:21:13,104
distribution mechanism. So you define

561
00:21:13,104 --> 00:21:14,458
your application,

562
00:21:14,458 --> 00:21:16,395
whatever you want to run, you

563
00:21:16,395 --> 00:21:18,437
encapsulate that in a container because

564
00:21:18,437 --> 00:21:19,583
it has all of the software

565
00:21:19,583 --> 00:21:21,166
that you need. It also tells

566
00:21:21,166 --> 00:21:22,416
us how to start it as a few other

567
00:21:22,416 --> 00:21:22,979
us how to start it has a few other

568
00:21:22,979 --> 00:21:24,062
properties. And and we place that

569
00:21:24,062 --> 00:21:25,333
properties. And and we place that

570
00:21:25,333 --> 00:21:27,875
container or multiple containers

571
00:21:27,875 --> 00:21:29,687
in a virtual machine that can use

572
00:21:29,687 --> 00:21:30,666
in a virtual machine that can use

573
00:21:30,666 --> 00:21:32,062
an arbitrary set of

574
00:21:32,062 --> 00:21:33,770
resources. So you can decide

575
00:21:33,770 --> 00:21:40,125
whether it uses two or 16 CPUs, or

576
00:21:40,125 --> 00:21:43,541
whether it uses, you know, two or 256

577
00:21:43,541 --> 00:21:45,520
gigs of RAM. So you have like

578
00:21:45,520 --> 00:21:48,937
full flexibility on that. And then also

579
00:21:48,937 --> 00:21:49,520
full flexibility on that. And then also

580
00:21:49,520 --> 00:21:50,104
from a network

581
00:21:50,104 --> 00:21:52,520
perspective, like if you want to

582
00:21:52,520 --> 00:21:55,145
interact with whatever is running in that

583
00:21:55,145 --> 00:21:56,770
in that instance, you get a few

584
00:21:56,770 --> 00:21:58,208
management properties out

585
00:21:58,208 --> 00:22:00,979
of the box, like you can SSH in, you

586
00:22:00,979 --> 00:22:02,458
don't need to configure anything,

587
00:22:02,458 --> 00:22:05,020
But if you want to access

588
00:22:05,020 --> 00:22:07,875
the service that you have, then you also

589
00:22:07,875 --> 00:22:10,708
have primitives for that too

590
00:22:10,708 --> 00:22:11,562
kind of program or ingress.

591
00:22:11,562 --> 00:22:12,250
kind of program or ingress.

592
00:22:12,250 --> 00:22:14,333
We say jobs, because we kind

593
00:22:14,770 --> 00:22:16,750
of approach the problem in a layered way.

594
00:22:17,812 --> 00:22:20,645
We think of the compute platform,

595
00:22:20,645 --> 00:22:21,958
which is a little bit more

596
00:22:21,958 --> 00:22:24,479
generic, as one thing, and then

597
00:22:24,479 --> 00:22:28,666
applications built on top as separate as

598
00:22:28,666 --> 00:22:29,458
something separate. And

599
00:22:29,458 --> 00:22:31,020
a lot of our customers,

600
00:22:31,020 --> 00:22:35,104
they use Namespace to run jobs. And so

601
00:22:35,104 --> 00:22:37,729
those jobs are usually something that

602
00:22:37,729 --> 00:22:39,145
starts, has a purpose,

603
00:22:39,145 --> 00:22:41,041
wants to go really fast, that's usually

604
00:22:41,041 --> 00:22:42,875
the case. And then it ends.

605
00:22:42,875 --> 00:22:45,062
And it could be a GitHub job,

606
00:22:45,062 --> 00:22:47,104
it could be a build guide job, it could

607
00:22:47,104 --> 00:22:49,875
be a GitLab job, it could be a CircleCI job

608
00:22:49,875 --> 00:22:50,312
job. But it can also

609
00:22:50,312 --> 00:22:51,062
job. But it can also

610
00:22:51,062 --> 00:22:52,270
be your custom job, like that you want to

611
00:22:52,270 --> 00:22:54,145
be your custom job, like that you want to

612
00:22:54,145 --> 00:22:56,562
run a system test. So for

613
00:22:56,562 --> 00:22:58,291
example, we have customers that

614
00:22:58,291 --> 00:23:01,041
deploy system tests on instances.

615
00:23:01,041 --> 00:23:02,750
And they can rely

616
00:23:02,750 --> 00:23:04,979
on something that scales out without

617
00:23:04,979 --> 00:23:06,291
being constrained by

618
00:23:06,291 --> 00:23:08,479
whatever resources that they

619
00:23:08,479 --> 00:23:10,854
have available in the job where they

620
00:23:10,854 --> 00:23:13,020
where they started. I think

621
00:23:13,020 --> 00:23:17,041
how people deal with adversity

622
00:23:17,041 --> 00:23:21,791
is very telling. When something fails,

623
00:23:21,791 --> 00:23:23,666
especially when it fails,

624
00:23:23,666 --> 00:23:25,125
how do you handle that tells

625
00:23:25,125 --> 00:23:28,520
everything about you at many levels, as

626
00:23:28,520 --> 00:23:29,854
an individual, as a

627
00:23:29,854 --> 00:23:31,312
team, as a company.

628
00:23:31,312 --> 00:23:32,958
And the reason why I say that is because I know

629
00:23:32,958 --> 00:23:34,187
that you had the major outage

630
00:23:34,187 --> 00:23:37,500
this year. And it was one of

631
00:23:37,541 --> 00:23:40,312
the things that you don't expect will

632
00:23:40,312 --> 00:23:42,958
happen. You prepare for it. And when it

633
00:23:42,958 --> 00:23:43,937
happens, you're like,

634
00:23:44,083 --> 00:23:46,583
wow, I'm so glad we had some

635
00:23:46,583 --> 00:23:48,333
preparations. But it's very difficult to

636
00:23:48,333 --> 00:23:49,395
simulate that. It's very

637
00:23:49,395 --> 00:23:52,812
difficult to fire drill that.

638
00:23:52,812 --> 00:23:55,229
It's really, really hard. So can

639
00:23:55,229 --> 00:23:56,312
you tell us more about what

640
00:23:56,312 --> 00:23:58,937
happened? And how did you respond?

641
00:23:58,937 --> 00:24:01,062
We've been running Namespace now

642
00:24:01,062 --> 00:24:02,895
for some time, so for

643
00:24:03,125 --> 00:24:08,166
close to two years. And we've had our

644
00:24:08,166 --> 00:24:10,000
challenges along the way. But

645
00:24:10,000 --> 00:24:13,208
nothing as big as this more

646
00:24:13,208 --> 00:24:17,270
recent outage. A couple interesting

647
00:24:17,270 --> 00:24:21,125
things there. Like, we had two issues

648
00:24:21,125 --> 00:24:24,375
that happened at the same time.

649
00:24:24,375 --> 00:24:27,000
And I'm lucky that our team is

650
00:24:27,000 --> 00:24:28,458
experienced, and we've operated and kind

651
00:24:28,458 --> 00:24:30,375
experienced, and we've operated and kind

652
00:24:30,375 --> 00:24:32,458
of supported and built

653
00:24:32,458 --> 00:24:33,958
large scale systems over

654
00:24:33,958 --> 00:24:35,541
our years before

655
00:24:35,541 --> 00:24:37,354
Namespace. So that gives us a little

656
00:24:37,354 --> 00:24:39,041
bit of preparation, like

657
00:24:39,041 --> 00:24:40,812
how can things fail? That's

658
00:24:40,812 --> 00:24:42,458
very often when we approach

659
00:24:42,458 --> 00:24:44,895
building something. It's not just a

660
00:24:44,895 --> 00:24:46,833
functionality that it has, but

661
00:24:46,833 --> 00:24:46,875
something that is part of our

662
00:24:46,875 --> 00:24:48,083
something that is part of our

663
00:24:48,083 --> 00:24:49,937
conversation is what are the failure

664
00:24:49,937 --> 00:24:52,229
modes? What if you have an

665
00:24:52,229 --> 00:24:53,583
application, it's stateless,

666
00:24:53,583 --> 00:24:55,687
but it pushes some state into some

667
00:24:55,687 --> 00:24:57,166
database? Well, what happens

668
00:24:57,166 --> 00:24:58,416
if you don't have access to that

669
00:24:58,416 --> 00:25:00,833
database? What happens if you have

670
00:25:00,833 --> 00:25:01,979
multiple requests going

671
00:25:01,979 --> 00:25:05,104
concurrently? And you compromise

672
00:25:05,104 --> 00:25:07,500
on your serializability of your

673
00:25:07,500 --> 00:25:09,770
transactions? Like, how does your

674
00:25:09,770 --> 00:25:11,583
application react to

675
00:25:11,583 --> 00:25:14,000
potential inconsistent states that you

676
00:25:14,000 --> 00:25:15,208
had to do for other reasons?

677
00:25:15,375 --> 00:25:17,000
So we try to incorporate as

678
00:25:17,000 --> 00:25:20,145
much as possible, like a failure mode

679
00:25:20,145 --> 00:25:21,416
into how we approach

680
00:25:21,416 --> 00:25:24,145
features. This big outage that we had,

681
00:25:24,145 --> 00:25:26,750
it was kind of a combination of two

682
00:25:26,750 --> 00:25:29,270
things. Namespace, when

683
00:25:29,270 --> 00:25:32,854
we started, and we used

684
00:25:32,854 --> 00:25:36,541
exclusively hardware provided by others.

685
00:25:36,541 --> 00:25:37,479
exclusively hardware provided by others.

686
00:25:37,479 --> 00:25:38,395
So actually, we started

687
00:25:38,395 --> 00:25:38,791
with bare metal in AWS,

688
00:25:38,791 --> 00:25:40,604
with bare metal in AWS,

689
00:25:40,604 --> 00:25:42,833
and then we switched over to Equinix

690
00:25:42,833 --> 00:25:43,458
metal, or packet. And then

691
00:25:43,458 --> 00:25:43,958
metal, or packet. And then

692
00:25:43,958 --> 00:25:44,208
metal, or packet. And then

693
00:25:44,208 --> 00:25:45,291
we kind of worked with other

694
00:25:45,291 --> 00:25:50,458
providers over time. And fairly early on,

695
00:25:50,458 --> 00:25:51,645
it became obvious to us

696
00:25:51,645 --> 00:25:53,687
that in order to offer

697
00:25:53,687 --> 00:25:56,666
a great product that had an emphasis on

698
00:25:56,666 --> 00:25:58,666
performance, we had to

699
00:25:58,666 --> 00:26:00,250
have a lot more control

700
00:26:00,250 --> 00:26:02,854
over the hardware, not just individual

701
00:26:02,854 --> 00:26:04,541
servers, but also the

702
00:26:04,541 --> 00:26:07,145
layout of the rack. So how much

703
00:26:07,145 --> 00:26:09,208
network capacity there is? Do we know

704
00:26:09,208 --> 00:26:11,062
that one compute node is

705
00:26:11,062 --> 00:26:12,458
next to another compute node?

706
00:26:12,458 --> 00:26:15,479
Is it in the same switch or not? So all

707
00:26:15,479 --> 00:26:16,375
of those things started to

708
00:26:16,375 --> 00:26:17,250
play a role in how we approached

709
00:26:17,250 --> 00:26:18,187
play a role in how we approached

710
00:26:19,229 --> 00:26:22,604
our development. And we couldn't find a good

711
00:26:22,604 --> 00:26:25,520
mix that would give us both the global

712
00:26:25,520 --> 00:26:26,791
reach that we needed,

713
00:26:26,791 --> 00:26:28,520
because we have some customers that want

714
00:26:28,520 --> 00:26:30,937
to run workloads in North

715
00:26:30,937 --> 00:26:32,958
America. We have customers that

716
00:26:32,958 --> 00:26:36,458
run workloads in Europe. And we realized,

717
00:26:36,458 --> 00:26:37,208
well, we have to do it

718
00:26:37,208 --> 00:26:40,020
ourselves. So, Namespace deploys

719
00:26:40,020 --> 00:26:41,708
its own hardware, and software stack on

720
00:26:41,708 --> 00:26:42,729
its own hardware, and software stack on

721
00:26:42,729 --> 00:26:44,187
top of that hardware. So

722
00:26:44,187 --> 00:26:46,145
that means we decide everything

723
00:26:46,187 --> 00:26:49,041
from CPU, RAM, how much storage, how much

724
00:26:49,041 --> 00:26:50,958
networking, what's

725
00:26:50,958 --> 00:26:52,375
the layout of the rack,

726
00:26:53,333 --> 00:26:57,250
how do our racks kind of the spine leaf

727
00:26:57,250 --> 00:26:59,458
setup, how that is, so all

728
00:26:59,458 --> 00:27:00,583
of that is done internally.

729
00:27:01,729 --> 00:27:03,979
And we set ourselves in a journey to kind

730
00:27:03,979 --> 00:27:05,645
of move completely to our

731
00:27:05,645 --> 00:27:07,416
own hardware. And we've been on

732
00:27:07,416 --> 00:27:09,541
a catch up for some time.

733
00:27:09,541 --> 00:27:10,812
We had

734
00:27:10,812 --> 00:27:15,770
in October, a major expansion of one of

735
00:27:15,770 --> 00:27:18,583
our sites coming in, where the

736
00:27:18,583 --> 00:27:20,145
distributor that we work with,

737
00:27:20,729 --> 00:27:24,812
they made a mistake in their order, and

738
00:27:24,812 --> 00:27:26,229
they ordered the wrong

739
00:27:26,229 --> 00:27:28,979
DIMMs for those servers.

740
00:27:30,166 --> 00:27:32,145
And it's a lot of DIMMs. It's not just,

741
00:27:32,145 --> 00:27:34,520
you know, 20 DIMMs or 30 DIMMs that you

742
00:27:34,520 --> 00:27:36,395
can go to a shop and

743
00:27:36,395 --> 00:27:39,416
get. It's actually so many that they had

744
00:27:39,416 --> 00:27:40,958
to go and order directly

745
00:27:40,958 --> 00:27:42,729
from the source. And that added

746
00:27:42,729 --> 00:27:48,708
three weeks more to that delivery. We

747
00:27:48,708 --> 00:27:49,687
were counting on that

748
00:27:49,687 --> 00:27:51,791
hardware, because we knew that

749
00:27:51,791 --> 00:27:54,833
we were already running quite hot. So

750
00:27:54,833 --> 00:27:57,458
quite hot, as in like our utilization is

751
00:27:57,458 --> 00:27:59,583
high. So part of the

752
00:27:59,583 --> 00:28:00,895
reason why we were okay with that is

753
00:28:00,895 --> 00:28:02,000
because we have tools that

754
00:28:02,000 --> 00:28:05,083
allow us to manage utilization

755
00:28:05,083 --> 00:28:08,833
across sites. We can, we can run in

756
00:28:08,833 --> 00:28:09,770
continuous optimizations

757
00:28:09,770 --> 00:28:13,291
where we try to maintain

758
00:28:13,291 --> 00:28:15,958
each site kind of hot enough, but not

759
00:28:15,958 --> 00:28:17,145
more than that. So we can

760
00:28:17,145 --> 00:28:17,666
more than that. So we can

761
00:28:17,666 --> 00:28:19,708
kind of move things around.

762
00:28:19,708 --> 00:28:22,645
but globally, because of that missed

763
00:28:22,645 --> 00:28:25,937
delivery, we were running quite hot. At

764
00:28:25,937 --> 00:28:28,687
the same time, one of our

765
00:28:28,687 --> 00:28:31,833
existing deployments in a company that

766
00:28:31,833 --> 00:28:33,187
offers kind of bare metal

767
00:28:33,187 --> 00:28:37,083
that we used, they started having

768
00:28:37,083 --> 00:28:40,479
an issue in their network product, which

769
00:28:40,479 --> 00:28:42,458
we use to connect multiple servers

770
00:28:42,458 --> 00:28:43,895
together into a single

771
00:28:43,895 --> 00:28:47,104
layer two segment, where it led to

772
00:28:47,104 --> 00:28:53,000
sporadic packet loss. And at first, while

773
00:28:53,000 --> 00:28:54,333
the internet is built on

774
00:28:54,333 --> 00:28:56,833
sporadic packet loss, so things just kind

775
00:28:56,833 --> 00:28:59,312
of work. But as that became

776
00:28:59,312 --> 00:29:01,145
worse over time, it was so bad

777
00:29:01,166 --> 00:29:03,958
that it had a real impact into our

778
00:29:03,958 --> 00:29:04,916
customers. And we interacted with that

779
00:29:04,916 --> 00:29:07,875
customers. And we interacted with that

780
00:29:07,875 --> 00:29:10,458
vendor and for various

781
00:29:10,458 --> 00:29:12,541
reasons, they kind of acknowledged, but

782
00:29:12,541 --> 00:29:14,791
they didn't react quickly enough to the

783
00:29:14,791 --> 00:29:16,708
problem. So we decided

784
00:29:16,750 --> 00:29:21,229
that that wasn't acceptable, the level of

785
00:29:21,229 --> 00:29:22,625
service that we're offering to our

786
00:29:22,625 --> 00:29:23,895
customers, the fact that

787
00:29:23,895 --> 00:29:28,875
we were a source of flakes, because of

788
00:29:28,875 --> 00:29:30,354
that kind of random packet

789
00:29:30,354 --> 00:29:32,520
loss, it was not acceptable. So

790
00:29:32,520 --> 00:29:36,333
we strategized and we made a decision on

791
00:29:36,770 --> 00:29:38,166
changing our network setup

792
00:29:38,166 --> 00:29:39,729
so that we wouldn't depend on

793
00:29:39,729 --> 00:29:42,354
that particular feature. That meant

794
00:29:42,354 --> 00:29:46,708
though, that we had a dip in our

795
00:29:46,708 --> 00:29:49,062
capacity, because we had to

796
00:29:49,062 --> 00:29:52,125
redeploy that part of our infrastructure

797
00:29:52,125 --> 00:29:53,770
that it has to do with the

798
00:29:53,770 --> 00:29:55,979
fact that we run an immutable,

799
00:29:55,979 --> 00:29:58,083
we try to be very immutable. So as

800
00:29:58,083 --> 00:30:00,854
machines move from one setup to another

801
00:30:00,854 --> 00:30:01,854
setup, they need to be

802
00:30:01,854 --> 00:30:06,312
reset up to get new keys. So there's kind

803
00:30:06,312 --> 00:30:08,479
of something else that

804
00:30:08,479 --> 00:30:09,770
kind of plays a role there.

805
00:30:09,770 --> 00:30:12,791
And that took some time, we had practiced

806
00:30:12,791 --> 00:30:14,541
that, but it took longer

807
00:30:14,541 --> 00:30:16,666
than we anticipated. One of

808
00:30:16,666 --> 00:30:19,312
the challenges was we rely a lot on state

809
00:30:19,312 --> 00:30:20,687
that lives on individual

810
00:30:20,687 --> 00:30:22,604
machines and not on the

811
00:30:22,604 --> 00:30:23,104
networks to enable fast performance,

812
00:30:23,104 --> 00:30:24,562
networks to enable fast performance,

813
00:30:24,562 --> 00:30:27,062
bootups, etc. And distributing that

814
00:30:27,062 --> 00:30:28,729
state, because we had a much

815
00:30:28,729 --> 00:30:31,395
bigger fleet versus what we had done

816
00:30:31,395 --> 00:30:33,937
before, for that particular region took

817
00:30:33,937 --> 00:30:35,187
longer than we expected.

818
00:30:35,187 --> 00:30:37,541
So it's kind of distributing all of the

819
00:30:37,541 --> 00:30:39,187
state across all machines, it

820
00:30:39,187 --> 00:30:41,000
highlighted a few bottlenecks

821
00:30:41,000 --> 00:30:43,812
that we had. And that build up took some

822
00:30:43,812 --> 00:30:45,375
time. So we were running

823
00:30:45,375 --> 00:30:49,104
really, really hot for some time,

824
00:30:49,104 --> 00:30:53,312
we had part of the team just trying to

825
00:30:53,312 --> 00:30:54,583
support our customers,

826
00:30:54,583 --> 00:30:56,270
making decisions on, okay, we're

827
00:30:56,270 --> 00:30:58,354
going to move this customer to this part,

828
00:30:58,354 --> 00:31:01,250
because now it's actually their peak

829
00:31:01,250 --> 00:31:02,000
time. And we want to

830
00:31:02,000 --> 00:31:03,166
time. And we want to

831
00:31:03,166 --> 00:31:04,333
make sure that they get as good of

832
00:31:04,333 --> 00:31:06,000
experience as possible. So

833
00:31:06,000 --> 00:31:07,270
there was kind of part of the team

834
00:31:07,270 --> 00:31:10,500
that was just trying to offer as good of

835
00:31:10,500 --> 00:31:12,104
support to our customers as possible,

836
00:31:12,104 --> 00:31:12,791
where the other part

837
00:31:12,791 --> 00:31:15,395
of the team was just kind of rebuilding

838
00:31:15,395 --> 00:31:20,229
the region. And we did it,

839
00:31:20,229 --> 00:31:22,479
but it was extremely taxing.

840
00:31:22,729 --> 00:31:25,312
It's primarily because we feel such a

841
00:31:25,312 --> 00:31:26,625
strong commitment to the

842
00:31:26,625 --> 00:31:28,770
services that we offer,

843
00:31:28,770 --> 00:31:31,125
because then we've experienced,

844
00:31:31,125 --> 00:31:32,416
there's something that we

845
00:31:32,416 --> 00:31:34,229
depend on as a developer,

846
00:31:34,229 --> 00:31:37,250
and then it's not working. It's just the

847
00:31:37,250 --> 00:31:39,229
worst, right? I cannot do my job.

848
00:31:40,729 --> 00:31:43,500
So I think emotionally, it was extremely

849
00:31:43,500 --> 00:31:45,791
taxing. I look at it a

850
00:31:45,791 --> 00:31:49,312
lot from the human side,

851
00:31:49,312 --> 00:31:51,958
like you're trying to do something that

852
00:31:51,958 --> 00:31:53,854
is, you're trying to do a

853
00:31:53,854 --> 00:31:55,375
great job to your, you're trying

854
00:31:55,375 --> 00:31:56,562
to provide a great service to our

855
00:31:56,562 --> 00:32:01,041
customers, but then we let them down in

856
00:32:01,041 --> 00:32:02,354
that particular moment.

857
00:32:02,354 --> 00:32:03,958
And we tried to be transparent about it.

858
00:32:03,958 --> 00:32:06,041
We wrote a postmortem.

859
00:32:06,041 --> 00:32:08,187
The things that I mentioned

860
00:32:08,187 --> 00:32:12,875
and more are there. We learned a lot in

861
00:32:12,875 --> 00:32:16,875
that experience. And to

862
00:32:16,875 --> 00:32:18,770
be honest with you, we were

863
00:32:18,770 --> 00:32:21,416
expecting that some customers would come

864
00:32:21,416 --> 00:32:22,916
to us and say that this is unacceptable

865
00:32:22,916 --> 00:32:25,541
and we're moving on.

866
00:32:25,541 --> 00:32:30,145
But not a single customer left due to

867
00:32:30,145 --> 00:32:31,041
that outage. And we

868
00:32:31,041 --> 00:32:32,041
actually got a lot of support,

869
00:32:32,041 --> 00:32:34,729
and I have a big appreciation for our

870
00:32:34,729 --> 00:32:36,437
customers. I met one of our

871
00:32:36,437 --> 00:32:38,291
customers in San Francisco

872
00:32:38,291 --> 00:32:40,708
a few weeks after the outage, and they

873
00:32:40,708 --> 00:32:42,583
said, "Yeah, it was 3

874
00:32:42,583 --> 00:32:43,770
p.m." And we decided, "Okay,

875
00:32:43,770 --> 00:32:47,375
we're going to call it a day now, because

876
00:32:47,375 --> 00:32:50,520
it seems like our jobs are not running.

877
00:32:51,770 --> 00:32:55,604
But we're so happy with the service that

878
00:32:55,604 --> 00:32:57,104
you folks usually provide

879
00:32:57,104 --> 00:32:59,166
to us that it was one day,

880
00:32:59,166 --> 00:33:01,854
and that's okay." But yeah, it felt very bad.

881
00:33:01,854 --> 00:33:04,375
It wasn't a complete outage, right? It was a

882
00:33:04,375 --> 00:33:06,250
degradation, a significant degradation,

883
00:33:06,250 --> 00:33:08,166
but not every customer was

884
00:33:08,166 --> 00:33:10,104
impacted. So this was limited

885
00:33:10,104 --> 00:33:11,875
to one region. That was the blast radius,

886
00:33:11,875 --> 00:33:12,666
and you have multiple

887
00:33:12,666 --> 00:33:15,020
regions. So that's one. The second

888
00:33:15,020 --> 00:33:16,750
one is that not all customers were

889
00:33:16,750 --> 00:33:19,833
impacted the same amount, right? Because

890
00:33:19,833 --> 00:33:21,750
as this was happening,

891
00:33:21,750 --> 00:33:23,604
you're also moving customers off, which

892
00:33:23,604 --> 00:33:25,333
I, you know, that's

893
00:33:25,333 --> 00:33:26,770
something which I missed. And I will

894
00:33:26,770 --> 00:33:28,541
go back to the post-mortem, by the way, I

895
00:33:28,541 --> 00:33:29,750
will add a link to the show notes

896
00:33:30,083 --> 00:33:33,166
The way you handled something failing, and

897
00:33:33,166 --> 00:33:35,333
something failing in a very

898
00:33:35,333 --> 00:33:37,375
significant way, right? The whole region

899
00:33:37,375 --> 00:33:40,812
going away, or being unusable

900
00:33:40,937 --> 00:33:41,770
You were able to

901
00:33:41,770 --> 00:33:44,583
be hands-on, you understood how

902
00:33:44,583 --> 00:33:45,979
all the pieces fit together, which means

903
00:33:45,979 --> 00:33:47,604
that you were able to do

904
00:33:47,604 --> 00:33:49,291
something about it, rather than

905
00:33:49,291 --> 00:33:50,770
putting your hands up in the air and

906
00:33:50,770 --> 00:33:52,437
saying, "Hey, it's the provider, we can't

907
00:33:52,437 --> 00:33:53,250
do anything about that."

908
00:33:53,833 --> 00:33:55,333
Think about what happens when you're in

909
00:33:55,333 --> 00:33:58,479
AWS, or GCP, or Azure, or,

910
00:33:58,479 --> 00:33:59,312
you know, one of those big

911
00:33:59,312 --> 00:34:01,833
providers, what can you do? And you say,

912
00:34:01,833 --> 00:34:03,500
"Well, I'm going to move things off it."

913
00:34:03,500 --> 00:34:04,291
You may have so much

914
00:34:04,291 --> 00:34:06,312
stuff to move off that you can't move it

915
00:34:06,312 --> 00:34:07,520
off, not to mention that if

916
00:34:07,520 --> 00:34:08,937
there's an outage, how are you

917
00:34:08,937 --> 00:34:11,104
going to move stuff off? Especially if

918
00:34:11,104 --> 00:34:12,604
the DNS is there, you can't get to the

919
00:34:12,604 --> 00:34:13,791
DNS, you can't update

920
00:34:13,791 --> 00:34:16,520
it. And this happens, you know, for many

921
00:34:16,520 --> 00:34:18,541
companies, and many, many

922
00:34:18,541 --> 00:34:20,291
businesses. And at the end of the

923
00:34:20,291 --> 00:34:22,875
day, we are humans. The internet does go

924
00:34:22,875 --> 00:34:24,000
down, or at least half of

925
00:34:24,000 --> 00:34:26,166
it. I remember when Fastly went

926
00:34:26,166 --> 00:34:27,916
down, they had an outage, or CloudFlare

927
00:34:27,916 --> 00:34:29,604
went down, or Facebook

928
00:34:29,604 --> 00:34:31,937
starts, you know, the BGP routes get

929
00:34:31,958 --> 00:34:34,229
all messed up. Now that's bad. But in all

930
00:34:34,229 --> 00:34:35,458
of this, there's always

931
00:34:35,458 --> 00:34:36,812
something to learn. There's

932
00:34:36,812 --> 00:34:39,708
always something to improve. And the best

933
00:34:39,708 --> 00:34:41,562
approach is to get

934
00:34:41,562 --> 00:34:42,500
better on the other side.

935
00:34:43,020 --> 00:34:44,583
You mentioned something that I think is

936
00:34:44,583 --> 00:34:48,541
really important that there's a

937
00:34:48,541 --> 00:34:49,562
particular company that

938
00:34:49,625 --> 00:34:52,958
we work with, and they played, like one

939
00:34:52,958 --> 00:34:53,916
of their products wasn't

940
00:34:53,916 --> 00:34:56,645
working to spec. But from my

941
00:34:56,666 --> 00:35:00,354
perspective, that's on us. We decide who

942
00:35:00,354 --> 00:35:01,479
are the companies that we

943
00:35:01,479 --> 00:35:03,437
work with. And I don't even want

944
00:35:03,437 --> 00:35:05,562
to throw them under the bus. Like they I

945
00:35:05,562 --> 00:35:07,333
think, perhaps other

946
00:35:07,333 --> 00:35:08,458
customers didn't have the same

947
00:35:08,458 --> 00:35:09,875
choice. Perhaps it was the way that we

948
00:35:09,875 --> 00:35:10,750
were using their

949
00:35:10,750 --> 00:35:12,708
infrastructure that led to that. And it's

950
00:35:12,708 --> 00:35:14,687
really on us. And obviously, when we work

951
00:35:14,687 --> 00:35:15,187
really on us. And obviously, when we work

952
00:35:15,187 --> 00:35:16,166
with someone, and they

953
00:35:16,166 --> 00:35:17,416
can provide us great support

954
00:35:17,416 --> 00:35:19,166
that helps us get the resolution faster.

955
00:35:19,916 --> 00:35:22,166
Great. But in that case,

956
00:35:22,166 --> 00:35:24,229
it was I felt, and the whole

957
00:35:24,229 --> 00:35:26,791
team felt like a commitment to our

958
00:35:26,791 --> 00:35:27,687
customers. And it doesn't

959
00:35:27,687 --> 00:35:29,645
matter if it's, if it was like a

960
00:35:29,645 --> 00:35:32,416
delayed delivery, or if it was a

961
00:35:32,416 --> 00:35:34,187
particular upstream, or if it's a

962
00:35:34,187 --> 00:35:35,625
particular provider,

963
00:35:35,625 --> 00:35:39,020
that's really on us. And that's we felt

964
00:35:39,020 --> 00:35:40,791
that, okay, we need to do something,

965
00:35:40,791 --> 00:35:42,333
we're not just going to say,

966
00:35:42,333 --> 00:35:44,958
you know, this is not usable. We

967
00:35:44,958 --> 00:35:47,020
need to do something to get back to

968
00:35:47,020 --> 00:35:49,562
service to our customers.

969
00:35:49,562 --> 00:35:51,833
I think a lot of props should go to the

970
00:35:51,833 --> 00:35:52,541
team. Obviously, I kind of

971
00:35:52,541 --> 00:35:54,666
team. But I think some of these, the our

972
00:35:54,666 --> 00:35:56,125
ability to handle some of these

973
00:35:56,125 --> 00:35:58,979
situations is, if you

974
00:35:58,979 --> 00:36:01,166
find yourself in a situation and it's

975
00:36:01,166 --> 00:36:02,625
new, and you're not prepared,

976
00:36:02,625 --> 00:36:03,895
it's going to be much harder.

977
00:36:04,520 --> 00:36:07,895
So the more you prepare both, hey, this

978
00:36:07,895 --> 00:36:09,708
disaster scenario is

979
00:36:09,708 --> 00:36:13,083
possible. And maybe even just maybe

980
00:36:13,083 --> 00:36:14,729
you don't even run an exercise, maybe you

981
00:36:14,729 --> 00:36:16,416
just talk about a here's

982
00:36:16,416 --> 00:36:17,770
what we would do so that you

983
00:36:17,770 --> 00:36:20,770
just have a shared understanding of what

984
00:36:20,770 --> 00:36:21,854
are the tools that would be

985
00:36:21,854 --> 00:36:23,229
available to us if something

986
00:36:23,229 --> 00:36:26,333
like that happens. I think that's already

987
00:36:26,333 --> 00:36:27,104
the first level of

988
00:36:27,104 --> 00:36:28,916
preparation. And then it goes

989
00:36:28,916 --> 00:36:30,000
to the architecture. As well, we try to

990
00:36:30,000 --> 00:36:32,479
to the architecture. As well, we try to

991
00:36:32,479 --> 00:36:36,562
present a global service to

992
00:36:36,562 --> 00:36:38,229
you so that you don't have to

993
00:36:38,229 --> 00:36:42,208
think about regions and capacity and all

994
00:36:42,208 --> 00:36:43,520
of that. But behind the

995
00:36:43,520 --> 00:36:46,708
scenes, there is partitioning,

996
00:36:46,708 --> 00:36:50,541
both for performance reasons, but also

997
00:36:50,541 --> 00:36:54,041
for reliability reasons. And that

998
00:36:56,229 --> 00:36:59,625
design principle also allowed us to

999
00:36:59,625 --> 00:37:03,145
continue to serve our customers, even at

1000
00:37:03,145 --> 00:37:05,104
a very degraded state

1001
00:37:05,104 --> 00:37:08,541
while recovery was going on. Because from

1002
00:37:08,541 --> 00:37:10,145
their perspective, things

1003
00:37:10,145 --> 00:37:11,416
got slower, because we just

1004
00:37:11,416 --> 00:37:14,104
didn't have enough capacity for all

1005
00:37:14,104 --> 00:37:15,083
of their jobs. So how does

1006
00:37:15,708 --> 00:37:18,354
Namespace come together for a

1007
00:37:18,354 --> 00:37:21,479
user like myself, or for regular users,

1008
00:37:21,479 --> 00:37:22,583
I'm very keen to basically

1009
00:37:22,583 --> 00:37:24,687
see what it means when we use

1010
00:37:24,687 --> 00:37:27,291
Namespace on the command line. How does

1011
00:37:27,291 --> 00:37:28,187
it compare with the local

1012
00:37:28,187 --> 00:37:29,145
stuff that you may have running

1013
00:37:29,145 --> 00:37:31,270
locally? Because that's also like,

1014
00:37:31,270 --> 00:37:34,000
sure, run things locally. But not

1015
00:37:34,000 --> 00:37:35,395
everyone has 25 gigabits

1016
00:37:35,395 --> 00:37:38,333
at home. But even then, when you do you

1017
00:37:38,333 --> 00:37:39,729
want the we want the

1018
00:37:39,729 --> 00:37:42,208
resilience. So I'm wondering if

1019
00:37:42,208 --> 00:37:43,916
there's something that we can screen

1020
00:37:43,916 --> 00:37:45,854
share. There's something we can look at

1021
00:37:45,854 --> 00:37:47,083
just to see how this

1022
00:37:47,083 --> 00:37:49,770
comes together in practice. Yeah, I'm

1023
00:37:49,770 --> 00:37:54,041
happy to show you a couple things. I

1024
00:37:54,041 --> 00:37:56,187
would preface though with

1025
00:37:56,187 --> 00:37:58,625
we're very pragmatic. There are certain

1026
00:37:58,625 --> 00:37:59,687
we're very pragmatic. There are certain

1027
00:37:59,687 --> 00:38:00,208
parts of your developer workflow or

1028
00:38:00,208 --> 00:38:02,229
parts of your developer workflow or

1029
00:38:02,229 --> 00:38:04,208
certain things that you want to do

1030
00:38:04,208 --> 00:38:05,458
where running

1031
00:38:05,458 --> 00:38:06,770
them locally will always

1032
00:38:06,770 --> 00:38:08,687
be kind of the right thing.

1033
00:38:09,375 --> 00:38:11,750
We really like to think about what's the

1034
00:38:11,750 --> 00:38:12,875
right tool for a particular

1035
00:38:12,875 --> 00:38:13,520
right tool for a particular

1036
00:38:13,520 --> 00:38:16,354
job, where we try to excel

1037
00:38:16,354 --> 00:38:21,937
at is scale out. So you can get things

1038
00:38:21,937 --> 00:38:22,937
running really well in your

1039
00:38:22,937 --> 00:38:24,708
machine. But now you want to

1040
00:38:24,708 --> 00:38:27,708
run 1000 of them. And we could try to

1041
00:38:27,708 --> 00:38:30,416
find 1000 machines to run them at home.

1042
00:38:30,416 --> 00:38:31,354
But you just probably

1043
00:38:31,354 --> 00:38:34,187
that's not kind of the best, right. So we

1044
00:38:34,187 --> 00:38:35,437
try to apply things

1045
00:38:35,437 --> 00:38:35,895
where we can provide like an

1046
00:38:35,895 --> 00:38:37,416
where we can provide like an

1047
00:38:37,416 --> 00:38:40,312
non trivial amount of value where it kind

1048
00:38:40,312 --> 00:38:41,354
of makes sense to kind of

1049
00:38:41,354 --> 00:38:41,625
of makes sense to kind of

1050
00:38:41,625 --> 00:38:43,270
move over from whatever else that

1051
00:38:43,270 --> 00:38:44,312
you're doing, whether it's local

1052
00:38:44,312 --> 00:38:46,000
development or something else.

1053
00:38:46,000 --> 00:38:48,791
Then Namespace, there's,

1054
00:38:48,791 --> 00:38:50,604
there's different ways to approach the

1055
00:38:50,604 --> 00:38:52,312
product. So we're both an

1056
00:38:52,312 --> 00:38:53,895
infrastructure provider. So that's

1057
00:38:53,895 --> 00:38:57,416
where the nsc CLI comes in. But

1058
00:38:57,416 --> 00:38:58,875
we're also a service

1059
00:38:58,875 --> 00:39:00,916
provider. And I would say actually,

1060
00:39:00,916 --> 00:39:03,604
majority of our customers, they use our

1061
00:39:03,604 --> 00:39:05,833
prepackaged solutions.

1062
00:39:05,833 --> 00:39:08,916
So they, they want to do

1063
00:39:08,916 --> 00:39:10,541
Docker builds, they want their Docker

1064
00:39:10,541 --> 00:39:11,666
builds to be as fast as

1065
00:39:11,666 --> 00:39:13,479
possible. So we have kind of a

1066
00:39:13,479 --> 00:39:15,979
prepackaged Docker build product, they

1067
00:39:15,979 --> 00:39:16,937
want to run a Kubernetes

1068
00:39:16,937 --> 00:39:19,812
cluster really fast or 100 or 1000

1069
00:39:19,812 --> 00:39:21,708
or 10,000 of them, we have a prepackaged

1070
00:39:21,708 --> 00:39:23,416
product for that. They want

1071
00:39:23,416 --> 00:39:25,750
their CI runs to go really

1072
00:39:25,750 --> 00:39:28,895
fast, or in a very cost effective way,

1073
00:39:28,895 --> 00:39:30,229
actually, we start hearing a

1074
00:39:30,229 --> 00:39:32,791
lot more around kind of cost

1075
00:39:32,791 --> 00:39:34,979
management. So we have products for that

1076
00:39:34,979 --> 00:39:37,291
as well. We'll focus today a

1077
00:39:37,291 --> 00:39:38,541
bit more on the infrastructure.

1078
00:39:38,541 --> 00:39:42,520
So kind of under the covers. But if

1079
00:39:42,520 --> 00:39:43,979
you're if you want your CI

1080
00:39:43,979 --> 00:39:45,270
to go fast, you don't actually

1081
00:39:45,270 --> 00:39:48,687
have to run nsc CLI, like there's

1082
00:39:48,687 --> 00:39:50,166
products that make that super

1083
00:39:50,166 --> 00:39:51,208
easy for you.

1084
00:39:51,208 --> 00:39:54,437
I was thinking of starting at the origin

1085
00:39:54,833 --> 00:39:56,770
So, things started in kind of

1086
00:39:56,770 --> 00:39:58,125
building this application

1087
00:39:58,125 --> 00:39:58,270
framework. And this application

1088
00:39:58,270 --> 00:39:59,812
framework. And this application

1089
00:39:59,812 --> 00:40:00,520
framework, we we leaned on on something

1090
00:40:00,520 --> 00:40:02,062
framework, we leaned on something

1091
00:40:02,062 --> 00:40:03,895
akin to what we had built

1092
00:40:03,895 --> 00:40:06,145
at Google, which is a platform called

1093
00:40:06,145 --> 00:40:09,812
BOQ, where it tackles how

1094
00:40:09,812 --> 00:40:11,833
you write services, how do

1095
00:40:11,833 --> 00:40:13,833
services talk with each other? How do you

1096
00:40:13,833 --> 00:40:15,062
build services? How you test

1097
00:40:15,062 --> 00:40:16,270
services? How do you deploy

1098
00:40:16,270 --> 00:40:19,541
services? How you observe services in production?

1099
00:40:25,229 --> 00:40:27,020
To see how Namespace tests Foundation,

1100
00:40:27,583 --> 00:40:29,395
the open source application platform

1101
00:40:29,395 --> 00:40:30,833
inspired by Google's Boq,

1102
00:40:31,416 --> 00:40:32,520
find the YouTube video

1103
00:40:32,520 --> 00:40:33,875
link in the show notes.

1104
00:40:35,062 --> 00:40:37,500
After Hugo's demo, we look into how a

1105
00:40:37,500 --> 00:40:39,500
remote Docker build can

1106
00:40:39,500 --> 00:40:41,625
be faster than a local one.

1107
00:40:42,354 --> 00:40:44,062
That is a separate YouTube video,

1108
00:40:44,062 --> 00:40:45,416
link in the show notes.

1109
00:40:46,208 --> 00:40:48,416
OK, let's start wrapping this episode up.

1110
00:40:53,833 --> 00:40:57,791
We see more and more use cases around

1111
00:40:57,791 --> 00:41:00,750
kind of complex scenarios with

1112
00:41:00,750 --> 00:41:03,291
previews. This is an area that has also

1113
00:41:03,291 --> 00:41:03,604
previews. This is an area that has also

1114
00:41:03,604 --> 00:41:04,833
been a pain point for us.

1115
00:41:05,916 --> 00:41:08,062
And we want to do better.

1116
00:41:08,062 --> 00:41:11,645
Another thing is instances right now they're fully isolated.

1117
00:41:11,645 --> 00:41:15,875
So two instances, they don't share any networking. 

1118
00:41:15,875 --> 00:41:17,812
But we...

1119
00:41:18,000 --> 00:41:21,270
We have a POC internally that uses tailscale

1120
00:41:21,270 --> 00:41:22,875
where you can connect multiple

1121
00:41:22,875 --> 00:41:24,854
instances. But we're also

1122
00:41:24,854 --> 00:41:28,729
thinking of just adding a tagged mode where you tag an

1123
00:41:28,729 --> 00:41:32,229
instance with kind of a network. And then

1124
00:41:32,229 --> 00:41:34,708
instances that are tagged the same in the

1125
00:41:34,708 --> 00:41:35,979
same network, they can

1126
00:41:35,979 --> 00:41:37,437
communicate with each

1127
00:41:37,437 --> 00:41:39,208
other. So you could have like a front end

1128
00:41:39,208 --> 00:41:40,770
that calls a back end.

1129
00:41:41,020 --> 00:41:43,895
Our goal is not to kind of cover all of the possible,

1130
00:41:43,895 --> 00:41:46,145
you know, compute use

1131
00:41:46,145 --> 00:41:48,291
cases, but just things that are

1132
00:41:48,291 --> 00:41:52,395
helpful and ideally kind of easy to use

1133
00:41:52,395 --> 00:41:55,770
to achieve what you want to achieve.

1134
00:41:55,770 --> 00:41:58,666
Typically creating a preview, you can go

1135
00:41:58,666 --> 00:41:59,937
all the way to a pass,

1136
00:41:59,937 --> 00:42:02,229
like go to a solution that

1137
00:42:02,229 --> 00:42:03,895
just packages everything and then you

1138
00:42:03,895 --> 00:42:07,020
have very little flexibility, or you can go, "Okay,

1139
00:42:07,020 --> 00:42:08,833
I need to do everything from scratch."

1140
00:42:08,833 --> 00:42:10,583
And we try to be somewhere in the middle

1141
00:42:10,583 --> 00:42:12,416
where you have kind of

1142
00:42:12,416 --> 00:42:14,187
building blocks that are

1143
00:42:14,187 --> 00:42:15,250
helpful but you still

1144
00:42:15,250 --> 00:42:18,083
can make it your own.

1145
00:42:18,083 --> 00:42:20,750
You can still decide what goes inside of my container?

1146
00:42:20,750 --> 00:42:22,312
Is it multiple containers?

1147
00:42:24,187 --> 00:42:26,895
Whether I want authentication, I want authentication

1148
00:42:26,895 --> 00:42:28,375
So all of that, it's kind

1149
00:42:28,375 --> 00:42:30,208
of more our mental model,

1150
00:42:30,208 --> 00:42:32,229
kind of our design principle of being

1151
00:42:32,229 --> 00:42:33,479
somewhere in the middle with

1152
00:42:33,479 --> 00:42:35,458
not fully packaged but also not

1153
00:42:35,458 --> 00:42:36,687
completely done from scratch.

1154
00:42:37,791 --> 00:42:39,020
As we prepare to wrap up,

1155
00:42:39,020 --> 00:42:40,000
one last thought,

1156
00:42:40,812 --> 00:42:43,104
one last takeaway from our conversation

1157
00:42:43,104 --> 00:42:44,958
for people that stuck with us to the end.

1158
00:42:45,437 --> 00:42:47,083
What would you like them to take away

1159
00:42:47,083 --> 00:42:48,729
from our conversation?

1160
00:42:49,541 --> 00:42:52,000
I was asked recently,

1161
00:42:53,791 --> 00:42:57,416
how does one become good at something?

1162
00:42:58,020 --> 00:43:00,458
And I've worked with so many engineers

1163
00:43:00,458 --> 00:43:03,541
that are extremely good.

1164
00:43:03,541 --> 00:43:07,145
And I've been looking for patterns,

1165
00:43:07,145 --> 00:43:08,812
like what are the things that are common

1166
00:43:08,812 --> 00:43:10,937
across these engineers?

1167
00:43:10,937 --> 00:43:13,895
And I find that it's usually some kind of

1168
00:43:13,895 --> 00:43:17,937
unrelenting curiosity

1169
00:43:17,937 --> 00:43:22,354
that really propels people beyond just

1170
00:43:22,354 --> 00:43:25,583
being good to being excellent.

1171
00:43:26,583 --> 00:43:28,541
And I think that kind of comes back to

1172
00:43:28,541 --> 00:43:30,208
when we approach how

1173
00:43:30,208 --> 00:43:31,229
we build our products

1174
00:43:31,229 --> 00:43:33,354
is with that same level

1175
00:43:33,354 --> 00:43:36,958
of unrelenting curiosity

1176
00:43:36,958 --> 00:43:39,708
and willingness to break

1177
00:43:39,708 --> 00:43:42,333
through and change things

1178
00:43:42,333 --> 00:43:45,520
that may help us build a better product.

1179
00:43:45,520 --> 00:43:46,562
And I think having that

1180
00:43:46,562 --> 00:43:46,625
courage has been helpful for us,

1181
00:43:46,625 --> 00:43:48,291
courage has been helpful for us,

1182
00:43:48,291 --> 00:43:49,083
but when we bring people in, try to

1183
00:43:49,083 --> 00:43:51,229
but when we bring people in, try to

1184
00:43:51,229 --> 00:43:52,854
instill that same spirit of

1185
00:43:52,854 --> 00:43:54,687
just go deep, read the

1186
00:43:54,687 --> 00:43:57,208
code, try different things,

1187
00:43:57,208 --> 00:44:00,750
see how it works, that just really helps

1188
00:44:00,750 --> 00:44:03,500
propelling us to just do better.

1189
00:44:05,645 --> 00:44:07,687
Well, on that note, thank you very much

1190
00:44:07,687 --> 00:44:09,000
for joining us today, Hugo.

1191
00:44:09,687 --> 00:44:11,583
I look forward to all the improvements

1192
00:44:11,583 --> 00:44:13,333
you'll be driving in Namespace.

1193
00:44:13,937 --> 00:44:15,479
I think you're on to something here.

1194
00:44:15,479 --> 00:44:18,000
I really like the speed, I really like

1195
00:44:18,000 --> 00:44:19,645
the simplicity in many ways,

1196
00:44:19,645 --> 00:44:21,854
and I know that behind it, there's a lot

1197
00:44:21,854 --> 00:44:23,520
of complexity that you need to handle

1198
00:44:23,520 --> 00:44:25,895
to make things this simple and this fast.

1199
00:44:27,104 --> 00:44:28,437
Thank you very much.

1200
00:44:28,437 --> 00:44:28,791
Thank you.

1201
00:44:28,791 --> 00:44:30,020
And I look forward to the next one.

1202
00:44:30,020 --> 00:44:31,625
It was a pleasure to be here. Thank you.