1
00:00:00,140 --> 00:00:00,720
Nikolay: Hello, hello.

2
00:00:00,720 --> 00:00:02,220
This is Postgres.FM.

3
00:00:02,220 --> 00:00:05,840
My name is Nikolay, Postgres.AI,
and as usual, co-host

4
00:00:05,900 --> 00:00:07,160
is Michael, pgMustard.

5
00:00:08,460 --> 00:00:09,240
Hi, Michael.

6
00:00:09,920 --> 00:00:10,780
Michael: Hello, Nikolay.

7
00:00:11,420 --> 00:00:14,880
Nikolay: And we have a super interesting
guest today, Lev Kokotov.

8
00:00:15,100 --> 00:00:16,600
Hi, thank you for coming.

9
00:00:17,140 --> 00:00:17,380
Lev: Thanks.

10
00:00:17,380 --> 00:00:18,140
Hi, Nikolay.

11
00:00:18,580 --> 00:00:19,640
Glad to be here.

12
00:00:19,960 --> 00:00:25,700
Nikolay: Yeah, and the reason we
invited you is when I saw PgCat,

13
00:00:26,260 --> 00:00:27,180
it was interesting.

14
00:00:28,080 --> 00:00:31,680
At that time, I had a spike of
interest to sharding solutions,

15
00:00:32,600 --> 00:00:36,920
and I saw PgCat was started as
a connection pooler alternative

16
00:00:37,060 --> 00:00:37,700
to PgBouncer.

17
00:00:39,160 --> 00:00:43,740
Actually, at the same time, a few
other teams started some new

18
00:00:43,740 --> 00:00:44,240
projects.

19
00:00:44,340 --> 00:00:47,900
But Then Sharding was, oh, okay,
Sharding was there.

20
00:00:47,900 --> 00:00:50,060
It's quite straightforward way,
just comments.

21
00:00:50,060 --> 00:00:53,100
It was interesting, but I'm not
a cat guy.

22
00:00:53,100 --> 00:00:54,440
I'm a dog guy.

23
00:00:54,760 --> 00:00:58,600
And once I saw PgDog started,
we focused on sharding.

24
00:00:58,780 --> 00:01:01,540
It obviously attracted much more
attention from me.

25
00:01:01,800 --> 00:01:04,220
So that's why we invited you, actually.

26
00:01:06,180 --> 00:01:06,680
Awesome.

27
00:01:07,040 --> 00:01:07,540
Yeah.

28
00:01:07,940 --> 00:01:11,600
Let's talk about, I don't know,
like maybe sharding itself and

29
00:01:11,600 --> 00:01:15,040
probably let's start with the idea
that not everyone needs it

30
00:01:15,860 --> 00:01:21,040
Because on a single Postgres cluster,
you can grow and reach

31
00:01:21,420 --> 00:01:25,580
multibillion evaluation, go to
IPO, I have examples.

32
00:01:26,400 --> 00:01:28,880
But obviously, sometimes it's really
tough, right?

33
00:01:29,280 --> 00:01:31,740
Tough job to maintain a huge monolith.

34
00:01:32,500 --> 00:01:37,540
So what's your take in general
about the idea that you don't

35
00:01:37,540 --> 00:01:38,180
need sharding?

36
00:01:39,320 --> 00:01:41,100
Lev: Oh yeah, well that's a good
1.

37
00:01:41,320 --> 00:01:43,000
I agree in principle.

38
00:01:43,480 --> 00:01:46,640
So you know, Postgres can be pushed
quite far and we pushed it

39
00:01:46,640 --> 00:01:50,700
very, very far at Instacart, pre-IPO
I should add.

40
00:01:51,040 --> 00:01:55,940
But we IPO it as a massively sharded
database.

41
00:01:56,480 --> 00:02:00,200
And we absolutely had to because
a lot of things started to run

42
00:02:00,200 --> 00:02:00,940
quite slow.

43
00:02:01,920 --> 00:02:03,140
We had a lot of writes.

44
00:02:03,560 --> 00:02:08,420
Instacart was, you think most apps
are 90% read, 10% write.

45
00:02:08,420 --> 00:02:09,840
I don't know if that was the case
for us.

46
00:02:09,840 --> 00:02:12,440
I think we're a little bit more
like maybe like 80-20, maybe

47
00:02:12,440 --> 00:02:15,060
70-30, but we were doing a lot
of writes.

48
00:02:15,540 --> 00:02:16,200
Nikolay: Hold on.

49
00:02:16,580 --> 00:02:22,220
80-20 means 80 writes or reads,
as usual reads, right?

50
00:02:22,660 --> 00:02:23,420
Lev: Yeah, yeah, reads.

51
00:02:23,420 --> 00:02:23,980
Yeah, yeah.

52
00:02:23,980 --> 00:02:27,900
I'm thinking like 90% typical workloads, 90% reads, 10% writes.

53
00:02:27,900 --> 00:02:29,670
Nikolay: Or even more, or even more sometimes.

54
00:02:29,670 --> 00:02:32,220
Social media is definitely more reads, right?

55
00:02:32,220 --> 00:02:33,460
But here's the thing.

56
00:02:34,020 --> 00:02:36,360
Lev: Yeah, here's what it was a little bit different, not by

57
00:02:36,360 --> 00:02:39,780
much, but even that that small percentage, like 10%, 15% writes

58
00:02:39,780 --> 00:02:45,660
was enough to push our r5.24xlarge, I don't know, like 192 cores,

59
00:02:45,660 --> 00:02:47,820
almost a terabyte of RAM over the edge.

60
00:02:48,080 --> 00:02:49,180
Nikolay: Sounds like RDS.

61
00:02:49,840 --> 00:02:51,180
Actually, I'm super curious.

62
00:02:51,180 --> 00:02:55,020
It's off topic, but I'm super curious about so-called Instacart

63
00:02:55,460 --> 00:02:58,760
zero-downtime upgrade approach because we believe, at Postgres.AI,

64
00:02:58,860 --> 00:03:01,160
we believe it leads to corruption.

65
00:03:01,160 --> 00:03:03,900
But I hope we will discuss at a different time.

66
00:03:04,660 --> 00:03:05,880
Let's hold on to this.

67
00:03:09,320 --> 00:03:10,900
Lev: Yeah, happy to talk about it on another podcast.

68
00:03:10,900 --> 00:03:12,900
That was 1 of the guys who worked on it.

69
00:03:14,440 --> 00:03:15,580
Nikolay: Consider you invited.

70
00:03:18,420 --> 00:03:22,340
zero-downtime upgrades made on managed Postgres, where, which doesn't

71
00:03:22,340 --> 00:03:25,340
allow to change recovery target LSN.

72
00:03:26,260 --> 00:03:26,680
Lev: Yep.

73
00:03:26,680 --> 00:03:27,660
I did it myself.

74
00:03:27,700 --> 00:03:28,700
We had zero-downtime.

75
00:03:28,700 --> 00:03:29,200
Nikolay: Yeah.

76
00:03:29,680 --> 00:03:31,120
It's super interesting topic.

77
00:03:31,120 --> 00:03:33,960
Maybe in a few weeks, I'm really looking forward.

78
00:03:35,140 --> 00:03:40,480
So yeah, I understand that writes, we can scale reads easily

79
00:03:40,480 --> 00:03:43,780
because it's just a replica until some point, right?

80
00:03:43,780 --> 00:03:46,780
But writes, we have only the primary.

81
00:03:47,300 --> 00:03:51,000
And you need to go either to services, microservices, or sharding

82
00:03:51,020 --> 00:03:52,520
or combination of them.

83
00:03:53,740 --> 00:03:57,420
This is several routes, actually 2 major routes here.

84
00:03:57,700 --> 00:04:00,320
Lev: Yeah, Instacart actually sharded what we call functionally

85
00:04:00,320 --> 00:04:00,720
sharded.

86
00:04:00,720 --> 00:04:03,700
I don't know if that's a real term in the industry, but we just

87
00:04:03,700 --> 00:04:07,760
took tables out of our main database and put it into a different

88
00:04:07,760 --> 00:04:09,880
database, you know, functional sharding air quotes.

89
00:04:10,240 --> 00:04:11,620
Nikolay: Vertical split.

90
00:04:12,080 --> 00:04:13,280
Vertical position.

91
00:04:14,340 --> 00:04:14,840
Exactly.

92
00:04:15,600 --> 00:04:18,540
Lev: And that happened even before I joined the company.

93
00:04:18,540 --> 00:04:22,800
So that was way before, you know, IPO talks and all that stuff.

94
00:04:22,800 --> 00:04:26,780
So you can run out of capacity on a single machine quite quickly

95
00:04:26,800 --> 00:04:28,140
if you use it a lot.

96
00:04:28,140 --> 00:04:30,580
Sorry, that's a little bit of a tautology, but that's just the

97
00:04:30,580 --> 00:04:31,080
case.

98
00:04:31,150 --> 00:04:32,840
Like, it depends on your workload.

99
00:04:32,840 --> 00:04:34,200
We did a lot of machine learning.

100
00:04:34,200 --> 00:04:37,100
So we wrote a lot of bulk data, a lot, like daily.

101
00:04:37,120 --> 00:04:39,640
We would have like hundreds of
gigabytes of data that were completely

102
00:04:39,640 --> 00:04:41,180
new data every day, basically.

103
00:04:41,200 --> 00:04:45,920
So a lot of writes were happening
and sharding was the only way,

104
00:04:45,920 --> 00:04:46,400
basically.

105
00:04:46,400 --> 00:04:47,720
We were running out of...

106
00:04:47,720 --> 00:04:50,740
Our vacuum was always behind, and
when vacuum is behind, everyone

107
00:04:50,740 --> 00:04:51,800
gets a little bit scared.

108
00:04:52,640 --> 00:04:54,620
Performance is worse as well.

109
00:04:54,920 --> 00:04:57,620
We were getting a lot of lock contention
on the WAL.

110
00:04:58,100 --> 00:04:58,940
That happens a lot.

111
00:04:58,940 --> 00:05:02,160
The WAL is single-threaded to
this day, and that's totally fine,

112
00:05:02,160 --> 00:05:05,340
but you end up, you know, when
you write like hundreds of thousands

113
00:05:05,340 --> 00:05:06,420
transactions per second.

114
00:05:06,420 --> 00:05:09,440
Nikolay: Well, it's single-threaded
if you use WAL writer, but

115
00:05:09,440 --> 00:05:14,600
backends can write to WAL themselves
if you turn it off.

116
00:05:14,600 --> 00:05:16,080
Like this setting, I forgot.

117
00:05:16,160 --> 00:05:18,180
Lev: Yeah, No, you're right.

118
00:05:18,180 --> 00:05:19,300
You're absolutely right.

119
00:05:19,300 --> 00:05:20,280
But there's still a log.

120
00:05:20,280 --> 00:05:21,040
There's still a log.

121
00:05:21,040 --> 00:05:21,180
Yeah.

122
00:05:21,180 --> 00:05:22,440
You have to grab a log.

123
00:05:22,800 --> 00:05:24,820
Nikolay: We see it in performance
insights.

124
00:05:24,840 --> 00:05:30,560
If it's on RDS, like queries are
spending time on LWLog.

125
00:05:31,020 --> 00:05:31,780
Lev: That's the guy.

126
00:05:31,780 --> 00:05:35,740
And that's the guy who takes your
website offline on Sunday,

127
00:05:35,740 --> 00:05:37,360
every morning, like a clockwork.

128
00:05:37,360 --> 00:05:38,820
Nikolay: On the one-night-defensive
foreclose.

129
00:05:40,020 --> 00:05:43,980
And I would say traditional optimization
would be, let's write

130
00:05:43,980 --> 00:05:44,480
less.

131
00:05:44,760 --> 00:05:50,460
But I assume only in Postgres 13,
WAL-related metrics were added

132
00:05:50,460 --> 00:05:52,840
to register statements and explain.

133
00:05:53,860 --> 00:05:57,260
It was very difficult to understand
which queries, well, with

134
00:05:57,260 --> 00:06:00,360
performance insights you can have
based on weight event analysis

135
00:06:01,160 --> 00:06:06,040
to identify queries which are like
WAL write intensive, right?

136
00:06:06,040 --> 00:06:10,080
And maybe optimize them, but it's
still like limited approach,

137
00:06:10,080 --> 00:06:10,580
right?

138
00:06:11,200 --> 00:06:15,400
Lev: Well it's limited in its success
in a company because when

139
00:06:15,400 --> 00:06:18,080
you tell somebody like, hey, can
you write less to your database?

140
00:06:18,080 --> 00:06:21,800
People are like, I'm sorry, but
is that your job?

141
00:06:22,280 --> 00:06:23,580
You can write less,

142
00:06:23,760 --> 00:06:27,600
Nikolay: but we have, well, WAL
writes are problem for a query,

143
00:06:27,600 --> 00:06:31,720
consider a query which writes to
a table and 2 situations, This

144
00:06:31,720 --> 00:06:36,780
table has a couple of indexes versus
20 indexes and when we insert

145
00:06:36,840 --> 00:06:41,600
something, every index gets some
insert and it amplifies the

146
00:06:41,600 --> 00:06:44,660
amount of WAL written for the
same insert.

147
00:06:46,500 --> 00:06:48,840
Lev: It's really important to know
that every single index has

148
00:06:48,840 --> 00:06:50,580
an important business use case.

149
00:06:50,740 --> 00:06:52,900
And that's why it was put there
in the first place.

150
00:06:52,900 --> 00:06:55,360
So like all of these are invariants.

151
00:06:55,460 --> 00:06:58,440
Like when you say like I have 20
indexes on a table and it's

152
00:06:58,440 --> 00:07:00,260
right up, like that's by design.

153
00:07:00,480 --> 00:07:03,000
And when you want to write more
data to that table, because your

154
00:07:03,000 --> 00:07:06,340
business is growing, like us as
a database engineering team,

155
00:07:06,340 --> 00:07:08,480
we're like, all right, we're going
to make it happen, because

156
00:07:08,480 --> 00:07:11,200
we can't go to like 25 individual
teams and tell them like, hey,

157
00:07:11,200 --> 00:07:13,940
can you fix your query, they're
going to be like, I don't have

158
00:07:13,940 --> 00:07:17,480
the time, I'm trying to sell groceries
here.

159
00:07:17,660 --> 00:07:19,840
You know, like I have other concerns.

160
00:07:19,940 --> 00:07:23,100
Nikolay: But anyway, I'm trying
to say I agree with you that

161
00:07:23,320 --> 00:07:27,440
without sharding, we are limited
in, in like, we need to squeeze,

162
00:07:27,440 --> 00:07:29,280
squeeze, squeeze, but it's very
limited.

163
00:07:29,380 --> 00:07:33,980
And at some point, you cannot say
we can grow to X anymore, right?

164
00:07:34,440 --> 00:07:35,180
Lev: That's right.

165
00:07:36,040 --> 00:07:38,660
And that 2X is actually on the
low bound.

166
00:07:38,860 --> 00:07:42,480
What most engineering leaders expect
is like 3, 5, 6X.

167
00:07:43,080 --> 00:07:43,580
10.

168
00:07:44,140 --> 00:07:46,260
Yeah, 10 to have that runway.

169
00:07:46,320 --> 00:07:49,040
If you don't have that runway in
your system, it's a red flag

170
00:07:49,900 --> 00:07:50,580
for most.

171
00:07:50,580 --> 00:07:52,700
Nikolay: And it goes to CTO level,
basically.

172
00:07:52,840 --> 00:07:57,340
The decision is like, should we
migrate out of Postgres to, I

173
00:07:57,340 --> 00:07:58,540
don't know where, right?

174
00:07:58,860 --> 00:07:59,360
Lev: Exactly.

175
00:07:59,440 --> 00:08:03,160
Nikolay: I know this so many times,
including sitting on CTO

176
00:08:03,160 --> 00:08:05,080
position myself, like 15 years
ago.

177
00:08:05,080 --> 00:08:07,580
Like I know this pain of sharding.

178
00:08:07,580 --> 00:08:08,080
Yeah.

179
00:08:08,320 --> 00:08:09,140
Lev: It's really funny.

180
00:08:09,140 --> 00:08:10,920
You're just kind of sitting there
and be like, Hey guys, I need

181
00:08:10,920 --> 00:08:11,700
10X capacity.

182
00:08:11,760 --> 00:08:13,620
And your database engineer is like,
well, it's the WAL.

183
00:08:13,620 --> 00:08:14,440
There's nothing I can do.

184
00:08:14,440 --> 00:08:14,940
It's the WAL.

185
00:08:14,940 --> 00:08:17,940
It's like, it's just like a disconnect
between these, like, you

186
00:08:17,940 --> 00:08:20,280
know, these 2 people, you need
to like, you just need to make

187
00:08:20,280 --> 00:08:21,000
this thing scale.

188
00:08:21,000 --> 00:08:21,740
Like that's the end.

189
00:08:21,740 --> 00:08:23,400
And a lot of these are out of control.

190
00:08:23,400 --> 00:08:23,900
So,

191
00:08:24,280 --> 00:08:24,520
Nikolay: yeah.

192
00:08:24,520 --> 00:08:28,520
Let's like, we can also enable
compression for full page writes.

193
00:08:28,520 --> 00:08:32,980
We can also partition so writes
become more local and some, I

194
00:08:32,980 --> 00:08:35,180
don't know, like you're skeptical,
right?

195
00:08:35,320 --> 00:08:38,300
Lev: I am, because all of that,
partitioning is great.

196
00:08:38,440 --> 00:08:41,880
I would never speak bad against
partitioning, but...

197
00:08:42,340 --> 00:08:44,140
Nikolay: Not every partitioning,
sorry for interrupting, not

198
00:08:44,140 --> 00:08:44,880
every partitioning.

199
00:08:44,960 --> 00:08:48,000
If, for example, people say, okay,
we are going to spread all

200
00:08:48,000 --> 00:08:50,960
customers evenly distributed among
partitions.

201
00:08:51,140 --> 00:08:53,980
This is not going to help us in
terms of WAL writes.

202
00:08:54,020 --> 00:08:59,860
But if you say, we will have hot
partition, like receiving most

203
00:08:59,860 --> 00:09:04,120
writes, and kind of archive partitions
where the vacuum and everything

204
00:09:04,120 --> 00:09:05,640
was going like stabilized.

205
00:09:06,020 --> 00:09:10,020
In this case, it's super beneficial
for WAL writes and write

206
00:09:10,240 --> 00:09:12,160
intensive workload optimization,
right?

207
00:09:12,740 --> 00:09:15,020
Lev: Absolutely, but then you're
going to end up having to read

208
00:09:15,020 --> 00:09:16,400
those cold partitions anyway.

209
00:09:16,940 --> 00:09:19,300
It's a temporary band-aid, which
we did.

210
00:09:19,300 --> 00:09:21,940
Again, absolutely, we partitioned
a lot of tables and that helped

211
00:09:21,940 --> 00:09:23,000
us with a lot of things.

212
00:09:23,000 --> 00:09:26,260
But at the end of the day, you
just need more compute.

213
00:09:26,540 --> 00:09:28,780
That's the bitter lesson of database.

214
00:09:29,140 --> 00:09:31,460
I don't know if you're familiar
with the bitter lesson in AI,

215
00:09:31,800 --> 00:09:34,780
where you just need more machines
to solve the problem.

216
00:09:34,780 --> 00:09:36,240
It's the same thing in databases.

217
00:09:36,280 --> 00:09:39,280
You just need more compute to just
be able to write, read more

218
00:09:39,280 --> 00:09:40,280
stuff and write more stuff.

219
00:09:40,280 --> 00:09:44,480
Nikolay: Yeah, I'm still pulling
us back intentionally because

220
00:09:44,480 --> 00:09:49,700
we are having conversations about
like scaling problems.

221
00:09:50,190 --> 00:09:54,380
And interesting insight I've got
yesterday talking to a very

222
00:09:54,380 --> 00:09:59,340
experienced Postgres expert, that
partitioning not only helps

223
00:09:59,340 --> 00:10:03,520
with data locality, write locality,
and full-page writes, and

224
00:10:03,520 --> 00:10:06,980
volume amount as well, but also
with backups, surprisingly, if

225
00:10:06,980 --> 00:10:13,600
we do incremental backups with
WAL-G, pgBackRest, with new incremental

226
00:10:13,980 --> 00:10:16,700
API, I still need to learn about
this more.

227
00:10:16,720 --> 00:10:21,020
Or snapshots, cloud snapshots of
EBS volumes and RDS also relies

228
00:10:21,020 --> 00:10:22,280
on it, as I understand.

229
00:10:23,600 --> 00:10:27,080
Imagine writes are spread out everywhere,
and you constantly

230
00:10:27,080 --> 00:10:32,120
change a lot of blocks, versus
you change only specific blocks,

231
00:10:32,120 --> 00:10:33,540
and some blocks are not changed.

232
00:10:33,540 --> 00:10:36,840
And it's less pressure on backups
as well.

233
00:10:36,960 --> 00:10:40,580
It's interesting and it affects
DR and RPO, RTO.

234
00:10:42,340 --> 00:10:45,920
But I just wanted to ask, like
my understanding, If we go to

235
00:10:45,920 --> 00:10:50,080
sharding blindly and miss some
optimizations, I remember I was

236
00:10:50,080 --> 00:10:54,440
dealing with Mongo and each node
of Mongo, it was maybe 10 plus

237
00:10:54,440 --> 00:11:00,340
years ago, I saw each node of Mongo
can handle much less workload

238
00:11:00,520 --> 00:11:02,620
than Postgres, 1 node of Postgres.

239
00:11:03,280 --> 00:11:08,540
Maybe like jumping to sharding
too early, it's premature optimization

240
00:11:08,620 --> 00:11:12,240
and you will be not in good shape
in terms of how much money

241
00:11:12,240 --> 00:11:16,220
you spend on compute nodes because
you missed a lot of optimization

242
00:11:16,320 --> 00:11:16,800
steps.

243
00:11:16,800 --> 00:11:17,700
What do you think?

244
00:11:17,870 --> 00:11:19,080
Lev: Alexey Nosovsky Absolutely.

245
00:11:19,540 --> 00:11:22,600
If your first step is like, my
database is broken, I need to

246
00:11:22,600 --> 00:11:24,180
shard, you missed a lot.

247
00:11:24,480 --> 00:11:26,680
Of course, you should look at all
the possible optimizations.

248
00:11:26,940 --> 00:11:29,560
And that's really important to
keep you on a single machine.

249
00:11:30,060 --> 00:11:32,840
I think the benchmark for me is
when you need to start thinking

250
00:11:32,840 --> 00:11:36,160
about charting is when you try
like 3 or 4 or 5 different things

251
00:11:36,480 --> 00:11:39,360
and then you have like a person
or 2 working on it full-time

252
00:11:39,400 --> 00:11:42,260
and at the end of the day they're
like I ran out of ideas.

253
00:11:42,880 --> 00:11:47,120
Nikolay: Do you have some numbers
like for modern hardware like

254
00:11:47,120 --> 00:11:51,420
how much WAL to be written per
day, for example, like terabyte,

255
00:11:51,600 --> 00:11:55,900
5 terabytes, where it's already
an edge, right?

256
00:11:55,900 --> 00:11:56,600
Lev: Yeah, for sure.

257
00:11:56,600 --> 00:11:58,080
I can give you the exact number.

258
00:11:58,080 --> 00:12:01,400
We benchmarked Postgres on in the
best possible scenario, actually,

259
00:12:01,560 --> 00:12:06,800
It was on EC2, it had NVMe drives,
it had an ext4, a RAID 0.

260
00:12:06,840 --> 00:12:09,600
So we did not, we did not want
durability.

261
00:12:09,640 --> 00:12:10,640
We wanted performance.

262
00:12:10,720 --> 00:12:15,140
That thing could write 4 or 5 gigabytes
per second using Bonnie++.

263
00:12:15,860 --> 00:12:19,180
Now Postgres was able to write
about 300 megabytes per second

264
00:12:19,600 --> 00:12:20,780
of just like whatever.

265
00:12:20,840 --> 00:12:23,440
I think we're using copy, like
the fastest possible way to dump

266
00:12:23,440 --> 00:12:24,440
data into PG.

267
00:12:24,840 --> 00:12:27,040
And it's just, it's the nature
of the beast, right?

268
00:12:27,040 --> 00:12:30,040
You have to like process it, you
have to put it into a table.

269
00:12:30,040 --> 00:12:32,640
I'm sure there's a lot of like
checks and balances go in between

270
00:12:32,640 --> 00:12:35,460
and again, lock contention and
all of that stuff.

271
00:12:35,460 --> 00:12:40,900
So you can't necessarily squeeze
out everything out of your machine

272
00:12:40,900 --> 00:12:42,600
if you have the capacity.

273
00:12:42,600 --> 00:12:46,320
So ultimately you need to split
the processes themselves between

274
00:12:46,320 --> 00:12:48,540
multiple machines once you reach
that number.

275
00:12:48,640 --> 00:12:51,880
And then, you know, for everything
else, if you have, again,

276
00:12:51,880 --> 00:12:53,860
if you have write amplification,
things are getting a little

277
00:12:53,860 --> 00:12:54,660
bit more tricky.

278
00:12:54,780 --> 00:12:57,940
And then, so I think that's, that's
usually the number.

279
00:12:57,940 --> 00:13:01,340
Once you're writing like a, you
know, 200 megabytes per second

280
00:13:01,340 --> 00:13:02,040
of just the WAL.

281
00:13:02,040 --> 00:13:05,460
And you can see that in RDS there's
a graph for that just just

282
00:13:05,460 --> 00:13:06,140
WAL writes.

283
00:13:06,700 --> 00:13:07,780
I would even start early.

284
00:13:07,780 --> 00:13:08,540
200 megabytes

285
00:13:08,560 --> 00:13:13,280
Nikolay: of WAL per second each
WAL is 16 megabytes but on

286
00:13:13,280 --> 00:13:14,360
RDS it's 64.

287
00:13:14,940 --> 00:13:16,020
They changed it.

288
00:13:16,240 --> 00:13:20,480
So how many WAL files like it's
a lot right?

289
00:13:21,000 --> 00:13:23,400
Lev: It's a lot and that's And
that's the scale you're looking

290
00:13:23,400 --> 00:13:23,600
at.

291
00:13:23,600 --> 00:13:25,520
Nikolay: 10 plus is already too
much.

292
00:13:25,520 --> 00:13:26,260
Yeah, I agree.

293
00:13:26,960 --> 00:13:27,460
Lev: Yeah.

294
00:13:28,780 --> 00:13:30,860
So that's the red line, right?

295
00:13:30,860 --> 00:13:33,420
If you're at the red line and you're
thinking about sharding,

296
00:13:33,780 --> 00:13:35,020
we're gonna have a bad time.

297
00:13:35,020 --> 00:13:35,660
And that's okay.

298
00:13:35,660 --> 00:13:36,960
It's okay to have a bad time.

299
00:13:36,960 --> 00:13:39,640
These lessons are learned when
you're having a bad time.

300
00:13:39,640 --> 00:13:41,080
I had a really bad time.

301
00:13:41,480 --> 00:13:46,640
Nikolay: Yeah, and I pulled BC
calculator just to check 200 megs

302
00:13:46,640 --> 00:13:50,100
per second, it gives you 17 terabytes
per day.

303
00:13:50,820 --> 00:13:51,320
Yeah,

304
00:13:52,660 --> 00:13:55,440
Lev: that's mild, honestly, for
what we did at Instacart.

305
00:13:55,580 --> 00:13:59,280
We had like petabytes of WAL in
S3, like after like a week.

306
00:13:59,680 --> 00:14:02,140
And That's just, again, it's the
nature of the beast.

307
00:14:02,560 --> 00:14:03,340
Nikolay: Multiple cluster.

308
00:14:03,340 --> 00:14:04,180
Lev: Single cluster.

309
00:14:04,440 --> 00:14:06,520
Yeah, it's just, it is.

310
00:14:07,540 --> 00:14:09,400
It's called selling groceries.

311
00:14:09,520 --> 00:14:10,880
What can I tell you?

312
00:14:11,040 --> 00:14:14,440
We had like, you know, multi, hundreds
of gigabytes of machine

313
00:14:14,440 --> 00:14:17,160
learning data that would come in
every day because the model

314
00:14:17,160 --> 00:14:19,900
was retrained on a nightly basis
and all the embeddings have

315
00:14:19,900 --> 00:14:20,780
changed, right?

316
00:14:20,860 --> 00:14:25,080
Again, the use cases are almost
like, I mean, they're interesting,

317
00:14:25,080 --> 00:14:26,980
but they're kind of off topic,
I guess, because they're just

318
00:14:26,980 --> 00:14:28,380
like, you just write a lot of data.

319
00:14:28,380 --> 00:14:29,600
That's what you do for a living,
right?

320
00:14:29,600 --> 00:14:30,760
That's what Postgres is for.

321
00:14:30,760 --> 00:14:32,180
That's what databases are for.

322
00:14:32,840 --> 00:14:35,380
So the numbers are like, well,
this is the number, and that's

323
00:14:35,380 --> 00:14:36,960
when you should split up.

324
00:14:37,080 --> 00:14:43,860
Nikolay: Yeah, so let's probably
rest for more sane situations.

325
00:14:44,340 --> 00:14:48,080
Let's say 10 terabytes per day
or something of WAL data.

326
00:14:48,080 --> 00:14:49,260
It's already too much.

327
00:14:49,900 --> 00:14:55,760
If you approach that like not soon,
it's already a WAL you will

328
00:14:55,760 --> 00:14:58,540
be hitting, several WALs inside,
obviously you will be hitting.

329
00:14:58,780 --> 00:14:59,840
Including light.

330
00:15:00,480 --> 00:15:02,820
Yeah, yeah, several WALs.

331
00:15:04,700 --> 00:15:09,520
Yeah, I was trying to say lightweight
lock, WAL-related lightweight

332
00:15:09,520 --> 00:15:11,140
locks, WAL-write, right?

333
00:15:11,600 --> 00:15:12,940
There are several of them.

334
00:15:13,260 --> 00:15:18,920
Yeah, but it's, Yeah, this is the
scale to be scary.

335
00:15:20,580 --> 00:15:23,040
Lev: It's hard to put a number
on it because you can push that

336
00:15:23,040 --> 00:15:25,200
further, and who knows what's going
to happen to your app.

337
00:15:25,200 --> 00:15:26,760
Maybe that's as high as you'll
go.

338
00:15:26,760 --> 00:15:31,240
Nikolay: And not everyone has local
NVMEs because in cloud environments,

339
00:15:31,240 --> 00:15:31,980
they are ephemeral.

340
00:15:31,980 --> 00:15:34,460
It's kind of exotic and risky.

341
00:15:35,140 --> 00:15:35,660
Lev: It is.

342
00:15:35,660 --> 00:15:36,580
It's definitely risky.

343
00:15:36,580 --> 00:15:39,560
I love those because when you click
reboot on the machine, it

344
00:15:39,560 --> 00:15:42,180
wipes the encryption key and all
your data is gone.

345
00:15:42,180 --> 00:15:44,100
So you better be careful which
button you press.

346
00:15:44,100 --> 00:15:47,320
There's like a hot restart, which
is safe, and there's the actual

347
00:15:47,320 --> 00:15:49,160
restart, which will delete all
your data.

348
00:15:49,720 --> 00:15:52,380
So stay on RDS if you can.

349
00:15:53,460 --> 00:15:57,160
Nikolay: But no, I don't agree
here because you just you should

350
00:15:57,160 --> 00:16:01,000
just use Patroni and multiple replicas
at least at least 3 nodes

351
00:16:01,000 --> 00:16:03,380
and you'll be fine to lose 1 node.

352
00:16:04,540 --> 00:16:09,500
Lev: You lose 1, something ran
a bad, anyway, that's totally

353
00:16:09,760 --> 00:16:10,540
off topic.

354
00:16:11,320 --> 00:16:12,260
Nikolay: Okay, thank you.

355
00:16:12,260 --> 00:16:17,360
For me, it's enough to understand
that there are cases definitely

356
00:16:17,420 --> 00:16:21,460
when you do need to split vertically,
and splitting vertically

357
00:16:21,460 --> 00:16:22,700
usually is limited as well.

358
00:16:22,700 --> 00:16:25,620
It's hard sometimes, and at some
point we need sharding.

359
00:16:26,320 --> 00:16:30,540
Yeah, so it's enough to explain
that definitely there are cases

360
00:16:30,680 --> 00:16:35,780
where you need to scale Postgres
beyond 1 primary, it can be

361
00:16:35,780 --> 00:16:36,900
vertical or horizontal.

362
00:16:37,080 --> 00:16:38,900
The result is sharding, right?

363
00:16:39,280 --> 00:16:39,980
Lev: That's right.

364
00:16:40,520 --> 00:16:41,080
Nikolay: So, good.

365
00:16:41,920 --> 00:16:42,420
Yeah.

366
00:16:42,580 --> 00:16:44,160
This is great explanation.

367
00:16:44,720 --> 00:16:47,720
Sharding is definitely needed,
but not for everyone maybe, but

368
00:16:47,720 --> 00:16:48,460
for many.

369
00:16:48,840 --> 00:16:50,640
Lev: I mean, yeah, I hope not for
everyone.

370
00:16:51,220 --> 00:16:53,760
That would be, I mean, you know,
we solve the problem once and

371
00:16:53,760 --> 00:16:54,380
for all.

372
00:16:54,380 --> 00:16:54,880
Nikolay: Actually.

373
00:16:54,950 --> 00:16:56,652
Anyone has access to

374
00:16:56,652 --> 00:16:56,764
Lev: it, yeah.

375
00:16:56,764 --> 00:16:59,320
Nikolay: Last week, Michael and
I had discussion about snapshots

376
00:16:59,440 --> 00:17:05,920
and I said that CloudSQL has hard
limit, 64 terabytes, because

377
00:17:05,920 --> 00:17:10,160
this is a limit for GCP persistent
disks, including persistent

378
00:17:10,520 --> 00:17:11,600
PD, SSD.

379
00:17:12,880 --> 00:17:17,220
And I said, RDS obviously can allow
you to grow further, but

380
00:17:17,300 --> 00:17:19,240
beyond 64, but it's not.

381
00:17:19,400 --> 00:17:25,020
After we recorded, I checked, and
the same limitation, 64 terabytes.

382
00:17:25,680 --> 00:17:31,560
It's another reason to have multiple
primaries and scale beyond

383
00:17:31,560 --> 00:17:32,300
1 cluster.

384
00:17:33,180 --> 00:17:36,060
Lev: Well, you'll hit that sooner
on Postgres because of the

385
00:17:36,060 --> 00:17:37,620
32 terabyte table limit.

386
00:17:37,900 --> 00:17:41,500
And actually I did hit that limit
once and that was a scary error.

387
00:17:42,440 --> 00:17:44,960
The only thing that it says is
like, there's no more bytes to

388
00:17:44,960 --> 00:17:48,480
be allocated or something like
that in some kind of database

389
00:17:48,480 --> 00:17:48,720
file.

390
00:17:48,720 --> 00:17:50,720
I'm like, what?

391
00:17:51,780 --> 00:17:54,000
You know, and then you're done.

392
00:17:54,000 --> 00:17:54,960
Like, there's nothing you can do.

393
00:17:54,960 --> 00:17:56,920
You have to move data out into
that table.

394
00:17:57,620 --> 00:17:58,580
You have to think database offline,
obviously.

395
00:17:59,760 --> 00:18:00,860
Nikolay: It's a single table.

396
00:18:00,860 --> 00:18:01,100
Okay.

397
00:18:01,100 --> 00:18:01,360
Okay.

398
00:18:01,360 --> 00:18:01,680
Lev: Yeah.

399
00:18:01,680 --> 00:18:02,380
A single table.

400
00:18:02,380 --> 00:18:02,540
Yeah.

401
00:18:02,540 --> 00:18:04,260
Which is actually not uncommon.

402
00:18:06,580 --> 00:18:09,560
Nikolay: Well, it's still not super
common as well.

403
00:18:10,080 --> 00:18:10,580
Yeah.

404
00:18:11,720 --> 00:18:15,060
I see only cases like kind of 10
terabytes for table.

405
00:18:15,060 --> 00:18:19,460
And it's already when any DBA should
scream, where is partitioning?

406
00:18:20,140 --> 00:18:22,760
Lev: Well, DBAs like to scream,
but application engineers be

407
00:18:22,760 --> 00:18:26,300
like, I can't, aren't you the DBA?

408
00:18:26,980 --> 00:18:27,900
Like, what am I doing?

409
00:18:27,900 --> 00:18:28,820
Do something, right?

410
00:18:29,440 --> 00:18:31,780
Nikolay: But partitioning unfortunately
requires application

411
00:18:31,920 --> 00:18:33,000
code changes.

412
00:18:33,080 --> 00:18:34,240
Lev: Yeah, precisely.

413
00:18:34,540 --> 00:18:40,000
Nikolay: Yeah, so sharding is needed
for big projects, obviously.

414
00:18:40,580 --> 00:18:41,080
Yeah.

415
00:18:41,460 --> 00:18:41,960
Agreed.

416
00:18:42,340 --> 00:18:45,040
What's next?

417
00:18:45,480 --> 00:18:46,080
Michael: Can we go?

418
00:18:46,080 --> 00:18:50,680
I reckon it's time to go back to
the origin story of PgDog and

419
00:18:50,680 --> 00:18:51,960
obviously then PgCat.

420
00:18:51,960 --> 00:18:54,740
It would be great to hear like
a little bit of that story, Lev.

421
00:18:54,780 --> 00:18:55,460
Lev: Yeah, absolutely.

422
00:18:55,520 --> 00:18:56,540
I'm happy to.

423
00:18:56,720 --> 00:18:58,580
Do you guys care about the name
of the project?

424
00:18:58,580 --> 00:19:00,960
I don't know if, or do you want
to know what the inspiration

425
00:19:00,960 --> 00:19:01,720
for that came from?

426
00:19:01,720 --> 00:19:02,980
Nikolay: Sure, it's fun, yeah.

427
00:19:03,240 --> 00:19:06,340
Lev: Okay, well, when I started
PgCat, we just got a new cat

428
00:19:06,340 --> 00:19:08,180
and I really loved that cat.

429
00:19:08,500 --> 00:19:11,580
And I was working on Postgres,
so the 2 things came together

430
00:19:11,580 --> 00:19:12,080
naturally.

431
00:19:13,580 --> 00:19:16,360
You could probably guess the origin
story of PgDog now, I got

432
00:19:16,360 --> 00:19:19,500
a dog, and then I'm like, look,
I love this dog.

433
00:19:19,740 --> 00:19:21,420
What can I call my next project?

434
00:19:21,500 --> 00:19:22,540
Obviously PgDog.

435
00:19:23,300 --> 00:19:23,620
Yeah.

436
00:19:23,620 --> 00:19:25,740
So the naming issue solved.

437
00:19:26,480 --> 00:19:29,440
PgCat came from the idea that it
was really simple.

438
00:19:29,440 --> 00:19:31,120
It was sharding was not even in
scope back then.

439
00:19:31,120 --> 00:19:33,740
It was just like, we ran PgBouncer.

440
00:19:33,740 --> 00:19:36,520
PgBouncer could only talk to 1
Database at a time.

441
00:19:36,740 --> 00:19:39,360
It makes sense, you know, you were
pooling like just the Postgres

442
00:19:39,360 --> 00:19:39,720
instance.

443
00:19:39,720 --> 00:19:40,880
We had a bunch of replicas.

444
00:19:41,000 --> 00:19:42,280
We needed to load balance.

445
00:19:42,280 --> 00:19:44,700
And we needed a load balancing
algorithm that was smart.

446
00:19:44,700 --> 00:19:47,860
When a replica went offline because
of, again, hardware issues,

447
00:19:48,000 --> 00:19:51,140
scaling issues, whatever, we needed
to remove it from the rotation

448
00:19:51,340 --> 00:19:53,560
without affecting the app.

449
00:19:53,560 --> 00:19:55,640
So we would regularly lose the
replica.

450
00:19:55,760 --> 00:20:01,220
And then most of the site would
go offline because we had a Ruby

451
00:20:01,220 --> 00:20:03,960
gem that would randomize access
to those replicas, and when 1

452
00:20:03,960 --> 00:20:06,600
of them broke, it's just, you know,
it worked okay.

453
00:20:06,600 --> 00:20:09,820
But doing this in the application
code is really hard.

454
00:20:10,160 --> 00:20:14,160
Especially in Ruby, there's a way
to inject exceptions into Ruby,

455
00:20:14,160 --> 00:20:17,220
like, sideline, and basically that
breaks your state.

456
00:20:17,560 --> 00:20:20,660
So we had like multiple gems working
against each other and we

457
00:20:20,660 --> 00:20:22,900
just needed to do that in a place
where it made more sense.

458
00:20:22,900 --> 00:20:26,040
Like a load balancer is typically
outside the application because

459
00:20:26,040 --> 00:20:28,280
you have multiple applications
so I can do it in any way.

460
00:20:28,280 --> 00:20:30,480
So I just built a load balancer
basically.

461
00:20:30,480 --> 00:20:32,320
It was after actually I left Instacart.

462
00:20:32,320 --> 00:20:35,340
I was just doing it as a side project
just to keep my mind going.

463
00:20:35,660 --> 00:20:36,800
So I built it.

464
00:20:36,820 --> 00:20:37,900
It was really simple.

465
00:20:37,900 --> 00:20:41,420
Used banding logic, which was kind
of novel at the time for Postgres.

466
00:20:42,040 --> 00:20:44,540
If you receive 1 single error from
the database, it's removed

467
00:20:44,540 --> 00:20:45,060
from the rotation.

468
00:20:45,060 --> 00:20:48,820
It's very aggressive, but Postgres
never throws errors, like

469
00:20:48,820 --> 00:20:51,180
network-related errors, unless
there's a serious problem.

470
00:20:51,180 --> 00:20:52,860
So that actually worked pretty
well.

471
00:20:53,300 --> 00:20:54,720
I talked to a friend of mine on
Instacart.

472
00:20:54,720 --> 00:20:56,580
I was like, hey, look, I built
this on the side.

473
00:20:56,580 --> 00:20:57,500
That looks fun, right?

474
00:20:57,500 --> 00:20:58,380
Like, we thought about this.

475
00:20:58,380 --> 00:21:00,120
And he's like, yeah, all right.

476
00:21:01,380 --> 00:21:02,260
Didn't you quit?

477
00:21:02,660 --> 00:21:06,360
I'm like, yeah, but no, I have
some free time to work on this.

478
00:21:06,360 --> 00:21:06,640
Right.

479
00:21:06,640 --> 00:21:07,700
And he's like, okay.

480
00:21:07,700 --> 00:21:10,840
And then he took the code, added
a bunch of features that I didn't

481
00:21:10,840 --> 00:21:11,120
add.

482
00:21:11,120 --> 00:21:13,940
Cause obviously I didn't have a
use case anymore for it.

483
00:21:13,940 --> 00:21:14,820
And he's like, oh, great.

484
00:21:14,820 --> 00:21:15,620
We're going to deploy it.

485
00:21:15,620 --> 00:21:15,800
Right.

486
00:21:15,800 --> 00:21:18,560
And we tried it and they use it
and they're, I mean, they put

487
00:21:18,560 --> 00:21:19,540
so much work into it.

488
00:21:19,540 --> 00:21:20,760
They wrote a blog post about it.

489
00:21:20,760 --> 00:21:21,880
You probably know about this.

490
00:21:21,880 --> 00:21:25,280
So that went pretty well, you know,
so it's working in production.

491
00:21:25,280 --> 00:21:26,020
It's great.

492
00:21:26,040 --> 00:21:28,320
And then I'm like, all right, well,
you know, sharding is the

493
00:21:28,320 --> 00:21:28,860
next 1.

494
00:21:28,860 --> 00:21:33,340
They have a bunch of sharded databases
that we sharded and adding

495
00:21:33,340 --> 00:21:35,740
sharding routing to that would
be great because again, it was

496
00:21:35,740 --> 00:21:38,440
done in the application layer and
application layer routing,

497
00:21:38,640 --> 00:21:40,760
I think we'll all agree is a little
bit iffy, especially if you

498
00:21:40,760 --> 00:21:43,140
have more than 1 app written in
more than 1 language, like you

499
00:21:43,140 --> 00:21:46,520
have to repeat the same logic across
all apps.

500
00:21:47,280 --> 00:21:49,330
So I added just a common system.

501
00:21:49,330 --> 00:21:51,600
I knew like there's 2 sharding
schemes at Instacart.

502
00:21:51,720 --> 00:21:54,380
1 uses, you know, the actually
the hashing function from partitions

503
00:21:54,380 --> 00:21:54,960
in Postgres.

504
00:21:54,960 --> 00:21:58,920
I love that because you can actually
split data both at the client

505
00:21:59,440 --> 00:22:00,300
and in the server.

506
00:22:00,300 --> 00:22:03,480
So you have multiple ways to move
your data around and the other

507
00:22:03,480 --> 00:22:06,200
1 is just a custom 1 like we use
like SHA1 and take the last

508
00:22:06,200 --> 00:22:08,940
few bytes and then mod that that's
just you know, it's random,

509
00:22:08,940 --> 00:22:12,400
but it was available in multiple
systems as well The data for

510
00:22:12,400 --> 00:22:15,360
that came from Snowflake so we
could actually Shard the data

511
00:22:15,360 --> 00:22:18,940
in Snowflake and then ingest it
into the instances directly.

512
00:22:19,200 --> 00:22:23,100
And then on the routing layer in
Ruby, same hashing function,

513
00:22:23,140 --> 00:22:25,300
you know, the sharding key is always
available.

514
00:22:25,440 --> 00:22:26,760
So that was good.

515
00:22:26,760 --> 00:22:29,160
So I added both of them and they're
like, great, that's great.

516
00:22:29,160 --> 00:22:32,960
And they tried, I think, the SHA1
function and I think it's working

517
00:22:32,960 --> 00:22:33,740
pretty well for them.

518
00:22:33,740 --> 00:22:34,700
So that was fun.

519
00:22:34,740 --> 00:22:37,280
Then I started another company
that had nothing to do with any

520
00:22:37,280 --> 00:22:38,000
of anything.

521
00:22:38,300 --> 00:22:39,960
PostgresML, you might've heard
about it.

522
00:22:39,960 --> 00:22:42,980
That came from, you know, the idea
that we shouldn't ingest,

523
00:22:42,980 --> 00:22:45,180
you know, hundreds of gigabytes
of machine learning data into

524
00:22:45,180 --> 00:22:49,120
Postgres, why we should just ingest
a 3 megabyte model and run

525
00:22:49,120 --> 00:22:50,040
inference online.

526
00:22:50,140 --> 00:22:51,760
You know, it was okay.

527
00:22:51,940 --> 00:22:54,060
Stayed a couple of, 2 and a half
years there.

528
00:22:54,280 --> 00:22:54,960
Didn't work out.

529
00:22:54,960 --> 00:22:57,080
There's a lot of startups too,
left.

530
00:22:57,180 --> 00:22:59,640
And then I had some free time and
I'm like, well, what do I like

531
00:22:59,640 --> 00:23:01,320
to do in my free time, guys?

532
00:23:01,900 --> 00:23:03,500
Writing Postgres poolers.

533
00:23:03,780 --> 00:23:05,100
This is what I do.

534
00:23:05,140 --> 00:23:07,780
This is what I do to, you know,
rest on vacation.

535
00:23:07,820 --> 00:23:08,800
I write Postgres poolers.

536
00:23:08,800 --> 00:23:10,760
I'm like, well, let's do sharding
for real this time.

537
00:23:10,760 --> 00:23:13,940
Let's actually take what we built
at Instacart, make it into

538
00:23:13,940 --> 00:23:16,860
software because That's what we
do, right?

539
00:23:16,860 --> 00:23:20,240
We come up with an idea, well,
we find a problem, we find a solution,

540
00:23:20,240 --> 00:23:22,360
we write it in code, and we don't
have to solve it again every

541
00:23:22,360 --> 00:23:25,580
single time manually for like,
you know, hopefully hundreds,

542
00:23:25,580 --> 00:23:27,400
hopefully thousands of use cases
for this.

543
00:23:27,400 --> 00:23:28,240
You know, we'll see.

544
00:23:28,320 --> 00:23:29,680
I'm still doing my research.

545
00:23:29,680 --> 00:23:31,560
But yeah, So that's PgDog.

546
00:23:31,980 --> 00:23:34,740
Sharding is, you know, sales and
number 1.

547
00:23:34,740 --> 00:23:37,040
Everything else is, you know, there's
obviously, it's obviously

548
00:23:37,040 --> 00:23:38,680
a pooler, it's obviously a load
balancer.

549
00:23:38,680 --> 00:23:40,580
It has all the features that PgCat
has.

550
00:23:41,420 --> 00:23:43,440
Almost all of them, I'm adding
them as I go.

551
00:23:43,440 --> 00:23:44,900
It's a rewrite, it's brand new.

552
00:23:44,900 --> 00:23:45,820
I like new code.

553
00:23:46,100 --> 00:23:47,980
That's what everyone loves to hear.

554
00:23:47,980 --> 00:23:49,540
Hey, you rewrote it from scratch.

555
00:23:49,540 --> 00:23:50,040
Great.

556
00:23:50,200 --> 00:23:53,100
You know, that code that we battle
tested in production and serving

557
00:23:53,100 --> 00:23:54,840
like half a million transactions
per second.

558
00:23:54,840 --> 00:23:57,420
Well, that's obsolete now, I guess.

559
00:23:57,740 --> 00:23:58,260
Nikolay: You're just going to

560
00:23:58,260 --> 00:24:00,420
Lev: take this brand new code base
and check it out.

561
00:24:01,240 --> 00:24:01,740
Nikolay: Yeah.

562
00:24:02,380 --> 00:24:03,780
And it's written in Rust.

563
00:24:04,380 --> 00:24:05,360
Lev: Yeah, absolutely.

564
00:24:05,540 --> 00:24:09,880
Nikolay: Yeah, it's in my kind
of dreams to find some spare time

565
00:24:10,080 --> 00:24:15,900
to learn and try it because, yeah,
it looks like many folks move

566
00:24:15,920 --> 00:24:16,660
to Rust.

567
00:24:17,360 --> 00:24:19,060
So what do you think?

568
00:24:20,600 --> 00:24:23,040
Lev: Well, it took me about a decade
to get good at it.

569
00:24:23,040 --> 00:24:26,760
So the sooner you start, the sooner
you'll get it.

570
00:24:26,760 --> 00:24:29,040
You know, it takes about 10 years
to be good at it.

571
00:24:29,040 --> 00:24:31,420
So at least that's what it took
me.

572
00:24:33,960 --> 00:24:34,740
That's okay.

573
00:24:34,960 --> 00:24:38,800
Again, if you start today, eventually
you'll get good at it and

574
00:24:38,800 --> 00:24:39,520
that's okay.

575
00:24:39,760 --> 00:24:42,540
It's a journey, you don't have
to learn it immediately.

576
00:24:42,720 --> 00:24:45,560
It's just, it's such a paradigm
shift in how things work.

577
00:24:45,580 --> 00:24:49,400
The compiler is very aggressive
about checking for things, especially

578
00:24:50,280 --> 00:24:54,480
concurrency errors, which is for
multithreaded asynchronous poolers,

579
00:24:54,480 --> 00:24:55,260
very important.

580
00:24:55,440 --> 00:24:59,240
I don't have concurrency bugs in
PgDog or PgCat because I'm using

581
00:24:59,240 --> 00:24:59,620
Rust.

582
00:24:59,620 --> 00:25:00,900
I don't have data races.

583
00:25:01,400 --> 00:25:02,980
And that's really important.

584
00:25:03,600 --> 00:25:06,700
And the number of bugs that I ship
is considerably lower because

585
00:25:06,700 --> 00:25:09,960
the compiler is like, hey, listen,
this variable, you're not

586
00:25:09,960 --> 00:25:10,580
using it.

587
00:25:10,580 --> 00:25:13,100
And I'm like, oh yeah, crap, I'm
actually importing the wrong

588
00:25:13,100 --> 00:25:13,660
variable here.

589
00:25:13,660 --> 00:25:14,840
I'm using the wrong variable.

590
00:25:14,920 --> 00:25:16,940
Good catch, because that was going
to be a bug.

591
00:25:17,040 --> 00:25:21,040
A lot of things that just in other
languages are not available

592
00:25:21,220 --> 00:25:22,540
Rust makes really nice.

593
00:25:22,660 --> 00:25:26,460
So it's really worth, again, you
ask anyone who writes Rust,

594
00:25:26,460 --> 00:25:28,240
but like, that's the best thing
that ever happened to me since

595
00:25:28,240 --> 00:25:28,680
sliced bread.

596
00:25:28,680 --> 00:25:30,360
I'm like, that's true.

597
00:25:30,900 --> 00:25:33,040
I haven't thought about sliced
bread in a while, but Rust is

598
00:25:33,040 --> 00:25:33,540
great.

599
00:25:34,060 --> 00:25:34,560
Cool.

600
00:25:35,740 --> 00:25:38,980
Nikolay: Another side question
is license.

601
00:25:39,520 --> 00:25:40,460
I saw feedback.

602
00:25:41,380 --> 00:25:42,600
I joined that feedback.

603
00:25:43,140 --> 00:25:45,780
So, PgCat was, I think, on Apache
or MIT.

604
00:25:45,780 --> 00:25:49,400
I don't remember exactly, but kind
of permissive.

605
00:25:50,640 --> 00:25:53,980
And for PgDog, you chose AGPL,
right?

606
00:25:54,180 --> 00:25:57,220
Can you elaborate a little bit
why?

607
00:25:57,980 --> 00:25:59,360
Lev: Yeah, yeah, of course.

608
00:25:59,380 --> 00:26:00,360
Yeah, happy to.

609
00:26:00,620 --> 00:26:01,920
The answer is really simple.

610
00:26:02,940 --> 00:26:04,780
AGPL is actually pretty misunderstood.

611
00:26:05,380 --> 00:26:08,760
You can use that code and the application
anywhere you want and

612
00:26:08,760 --> 00:26:10,240
never have to share anything back.

613
00:26:10,240 --> 00:26:12,720
As long as you use it internally
and you don't provide, like,

614
00:26:12,720 --> 00:26:16,760
you don't use PgDog like publicly
as like a service for running

615
00:26:16,760 --> 00:26:17,260
PgDog.

616
00:26:17,440 --> 00:26:20,040
And even if you do that in that
case, all you have to do is just

617
00:26:20,080 --> 00:26:23,260
tell us like, what did you change
and send us patches, like what

618
00:26:23,260 --> 00:26:23,600
did you change?

619
00:26:23,600 --> 00:26:24,980
So like, it's pretty minimal.

620
00:26:25,420 --> 00:26:27,660
But it's a red flag for everyone
and that's okay.

621
00:26:27,660 --> 00:26:29,700
Yeah, I'm building a company around
it.

622
00:26:30,480 --> 00:26:34,920
Building a company around MIT code
base is, I think, probably

623
00:26:34,920 --> 00:26:35,420
possible.

624
00:26:35,460 --> 00:26:36,720
I've never done it successfully.

625
00:26:37,360 --> 00:26:40,780
Building a company around AGPL,
I think, has been done before.

626
00:26:41,160 --> 00:26:44,320
And I think it's probably fine.

627
00:26:45,040 --> 00:26:48,080
But if it becomes a hurdle, I'm
not like married to it.

628
00:26:48,080 --> 00:26:49,840
I just thought AGPL looks cool.

629
00:26:49,840 --> 00:26:51,380
I like the ideas behind it.

630
00:26:51,380 --> 00:26:53,260
I like free and open source code.

631
00:26:53,360 --> 00:26:57,660
I don't think MIT is necessarily
the original idea behind, you

632
00:26:57,660 --> 00:26:58,840
know, free and open source code.

633
00:26:58,840 --> 00:27:00,960
I like MIT because I don't have
to think about it.

634
00:27:01,060 --> 00:27:02,280
Nikolay: I'm checking PostgresML.

635
00:27:03,040 --> 00:27:07,260
PostgresML is MIT, right?

636
00:27:07,280 --> 00:27:07,780
Lev: Yeah.

637
00:27:08,040 --> 00:27:08,540
Nikolay: Codebase.

638
00:27:09,660 --> 00:27:10,760
Rasta and MIT.

639
00:27:10,760 --> 00:27:13,420
So it's interesting how you decided
to change it.

640
00:27:14,920 --> 00:27:16,300
I agree it's misunderstood.

641
00:27:18,520 --> 00:27:20,380
But it's already so.

642
00:27:20,820 --> 00:27:23,940
The majority of people misunderstood
it and we cannot change

643
00:27:23,940 --> 00:27:25,700
it with single project.

644
00:27:26,200 --> 00:27:27,540
So it's reality.

645
00:27:27,980 --> 00:27:29,380
Yeah, but you don't care if

646
00:27:29,380 --> 00:27:30,060
Lev: it's open.

647
00:27:30,060 --> 00:27:31,900
Nikolay: Okay, yeah, like you do
it.

648
00:27:32,420 --> 00:27:34,540
Lev: Yeah, yeah, because like if
somebody tells me like look

649
00:27:34,540 --> 00:27:37,360
I would love to use your code,
but AGPL is a deal-breaker I'll

650
00:27:37,360 --> 00:27:39,100
be like well, we'll work something
out.

651
00:27:39,100 --> 00:27:40,900
You know that's that's not a big deal.

652
00:27:40,900 --> 00:27:44,940
You know But I think that's a good.

653
00:27:45,060 --> 00:27:46,940
It's a good thing to have a good license.

654
00:27:47,040 --> 00:27:47,780
It's important.

655
00:27:48,600 --> 00:27:50,500
Michael: You mentioned starting a company around it.

656
00:27:50,500 --> 00:27:52,760
It strikes me it's going to be tough.

657
00:27:52,760 --> 00:27:57,220
Like, obviously, New codebase, the main use cases are at scale.

658
00:27:59,080 --> 00:28:01,720
But normally startups, like, the easiest way of getting started

659
00:28:01,720 --> 00:28:03,340
is serving smaller companies, right?

660
00:28:03,340 --> 00:28:05,860
Like it's harder, like going straight to the enterprise with

661
00:28:05,860 --> 00:28:09,720
something that's not yet, like what's the, you've got a plan

662
00:28:09,720 --> 00:28:11,960
though, it'd be great to hear like what's the plan?

663
00:28:11,960 --> 00:28:13,000
Lev: What's the plan?

664
00:28:13,140 --> 00:28:15,300
It's okay, don't freak out, it's gonna be okay.

665
00:28:16,780 --> 00:28:20,320
Yes, it's not actually that uncommon to have enterprise startup

666
00:28:20,320 --> 00:28:20,820
products.

667
00:28:21,620 --> 00:28:23,860
If the problem is interesting enough, there's always going to

668
00:28:23,860 --> 00:28:25,160
be somebody who's going to be like, oh, great.

669
00:28:25,160 --> 00:28:26,820
Somebody's working on it full time.

670
00:28:26,960 --> 00:28:27,900
That'll be amazing.

671
00:28:28,200 --> 00:28:31,060
At this early stage, how this works usually is I'm looking for

672
00:28:31,060 --> 00:28:31,840
design partners.

673
00:28:31,840 --> 00:28:34,720
So it's companies like Instacart who are like, hey, this is a

674
00:28:34,720 --> 00:28:35,460
great idea.

675
00:28:35,900 --> 00:28:36,880
We're gonna try it out.

676
00:28:36,880 --> 00:28:38,220
We're gonna develop it together.

677
00:28:38,860 --> 00:28:41,400
And at the end of the day, it's gonna be in production because

678
00:28:41,400 --> 00:28:42,460
we built it together.

679
00:28:42,740 --> 00:28:45,140
And that's actually good because you wanna build it with the

680
00:28:45,140 --> 00:28:45,400
users.

681
00:28:45,400 --> 00:28:47,720
Like you don't wanna build it by yourself like for several years

682
00:28:47,720 --> 00:28:49,640
and then show up and be like, hey, does anyone need Postgres

683
00:28:49,640 --> 00:28:50,140
sharding?

684
00:28:50,460 --> 00:28:54,820
And Nikolay's like, well, I don't know, maybe, maybe not, depends.

685
00:28:55,160 --> 00:28:58,320
So what I'm actively looking for right now, like codebase is

686
00:28:58,320 --> 00:28:58,820
okay.

687
00:28:59,240 --> 00:29:02,080
I'm sure there's bugs in it, performance issues, and that's totally

688
00:29:02,080 --> 00:29:02,580
fine.

689
00:29:02,860 --> 00:29:05,200
I'm just looking for people who'd be like, this is an interesting

690
00:29:05,200 --> 00:29:05,700
idea.

691
00:29:05,940 --> 00:29:07,800
I like the idea of Postgres sharding.

692
00:29:07,800 --> 00:29:09,600
I like the way it's done at the pooler.

693
00:29:09,600 --> 00:29:10,920
It's not done as an extension.

694
00:29:10,920 --> 00:29:13,780
It's not done as some kind of other thing that I can't even think

695
00:29:13,780 --> 00:29:14,100
of.

696
00:29:14,100 --> 00:29:15,060
I like the way it's done.

697
00:29:15,060 --> 00:29:18,280
So I'd like to try it out and help you finish the job.

698
00:29:18,340 --> 00:29:21,020
You know, by deploying it in production, by benchmarking it,

699
00:29:21,020 --> 00:29:23,760
by finding bugs, by reporting bugs, by even fixing bugs would

700
00:29:23,760 --> 00:29:25,620
be great, but not required.

701
00:29:26,400 --> 00:29:27,100
My job.

702
00:29:27,700 --> 00:29:31,780
Nikolay: I can confirm you're very
quick reacting to requests.

703
00:29:32,500 --> 00:29:37,460
I remember looking at PgCat, I
had the idea that mirroring, let's

704
00:29:37,460 --> 00:29:41,200
have mirroring to have like a kind
of A-B testing, A-B performance

705
00:29:41,200 --> 00:29:42,840
testing, right in production.

706
00:29:43,520 --> 00:29:45,360
And you implemented it, it was
great.

707
00:29:45,700 --> 00:29:47,860
I think you were, like you were, or no?

708
00:29:47,860 --> 00:29:50,380
Lev: Actually, to be perfectly
correct, it was actually Mostafa

709
00:29:50,380 --> 00:29:54,140
and Instacart who implemented it,
because he had a use case for

710
00:29:54,140 --> 00:29:54,440
it.

711
00:29:54,440 --> 00:29:56,260
He was my design partner.

712
00:29:56,280 --> 00:30:00,780
Nikolay: It's a very common request,
and the only problem, like,

713
00:30:00,860 --> 00:30:05,360
you need to have this pool already
in production to use it.

714
00:30:05,500 --> 00:30:07,620
This is the trickiest part.

715
00:30:08,740 --> 00:30:09,660
Lev: It's 0 to 1.

716
00:30:09,660 --> 00:30:10,680
It's always tricky.

717
00:30:10,680 --> 00:30:14,020
New stuff, especially in the hot
path, always going to be hard.

718
00:30:14,020 --> 00:30:17,180
But if the problem is there, and
If the problem is big enough,

719
00:30:18,400 --> 00:30:19,540
I'll find my champion.

720
00:30:19,540 --> 00:30:20,240
Yeah, exactly.

721
00:30:21,820 --> 00:30:24,680
Nikolay: So yeah, and you mentioned
PgDog.

722
00:30:25,080 --> 00:30:27,580
Maybe let's move back to technical
discussions.

723
00:30:28,080 --> 00:30:32,580
A little bit out of business and
license and so on.

724
00:30:32,980 --> 00:30:37,040
So first of all, it's not like
PgCat, it's not explicit sharding

725
00:30:37,040 --> 00:30:41,180
when you command with SQL comments
how to route.

726
00:30:42,340 --> 00:30:45,080
This PgDog has automated routing,
right?

727
00:30:45,100 --> 00:30:51,600
And second thing, there's no Postgres
middleware for this.

728
00:30:51,660 --> 00:30:55,480
So it's just a pooler with routing.

729
00:30:56,400 --> 00:31:00,040
Can you explain architectural decisions
here and what do you

730
00:31:00,040 --> 00:31:03,360
use and what kind of components?

731
00:31:04,200 --> 00:31:08,600
You got a parser from Postgres
to understand queries.

732
00:31:09,140 --> 00:31:14,940
I'm very curious how you are going
to automatically route selects

733
00:31:15,220 --> 00:31:18,480
of functions which are writing,
for example, right?

734
00:31:19,120 --> 00:31:25,020
Or select for update, which you
cannot route to a physical standby,

735
00:31:25,320 --> 00:31:26,500
a replica, right?

736
00:31:26,920 --> 00:31:33,420
Or I don't know, something else,
like how you are going to, or

737
00:31:33,420 --> 00:31:36,240
You obviously will have some limitations,
already have some limitations,

738
00:31:36,280 --> 00:31:36,780
right?

739
00:31:37,060 --> 00:31:39,940
So can you talk about this a little
bit?

740
00:31:40,640 --> 00:31:41,400
Lev: Yeah, of course.

741
00:31:41,740 --> 00:31:43,260
Select for update is actually really
simple.

742
00:31:43,260 --> 00:31:45,360
That's a clear intent to write
something.

743
00:31:45,560 --> 00:31:47,720
So that's an easy 1, straight to
the primary.

744
00:31:48,260 --> 00:31:48,980
No problem.

745
00:31:49,020 --> 00:31:52,740
That's an easy 1, which I actually
should implement.

746
00:31:52,760 --> 00:31:55,080
Now that I'm thinking about it,
I'm routing it to the replica

747
00:31:55,080 --> 00:31:55,780
right now.

748
00:31:56,120 --> 00:31:57,140
Bug issue incoming.

749
00:31:57,360 --> 00:31:58,260
Thank you very much.

750
00:31:58,260 --> 00:31:59,080
Nikolay: You're welcome.

751
00:32:01,700 --> 00:32:03,920
Lev: The other 1 is the functions.

752
00:32:04,040 --> 00:32:06,820
Obviously It's impossible to know
if a function is writing or

753
00:32:06,820 --> 00:32:07,720
not by just looking at it.

754
00:32:07,720 --> 00:32:09,840
Even if you look at the code, you're
not going to know.

755
00:32:09,840 --> 00:32:13,060
So like static analysis, I don't
think is necessarily possible.

756
00:32:13,680 --> 00:32:15,920
So for that 1, I think it should
be pretty easy.

757
00:32:15,920 --> 00:32:18,240
You put it in the config, you have
a list of functions that actually

758
00:32:18,240 --> 00:32:18,440
write.

759
00:32:18,440 --> 00:32:20,180
Nikolay: This is what pgpool does,
right?

760
00:32:20,800 --> 00:32:21,740
Lev: I think so, yeah.

761
00:32:21,740 --> 00:32:22,620
I'm not sure.

762
00:32:24,520 --> 00:32:28,160
Nikolay: pgpool does everything,
so I'm sure they do it as well.

763
00:32:28,380 --> 00:32:28,880
Lev: Exactly.

764
00:32:29,340 --> 00:32:30,240
You have to be careful.

765
00:32:30,240 --> 00:32:31,240
You can't do everything.

766
00:32:31,600 --> 00:32:32,560
People don't believe.

767
00:32:34,540 --> 00:32:37,040
I know that approach is not perfect
because if you add a new

768
00:32:37,040 --> 00:32:39,140
function, you have to update the
config and you're always going

769
00:32:39,140 --> 00:32:41,340
to forget that you're always going
to have issues.

770
00:32:42,040 --> 00:32:43,940
So for that 1, I don't have a solution.

771
00:32:44,120 --> 00:32:46,360
My theory so far is that that is.

772
00:32:46,500 --> 00:32:49,800
Not as common as I'd like, but
I will probably prove it wrong

773
00:32:49,840 --> 00:32:51,920
and then we'll figure something
out probably.

774
00:32:52,120 --> 00:32:54,440
Some kind of migration process
that says, if you want to add

775
00:32:54,440 --> 00:32:57,180
a function that writes, send it
through the, like you should

776
00:32:57,180 --> 00:32:59,680
be writing migrations and sending
them through the, through the

777
00:32:59,680 --> 00:33:01,720
pooler, not some kind of side channel.

778
00:33:01,720 --> 00:33:03,960
And you can probably mark that
function as like, hey, this function

779
00:33:03,960 --> 00:33:05,840
writes, like put in a comment or
something.

780
00:33:05,840 --> 00:33:08,260
And then PgDog is gonna be like,
great, good to know.

781
00:33:08,260 --> 00:33:10,460
You're gonna need persistent storage
for that kind of stuff,

782
00:33:10,460 --> 00:33:11,740
which you could probably implement.

783
00:33:11,760 --> 00:33:12,660
Nikolay: Yeah, I agree.

784
00:33:12,660 --> 00:33:14,740
Some dogs behave like cats sometimes.

785
00:33:15,800 --> 00:33:16,300
Lev: Exactly.

786
00:33:16,840 --> 00:33:17,660
Yeah, exactly.

787
00:33:17,660 --> 00:33:19,760
People will forget to put that
comment in and there's going to

788
00:33:19,760 --> 00:33:20,280
be issues.

789
00:33:20,280 --> 00:33:21,880
But that's just software, you know.

790
00:33:21,900 --> 00:33:24,260
When there's people involved, there's
always going to be...

791
00:33:24,920 --> 00:33:27,880
The more manual stuff you have
to do, the more problems there

792
00:33:27,880 --> 00:33:28,700
are going to be.

793
00:33:28,700 --> 00:33:31,320
But if you're writing a function
by hand in the first place,

794
00:33:31,800 --> 00:33:34,400
you know, That's just got to be
part of your review process.

795
00:33:35,800 --> 00:33:38,760
Michael: I was reading Hacker News
comments and somebody asked

796
00:33:38,760 --> 00:33:40,080
about aggregates as well.

797
00:33:40,080 --> 00:33:42,720
I think there's a limitation around
those at the moment.

798
00:33:42,720 --> 00:33:44,320
What's the story there?

799
00:33:44,700 --> 00:33:47,320
Lev: Well When I posted the thing,
I didn't have support for

800
00:33:47,320 --> 00:33:50,140
aggregates at all, but then I'm
like, hey, you know what, let's

801
00:33:50,140 --> 00:33:50,580
add some.

802
00:33:50,580 --> 00:33:52,080
I've been thinking about it for months.

803
00:33:52,080 --> 00:33:53,940
I might as well just do a couple of simple ones.

804
00:33:53,940 --> 00:33:57,240
So I added, yeah, I added like count, simple 1, you just sum

805
00:33:57,240 --> 00:33:58,680
the counts across all shards.

806
00:33:58,680 --> 00:34:00,660
Maxmin, again, super simple.

807
00:34:00,660 --> 00:34:01,620
I added sum as well.

808
00:34:01,620 --> 00:34:03,360
Sum is really just sum everything.

809
00:34:03,820 --> 00:34:04,320
Nikolay: Yeah.

810
00:34:04,940 --> 00:34:05,440
Yeah.

811
00:34:05,600 --> 00:34:09,140
Some of the- Let's, let's, small comment here.

812
00:34:09,140 --> 00:34:11,960
It's MapReduce basically, like an analogy for it.

813
00:34:11,960 --> 00:34:13,040
It's, it's cool, right?

814
00:34:13,040 --> 00:34:13,540
So.

815
00:34:13,940 --> 00:34:15,080
Lev: It is cool, right?

816
00:34:15,080 --> 00:34:16,040
This is cool.

817
00:34:16,100 --> 00:34:17,300
Like, you know, it's scale

818
00:34:17,300 --> 00:34:19,740
Nikolay: to billions, trillions and so on.

819
00:34:20,020 --> 00:34:20,520
Lev: Precisely.

820
00:34:20,580 --> 00:34:23,140
Yeah, this is like a MapReduce for Postgres is phenomenal.

821
00:34:24,000 --> 00:34:26,820
I'm actually, okay, so I'm going to release a blog post in a

822
00:34:26,820 --> 00:34:27,320
couple of days.

823
00:34:27,320 --> 00:34:30,520
I'm actually sharding and doing MapReduce for pgvector.

824
00:34:30,520 --> 00:34:32,380
I don't know if you've heard about this 1.

825
00:34:32,680 --> 00:34:33,260
Yeah, yeah.

826
00:34:33,260 --> 00:34:34,400
That one's really fun.

827
00:34:34,640 --> 00:34:35,740
pgvector is like a-

828
00:34:35,740 --> 00:34:39,060
Nikolay: No, no, pgvector we know, but what do you do with it?

829
00:34:40,080 --> 00:34:42,040
Lev: Well, you're going to have to wait till the blog post comes

830
00:34:42,040 --> 00:34:43,500
out, but it's really fun.

831
00:34:43,500 --> 00:34:46,400
I'm doing both MapReduce and like a machine learning algorithm

832
00:34:46,400 --> 00:34:48,080
to route queries in the cluster.

833
00:34:48,080 --> 00:34:50,800
Because like scaling, like searching vectors is a completely

834
00:34:50,800 --> 00:34:52,580
different problem than searching B-trees.

835
00:34:52,760 --> 00:34:57,180
So I don't know how many people will need that solution.

836
00:34:57,180 --> 00:35:01,300
Nikolay: Well, HNSW is terrible if you go beyond 1 million records.

837
00:35:01,520 --> 00:35:03,620
It's a big problem still, So.

838
00:35:04,200 --> 00:35:06,600
Lev: That's the number I had in my blog post as well.

839
00:35:06,600 --> 00:35:08,940
I don't know why 1 million just feels like the right number,

840
00:35:08,940 --> 00:35:09,960
but that's exactly what I said.

841
00:35:09,960 --> 00:35:12,500
I'll answer you, over a million, you probably need to shard your

842
00:35:12,500 --> 00:35:13,380
pgvector index.

843
00:35:13,380 --> 00:35:16,040
Nikolay: Or use a different approach, yeah.

844
00:35:16,380 --> 00:35:16,880
Exactly.

845
00:35:17,040 --> 00:35:17,820
That's great.

846
00:35:18,080 --> 00:35:24,600
And MapReduce, someone told in the past that Pale Proxy by Skype,

847
00:35:24,860 --> 00:35:29,820
very old tech, also looked like MapReduce, but it required you

848
00:35:29,820 --> 00:35:33,880
to use only functions, which is a huge limitation, especially

849
00:35:33,900 --> 00:35:38,220
if you have ORM or GraphQL, it's a big showstopper.

850
00:35:38,940 --> 00:35:41,940
And also it was Postgres in the middle, right?

851
00:35:41,940 --> 00:35:43,680
For routing and for this MapReduce.

852
00:35:43,780 --> 00:35:48,360
But in your case, this is more
lightweight software in the middle,

853
00:35:48,580 --> 00:35:49,700
PgDog, right?

854
00:35:49,860 --> 00:35:54,180
And it does some simple, like,
arithmetic operations.

855
00:35:54,280 --> 00:35:54,780
Yeah.

856
00:35:55,080 --> 00:36:01,220
And do you plan to define some
interface for more advanced operations

857
00:36:01,980 --> 00:36:07,320
that the user could define, like,
beyond simple sum or count

858
00:36:07,320 --> 00:36:08,820
or other aggregates?

859
00:36:10,240 --> 00:36:14,880
Lev: I haven't worked much with
custom data types, UDFs and all

860
00:36:14,880 --> 00:36:17,240
that stuff, So that's going to
be a learning curve for me, I'm

861
00:36:17,240 --> 00:36:17,560
sure.

862
00:36:17,560 --> 00:36:20,280
I'm sure it's not that hard, but
like once you add custom functions,

863
00:36:20,280 --> 00:36:21,900
you need to add custom logic.

864
00:36:22,240 --> 00:36:24,960
I think that should be pretty straightforward
to implement if

865
00:36:24,960 --> 00:36:28,280
there's a synchronization between,
you know, the this is working.

866
00:36:28,280 --> 00:36:31,760
Nikolay: This would give full-fledged
MapReduce capabilities

867
00:36:32,020 --> 00:36:33,480
to this, right?

868
00:36:33,660 --> 00:36:34,840
Lev: Yeah, absolutely.

869
00:36:34,840 --> 00:36:35,460
More open

870
00:36:35,460 --> 00:36:37,520
Nikolay: and interesting perspectives,
I suppose.

871
00:36:38,680 --> 00:36:39,380
Lev: Absolutely, yeah.

872
00:36:39,380 --> 00:36:43,140
If I find someone who thinks this
is cool as well, we could definitely

873
00:36:43,140 --> 00:36:43,940
build it together.

874
00:36:44,440 --> 00:36:47,120
Nikolay: Not only thinks it's cool,
but has some production to

875
00:36:47,120 --> 00:36:47,940
try, right?

876
00:36:47,980 --> 00:36:49,400
Because it's just- Absolutely.

877
00:36:49,600 --> 00:36:50,100
Exactly.

878
00:36:50,420 --> 00:36:51,900
Lev: Yes, absolutely, yeah.

879
00:36:51,900 --> 00:36:53,420
I think that would be pretty terrific.

880
00:36:54,120 --> 00:36:56,820
It will work pretty well, but yeah.

881
00:36:58,080 --> 00:37:01,100
So again, there's a lot of interesting
things about aggregates

882
00:37:01,120 --> 00:37:04,180
that, you know, for example, like
percentiles, like notoriously

883
00:37:04,760 --> 00:37:09,280
difficult, basically impossible
to solve, I think, at the sharding

884
00:37:09,280 --> 00:37:12,840
level, because you need to look
at the whole dataset to compute

885
00:37:12,840 --> 00:37:13,140
it.

886
00:37:13,140 --> 00:37:14,480
You can approximate it.

887
00:37:14,600 --> 00:37:18,340
Approximation functions should
be like a feature that we add

888
00:37:18,340 --> 00:37:20,740
to like PgDog that says like, you
know what, I don't care about

889
00:37:20,740 --> 00:37:25,120
the exact number, like average,
the simplest 1, you could estimate

890
00:37:25,120 --> 00:37:25,460
it.

891
00:37:25,460 --> 00:37:25,960
I think

892
00:37:25,960 --> 00:37:29,860
Nikolay: for percentile, like
we could define some custom data

893
00:37:29,860 --> 00:37:35,220
type, right, to remember how many
members were analyzed and so

894
00:37:35,220 --> 00:37:38,800
on, like, I don't know, to bring
not just 1 number from each

895
00:37:38,800 --> 00:37:41,540
shard, but a couple of numbers
and then it would be possible

896
00:37:41,840 --> 00:37:44,280
to understand percentiles, maybe,
should be.

897
00:37:44,700 --> 00:37:45,180
Maybe.

898
00:37:45,180 --> 00:37:46,280
Not super difficult.

899
00:37:46,320 --> 00:37:50,520
Michael: Feels like a similar,
like HyperLogLog had some similar,

900
00:37:50,740 --> 00:37:54,280
like, I don't know how you do that
cross-shard, but it feels

901
00:37:54,280 --> 00:37:59,320
like there might be some secret
sauce in what they've done already

902
00:37:59,340 --> 00:38:01,120
that could be applied cross-shard.

903
00:38:01,980 --> 00:38:06,100
Lev: Yeah, HyperLogLog is like
a counter, basically, approximates

904
00:38:06,180 --> 00:38:07,860
how many members are in a set.

905
00:38:07,860 --> 00:38:10,080
Actually, there's an extension
for it in Postgres, which you

906
00:38:10,080 --> 00:38:10,668
can use.

907
00:38:10,668 --> 00:38:12,380
Yeah, it's pretty fun.

908
00:38:12,380 --> 00:38:16,080
But yeah, it's, I need a statistician,
like a data scientist

909
00:38:16,080 --> 00:38:16,400
Nikolay: to come

910
00:38:16,400 --> 00:38:19,260
Lev: and be like, all right, this
is how you approximate percentiles.

911
00:38:19,300 --> 00:38:20,520
And I'd be like, great.

912
00:38:20,840 --> 00:38:21,840
Do you know Rust?

913
00:38:22,000 --> 00:38:22,720
Good stuff.

914
00:38:22,720 --> 00:38:23,220
Yeah.

915
00:38:24,340 --> 00:38:25,580
Nikolay: What won't do with Rust.

916
00:38:26,400 --> 00:38:27,680
Lev: Exactly, yeah.

917
00:38:28,280 --> 00:38:30,260
Michael: Well, so what is, like,
what's next?

918
00:38:30,260 --> 00:38:33,120
What are you looking, I guess it
depends on what people want,

919
00:38:33,120 --> 00:38:35,580
but what does tomorrow look like
or the next week?

920
00:38:35,860 --> 00:38:39,720
Lev: Well You're gonna be shocked
to hear this, but I'm an engineer

921
00:38:39,720 --> 00:38:43,740
who does sales now I literally
just like yeah as I said, I'm

922
00:38:43,740 --> 00:38:46,620
building a company So I'm literally
just sending as many like

923
00:38:46,620 --> 00:38:51,140
LinkedIn and emails to whoever
I can think of to find design

924
00:38:51,140 --> 00:38:54,060
partners for people to be like,
hey, I want to help.

925
00:38:54,620 --> 00:38:55,520
This problem exists.

926
00:38:55,520 --> 00:38:57,720
First of all, that's the first
feedback I need to get.

927
00:38:57,720 --> 00:39:00,040
Be like, I would like sharding,
I would like it solved, and I

928
00:39:00,040 --> 00:39:01,480
would like to be solved at your
way.

929
00:39:01,480 --> 00:39:02,360
That would be great.

930
00:39:02,360 --> 00:39:04,940
Or, you know, just tell me how
you'd like it to be solved.

931
00:39:04,940 --> 00:39:07,760
And if like there's an interlap
and I think there should be overlap,

932
00:39:08,480 --> 00:39:09,240
solve it together.

933
00:39:09,240 --> 00:39:11,940
So that's what I'm doing mostly,
but you know I'm an engineer,

934
00:39:11,940 --> 00:39:14,800
so I need, I need like a safe space
from, from all the social

935
00:39:14,800 --> 00:39:15,060
activities.

936
00:39:15,060 --> 00:39:16,080
So I still code.

937
00:39:16,080 --> 00:39:20,540
So that's why the pgvector sharding
is coming out because I

938
00:39:20,540 --> 00:39:22,960
needed to code a little bit and
I thought the idea would be cool.

939
00:39:22,960 --> 00:39:24,060
And I'm gonna keep doing that.

940
00:39:24,060 --> 00:39:27,820
I'm gonna keep adding these kind
of features, keep adding tests,

941
00:39:27,940 --> 00:39:32,780
benchmarks, fixing bugs, finding
more use cases, like SELECT

942
00:39:32,780 --> 00:39:34,260
for UPDATE that I forgot.

943
00:39:35,500 --> 00:39:37,160
But yeah, that's that's the plan.

944
00:39:37,540 --> 00:39:42,080
Nikolay: I have specific question
about rebalancing without downtime.

945
00:39:42,160 --> 00:39:42,660
Right.

946
00:39:42,740 --> 00:39:46,880
If 1 shard is too huge, others
are smaller.

947
00:39:46,880 --> 00:39:47,840
We need to rebalance.

948
00:39:47,840 --> 00:39:48,740
What do you think?

949
00:39:50,140 --> 00:39:55,140
Will this feature first of all
be inside open source core offering?

950
00:39:56,020 --> 00:40:00,560
Because we remember in Citus it
was not until Microsoft decision,

951
00:40:00,580 --> 00:40:04,400
as I understand, Microsoft decision to, or like a Citus team

952
00:40:04,400 --> 00:40:06,660
decision to make everything open source.

953
00:40:06,860 --> 00:40:10,360
And second question actually, triggering by my this question,

954
00:40:10,360 --> 00:40:13,820
like, how do you compare PgDog to Citus?

955
00:40:14,140 --> 00:40:17,180
Lev: Yeah, the open source, I would like to stay, I don't foresee

956
00:40:17,560 --> 00:40:19,340
myself writing closed source code.

957
00:40:19,640 --> 00:40:26,240
Maybe things around deployments and orchestrating the stuff in

958
00:40:26,240 --> 00:40:26,740
companies.

959
00:40:27,040 --> 00:40:31,480
My idea, we'll see if it's realistic, is to sell managed deployments

960
00:40:31,520 --> 00:40:34,160
of PgDog to on-prem deployments to companies.

961
00:40:34,160 --> 00:40:36,560
So probably the the actual code that orchestrates that stuff

962
00:40:36,560 --> 00:40:39,840
will probably be proprietary, you know, mostly because I'm embarrassed

963
00:40:39,840 --> 00:40:41,100
how much bash I use.

964
00:40:41,260 --> 00:40:43,820
Nobody really wants to know how that sausage is made.

965
00:40:45,860 --> 00:40:48,260
Nikolay: Do you follow Google Bash code style?

966
00:40:49,120 --> 00:40:53,100
Lev: Yeah, somebody taught me to use curly braces for my bash

967
00:40:53,100 --> 00:40:55,360
variables, and ever since then I've been doing that religiously.

968
00:40:55,520 --> 00:40:57,760
So I learn new things every day.

969
00:40:58,180 --> 00:41:01,240
But I think the core will stay open source forever.

970
00:41:01,240 --> 00:41:02,120
I don't see...

971
00:41:02,560 --> 00:41:05,720
Even that data migration part, like that's a, that's a known,

972
00:41:05,920 --> 00:41:07,360
there's a known solution for it.

973
00:41:07,360 --> 00:41:09,640
There's no point of building a closed source solution that does

974
00:41:09,640 --> 00:41:10,240
the same thing.

975
00:41:10,240 --> 00:41:11,460
Like it's already been solved.

976
00:41:11,460 --> 00:41:12,540
So might as well just...

977
00:41:12,540 --> 00:41:14,020
Nikolay: It doesn't exist yet, right?

978
00:41:14,200 --> 00:41:16,220
This, this rebalancing feature.

979
00:41:16,780 --> 00:41:19,920
Lev: Well the rebalancing feature is basically, again, it depends

980
00:41:19,920 --> 00:41:20,800
on your sharding key.

981
00:41:20,800 --> 00:41:23,140
It depends how you store data on the shard.

982
00:41:23,320 --> 00:41:25,960
Like Instagram wrote that blog post a long time ago where they

983
00:41:25,960 --> 00:41:27,800
pre-partition everything into smaller shards.

984
00:41:27,800 --> 00:41:29,840
And that's how Citus does it underneath.

985
00:41:29,960 --> 00:41:31,920
Like Instead of like, you see, if you say you want 3 shards,

986
00:41:31,920 --> 00:41:33,040
it's not going to build you 3 tables.

987
00:41:33,040 --> 00:41:35,689
It's going to build 128 tables and move them between the shards.

988
00:41:35,689 --> 00:41:37,740
And then when 1 shard gets- How?

989
00:41:37,740 --> 00:41:38,240
How?

990
00:41:39,340 --> 00:41:42,900
Well, logical replication became a thing in 10.

991
00:41:42,900 --> 00:41:45,740
So, Asidus uses that to move things around.

992
00:41:45,840 --> 00:41:49,100
I think logical replication makes sense up to a point.

993
00:41:49,200 --> 00:41:51,140
You really have to catch the tables at the right time.

994
00:41:51,140 --> 00:41:53,160
Once they get a little bit too big, logical propagation can't

995
00:41:53,160 --> 00:41:54,180
catch up anymore.

996
00:41:54,840 --> 00:41:57,940
So that's going to be an orchestration problem.

997
00:41:58,580 --> 00:41:59,440
I've seen logical kind of...

998
00:41:59,440 --> 00:41:59,860
You can

999
00:41:59,860 --> 00:42:04,620
Nikolay: partition it virtually, like what peerDB did.

1000
00:42:04,700 --> 00:42:07,400
They implemented virtual partitioning, splitting.

1001
00:42:09,840 --> 00:42:12,840
If your primary key or partition
key, partition key in this case,

1002
00:42:13,280 --> 00:42:19,340
allows to define some ranges, you
can use multiple streams to

1003
00:42:19,340 --> 00:42:22,220
copy initially and then even to
have CDC.

1004
00:42:22,280 --> 00:42:24,500
So it's kind of interesting.

1005
00:42:24,520 --> 00:42:25,220
That's right.

1006
00:42:25,580 --> 00:42:26,380
So yeah.

1007
00:42:26,380 --> 00:42:26,880
Well,

1008
00:42:27,800 --> 00:42:30,400
Lev: it's funny that you mentioned
that because what is it, Postgres

1009
00:42:30,400 --> 00:42:33,400
16 allows us to now create logical
application for

1010
00:42:33,400 --> 00:42:34,260
Nikolay: other tests.

1011
00:42:34,760 --> 00:42:35,900
Binary equals true.

1012
00:42:35,900 --> 00:42:36,840
Postgres 17.

1013
00:42:37,660 --> 00:42:38,160
17?

1014
00:42:38,300 --> 00:42:39,000
Or 16.

1015
00:42:39,520 --> 00:42:39,960
Maybe 16.

1016
00:42:39,960 --> 00:42:40,440
1 of those.

1017
00:42:40,440 --> 00:42:41,400
Maybe you are right.

1018
00:42:41,400 --> 00:42:44,440
I just looked at the computation
a few days ago and I already

1019
00:42:44,440 --> 00:42:44,940
forget.

1020
00:42:45,200 --> 00:42:50,300
Yeah, but still binary is good,
but it's whole table.

1021
00:42:50,820 --> 00:42:51,820
Maybe we don't need it.

1022
00:42:51,820 --> 00:42:53,100
If we rebalance, we don't.

1023
00:42:53,100 --> 00:42:56,580
Well, in case if you split already
partitions, it's fine.

1024
00:42:56,580 --> 00:43:00,140
But if not, yeah, I'm very interested
to understand design decisions

1025
00:43:00,140 --> 00:43:00,640
here.

1026
00:43:01,680 --> 00:43:02,860
It's going to be interesting.

1027
00:43:02,860 --> 00:43:07,360
Because I think for sharding at
large scale, this is 1 of the

1028
00:43:07,360 --> 00:43:09,180
key features to understand this.

1029
00:43:09,960 --> 00:43:10,360
Lev: Yeah.

1030
00:43:10,360 --> 00:43:13,520
Well, so when you started from
like a 1 big database to shard

1031
00:43:13,520 --> 00:43:18,420
it into like 12, what we did at
Instacart was we created a replication

1032
00:43:18,540 --> 00:43:22,200
slot, we snapshot at 12 databases,
we restored them 12 different

1033
00:43:22,200 --> 00:43:26,020
times, deleted the data that's
not part of the shard, synchronized

1034
00:43:26,040 --> 00:43:28,480
it with logical replication and
launched.

1035
00:43:28,700 --> 00:43:31,880
So deleting data is faster than
writing it from-

1036
00:43:31,880 --> 00:43:34,780
Nikolay: Copy it as physical replica
first, right?

1037
00:43:36,580 --> 00:43:40,120
Or copy it logically, like provisioning
logical replica, basically

1038
00:43:40,120 --> 00:43:42,080
dump restore before binary.

1039
00:43:42,180 --> 00:43:43,120
Lev: No dump restore.

1040
00:43:43,260 --> 00:43:46,160
So RDS, we were on RDS, so EBS.

1041
00:43:46,160 --> 00:43:48,820
Nikolay: This leads us to this
discussion about upgrades, because

1042
00:43:48,820 --> 00:43:50,660
recovery targetless doesn't exist.

1043
00:43:50,660 --> 00:43:53,320
How, like, let's follow up on this.

1044
00:43:53,320 --> 00:43:56,860
Like, this is good bridge to 0
downtime upgrades on RDS.

1045
00:43:57,880 --> 00:43:58,860
Lev: Yeah, those are fine.

1046
00:43:58,860 --> 00:44:01,500
Oh yeah, so the Second question,
yeah, Citus.

1047
00:44:02,780 --> 00:44:06,000
So the difference is mostly philosophical
in the architecture.

1048
00:44:06,100 --> 00:44:09,380
Citus runs inside the database,
which limits it to 2 things.

1049
00:44:09,380 --> 00:44:12,740
First of all, the database host
has to allow you to run Citus,

1050
00:44:15,720 --> 00:44:18,820
Maybe because of AGPL, maybe because
they just don't want you

1051
00:44:18,820 --> 00:44:20,460
competing with their internal products.

1052
00:44:20,540 --> 00:44:23,500
Again, you know, it's business,
it's all fair game.

1053
00:44:23,940 --> 00:44:26,260
And then the second 1 is performance.

1054
00:44:26,280 --> 00:44:28,820
When you run something inside Postgres
that needs to be massively

1055
00:44:28,820 --> 00:44:30,920
parallel, you know, you're limited
by the number of processes

1056
00:44:30,920 --> 00:44:33,800
you can spawn and by number of
connections you can serve.

1057
00:44:33,800 --> 00:44:36,760
So PgDog is asynchronous, Tokio,
Rust, lightweight.

1058
00:44:36,820 --> 00:44:37,720
It's not even threaded.

1059
00:44:37,720 --> 00:44:39,740
I mean, it's multi-threaded, but
it's mostly like asynchronous,

1060
00:44:39,780 --> 00:44:41,020
like task-based runtime.

1061
00:44:41,140 --> 00:44:44,800
So you can connect, I mean, I'm
gonna pull up a big number here

1062
00:44:44,800 --> 00:44:47,200
just for, you know, for effect
that you could have like a million

1063
00:44:47,200 --> 00:44:50,380
connections going to PgDog from
a single machine and that technically

1064
00:44:50,380 --> 00:44:52,620
should work because it's using
Ebola underneath.

1065
00:44:54,160 --> 00:44:56,760
But for Postgres, you could probably
do like, you know.

1066
00:44:56,760 --> 00:44:59,440
Nikolay: You need PgBouncer or
something in front of

1067
00:44:59,440 --> 00:44:59,940
Lev: it.

1068
00:45:00,100 --> 00:45:02,720
You need PgBouncer and you need
most of those connections to

1069
00:45:02,720 --> 00:45:05,660
be idle because concurrency-wise
Postgres can only do maybe like,

1070
00:45:05,660 --> 00:45:06,800
you know, 2 per core.

1071
00:45:07,280 --> 00:45:08,580
That's the myth at least.

1072
00:45:09,160 --> 00:45:13,520
So again, like Citus, single machine,
they have some kind of

1073
00:45:13,520 --> 00:45:17,500
support for multiple coordinators,
but I think the readme just

1074
00:45:17,500 --> 00:45:19,140
says like, please contact us.

1075
00:45:21,220 --> 00:45:24,340
Nikolay: Any case you're in the
hands of Microsoft, you need

1076
00:45:24,340 --> 00:45:27,540
to go to Azure or you need to do
self host everything.

1077
00:45:27,740 --> 00:45:30,780
You cannot use it on RDS because
extensions required.

1078
00:45:30,780 --> 00:45:32,740
In your case, no extensions required.

1079
00:45:32,960 --> 00:45:34,020
This is the key.

1080
00:45:35,140 --> 00:45:38,100
You can run it on RDS because the
extensions are not needed.

1081
00:45:39,520 --> 00:45:39,960
Exactly.

1082
00:45:39,960 --> 00:45:42,180
So you can run it on any Postgres.

1083
00:45:42,700 --> 00:45:48,680
I see a lot of guys are trying
to develop Extension ecosystem.

1084
00:45:48,920 --> 00:45:57,340
I'm over time became big Like opponent
of extensions idea because

1085
00:45:57,580 --> 00:46:00,560
we have a lot of managed services
and if you develop extension

1086
00:46:00,560 --> 00:46:04,620
it takes a lot of time to bring
that extension to managed provider.

1087
00:46:05,580 --> 00:46:06,680
It takes years.

1088
00:46:07,080 --> 00:46:10,760
So if you can do something without
extensions, it might be better

1089
00:46:10,760 --> 00:46:11,580
in some cases.

1090
00:46:11,580 --> 00:46:14,180
And sharding maybe is such a case.

1091
00:46:16,160 --> 00:46:16,800
Lev: No, absolutely.

1092
00:46:16,800 --> 00:46:16,980
Yeah.

1093
00:46:16,980 --> 00:46:19,920
Because if you develop an extension
that gets installed by RDS,

1094
00:46:19,920 --> 00:46:22,420
like, I don't know if they're gonna
pay

1095
00:46:22,420 --> 00:46:22,900
Nikolay: you for that.

1096
00:46:22,900 --> 00:46:24,020
Approve it and support.

1097
00:46:24,020 --> 00:46:26,620
Lev: Approve it or support it or
all that stuff.

1098
00:46:26,720 --> 00:46:29,040
RDS notoriously upgrades like once
a year.

1099
00:46:29,040 --> 00:46:31,780
Nikolay: We need to wrap up soon,
but like a few words about

1100
00:46:31,780 --> 00:46:32,280
encryption.

1101
00:46:32,780 --> 00:46:36,240
Does PgDog support encryption now
already?

1102
00:46:36,300 --> 00:46:38,040
Because it's super important for...

1103
00:46:38,040 --> 00:46:39,740
And it can be a bottleneck.

1104
00:46:39,780 --> 00:46:40,520
I know...

1105
00:46:42,500 --> 00:46:46,420
Odyssey connection pool was created
because PgBouncer needed

1106
00:46:46,420 --> 00:46:49,820
2 layers of PgBouncers to handle
a lot of...

1107
00:46:49,820 --> 00:46:52,940
Yeah, this is what several companies
did in the past, two layers

1108
00:46:52,940 --> 00:46:56,020
of the bouncers because of a handshake.

1109
00:46:56,760 --> 00:46:57,840
So, okay.

1110
00:46:57,980 --> 00:46:58,820
Tell us handshake.

1111
00:46:59,540 --> 00:47:02,120
What, what, what's in this area
encryption?

1112
00:47:02,880 --> 00:47:03,140
Lev: Sure.

1113
00:47:03,140 --> 00:47:03,400
Yeah.

1114
00:47:03,400 --> 00:47:04,260
It supports TLS.

1115
00:47:04,260 --> 00:47:09,960
It's using a library, Tokyo, it's
like one of the rest libraries

1116
00:47:09,960 --> 00:47:11,060
that implements TLS.

1117
00:47:11,100 --> 00:47:12,900
It's completely fine, you can use
TLS.

1118
00:47:12,980 --> 00:47:16,560
And my personal favorite, I finally
find a library that implements

1119
00:47:16,560 --> 00:47:18,660
the SCRAM SHA-256 authentication.

1120
00:47:18,840 --> 00:47:20,420
So now that's finally supported.

1121
00:47:20,840 --> 00:47:24,160
Nikolay: Yeah, I saw MD5 is going
to be duplicated in the next

1122
00:47:24,160 --> 00:47:25,780
few years in Postgres, so...

1123
00:47:26,120 --> 00:47:27,340
Lev: It still comes up.

1124
00:47:27,340 --> 00:47:30,800
I mean, it's been duplicated for
10 years, and people still use

1125
00:47:30,800 --> 00:47:32,580
it just because it's really simple
to implement.

1126
00:47:32,800 --> 00:47:34,840
And SCRAM is really hard.

1127
00:47:35,460 --> 00:47:37,620
Nikolay: And one more small question
about...

1128
00:47:38,560 --> 00:47:40,220
Yes, about prepared statements.

1129
00:47:40,840 --> 00:47:43,280
I know PgCat supported them, right?

1130
00:47:43,660 --> 00:47:45,660
In transaction pool mode, right?

1131
00:47:46,160 --> 00:47:49,020
Lev: Yeah, the implementation wasn't
great, but it does support

1132
00:47:49,020 --> 00:47:49,520
it.

1133
00:47:49,540 --> 00:47:50,920
Yeah, PgDog supports them too.

1134
00:47:50,920 --> 00:47:52,660
Much better implementation this
time.

1135
00:47:52,660 --> 00:47:54,740
Nikolay: Okay, cool.

1136
00:47:54,740 --> 00:47:57,160
Well, I'm not out of questions.

1137
00:47:57,160 --> 00:47:58,840
I'm out of time.

1138
00:47:58,860 --> 00:47:59,360
Yes!

1139
00:47:59,380 --> 00:48:02,420
But it was absolutely interesting.

1140
00:48:02,620 --> 00:48:03,620
Thank you so much.

1141
00:48:04,020 --> 00:48:07,240
I'm definitely rooting and going
to follow the project.

1142
00:48:07,240 --> 00:48:09,440
Best of luck to you and your new company.

1143
00:48:10,680 --> 00:48:12,880
Maybe Michael has some questions
additionally.

1144
00:48:14,440 --> 00:48:17,720
I took the microphone for too long
this time.

1145
00:48:18,260 --> 00:48:18,580
Apologies.

1146
00:48:18,580 --> 00:48:21,100
Michael: Well, Lev, is there anything
we should have asked that

1147
00:48:21,100 --> 00:48:21,760
we didn't?

1148
00:48:21,900 --> 00:48:25,240
Lev: Oh, no you guys are, I think
you covered it.

1149
00:48:25,240 --> 00:48:27,980
Michael: Well, really nice to meet
you, thanks so much for joining

1150
00:48:27,980 --> 00:48:30,140
us and yeah catch next week Nikolay.

1151
00:48:30,240 --> 00:48:32,480
Nikolay: Thank you so much, Bye
bye.

1152
00:48:32,540 --> 00:48:33,220
Thank you.