1
00:00:00,060 --> 00:00:02,440
Michael: Hello and welcome to Postgres.FM, a weekly show about

2
00:00:02,440 --> 00:00:03,300
all things PostgreSQL.

3
00:00:03,480 --> 00:00:05,740
I am Michael, founder of pgMustard, and I'm joined as usual

4
00:00:05,740 --> 00:00:08,100
by my co-host Nikolay, founder of Postgres.AI.

5
00:00:08,100 --> 00:00:08,800
Hey Nikolay.

6
00:00:09,340 --> 00:00:10,460
Nikolay: Hi Michael, hi.

7
00:00:11,000 --> 00:00:14,360
Michael: And today we are joined by special guest Gwen Shapira,

8
00:00:14,380 --> 00:00:17,440
co-founder and chief product officer at Nile, to talk all things

9
00:00:17,440 --> 00:00:17,940
multi-tenancy.

10
00:00:18,600 --> 00:00:20,580
Hello Gwen, thank you for joining us.

11
00:00:21,020 --> 00:00:23,600
Gwen: Thank you for having me, it's very exciting for me to be

12
00:00:23,600 --> 00:00:24,480
in your show.

13
00:00:24,760 --> 00:00:26,940
Michael: Oh, we're excited to have you, thank you.

14
00:00:27,240 --> 00:00:30,860
So to start perhaps you could give us a little bit of background

15
00:00:30,860 --> 00:00:33,940
and what got you interested in this topic of multi-tenancy

16
00:00:34,040 --> 00:00:34,700
in general?

17
00:00:34,960 --> 00:00:40,720
Gwen: As many things it started with an incident but this time

18
00:00:40,720 --> 00:00:46,440
not actually one of mine, so when my co-founder and I started Nile

19
00:00:46,440 --> 00:00:48,660
we actually started with a very different idea.

20
00:00:49,440 --> 00:00:52,900
After about 9 months, we were like, this idea is not working

21
00:00:52,900 --> 00:00:53,460
very well.

22
00:00:53,460 --> 00:00:56,520
We developed some things, we're not finding the market we hoped

23
00:00:56,520 --> 00:00:57,180
to find.

24
00:00:57,560 --> 00:01:02,580
We were sitting in the Hacker Space over in Mountain View, and

25
00:01:02,580 --> 00:01:05,500
we're like, what did we learn?

26
00:01:05,500 --> 00:01:09,560
We talked to 200 companies all doing SaaS.

27
00:01:09,640 --> 00:01:12,240
What have we found out as a result?

28
00:01:13,520 --> 00:01:20,040
And the things that called to us is that very early on, basically

29
00:01:20,340 --> 00:01:24,020
when the first lines of code are getting written, you have to

30
00:01:24,020 --> 00:01:25,820
choose a multi-tenancy model.

31
00:01:26,820 --> 00:01:32,120
And then about 2, 3 or 4 years later, depending on how fast you're

32
00:01:32,120 --> 00:01:35,100
growing, you have to change it.

33
00:01:36,380 --> 00:01:41,760
And we heard a lot of stories on what caused people to change

34
00:01:41,760 --> 00:01:46,220
it and whether they regret earlier choices or they're like, we

35
00:01:46,220 --> 00:01:50,020
didn't know better and how things went for them.

36
00:01:50,020 --> 00:01:55,200
And then we started looking in different blogs and we found so

37
00:01:55,200 --> 00:02:02,800
many by very famous companies with either incidents where something

38
00:02:02,800 --> 00:02:07,540
that was done to a single tenant caused a whole chain of events

39
00:02:07,540 --> 00:02:11,260
that took down their entire system, sometimes for days.

40
00:02:13,320 --> 00:02:18,100
And also a lot of slightly better stories, how we sharded our

41
00:02:18,280 --> 00:02:19,780
highly multi-tenant database.

42
00:02:20,740 --> 00:02:26,600
And we found story after story after story on how people had

43
00:02:26,600 --> 00:02:29,620
to re-architect their entire database, which is, as you guys

44
00:02:29,620 --> 00:02:33,480
know, extremely painful to do after you're a successful company

45
00:02:33,480 --> 00:02:34,580
3 years in.

46
00:02:35,380 --> 00:02:37,780
And we're like, this is a good problem.

47
00:02:38,460 --> 00:02:39,640
So many people have it.

48
00:02:39,640 --> 00:02:40,740
It is so common.

49
00:02:41,040 --> 00:02:46,640
My past was in databases, not as much Postgres, more Oracle and

50
00:02:46,640 --> 00:02:47,140
MySQL.

51
00:02:47,900 --> 00:02:52,340
But I've seen this problem again
and again in all kinds of companies.

52
00:02:52,680 --> 00:02:56,280
My co-founders seen this problem
again and again in all kinds

53
00:02:56,280 --> 00:02:57,240
of companies.

54
00:02:57,660 --> 00:02:59,040
This is such a good problem.

55
00:02:59,040 --> 00:03:01,220
Everyone has it and nobody's working
on it.

56
00:03:01,220 --> 00:03:03,140
Why is nobody working on it?

57
00:03:04,340 --> 00:03:07,220
And that's how we got into it.

58
00:03:07,540 --> 00:03:08,260
Michael: Yeah, nice.

59
00:03:08,300 --> 00:03:12,320
Should we go back then in terms
of how do you like to describe

60
00:03:12,360 --> 00:03:15,180
the different models or the different
options that people have

61
00:03:15,180 --> 00:03:17,240
at the in the early stages?

62
00:03:18,360 --> 00:03:23,420
Gwen: Yeah, so everyone basically
starts with 1 out of 2.

63
00:03:23,420 --> 00:03:28,080
And I'm using AWS terminology,
even though there is other terminology

64
00:03:28,200 --> 00:03:29,720
that people apply to it.

65
00:03:30,160 --> 00:03:36,680
AWS calls it the pooled model versus
isolated model.

66
00:03:37,720 --> 00:03:43,980
And in a pooled model, you basically
create your tables as normal,

67
00:03:44,240 --> 00:03:49,520
and then add a tenant ID Column
to each and every 1 of your tables,

68
00:03:49,540 --> 00:03:50,060
pretty much.

69
00:03:50,060 --> 00:03:53,720
Some, maybe not everyone, some
have shared data, but most of

70
00:03:53,720 --> 00:03:57,080
your tables are going to end up
with a tenant ID Column that

71
00:03:57,080 --> 00:04:01,100
tells you which tenant this Row
belongs to.

72
00:04:01,980 --> 00:04:07,040
Very easy when you start out and
all you have to do is sprinkle

73
00:04:07,060 --> 00:04:10,920
some work clauses every now and
then and you're pretty much good

74
00:04:10,920 --> 00:04:11,320
to go.

75
00:04:11,320 --> 00:04:13,120
How hard can it possibly be?

76
00:04:13,700 --> 00:04:18,220
The places where it gets you is
that you have no solution.

77
00:04:18,420 --> 00:04:20,820
Everyone is in 1 big pool.

78
00:04:21,600 --> 00:04:27,660
And if you have a problem tenant,
if someone grows really, really

79
00:04:27,660 --> 00:04:32,360
large and suddenly you need to
Query start getting slow for them,

80
00:04:33,060 --> 00:04:36,920
If you need to do an upgrade and
1 Customer absolutely refuses

81
00:04:37,160 --> 00:04:41,100
to accept changes or needs their
own time window.

82
00:04:41,680 --> 00:04:45,040
There is a lot of different ways
you may discover that by putting

83
00:04:45,040 --> 00:04:48,900
all your Customers in 1 big pool,
you save a lot of effort, you

84
00:04:48,900 --> 00:04:49,920
save a lot of money.

85
00:04:49,920 --> 00:04:52,700
This is by far the cheapest option
you're going to have.

86
00:04:53,260 --> 00:04:55,300
It's shared resources in 1 Database.

87
00:04:56,640 --> 00:05:00,940
But you are not allowing yourself
to do anything specific for

88
00:05:00,940 --> 00:05:03,280
any 1 Customer should they need
it.

89
00:05:04,120 --> 00:05:07,120
The other approach is basically
the reverse.

90
00:05:07,900 --> 00:05:11,740
It gives every Customer its own
Database or sometimes its own

91
00:05:11,740 --> 00:05:12,240
schema.

92
00:05:12,380 --> 00:05:16,080
This still counts as isolated,
even though It's not all that

93
00:05:16,080 --> 00:05:16,580
isolated.

94
00:05:16,800 --> 00:05:19,500
You share quite a lot in that scenario.

95
00:05:19,960 --> 00:05:21,360
But the schema is separate.

96
00:05:22,100 --> 00:05:26,660
And it's quite a bit easier to
move them out if needed, if the

97
00:05:26,660 --> 00:05:27,680
schema is separated.

98
00:05:28,780 --> 00:05:32,680
And In this scenario, first of
all, there is a nice benefit that

99
00:05:32,680 --> 00:05:35,660
you get help from a lot of popular
frameworks.

100
00:05:36,160 --> 00:05:40,620
I think Ruby has an, I think it's
called apartment plugin for

101
00:05:40,800 --> 00:05:43,040
having this kind of multi-tenancy
model.

102
00:05:43,380 --> 00:05:44,440
Django has something.

103
00:05:44,440 --> 00:05:47,660
So a lot of very popular frameworks
have something that helps

104
00:05:47,660 --> 00:05:48,740
you in that model.

105
00:05:49,400 --> 00:05:54,460
But if you accidentally grow to
a large number of databases,

106
00:05:54,800 --> 00:05:56,260
it starts being very painful.

107
00:05:56,280 --> 00:06:00,280
Obviously, a database with a large
number of objects is no longer

108
00:06:00,280 --> 00:06:01,880
as easy to work with.

109
00:06:02,320 --> 00:06:06,480
Suddenly, you start learning how
much space and memory the catalog

110
00:06:06,480 --> 00:06:08,640
can really take when you have connections.

111
00:06:09,320 --> 00:06:12,860
Suddenly you start learning that
pg_dump can take a very long

112
00:06:12,860 --> 00:06:13,360
time.

113
00:06:13,940 --> 00:06:19,040
If you actually have a database
for each tenant, then doing any

114
00:06:19,040 --> 00:06:24,120
kind of maintenance on 100 databases
is already not fun.

115
00:06:24,200 --> 00:06:28,000
If you end up going into a thousand
databases, it's really not

116
00:06:28,000 --> 00:06:28,500
fun.

117
00:06:29,040 --> 00:06:33,300
And if you think about it, a lot
of SaaS have customers in the

118
00:06:33,380 --> 00:06:35,960
hundreds of thousands, not just
thousand.

119
00:06:36,580 --> 00:06:41,940
So it becomes very painful exactly
when you grow.

120
00:06:43,260 --> 00:06:43,520
Nikolay: Yeah.

121
00:06:43,520 --> 00:06:47,280
For example, upgrade includes dump
restore schema.

122
00:06:47,640 --> 00:06:52,620
And we've had cases with 250,000
tables, 2 million indexes.

123
00:06:52,960 --> 00:06:55,700
Well, indexes are there not dumped,
but tables.

124
00:06:55,760 --> 00:06:56,260
Gwen: Exactly.

125
00:06:56,320 --> 00:06:58,300
This is worse than no dump, right?

126
00:06:58,580 --> 00:07:00,260
If only we could dump indexes.

127
00:07:00,660 --> 00:07:03,500
Nikolay: Yeah, and then you need
to update statistics after upgrade

128
00:07:03,580 --> 00:07:04,900
for all of those tables.

129
00:07:05,280 --> 00:07:06,480
It's a nightmare, honestly.

130
00:07:07,200 --> 00:07:07,580
Gwen: Yeah.

131
00:07:07,580 --> 00:07:10,640
On the other hand, at least you
get to do it to a customer at

132
00:07:10,640 --> 00:07:11,120
a time.

133
00:07:11,120 --> 00:07:14,020
Imagine that everyone is in 1 really
big database and now you

134
00:07:14,020 --> 00:07:15,900
have to upgrade all of them together.

135
00:07:16,700 --> 00:07:17,200
Nikolay: Yeah.

136
00:07:18,380 --> 00:07:18,880
Yeah.

137
00:07:19,200 --> 00:07:20,920
It's painful, all this.

138
00:07:21,860 --> 00:07:22,300
Gwen: Yeah.

139
00:07:22,300 --> 00:07:27,180
And usually, and then there is
the mixed model where I think

140
00:07:27,180 --> 00:07:32,660
pretty much everyone ends up with
where you basically have, you

141
00:07:32,660 --> 00:07:37,740
start with the pool model, and
as you grow, you shard it, but

142
00:07:37,960 --> 00:07:41,160
you're actually pretty smart, and
you realize that not all customers

143
00:07:41,160 --> 00:07:45,640
are of the same size, and you can
have some dedicated shards

144
00:07:45,660 --> 00:07:50,040
for your biggest or most sensitive
or most demanding customers.

145
00:07:51,460 --> 00:07:56,680
And this is, I think, if you look
5 years into the life of a

146
00:07:56,680 --> 00:08:00,560
company, I would say that this
is the dominant model.

147
00:08:00,920 --> 00:08:07,660
Some variation of we have shared
databases with pool and then

148
00:08:07,660 --> 00:08:13,720
some dedicated databases with specific
customers.

149
00:08:15,040 --> 00:08:16,560
Nikolay: What is the problem we're
trying to solve?

150
00:08:16,560 --> 00:08:18,860
Is it security or performance or
both?

151
00:08:19,280 --> 00:08:25,160
And if we go back to some customers,
they share this pool, it

152
00:08:25,160 --> 00:08:26,840
affects security goal, right?

153
00:08:26,840 --> 00:08:28,680
So we don't achieve it.

154
00:08:28,860 --> 00:08:29,360
Gwen: Absolutely.

155
00:08:29,380 --> 00:08:32,420
So this is 1, First of all, this
is 1 driver that people actually

156
00:08:32,420 --> 00:08:34,140
start with the isolated model.

157
00:08:34,280 --> 00:08:37,680
They know that they're going into
a sensitive area.

158
00:08:37,740 --> 00:08:42,480
They're focusing on SaaS for healthcare,
SaaS for finance.

159
00:08:43,660 --> 00:08:50,440
Those companies definitely start
with isolated model and try

160
00:08:50,440 --> 00:08:53,500
to figure out how to manage large
number of databases.

161
00:08:53,940 --> 00:08:57,320
A lot of times those companies
don't become huge.

162
00:08:57,440 --> 00:09:04,400
There is that many hospitals in
the United States, but they still

163
00:09:04,400 --> 00:09:08,800
have to build all the tooling to
manage a large number of isolated

164
00:09:08,820 --> 00:09:09,320
databases.

165
00:09:10,840 --> 00:09:14,440
For other companies, it's more
complicated.

166
00:09:15,060 --> 00:09:19,700
I would say maybe 70% of the time,
the reason for eventually

167
00:09:19,920 --> 00:09:23,900
moving customers out and sharding
and isolating would be performance.

168
00:09:25,080 --> 00:09:29,200
It's amazing how many performance
problems can be solved by just

169
00:09:29,200 --> 00:09:31,260
having less data in each database.

170
00:09:34,300 --> 00:09:41,340
On the other side, there are the
story where 2 years in, suddenly

171
00:09:41,520 --> 00:09:45,560
a very sensitive customer shows
up or you want to sell into,

172
00:09:46,080 --> 00:09:52,000
like you thought you were building
a normal CRM or some kind

173
00:09:52,080 --> 00:09:57,940
of a rug database, but then a healthcare
company shows up, a

174
00:09:57,940 --> 00:10:02,840
bank shows up, or even worse, a
government shows up, and they

175
00:10:02,840 --> 00:10:09,620
show up with a list of demands
and since they usually have good

176
00:10:09,620 --> 00:10:13,300
amounts of money to back those
demands there is a lot of incentive

177
00:10:13,660 --> 00:10:16,220
to figure out a solution for them.

178
00:10:17,020 --> 00:10:17,520
Michael: Nice.

179
00:10:17,860 --> 00:10:22,920
We have 1 kind of ugly duckling
in the Postgres world that I'm

180
00:10:22,920 --> 00:10:24,960
not sure quite fits either of these
models.

181
00:10:24,960 --> 00:10:28,400
I wonder if it's worth discussing
row level security briefly,

182
00:10:28,580 --> 00:10:32,280
because if I was to bucket it based
on those definitions, it's

183
00:10:32,320 --> 00:10:35,640
kind of the pooled model in a way
because all of the data is

184
00:10:35,640 --> 00:10:38,340
together but there is some isolation

185
00:10:39,160 --> 00:10:40,420
Gwen: between tenants

186
00:10:40,940 --> 00:10:42,320
Michael: absolutely yeah

187
00:10:42,740 --> 00:10:46,120
Gwen: yeah I have a love-hate relationship
with RLS I think a

188
00:10:46,120 --> 00:10:47,300
lot of people do.

189
00:10:47,300 --> 00:10:50,400
Because you're right, on 1 hand,
it's absolutely a lifesaver

190
00:10:50,800 --> 00:10:51,920
in the pooled model.

191
00:10:52,120 --> 00:10:58,200
Developers make mistakes as joins
get, and conditions get more

192
00:10:58,200 --> 00:10:58,700
complicated.

193
00:10:58,940 --> 00:11:04,700
It's very easy to misplace a work
clause and actually leak data

194
00:11:05,140 --> 00:11:06,560
that you don't want to.

195
00:11:07,260 --> 00:11:11,600
So RLS will prevent you from doing
it if you do it right.

196
00:11:12,440 --> 00:11:16,300
It turns out that a lot of times
the rules could get complicated

197
00:11:17,040 --> 00:11:18,620
and then it leads to bugs.

198
00:11:19,060 --> 00:11:22,340
It also turns out that a lot of
times the rules get complicated

199
00:11:22,360 --> 00:11:24,340
and it leads to terrible performance.

200
00:11:25,440 --> 00:11:29,760
And 1 thing that developers really
don't realize, I'd say almost

201
00:11:29,760 --> 00:11:33,340
no developer realizes it until
they run into it.

202
00:11:35,020 --> 00:11:40,680
The work conditions that RLS introduces
are not optimized like

203
00:11:40,680 --> 00:11:42,420
the work conditions that you introduce.

204
00:11:43,080 --> 00:11:48,700
Because Postgres, thankfully, is
very good about security, It

205
00:11:48,700 --> 00:11:51,600
treats the work conditions in RLS
differently.

206
00:11:52,200 --> 00:11:56,180
I think they have, they call it
security conditions or something

207
00:11:56,180 --> 00:11:56,880
like that.

208
00:11:57,660 --> 00:12:02,020
And they are very, very conservative
on how they optimize it

209
00:12:02,020 --> 00:12:03,740
and how they plan for it.

210
00:12:04,180 --> 00:12:05,260
This has benefits.

211
00:12:05,540 --> 00:12:09,320
You get very strong security guarantees,
very few bugs as a result.

212
00:12:09,320 --> 00:12:10,140
This is fantastic.

213
00:12:10,580 --> 00:12:16,160
On the other hand, the plan would
be sometimes significantly

214
00:12:16,500 --> 00:12:20,020
worse than what you would come
up with if you were to look at

215
00:12:20,020 --> 00:12:23,000
it really hard and do it yourself.

216
00:12:23,740 --> 00:12:27,700
And with RLS, there is basically
no way to force the plan you

217
00:12:27,700 --> 00:12:28,200
want.

218
00:12:28,860 --> 00:12:36,800
You cannot set, enable, disable
different rules because, again,

219
00:12:36,820 --> 00:12:41,120
the main overriding rule is that
we're very conservative on how

220
00:12:41,120 --> 00:12:43,760
we optimize those RLS conditions.

221
00:12:44,380 --> 00:12:47,940
So Some people call RLS a performance
killer.

222
00:12:48,560 --> 00:12:52,960
I wouldn't necessarily go this
far, but you can definitely run

223
00:12:52,960 --> 00:12:57,100
into gotchas and you need to be
aware that it's not a normal

224
00:12:57,440 --> 00:12:59,280
wear conditions that you're looking
at.

225
00:13:01,160 --> 00:13:01,560
Michael: Yeah.

226
00:13:01,560 --> 00:13:02,060
Nice.

227
00:13:02,700 --> 00:13:08,920
Nikolay: So what does Nile offer
today and what is the ideal

228
00:13:09,560 --> 00:13:13,220
solution to all this in Postgres
context of course?

229
00:13:13,780 --> 00:13:19,400
Gwen: Yeah, so basically we wanted
to do maybe 3 things.

230
00:13:19,640 --> 00:13:24,900
First of all, give isolation while
not degrading the developer

231
00:13:24,940 --> 00:13:25,440
experience.

232
00:13:26,180 --> 00:13:31,940
So for example, we partitioned
data by tenant out of the box

233
00:13:32,500 --> 00:13:38,440
for you, completely transparently,
because we know that a bit

234
00:13:38,440 --> 00:13:41,280
later on, you're going to want
it, and it's going to be a pain

235
00:13:41,280 --> 00:13:42,480
in the ass to edit.

236
00:13:44,060 --> 00:13:45,660
We shard it Transparently.

237
00:13:46,160 --> 00:13:50,440
Basically, your database may be
spread across multiple different

238
00:13:50,440 --> 00:13:50,860
charts.

239
00:13:50,860 --> 00:13:55,740
We will route the queries for you
and make sure that they are

240
00:13:56,140 --> 00:13:57,100
working as you should.

241
00:13:57,100 --> 00:14:03,580
So, in a way, you get the model
you will have anyway in 4 to

242
00:14:03,580 --> 00:14:08,520
5 years, but you're getting it
from the get-go and without doing

243
00:14:08,520 --> 00:14:11,880
a lot of the work because we are
doing a lot of the management

244
00:14:11,980 --> 00:14:12,680
for you.

245
00:14:13,140 --> 00:14:18,780
The other thing is that we have
done some work to basically bypass

246
00:14:19,020 --> 00:14:21,640
RLS and still give you isolation.

247
00:14:22,280 --> 00:14:29,340
So the queries, you kind of do
the same RLS set tenant ID equal.

248
00:14:30,040 --> 00:14:32,540
We use that to actually direct
queries.

249
00:14:33,400 --> 00:14:37,200
We rewrite the queries immediately
to the partitions that we

250
00:14:37,200 --> 00:14:39,360
know has the data for that tenant.

251
00:14:39,860 --> 00:14:43,700
So we have a small extension that
kind of replaces table names

252
00:14:43,700 --> 00:14:45,820
with partition names in the query
itself.

253
00:14:47,380 --> 00:14:53,980
And this vastly improves performance
in the majority of cases.

254
00:14:54,140 --> 00:14:57,980
I mean, we've seen it in a bunch
of cases, especially if you

255
00:14:57,980 --> 00:15:02,940
have slightly weird indexes that
RLS may conservatively not use.

256
00:15:03,580 --> 00:15:10,080
The improvement is quite stark,
depending, obviously, on table

257
00:15:10,080 --> 00:15:10,580
sizes.

258
00:15:10,760 --> 00:15:13,820
You can get a benchmark that proves
anything, so I don't want

259
00:15:13,840 --> 00:15:15,640
to throw numbers out there.

260
00:15:16,980 --> 00:15:22,620
But obviously, if you break down
a table with a million rows

261
00:15:22,820 --> 00:15:27,500
into thousand tenants with a thousand
rows each, then you can

262
00:15:27,500 --> 00:15:30,700
show quite a, you can see where
I'm going with that.

263
00:15:31,720 --> 00:15:35,020
The other thing we did, and that
was probably the most work,

264
00:15:35,020 --> 00:15:39,260
and this is still work in progress,
is allow moving tenants around.

265
00:15:39,440 --> 00:15:43,040
Because 1 of the biggest problems
is that the tenant gets large

266
00:15:43,320 --> 00:15:44,020
or noisy.

267
00:15:44,640 --> 00:15:49,940
And you want to give it its own
machine, moving it is usually

268
00:15:50,080 --> 00:15:51,040
a long downtime.

269
00:15:51,300 --> 00:15:54,100
If you catch it after the tenant
is already large.

270
00:15:54,480 --> 00:16:00,240
By doing the compute storage separation,
we can basically make

271
00:16:00,240 --> 00:16:00,980
it transparent.

272
00:16:01,360 --> 00:16:06,500
It's a latency spike while we're
holding off some queries, while

273
00:16:06,500 --> 00:16:11,980
we're moving things like setting
up sequence ideas, moving, pointing

274
00:16:12,040 --> 00:16:15,640
into a different compute into the
same part of the storage.

275
00:16:16,160 --> 00:16:20,240
But it's essentially a no downtime
operation.

276
00:16:21,000 --> 00:16:26,500
So we think it's a huge deal because
again, it's just a problem

277
00:16:26,500 --> 00:16:29,240
that we keep seeing again and again.

278
00:16:30,600 --> 00:16:33,980
Nikolay: Yeah, I'm curious, is
it all open source what you build

279
00:16:34,120 --> 00:16:35,740
or only parts of it?

280
00:16:35,740 --> 00:16:37,620
Gwen: Right now it's mostly hidden.

281
00:16:38,680 --> 00:16:45,600
We have started registering parts
of it under a Apache license

282
00:16:46,240 --> 00:16:46,740
veto.

283
00:16:47,300 --> 00:16:51,800
So yeah, the goal is to open source
it and we already publicly

284
00:16:52,020 --> 00:16:54,360
declared that it's going to be
open source.

285
00:16:54,520 --> 00:17:00,060
We have not a date, but a point
of completion where we plan to

286
00:17:00,060 --> 00:17:00,840
open it.

287
00:17:01,100 --> 00:17:01,360
Nikolay: Yeah.

288
00:17:01,360 --> 00:17:05,020
So I've heard about this, several
interesting things here.

289
00:17:05,020 --> 00:17:10,160
So 1 is extension for this, I guess
it's called, in your documentation,

290
00:17:10,160 --> 00:17:14,660
it's called RLS virtualization
or how's it called?

291
00:17:15,140 --> 00:17:16,980
Gwen: We call it tenant virtualization.

292
00:17:17,540 --> 00:17:21,480
The extension itself, I think we
called it Karnak.

293
00:17:21,960 --> 00:17:26,480
We call everything after stuff
in Egypt and Karnak is a famous

294
00:17:26,480 --> 00:17:26,980
temple.

295
00:17:28,320 --> 00:17:33,600
Nikolay: So data is stored in separate
tables and in the same

296
00:17:33,600 --> 00:17:38,540
database, but extension rewrites
queries to basically route query

297
00:17:38,760 --> 00:17:40,640
to proper table, right?

298
00:17:40,640 --> 00:17:42,300
Is this based on?

299
00:17:42,660 --> 00:17:48,240
Gwen: It's in separate partitions
And we rewrite queries to go

300
00:17:48,240 --> 00:17:49,580
to the correct partition.

301
00:17:50,660 --> 00:17:55,800
We basically bypass RLS, we bypass
the planner trying to make

302
00:17:55,800 --> 00:17:56,620
those calls.

303
00:17:56,880 --> 00:18:01,420
We found out that with a large
number of partitions, this is

304
00:18:01,780 --> 00:18:03,180
significantly more efficient.

305
00:18:03,180 --> 00:18:03,740
Nikolay: I see.

306
00:18:03,740 --> 00:18:05,280
So it's Postgres partitioning.

307
00:18:05,280 --> 00:18:05,960
I see.

308
00:18:06,060 --> 00:18:10,660
And you mentioned also another
thing you mentioned here is sharding,

309
00:18:11,200 --> 00:18:11,700
right?

310
00:18:12,180 --> 00:18:12,680
Gwen: Yeah.

311
00:18:12,720 --> 00:18:16,140
So we use foreign data wrappers
to allow...

312
00:18:17,380 --> 00:18:20,140
So we have 2 things in the architecture.

313
00:18:20,460 --> 00:18:24,180
First of all, we have a proxy,
it's a routing proxy.

314
00:18:24,240 --> 00:18:29,500
So it keeps track of every connection,
which tenant is the current

315
00:18:29,500 --> 00:18:35,860
tenant and it routes it to the
shard that has the correct tenant

316
00:18:35,860 --> 00:18:36,600
in it.

317
00:18:37,440 --> 00:18:42,480
And then we also have some cases
where some developers want to

318
00:18:42,480 --> 00:18:44,680
write queries that touch multiple
tenants.

319
00:18:45,040 --> 00:18:49,240
Those are not going to be as fast,
but we do allow them by use

320
00:18:49,240 --> 00:18:51,160
of foreign data wrappers.

321
00:18:52,360 --> 00:18:58,660
And mixing partitions with our
own partitioning rules with foreign

322
00:18:58,660 --> 00:19:03,400
data wrappers and still keeping
things efficient, we didn't want

323
00:19:03,540 --> 00:19:08,080
the planner on any machine to be
aware of all the partitions

324
00:19:08,600 --> 00:19:14,380
in all the other shards, because
it just explodes the planning

325
00:19:14,380 --> 00:19:16,420
time in ways that we saw as unacceptable.

326
00:19:17,200 --> 00:19:25,040
So what we did is represent each
shard with a table and then

327
00:19:25,040 --> 00:19:29,660
put hierarchical table inheritance
on top of it.

328
00:19:30,660 --> 00:19:36,280
And the end result is basically
a union all between the table

329
00:19:36,280 --> 00:19:39,560
with the table inheritance that
points to all those other shards,

330
00:19:41,040 --> 00:19:42,600
and the table with the partitions.

331
00:19:43,520 --> 00:19:47,920
Now this gives us basically predicate
pushdown because the planner

332
00:19:47,920 --> 00:19:52,700
will push the plan to all those
different charts.

333
00:19:53,200 --> 00:19:56,540
The other charts know that they
have partitions, which the source

334
00:19:56,540 --> 00:19:57,760
planner didn't know.

335
00:19:58,120 --> 00:20:02,360
They will plan correctly with all
the partitions, but they only

336
00:20:02,360 --> 00:20:03,980
know about a subset of partitions.

337
00:20:05,020 --> 00:20:09,700
So we see it's a bit hacky and
it's a bit weird to explain.

338
00:20:10,360 --> 00:20:14,040
And we think we can do better with
some modifications to Postgres,

339
00:20:14,060 --> 00:20:19,440
which we have not done yet, but
this does give us predicate pushdown,

340
00:20:20,140 --> 00:20:24,240
fairly fast planning, and the ability
to do queries that cross

341
00:20:24,240 --> 00:20:28,320
tenants in situations where this
is required, essentially.

342
00:20:28,320 --> 00:20:32,660
Nikolay: Yeah, if you don't involve
2PC, just rely on front-end

343
00:20:32,660 --> 00:20:35,120
wrappers, no two-phase commit.

344
00:20:35,600 --> 00:20:39,440
I'm curious what kind of anomalies
can happen there.

345
00:20:40,080 --> 00:20:44,680
Gwen: Yes, and we prevent a lot
of things that could cause anomalies.

346
00:20:45,080 --> 00:20:50,140
So we do have a transaction coordinator,
but in order to not

347
00:20:50,140 --> 00:20:54,280
overload and also not overcomplicate
our architecture, we limit

348
00:20:54,280 --> 00:20:54,860
some things.

349
00:20:54,860 --> 00:21:00,360
So, DDL has to be done on a single
tenant, and you cannot mix

350
00:21:00,700 --> 00:21:02,000
cross-tenant queries.

351
00:21:02,120 --> 00:21:06,280
Sorry, DML, INSERTs and UPDATEs have to be done on a single tenant,

352
00:21:06,280 --> 00:21:10,840
and you cannot mix in a Transaction cross, so the moment you

353
00:21:10,840 --> 00:21:13,500
start a Transaction, you have to know what tenant you're working

354
00:21:13,500 --> 00:21:14,000
on.

355
00:21:14,340 --> 00:21:17,120
And then we route it to the correct shard, which has the correct

356
00:21:17,120 --> 00:21:21,000
table, and everything has the absolutely correct guarantees.

357
00:21:21,600 --> 00:21:26,100
If you need to do something cross-tenant, you do not involve

358
00:21:26,180 --> 00:21:28,100
it with any kind of UPDATE.

359
00:21:28,460 --> 00:21:32,320
You could still be exposed to some anomalies, I agree, because

360
00:21:32,320 --> 00:21:36,780
it could be ongoing Transactions from other people in other places.

361
00:21:36,820 --> 00:21:42,600
So you get the basic Read Committed guarantees that Postgres

362
00:21:42,880 --> 00:21:46,520
gives you, I believe, but not anything more than that.

363
00:21:47,280 --> 00:21:53,300
But again, we believe that cross tenant Queries are rare and

364
00:21:53,300 --> 00:21:57,620
mostly done in analytical cases, where you do reporting where

365
00:21:57,620 --> 00:22:00,640
it's slightly less critical to have those.

366
00:22:01,020 --> 00:22:05,440
Nikolay: So you forbid the writing to 2 shards in 1 Transaction?

367
00:22:05,900 --> 00:22:08,000
Gwen: If you want to write to a shard, it's fantastic.

368
00:22:08,300 --> 00:22:12,180
You tell us what tenant you're writing data into, and we will

369
00:22:12,180 --> 00:22:13,580
direct you to the correct shard.

370
00:22:13,580 --> 00:22:16,760
Nikolay: I mean, if there is a Transaction which needs to write

371
00:22:16,760 --> 00:22:23,000
to 2 different shards, this is a big problem, because without

372
00:22:23,000 --> 00:22:23,500
PC.

373
00:22:24,720 --> 00:22:25,660
Gwen: Yes, exactly.

374
00:22:25,840 --> 00:22:28,660
And we don't let you do that, essentially, in order to avoid

375
00:22:28,660 --> 00:22:29,040
anomalies.

376
00:22:29,040 --> 00:22:29,640
Nikolay: I see.

377
00:22:29,760 --> 00:22:33,760
Another question here, have you considered the approach used

378
00:22:33,760 --> 00:22:34,420
in Vitesse?

379
00:22:34,640 --> 00:22:36,720
As I understand it, maybe I'm wrong.

380
00:22:37,040 --> 00:22:41,320
Where mostly for analytical Queries maybe, to avoid distributed

381
00:22:41,320 --> 00:22:45,660
Transactions, data is brought synchronously from 1 shard to another.

382
00:22:45,660 --> 00:22:48,920
And we have it locally, like basically kind of Materialized U

383
00:22:49,020 --> 00:22:51,880
on top of logical replication, for example, or something.

384
00:22:52,200 --> 00:22:55,640
And it has eventual consistency approach, of course, but you

385
00:22:55,640 --> 00:23:00,120
can just join it in 1 Postgres, in 1 shard, right?

386
00:23:00,140 --> 00:23:01,660
Have you considered this approach?

387
00:23:01,840 --> 00:23:02,300
Gwen: Y.K.

388
00:23:02,300 --> 00:23:03,780
We have considered it.

389
00:23:04,700 --> 00:23:10,680
I think maybe CitusDB has something similar, if I remember correctly.

390
00:23:10,680 --> 00:23:12,280
I'm not 100% sure.

391
00:23:12,740 --> 00:23:16,080
But yeah, it's something that we were like, yeah, this is a good

392
00:23:16,080 --> 00:23:18,460
idea that we may examine in the future.

393
00:23:19,280 --> 00:23:23,820
It's definitely, we are trying to build something useful gradually,

394
00:23:24,280 --> 00:23:28,440
and we understand that early on, it's almost safer to have a

395
00:23:28,440 --> 00:23:32,180
bunch of limitations that over time will resolve, rather than

396
00:23:32,440 --> 00:23:37,080
allow people to do something unsafe and also build a kitchen

397
00:23:37,080 --> 00:23:37,580
sink.

398
00:23:37,660 --> 00:23:41,140
Nikolay: It sounds to me like Postgres versus MySQL approaches

399
00:23:41,820 --> 00:23:45,200
because MySQL approach, you remember MyISAM?

400
00:23:45,560 --> 00:23:50,540
Maybe you don't remember, but it was like quite bad.

401
00:23:50,540 --> 00:23:55,080
You need to run a repair table
all the time because it's not

402
00:23:55,080 --> 00:23:56,760
ACID and so on.

403
00:23:56,980 --> 00:23:58,220
It allowed too much.

404
00:23:58,980 --> 00:24:00,420
Gwen: Exactly, yeah.

405
00:24:00,540 --> 00:24:02,340
And Marathon had a lot of issues.

406
00:24:02,360 --> 00:24:06,320
I mean, there is a reason why InnoDB
became extremely popular.

407
00:24:06,820 --> 00:24:07,800
Nikolay: Right, right.

408
00:24:07,900 --> 00:24:11,120
Michael: But also, with a multi-tenancy
use case, I think you're

409
00:24:11,120 --> 00:24:14,440
quite right that, well, we, I mean,
you'll find out soon enough,

410
00:24:14,440 --> 00:24:18,340
right, if lots of people want these
cross-tenant or cross-shard

411
00:24:18,420 --> 00:24:22,800
queries which are by definition
cross-tenant queries and if they

412
00:24:22,800 --> 00:24:25,600
don't, if you don't need to worry
about it, you save a bunch

413
00:24:25,600 --> 00:24:27,900
of effort having to even implement
that.

414
00:24:27,900 --> 00:24:29,680
So yeah, I like that a lot.

415
00:24:30,300 --> 00:24:34,580
Nikolay: Yeah, last comment here
is I'm excited to see that finally

416
00:24:34,600 --> 00:24:38,620
Postgres ecosystem receives some
tension in the area of sharding.

417
00:24:38,620 --> 00:24:43,180
I guess it's just time has come
and more and more databases became

418
00:24:43,780 --> 00:24:45,420
too large to be handled.

419
00:24:45,420 --> 00:24:46,520
Gwen: I'm almost surprised.

420
00:24:46,960 --> 00:24:49,040
I'm honestly surprised it took
that long.

421
00:24:49,160 --> 00:24:51,840
I mean, again, if you look at MySQL.

422
00:24:51,900 --> 00:24:53,540
Nikolay: We just, it's just unfortunate.

423
00:24:53,860 --> 00:24:57,260
First time I touched this topic,
it was 2006 immediately when

424
00:24:57,260 --> 00:24:59,080
we started working with Postgres,
honestly.

425
00:24:59,440 --> 00:25:04,280
And There was a PL/Proxy from Skype
at that time already, but

426
00:25:04,280 --> 00:25:08,300
it required you to write everything
in functions.

427
00:25:08,400 --> 00:25:10,700
Gwen: Partitioning didn't exist
back then.

428
00:25:10,960 --> 00:25:11,460
Nikolay: Existed.

429
00:25:11,580 --> 00:25:13,980
It was based on inheritance.

430
00:25:14,480 --> 00:25:14,640
It

431
00:25:14,640 --> 00:25:16,960
required much more manual.

432
00:25:17,700 --> 00:25:18,880
It was fun, actually.

433
00:25:19,660 --> 00:25:21,440
You understood it better, you know?

434
00:25:22,240 --> 00:25:25,020
But yeah, but it was not convenient,
100%.

435
00:25:25,640 --> 00:25:26,740
Not super convenient.

436
00:25:26,980 --> 00:25:31,260
It's just, I see that it's just
unfortunate how it turned out

437
00:25:31,260 --> 00:25:32,320
in Postgres ecosystem.

438
00:25:32,320 --> 00:25:34,420
And now definitely there is huge
pressure.

439
00:25:34,460 --> 00:25:36,880
Many companies need partitioning...

440
00:25:37,400 --> 00:25:37,740
Need sharding.

441
00:25:37,740 --> 00:25:38,260
And it seems

442
00:25:38,260 --> 00:25:39,960
Gwen: like How would history work?

443
00:25:39,960 --> 00:25:44,760
Imagine that YouTube picked Postgres
and not MySQL as their first

444
00:25:44,760 --> 00:25:45,260
database.

445
00:25:46,580 --> 00:25:50,140
Nikolay: Yeah Google or Facebook
they both chose MySQL somehow

446
00:25:50,140 --> 00:25:50,640
Yeah.

447
00:25:52,040 --> 00:25:54,140
Michael: The test would be for
Postgres first, right?

448
00:25:54,140 --> 00:25:54,640
Gwen: Exactly.

449
00:25:55,680 --> 00:25:57,580
It could have turned out so differently.

450
00:25:57,720 --> 00:25:58,220
Michael: Yeah.

451
00:25:59,320 --> 00:26:00,420
Funny, isn't it?

452
00:26:01,020 --> 00:26:04,740
Changing topics slightly or LLMs
are quite top of mind at the

453
00:26:04,740 --> 00:26:09,880
moment, are you seeing people's
initial choices change as a result

454
00:26:09,880 --> 00:26:14,120
of asking for advice earlier from
our robot friends?

455
00:26:15,540 --> 00:26:21,260
Gwen: Oh my god, we're seeing so
many weird things, it's just

456
00:26:21,820 --> 00:26:24,120
unbelievable how much things are
changing.

457
00:26:24,600 --> 00:26:28,900
First of all, we're having people
show up on our Discord and

458
00:26:28,900 --> 00:26:34,120
say things like, I'm using Nile
because my LLM thought it's a

459
00:26:34,120 --> 00:26:34,820
good idea.

460
00:26:36,660 --> 00:26:40,760
And I don't really know Postgres,
so I need some help, but my

461
00:26:40,760 --> 00:26:43,360
LLM assured me that this is still
a good idea.

462
00:26:44,700 --> 00:26:48,580
Like a lot of people are just like
people who are very much beginners,

463
00:26:49,440 --> 00:26:53,160
like maybe I can say when I started
developing, the first time

464
00:26:53,160 --> 00:26:57,940
I had to use a database, my company
sent me to a 3 weeks database

465
00:26:58,240 --> 00:26:58,740
class.

466
00:26:58,780 --> 00:27:01,720
I think it was Oracle about 20
years back.

467
00:27:01,980 --> 00:27:06,140
And I came back a lot more confident
that I know how to use Oracle

468
00:27:06,600 --> 00:27:09,660
and not to leave transactions open
for too long because people

469
00:27:09,660 --> 00:27:10,780
will yell at me.

470
00:27:11,520 --> 00:27:16,520
These days people don't do the
three-week class before they try

471
00:27:16,520 --> 00:27:17,460
using a database.

472
00:27:17,880 --> 00:27:21,740
So you see a lot more, people are
using a database earlier on

473
00:27:21,740 --> 00:27:26,340
and they do expect more hand-holding
from the vendors.

474
00:27:26,580 --> 00:27:32,620
Like the LLMs give them advice
up to a certain point, but eventually,

475
00:27:33,080 --> 00:27:35,980
if things are slow, they will come
to you and say, hey, why is

476
00:27:35,980 --> 00:27:36,920
my query slow?

477
00:27:37,960 --> 00:27:40,520
I'm sure you guys have seen your
share of that.

478
00:27:41,840 --> 00:27:49,340
I'm also seeing people use Postgres
for their LLMs in different

479
00:27:49,340 --> 00:27:49,840
ways.

480
00:27:49,920 --> 00:27:52,420
And this is really exciting to
me.

481
00:27:52,420 --> 00:27:57,940
People use Postgres via MCPs, people
use Postgres with vectors,

482
00:27:58,260 --> 00:28:01,960
people building AI applications
on top of Postgres.

483
00:28:02,440 --> 00:28:04,040
We're seeing a lot of that.

484
00:28:06,100 --> 00:28:10,220
And, I mean, personally, I'm really
excited that people can program

485
00:28:10,460 --> 00:28:15,200
with LLMs, not knowing a lot about
Postgres, not knowing even

486
00:28:15,200 --> 00:28:21,780
a lot about software engineering
at all, and still get reasonable

487
00:28:22,800 --> 00:28:23,860
security guarantees.

488
00:28:24,480 --> 00:28:31,640
You don't need to know to ask your
LLM about RLS or about, Are

489
00:28:31,640 --> 00:28:35,640
you sure this query actually properly
isolates tenants?

490
00:28:36,900 --> 00:28:41,920
And it's also interesting how much
the results differ when people

491
00:28:41,920 --> 00:28:44,880
use different LLMs.

492
00:28:46,260 --> 00:28:51,820
Like I would say syncing models
do fairly well iterating in order

493
00:28:51,820 --> 00:28:55,320
to get good code and checking their
own results, again, given

494
00:28:55,320 --> 00:28:57,460
via MCP access to a database.

495
00:28:58,340 --> 00:29:04,220
I would say that if you use HRGPT
4.0, you will get a lot of

496
00:29:04,220 --> 00:29:08,100
random hallucinatory stuff still
in your code.

497
00:29:08,680 --> 00:29:11,000
Nikolay: Yeah, it's funny.

498
00:29:11,000 --> 00:29:15,660
I already told Michael, we had
the cases doing consulting, But

499
00:29:15,660 --> 00:29:17,680
like maybe already almost a year
ago.

500
00:29:17,680 --> 00:29:23,620
I started noticing that people
send us Like some we are building

501
00:29:23,620 --> 00:29:24,640
this part of database.

502
00:29:24,640 --> 00:29:25,800
Can you review it?

503
00:29:25,800 --> 00:29:30,740
We are reviewing we use different
lamps supporting this review

504
00:29:31,040 --> 00:29:36,960
and then we have a call and I'm
curious, code looks great, I

505
00:29:36,960 --> 00:29:42,440
mean schema looks great but something
is off, right?

506
00:29:42,440 --> 00:29:46,420
And then we have a call and I see
they open tabs and charge a

507
00:29:46,420 --> 00:29:51,280
PT, Claude there as well so I realize
they used LLM to create

508
00:29:51,280 --> 00:29:54,380
schema and then send us for review
and we use a lens to review

509
00:29:54,380 --> 00:29:59,880
it and then There's like a 4 party
process, you know It's it

510
00:29:59,880 --> 00:30:04,300
leads to good place, but someone
needs to jump with proper expertise

511
00:30:04,440 --> 00:30:06,880
and say, this is not a good approach.

512
00:30:07,660 --> 00:30:08,560
Gwen: This is hilarious.

513
00:30:08,560 --> 00:30:12,340
Do you think at some point you
and your customer can just step

514
00:30:12,340 --> 00:30:15,100
out and let the LLMs figure it
out between themselves?

515
00:30:15,100 --> 00:30:19,000
Nikolay: Well, there's a problem
here because like it's great,

516
00:30:19,000 --> 00:30:23,920
but there is like it does 80% of
drop in 1% not not even 20%

517
00:30:24,280 --> 00:30:28,140
of effort very quickly, but there
is 20% of problems, which again,

518
00:30:28,140 --> 00:30:31,780
like, I don't know, maybe in next
few years, it will change.

519
00:30:31,780 --> 00:30:32,860
I think it will change.

520
00:30:32,860 --> 00:30:38,000
But right now I feel my internal
LLM trained much better than

521
00:30:38,000 --> 00:30:38,960
than Charger PT.

522
00:30:39,440 --> 00:30:41,180
Gwen: You trained your own LLM,
right?

523
00:30:41,180 --> 00:30:43,040
Nikolay: Yeah, no, I mean my own.

524
00:30:43,620 --> 00:30:46,360
Gwen: Oh yeah, no, but I think
you actually trained your own.

525
00:30:46,360 --> 00:30:47,460
Nikolay: Yeah, we experiment.

526
00:30:48,920 --> 00:30:50,440
We do some stuff, we experiment.

527
00:30:50,600 --> 00:30:55,400
We have some things like we have,
we start with fine tuning,

528
00:30:55,680 --> 00:31:00,200
moving to our own LLM, but not
yet there still.

529
00:31:01,260 --> 00:31:05,400
So yeah, there is a lot of stuff
can be done there to properly.

530
00:31:06,040 --> 00:31:10,680
It's hard to compete with Cloud
and they have very high pace.

531
00:31:11,120 --> 00:31:14,320
With Cloud 4 release, I see, wow,
it's really great.

532
00:31:14,320 --> 00:31:17,720
But it's still missing many things
you learn from practice, which

533
00:31:17,720 --> 00:31:18,620
were not discussed.

534
00:31:18,780 --> 00:31:20,580
That's why they don't bring it.

535
00:31:21,580 --> 00:31:24,220
Many problems were not discussed
yet.

536
00:31:24,520 --> 00:31:29,080
And you explore them if you have
a lot of data and heavy workloads.

537
00:31:30,040 --> 00:31:35,680
Gwen: So you're saying that even
training on the Postgres mailing

538
00:31:35,680 --> 00:31:39,600
list doesn't have all the information
in it, essentially?

539
00:31:39,880 --> 00:31:40,780
Nikolay: Well, yes.

540
00:31:41,280 --> 00:31:44,820
For example, random problem, we
recently touched it.

541
00:31:45,060 --> 00:31:47,220
There is a buffer pool in Postgres.

542
00:31:47,780 --> 00:31:52,040
And there are 128 basically partitions.

543
00:31:52,720 --> 00:31:55,220
So you can have 128 locks.

544
00:31:55,640 --> 00:31:59,280
And if you have a huge buffer pool,
this becomes bottleneck.

545
00:31:59,440 --> 00:32:03,580
And some people say, let's maybe
make it tunable, configurable.

546
00:32:05,500 --> 00:32:09,380
But when you start researching
this topic, you end up finding

547
00:32:09,380 --> 00:32:12,940
some recent, well, recent last
August and September last year,

548
00:32:13,280 --> 00:32:17,120
conversation in hackers mailing
list, which is open-ended.

549
00:32:17,380 --> 00:32:20,500
It's not complete, because somebody
needs to run benchmarks and

550
00:32:20,500 --> 00:32:26,060
prove that this is worth having
new setting.

551
00:32:26,880 --> 00:32:27,540
And that's it.

552
00:32:27,540 --> 00:32:29,440
There's a patch proposed, but that's
it.

553
00:32:29,440 --> 00:32:31,360
No experiments yet.

554
00:32:31,680 --> 00:32:34,460
Gwen: This is, by the way, 1 of
the reasons I was so excited

555
00:32:34,480 --> 00:32:38,600
about your LLM approach, because
if you think about what is the

556
00:32:38,600 --> 00:32:43,480
bottleneck for doing a lot of the
database improvement things,

557
00:32:43,780 --> 00:32:48,360
and I am feeling it Very personally,
running benchmarks is hard.

558
00:32:49,020 --> 00:32:51,000
Properly planning a benchmark is
hard.

559
00:32:51,000 --> 00:32:56,000
LLMs can help with that, but again,
they sometimes just go off

560
00:32:56,000 --> 00:32:56,680
the rails.

561
00:32:57,700 --> 00:33:02,920
And even if they help plan that,
I don't know LLMs that actually

562
00:33:03,460 --> 00:33:07,400
run benchmarks to the point where
they provision the machines

563
00:33:07,440 --> 00:33:11,980
in AWS and know that you have to
provision a separate machine

564
00:33:11,980 --> 00:33:15,520
as a database and a separate machine,
maybe a few of them, to

565
00:33:15,520 --> 00:33:19,340
drive the workload and they both
need to have appropriate resources.

566
00:33:19,940 --> 00:33:23,700
Like all those and then don't get
me started on analyzing the

567
00:33:23,700 --> 00:33:27,480
results, which is kind of 99.9%
of the work.

568
00:33:28,660 --> 00:33:32,220
So the fact that you actually kind
of started your LLM for I

569
00:33:32,220 --> 00:33:36,820
have an LLM that can actually do
benchmarks, is just, I think

570
00:33:36,820 --> 00:33:41,980
this will be the biggest breakthrough
in both people tuning their

571
00:33:41,980 --> 00:33:47,500
own Postgres, and also Postgres
as a community being able to

572
00:33:47,500 --> 00:33:49,500
advance the state of the art?

573
00:33:49,540 --> 00:33:51,000
Nikolay: Let me share sad news.

574
00:33:51,580 --> 00:33:55,240
Like, I'm not giving up, but it's
a roller coaster.

575
00:33:55,240 --> 00:33:58,880
I spend more than we spend, like
the team of maybe 5 engineers

576
00:33:58,900 --> 00:34:02,840
spent more than 1 year trying to
achieve that.

577
00:34:03,740 --> 00:34:08,240
We achieved many things, but first
of all, we chose Gemini because

578
00:34:08,240 --> 00:34:09,300
they gave us credits.

579
00:34:09,640 --> 00:34:11,140
And I think it was a huge mistake.

580
00:34:11,140 --> 00:34:12,740
Gemini has a lot of problems.

581
00:34:14,360 --> 00:34:20,520
Suddenly you have 500 errors, which
are so many problems.

582
00:34:21,100 --> 00:34:26,780
It's just not mature product, Gemini,
and it has hallucinations

583
00:34:27,040 --> 00:34:27,880
all the time.

584
00:34:27,880 --> 00:34:31,160
It's good, for example, for JSON,
working with JSON, because

585
00:34:31,160 --> 00:34:34,280
when you need to run experiment,
we decided to choose JSON as

586
00:34:34,280 --> 00:34:35,140
config format.

587
00:34:35,460 --> 00:34:39,860
It writes much better than GPT-4,
4.0 and so on.

588
00:34:39,960 --> 00:34:43,040
But many things, it just hallucinates.

589
00:34:43,260 --> 00:34:44,860
It invents all the time some things.

590
00:34:44,860 --> 00:34:47,120
It just makes up results all the
time.

591
00:34:47,400 --> 00:34:51,960
And we have a system to control
it but it bypasses all the time.

592
00:34:52,060 --> 00:34:53,460
Like it's silly hard.

593
00:34:53,500 --> 00:34:59,180
So then yeah, we experimented with
additional DeepSeek, Lama

594
00:34:59,480 --> 00:35:02,480
and we've fine-tuned a lot.

595
00:35:02,680 --> 00:35:07,480
All versions of GPT, all modern
fresh versions, we also bring

596
00:35:07,480 --> 00:35:08,600
them all the time.

597
00:35:08,720 --> 00:35:12,680
And Claude, Claude is much better,
we just added it to this system

598
00:35:12,780 --> 00:35:13,600
we have.

599
00:35:14,060 --> 00:35:19,800
But After 1 year, I decided, you
know what, I'm like, benchmarks

600
00:35:19,900 --> 00:35:21,320
is extremely hard topic.

601
00:35:21,860 --> 00:35:23,560
And we cannot trust it anymore.

602
00:35:23,560 --> 00:35:30,940
I mean, we cannot trust LLM to
create precise configuration and

603
00:35:30,940 --> 00:35:32,720
process results fully.

604
00:35:32,980 --> 00:35:38,540
So we decided that LLM is just
more like connection thing.

605
00:35:38,680 --> 00:35:41,540
When you engineer benchmark, expert
needs to engineer.

606
00:35:41,540 --> 00:35:47,900
I don't trust any LLM for now,
because any experiment, we plan

607
00:35:47,900 --> 00:35:52,360
to publish maybe 15 to 20 experiments
in our blog last year.

608
00:35:52,360 --> 00:35:54,880
And if you open our blog, you see
just 1 experiment.

609
00:35:55,320 --> 00:35:59,640
And even there we screwed up and
someone on Twitter said this

610
00:35:59,640 --> 00:36:02,220
is not right and we quickly corrected,
which is good.

611
00:36:02,320 --> 00:36:06,640
And we had achievements like interesting
things, bottlenecks

612
00:36:07,420 --> 00:36:08,560
popped up here and there.

613
00:36:08,560 --> 00:36:10,940
It's really fun to iterate with
LLM.

614
00:36:11,060 --> 00:36:14,440
But once you allow to think, to
design experiment and to treat

615
00:36:14,440 --> 00:36:20,880
results, in 99% of time we have
wrong, wrong results, wrong conclusions

616
00:36:21,040 --> 00:36:21,760
and so on.

617
00:36:21,760 --> 00:36:26,400
So for now we are thinking, okay,
this is just, accelerator of

618
00:36:26,400 --> 00:36:30,880
performing experiments, but design
and understanding results

619
00:36:30,960 --> 00:36:33,140
should be in human brain for now.

620
00:36:33,540 --> 00:36:34,040
100%.

621
00:36:34,120 --> 00:36:38,080
Gwen: I love the fact that Benchmarks
is also hard for robots,

622
00:36:38,240 --> 00:36:39,040
to be honest.

623
00:36:39,060 --> 00:36:40,940
It's so hard for humans, right?

624
00:36:41,600 --> 00:36:45,960
Nikolay: Yeah, and we collect so
many artifacts, But somehow

625
00:36:45,960 --> 00:36:47,260
it's super hard still.

626
00:36:48,100 --> 00:36:50,240
So you always think where is the
bottleneck?

627
00:36:51,180 --> 00:36:53,040
And simple question, right?

628
00:36:53,560 --> 00:37:00,480
But for now, it's extremely hard
to let LLM find bottleneck and

629
00:37:00,480 --> 00:37:01,620
draw proper conclusions.

630
00:37:02,780 --> 00:37:05,740
Gwen: And honestly, if you just
had an LM that always said it's

631
00:37:05,740 --> 00:37:09,180
a network, it would be correct
about 80% of the time.

632
00:37:10,400 --> 00:37:11,600
Nikolay: What if it's local?

633
00:37:12,340 --> 00:37:15,640
If it's, there's no network, everything
through Unix sockets

634
00:37:15,660 --> 00:37:17,740
and we had these cases as well.

635
00:37:18,390 --> 00:37:20,860
So I agree with you in production.

636
00:37:20,900 --> 00:37:24,960
Like in production, yes, but in
experiments when we learn Postgres

637
00:37:24,960 --> 00:37:28,660
behavior on single machine running
pgbench locally, we don't

638
00:37:28,660 --> 00:37:29,560
care sometimes.

639
00:37:30,060 --> 00:37:32,620
So there's no network there sometimes.

640
00:37:33,340 --> 00:37:35,140
So it's hard.

641
00:37:35,140 --> 00:37:37,800
So I had many moments of frustration.

642
00:37:38,860 --> 00:37:42,880
But it's so good, like I still
believe that iterations are great.

643
00:37:42,880 --> 00:37:46,660
So if you say this is great benchmark,
just check it on new version.

644
00:37:46,960 --> 00:37:48,380
Just changing 1 thing.

645
00:37:48,900 --> 00:37:49,820
This is good.

646
00:37:50,280 --> 00:37:54,960
We have automation, we have interface,
and it repeats the process

647
00:37:54,960 --> 00:37:56,260
of analysis again.

648
00:37:56,580 --> 00:37:58,580
This is where LLM helps a lot.

649
00:38:01,420 --> 00:38:05,220
Because without LLM, you could
just, you have some form, and

650
00:38:05,220 --> 00:38:07,320
oh, we don't have this parameter
programmed.

651
00:38:07,360 --> 00:38:08,980
It's not exposed in the interface.

652
00:38:09,020 --> 00:38:09,840
It's bad.

653
00:38:09,880 --> 00:38:13,740
With LLM, you have freedom to change
things and iterate based

654
00:38:13,740 --> 00:38:15,260
on existing good benchmarks.

655
00:38:16,160 --> 00:38:18,940
So, yeah, I cannot say we are there
yet.

656
00:38:18,940 --> 00:38:22,780
This project, like, right now is
we are, like, thinking about

657
00:38:22,780 --> 00:38:29,340
next level of it, where I think
we will let LLM, we will give

658
00:38:29,340 --> 00:38:35,120
it less freedom, you know, that's
the key, and control more by

659
00:38:35,740 --> 00:38:36,500
human brain.

660
00:38:37,700 --> 00:38:39,520
Gwen: Human in the loop kind of
thing.

661
00:38:39,520 --> 00:38:40,640
Nikolay: Yeah, yeah, exactly.

662
00:38:40,640 --> 00:38:41,140
Exactly.

663
00:38:41,220 --> 00:38:46,720
So design and first analysis, only
human should be there.

664
00:38:46,720 --> 00:38:49,960
But once you have confidence that
you're moving in the right

665
00:38:49,960 --> 00:38:53,680
direction and you just need to
iterate and expand, for example,

666
00:38:53,680 --> 00:38:56,820
to different versions, platforms,
everything, this is where you

667
00:38:56,820 --> 00:38:57,540
can relax.

668
00:38:57,600 --> 00:38:59,180
You already verified results.

669
00:38:59,480 --> 00:39:02,060
You can say just repeat, but on
different something.

670
00:39:02,520 --> 00:39:04,760
This is where LLM already can bring
you.

671
00:39:04,760 --> 00:39:06,360
It can just speed up everything.

672
00:39:06,660 --> 00:39:10,320
You can throw it to this benchmarking
process and have like 10

673
00:39:10,320 --> 00:39:13,220
experiments running in 10 different
versions or something.

674
00:39:13,280 --> 00:39:16,460
Gwen: And I think this is also
kind of almost the general directions

675
00:39:16,640 --> 00:39:19,020
that agents are taking shape.

676
00:39:19,280 --> 00:39:23,300
I mean, people started to 2025
was supposed to be the year of

677
00:39:23,300 --> 00:39:23,980
the agent.

678
00:39:24,720 --> 00:39:28,620
I think it's almost becoming the
year of the human in the loop

679
00:39:28,620 --> 00:39:29,660
with the agent.

680
00:39:29,720 --> 00:39:30,220
Yeah.

681
00:39:30,780 --> 00:39:35,080
Like all the successful products
I see are you tell the agent

682
00:39:35,080 --> 00:39:36,140
to do some stuff.

683
00:39:36,820 --> 00:39:38,620
You ask it, please plan something.

684
00:39:38,620 --> 00:39:39,720
You give it feedback.

685
00:39:39,960 --> 00:39:43,260
You then say, okay, now that we
have a good plan, go and execute

686
00:39:43,260 --> 00:39:43,700
on it.

687
00:39:43,700 --> 00:39:46,660
You come back an hour later, you
had your coffee.

688
00:39:47,020 --> 00:39:48,500
Okay, let's see what you've got.

689
00:39:48,500 --> 00:39:49,460
Here's some feedback.

690
00:39:49,640 --> 00:39:51,440
Go fix some stuff.

691
00:39:52,120 --> 00:39:57,600
I think it's always every successful
product is a bit like this.

692
00:39:57,800 --> 00:39:58,200
Nikolay: Right.

693
00:39:58,200 --> 00:40:02,000
But sometimes humans start using
different LLM when reviewing

694
00:40:02,000 --> 00:40:02,820
things, right?

695
00:40:03,560 --> 00:40:04,780
Being lazy, right?

696
00:40:06,500 --> 00:40:07,000
Gwen: Yes.

697
00:40:08,420 --> 00:40:09,220
Nikolay: That's interesting.

698
00:40:10,840 --> 00:40:15,520
So I'm not sure how successful
it will be, but my gut tells me

699
00:40:15,520 --> 00:40:18,740
that we need to move, move, move
in this direction anyway and

700
00:40:18,740 --> 00:40:22,280
have some, I don't know, like more
experiments and so on.

701
00:40:22,280 --> 00:40:26,260
I hope we will have more soon to
publish, I mean, and start iterating.

702
00:40:27,880 --> 00:40:33,640
But yeah, it was a rollercoaster
last year, So now we are rebuilding

703
00:40:33,640 --> 00:40:34,140
stuff.

704
00:40:35,460 --> 00:40:36,920
We will see how it works.

705
00:40:37,660 --> 00:40:41,160
And for example, 1 of the experiments
we must do, I think, is

706
00:40:41,160 --> 00:40:43,660
to conduct various benchmarks for
RLS.

707
00:40:44,800 --> 00:40:45,800
Because it's obviously...

708
00:40:45,800 --> 00:40:46,320
That would

709
00:40:46,320 --> 00:40:47,860
Gwen: be a fantastic example.

710
00:40:48,160 --> 00:40:48,660
Yes.

711
00:40:49,760 --> 00:40:52,280
And I mean, this is something that
humans with experience are

712
00:40:52,280 --> 00:40:56,960
pretty good at finding cases where
you're like, is RLS going

713
00:40:56,960 --> 00:41:04,340
to actually be an issue and have
the stories that then the LLM

714
00:41:04,340 --> 00:41:06,180
can go implement and test.

715
00:41:06,340 --> 00:41:10,580
Nikolay: Yeah, we had several cases
and also Supabase has public

716
00:41:10,580 --> 00:41:12,680
materials, blog posts about this.

717
00:41:12,720 --> 00:41:16,720
Obviously, there is already quite
known case when you have like

718
00:41:16,720 --> 00:41:20,040
current_setting function inside
RLS expression.

719
00:41:21,100 --> 00:41:28,700
And this is, yeah, if you select
count 1000000 rows, it's terrible.

720
00:41:28,860 --> 00:41:31,560
And it's quite easy to fix actually.

721
00:41:31,560 --> 00:41:35,040
But yeah, so these kinds of experiments
to collect them and see

722
00:41:35,040 --> 00:41:35,540
how.

723
00:41:35,580 --> 00:41:38,680
Actually, my goal with this experiment
I think will be to prove

724
00:41:38,680 --> 00:41:42,420
that they are not a problem if
you do it right.

725
00:41:42,900 --> 00:41:43,400
Gwen: Interesting.

726
00:41:43,980 --> 00:41:48,680
I would contribute, I think there
was a recent post on the bug

727
00:41:48,680 --> 00:41:53,720
tracker where basically the optimizer,
the planner refused to

728
00:41:53,720 --> 00:41:59,380
use, I think it was a GiST index
or a GiN index due to the belief

729
00:41:59,380 --> 00:42:04,080
that the RLS optimization is incorrect,
is unsafe.

730
00:42:04,900 --> 00:42:06,800
I can look it up and send it to
you.

731
00:42:06,900 --> 00:42:08,480
Yeah, it's interesting.

732
00:42:08,480 --> 00:42:10,700
But yeah, that can also be interesting.

733
00:42:10,940 --> 00:42:15,620
Like if you have a fix for Postgres,
then you can, it will obviously

734
00:42:15,720 --> 00:42:17,920
be nice to showcase.

735
00:42:18,820 --> 00:42:19,320
Nikolay: Right.

736
00:42:19,600 --> 00:42:25,280
I think also if you, so you don't
use RLS, you said bypass it,

737
00:42:25,280 --> 00:42:25,760
right?

738
00:42:25,760 --> 00:42:27,280
Gwen: We bypass it, yeah.

739
00:42:27,880 --> 00:42:28,860
Nikolay: Yeah, that's interesting.

740
00:42:28,860 --> 00:42:34,660
I'm curious if you, in this mixed
schema when we have partitions

741
00:42:34,780 --> 00:42:41,140
or shards, and RLS doesn't make
sense at all to involve RLS additionally

742
00:42:41,280 --> 00:42:46,480
locally if some of shards have
a mixed pool of customers.

743
00:42:46,500 --> 00:42:49,600
Gwen: Okay, 1 of the questions
that I have in mind, and this

744
00:42:49,600 --> 00:42:53,220
is something that we're trying
to help figure out for our users,

745
00:42:53,900 --> 00:42:57,680
often on top of the tenant, you
still have permissions for specific

746
00:42:57,780 --> 00:42:59,120
users in the tenant.

747
00:42:59,120 --> 00:43:02,560
Like you have an admin that can
do anything, and then you may

748
00:43:02,560 --> 00:43:05,680
have someone who is not allowed
to see some rows at all because

749
00:43:05,680 --> 00:43:07,940
they're too sensitive, all this
kind of stuff.

750
00:43:08,600 --> 00:43:13,380
And our users ask us, they can
do it with RLS inside those partitions,

751
00:43:14,500 --> 00:43:19,440
or they can do it in their application.

752
00:43:19,780 --> 00:43:23,540
There is a lot of application level
tools or like middleware

753
00:43:23,680 --> 00:43:26,060
kind of tools that will do it for
them.

754
00:43:26,180 --> 00:43:31,520
Is it better to do it in the app
layer or in Postgres with RLS

755
00:43:31,840 --> 00:43:34,900
is a good question that I don't
have an immediate answer for.

756
00:43:35,380 --> 00:43:35,660
Nikolay: Right.

757
00:43:35,660 --> 00:43:39,900
So there are several layers of
multi-tenancy basically.

758
00:43:40,840 --> 00:43:41,340
Yeah.

759
00:43:41,400 --> 00:43:44,160
Not multi-tenancy, but different
layers.

760
00:43:45,020 --> 00:43:49,860
And If your customer being a tenant
for you, they might have

761
00:43:49,860 --> 00:43:54,460
tenants for them, but also inside
them, they might have additional

762
00:43:55,600 --> 00:43:59,760
clusters or segments of users inside
each tenant.

763
00:43:59,820 --> 00:44:04,340
So it's apartments and rooms inside
apartments.

764
00:44:05,220 --> 00:44:05,720
Gwen: Nice.

765
00:44:06,100 --> 00:44:07,160
I like that.

766
00:44:08,860 --> 00:44:09,360
Yeah.

767
00:44:09,520 --> 00:44:13,030
This is 1 of the things that make
multi-tenancy confusing, right?

768
00:44:13,030 --> 00:44:17,520
Because it's almost like Matryoshka
kind of scenario.

769
00:44:17,800 --> 00:44:18,300
Nikolay: Yes.

770
00:44:18,620 --> 00:44:21,600
This term, by the way, used in
Postgres ecosystem multiple times,

771
00:44:21,600 --> 00:44:25,520
starting with, you mentioned GiST
original paper by Hellerstein

772
00:44:25,900 --> 00:44:32,020
mentions that Matryoshka..,
RD-tree, Russian doll tree, so basically

773
00:44:32,020 --> 00:44:32,130
Matryoshka

774
00:44:32,130 --> 00:44:32,240
tree.

775
00:44:32,240 --> 00:44:33,140
Gwen: That's true.

776
00:44:33,840 --> 00:44:39,220
And also in embeddings, they have
the Matryoshka type embeddings

777
00:44:39,340 --> 00:44:42,320
where you can make them of any
size.

778
00:44:42,800 --> 00:44:44,200
Nikolay: Yeah, I read about this
as well.

779
00:44:44,200 --> 00:44:45,140
Yeah, it's funny.

780
00:44:45,580 --> 00:44:46,560
So yeah, great.

781
00:44:47,120 --> 00:44:50,180
And what are your plans for AI?

782
00:44:50,180 --> 00:44:54,580
I saw MCP servers already with
some integration.

783
00:44:55,640 --> 00:44:57,900
Gwen: We basically have 3 directions.

784
00:44:58,360 --> 00:45:03,220
1 is MCP server and making it public,
giving it authentication.

785
00:45:03,600 --> 00:45:06,560
Right now, it's open-source, you
can run it on your own, but

786
00:45:06,560 --> 00:45:10,180
we're not hosting it, so we should
start hosting it at some point.

787
00:45:11,400 --> 00:45:15,400
So things that just make people
who use LLM, make it easier for

788
00:45:15,400 --> 00:45:16,580
them to use it.

789
00:45:16,840 --> 00:45:19,820
Other thing we've done that we
think is very useful for LLMs,

790
00:45:19,820 --> 00:45:24,040
this we already have, just make
it 0 time to create new databases.

791
00:45:24,640 --> 00:45:28,220
Because LLMs just 0 time and 0
cost.

792
00:45:28,440 --> 00:45:30,300
Because they love creating a lot
of them.

793
00:45:30,300 --> 00:45:33,880
Every time something goes wrong,
okay, let's try from scratch

794
00:45:34,120 --> 00:45:36,560
with a new database kind of situation.

795
00:45:37,540 --> 00:45:39,400
So we're making it fast and cheap.

796
00:45:39,440 --> 00:45:42,080
The other thing that we're still
working on is really to make

797
00:45:42,080 --> 00:45:44,200
our documentation more LLM friendly.

798
00:45:45,240 --> 00:45:48,540
LLMs.txt is absolutely not enough.

799
00:45:49,760 --> 00:45:53,720
It is actually, it creates a very
large file.

800
00:45:53,720 --> 00:45:55,680
The LLMs tend to get lost in it.

801
00:45:55,680 --> 00:45:58,180
We need to figure out how to make it better.

802
00:45:59,440 --> 00:46:03,220
And then also, everyone is kind of thinking, can we have our

803
00:46:03,220 --> 00:46:03,980
own agents?

804
00:46:04,340 --> 00:46:06,800
Can we do something around that?

805
00:46:07,540 --> 00:46:09,360
We're kind of thinking about that.

806
00:46:09,900 --> 00:46:14,540
Something we already have that is useful is just that with the

807
00:46:14,540 --> 00:46:19,540
multi-tenant model, the embedding, the vector indexes are much

808
00:46:19,540 --> 00:46:20,040
smaller.

809
00:46:20,740 --> 00:46:25,120
And this is a huge deal for people building those agents and

810
00:46:25,120 --> 00:46:25,620
LLMs.

811
00:46:26,440 --> 00:46:26,780
Nikolay: Yeah.

812
00:46:26,780 --> 00:46:32,120
And actually, you mentioned, so this zero startup goal, you mentioned

813
00:46:32,120 --> 00:46:34,440
separation of compute and data.

814
00:46:34,860 --> 00:46:36,240
How is that achieved?

815
00:46:36,500 --> 00:46:38,540
Using what approach?

816
00:46:39,160 --> 00:46:39,940
Can you elaborate?

817
00:46:39,940 --> 00:46:40,520
Gwen: Oh, yeah.

818
00:46:40,520 --> 00:46:41,000
Sorry.

819
00:46:41,000 --> 00:46:45,780
This is more than a two-minute answer, but the short one is that

820
00:46:45,780 --> 00:46:49,960
you kind of, you patch Postgres, and then you find a better way

821
00:46:49,960 --> 00:46:55,240
to do your storage and basically wrap every Postgres storage

822
00:46:55,240 --> 00:46:58,700
function with an equivalent with your storage.

823
00:46:59,340 --> 00:47:03,940
And then you also need to apply the WAL continuously to the

824
00:47:03,940 --> 00:47:04,860
storage layer.

825
00:47:05,180 --> 00:47:08,300
Nikolay: Great job doing this like in actually 20 seconds.

826
00:47:09,320 --> 00:47:10,720
I understood very well.

827
00:47:10,720 --> 00:47:11,220
Yeah.

828
00:47:11,280 --> 00:47:15,400
So, and do you have plans to make this open source as well?

829
00:47:16,800 --> 00:47:17,820
Gwen: Yes, absolutely.

830
00:47:17,860 --> 00:47:21,540
I mean, this is a first of all, as you know, WAL readers have

831
00:47:21,540 --> 00:47:27,280
to be registered, ours already is, and with the open-source license.

832
00:47:28,380 --> 00:47:31,400
And then, yeah, we are planning to open

833
00:47:31,400 --> 00:47:31,880
Nikolay: source our

834
00:47:31,880 --> 00:47:33,720
Gwen: storage layer, our extension.

835
00:47:34,200 --> 00:47:37,940
I think at this point it's about seven different patches we've made

836
00:47:38,080 --> 00:47:38,800
on Postgres.

837
00:47:39,160 --> 00:47:42,320
I don't think we'll want to open source it as a Postgres fork,

838
00:47:42,660 --> 00:47:46,100
because the number of patches is quite small and we are maintaining

839
00:47:46,220 --> 00:47:46,680
it.

840
00:47:46,680 --> 00:47:51,020
I think everything from Postgres 12 to 18 at this point, or maybe

841
00:47:51,020 --> 00:47:53,500
13 to 18, something along those lines.

842
00:47:53,680 --> 00:47:55,580
Nikolay: All supported versions, I guess.

843
00:47:55,960 --> 00:47:56,460
Yeah.

844
00:47:56,460 --> 00:47:57,040
That's great.

845
00:47:57,040 --> 00:47:57,400
Yeah.

846
00:47:57,400 --> 00:47:59,440
Well, looking forward to checking it out.

847
00:47:59,440 --> 00:48:02,880
And actually, one more question for me, maybe last one.

848
00:48:02,980 --> 00:48:06,800
Are you open to some benchmarks we probably will do?

849
00:48:07,200 --> 00:48:09,680
We plan to do some benchmarks with various platforms.

850
00:48:10,520 --> 00:48:14,240
It all started after acquisition of Neon and there's a discussion

851
00:48:14,680 --> 00:48:15,560
on LinkedIn.

852
00:48:15,600 --> 00:48:18,900
So I thought about trying some
benchmarks for different platforms.

853
00:48:19,540 --> 00:48:23,800
How do you think, like what are
your thoughts about it?

854
00:48:24,280 --> 00:48:26,060
Gwen: Oh my God, this is very scary.

855
00:48:26,140 --> 00:48:29,120
We are benchmarking ourselves all
the time.

856
00:48:29,180 --> 00:48:33,220
So I'm keenly aware of exactly
like what benchmarks make us look

857
00:48:33,220 --> 00:48:36,020
good and what benchmark make us
look bad.

858
00:48:36,020 --> 00:48:39,800
And I think that's also how we
react to benchmarks in social

859
00:48:39,800 --> 00:48:40,300
media.

860
00:48:40,640 --> 00:48:44,020
Unless it's a benchmark that shows
something in Postgres itself

861
00:48:44,040 --> 00:48:47,540
and like clearly attempting to
educate and help people.

862
00:48:48,260 --> 00:48:52,280
You can design a benchmark to make
anyone in the world look bad.

863
00:48:52,280 --> 00:48:53,252
Nikolay: Benchmarketing, it's called.

864
00:48:53,252 --> 00:48:53,508
Michael: You can design

865
00:48:53,508 --> 00:48:53,860
Nikolay: a benchmark to

866
00:48:53,860 --> 00:48:55,240
Gwen: make everyone look good.

867
00:48:55,240 --> 00:48:55,740
Exactly.

868
00:48:56,440 --> 00:49:01,800
So, I will, at any given time,
I can publish my benchmarks that

869
00:49:01,800 --> 00:49:07,720
makes Nile look fantastic, and
you can run benchmarks that you

870
00:49:07,720 --> 00:49:12,780
decided how to build them, and
may be very realistic and may

871
00:49:12,780 --> 00:49:15,900
even expose a problem that we have
that I didn't know about.

872
00:49:16,840 --> 00:49:19,080
But, yeah, in general, I love benchmarks.

873
00:49:19,920 --> 00:49:23,620
I just have opinions on how benchmarks
are used by marketing

874
00:49:23,620 --> 00:49:24,120
people.

875
00:49:25,580 --> 00:49:26,660
Michael: Very good answer.

876
00:49:27,260 --> 00:49:30,880
For anybody interested in a bit
more about Niall's architecture,

877
00:49:31,120 --> 00:49:35,140
Gwen gave a really good talk at
PGConf.dev recently and the video

878
00:49:35,140 --> 00:49:37,800
just went up on YouTube so I will
put that in the show notes

879
00:49:37,800 --> 00:49:40,020
for anybody that wants a deeper
dive there.

880
00:49:40,840 --> 00:49:45,560
There was also a good talk I saw
at a PgDay Paris event by Pierre

881
00:49:45,580 --> 00:49:48,560
Ducroquet, I'm not sure if I'm
pronouncing that anywhere near

882
00:49:48,560 --> 00:49:52,360
correctly, all about multi-tenant
database design, especially

883
00:49:52,460 --> 00:49:56,260
focusing on something we didn't
focus much on today, which was

884
00:49:56,600 --> 00:50:01,620
the downsides of schema per tenant
design, including things like

885
00:50:01,620 --> 00:50:04,300
observability and monitoring, which
I thought was really fascinating.

886
00:50:04,360 --> 00:50:07,360
So anybody considering going down
that route definitely check

887
00:50:07,360 --> 00:50:09,940
out that video and I'll put that
in the show notes as well.

888
00:50:10,120 --> 00:50:12,900
So yeah thank you so much Gwen
I think we're out of time it's

889
00:50:12,900 --> 00:50:13,940
been a real pleasure.

890
00:50:14,060 --> 00:50:16,380
Gwen: It's been a pleasure Thank
you for having me on.