1
00:00:00,060 --> 00:00:02,160
Michael: Hello and welcome to Postgres.FM, a weekly show about

2
00:00:02,160 --> 00:00:03,260
all things PostgreSQL.

3
00:00:03,260 --> 00:00:06,140
I am Michael, founder of pgMustard, and I'm joined by Nik, as

4
00:00:06,140 --> 00:00:07,640
usual, from Postgres.AI.

5
00:00:07,640 --> 00:00:08,800
Hey Nik, how's it going?

6
00:00:09,720 --> 00:00:10,820
Nikolay: Everything is fantastic.

7
00:00:10,920 --> 00:00:11,960
How are you, Michael?

8
00:00:12,500 --> 00:00:13,340
Michael: Yeah, good.

9
00:00:13,380 --> 00:00:14,160
Good also.

10
00:00:15,940 --> 00:00:17,800
What are we talking about this week?

11
00:00:17,980 --> 00:00:23,040
Nikolay: So today, this week, we are talking about let's move

12
00:00:23,040 --> 00:00:30,980
out of RDS or put your own managed Postgres name there.

13
00:00:31,700 --> 00:00:32,340
And how?

14
00:00:32,980 --> 00:00:36,100
Michael: Yeah, so you mentioned how to move off RDS but I was

15
00:00:36,100 --> 00:00:39,020
going to ask how specific you wanted to be to, well because there's

16
00:00:39,020 --> 00:00:43,160
even multiple flavors of RDS in terms, even Postgres compatible.

17
00:00:43,680 --> 00:00:47,080
Nikolay: Yeah, it's great And we have a lot of managed service

18
00:00:47,080 --> 00:00:49,660
providers and that's super good.

19
00:00:49,660 --> 00:00:53,000
They have all of them have their pros and cons.

20
00:00:53,000 --> 00:00:54,500
Some of them are already dead.

21
00:00:55,320 --> 00:01:00,800
We must say like, and you know, there's a saying like A human

22
00:01:00,800 --> 00:01:04,540
can be not only can be that can be suddenly that and this is

23
00:01:04,540 --> 00:01:07,560
true about this is from Russian literature

24
00:01:07,580 --> 00:01:08,360
Michael: Okay, okay,

25
00:01:08,360 --> 00:01:14,940
Nikolay: Bulgakov. The Master and Margarita. So managed Postgres

26
00:01:14,960 --> 00:01:18,980
service can be suddenly dead also And we know several examples

27
00:01:19,540 --> 00:01:25,740
and what annoys me is how they provide very short notice sometimes.

28
00:01:26,480 --> 00:01:28,980
I don't expect this from RDS, but who knows?

29
00:01:29,280 --> 00:01:29,620
Michael: Right?

30
00:01:29,620 --> 00:01:31,020
No, definitely not.

31
00:01:31,020 --> 00:01:32,620
Nikolay: RDS is huge, right?

32
00:01:32,620 --> 00:01:37,440
But we know Google can kill services.

33
00:01:38,140 --> 00:01:41,620
Not with 30 days short notice, right, or 2 weeks.

34
00:01:42,100 --> 00:01:48,480
But among recent examples, I remember Bit.io, which was an interesting

35
00:01:48,480 --> 00:01:55,580
idea, drop CSV file and it gives you Postgres and you can start

36
00:01:55,580 --> 00:01:56,080
querying.

37
00:01:56,360 --> 00:01:59,440
I remember, I think Tembo is a recent example, right?

38
00:02:00,040 --> 00:02:01,820
Michael: Yeah, but how much notice are they giving?

39
00:02:03,480 --> 00:02:06,360
Nikolay: Bit.io gave, I think 30 day notice only.

40
00:02:06,420 --> 00:02:07,100
Michael: It's not,

41
00:02:07,120 --> 00:02:09,440
Nikolay: it's not, it's not all right for databases.

42
00:02:09,620 --> 00:02:13,860
We know how difficult it is to do any action.

43
00:02:13,860 --> 00:02:17,320
If you exceed, for example, 1 terabyte and it's already like

44
00:02:17,320 --> 00:02:21,480
significant size when you, any move you need to plan carefully

45
00:02:21,600 --> 00:02:23,800
and practice and test and then perform.

46
00:02:24,440 --> 00:02:26,320
Yeah, well, Tembo, right?

47
00:02:26,320 --> 00:02:30,720
Tembo, I think they completely pivoted to different area and

48
00:02:30,720 --> 00:02:32,660
also PostgresML is a new example.

49
00:02:33,620 --> 00:02:38,440
And, but PostgresML, I think 1 of the founders is PgCat.

50
00:02:38,560 --> 00:02:40,340
He was here long ago.

51
00:02:40,340 --> 00:02:40,940
Michael: Yeah, Lev.

52
00:02:41,520 --> 00:02:41,880
Nikolay: Right.

53
00:02:41,880 --> 00:02:45,020
But PostgresML is closed, but others
as well.

54
00:02:45,020 --> 00:02:45,280
Right.

55
00:02:45,280 --> 00:02:47,140
So I think it's, it happens.

56
00:02:47,440 --> 00:02:51,340
So 1 of the reasons to migrate
out is lack of control.

57
00:02:52,720 --> 00:02:56,180
I think, if we already discussed
reasons, do we?

58
00:02:56,180 --> 00:02:58,160
Michael: Yeah, let's do it a little
bit.

59
00:02:58,260 --> 00:03:02,180
I hear people talking a little
bit about costs, but I think the

60
00:03:02,180 --> 00:03:07,160
ones I see more often seem to be
yeah low-level access extensions

61
00:03:07,540 --> 00:03:11,040
like if you've got a specific extension
you want supported you

62
00:03:11,040 --> 00:03:14,780
might migrate I guess sometimes
between providers rather than

63
00:03:14,780 --> 00:03:19,500
off but yeah migrating off it also
could include migrating on

64
00:03:19,500 --> 00:03:20,560
to a different 1

65
00:03:21,040 --> 00:03:23,560
Nikolay: yeah yeah definitely because
some of them are much more

66
00:03:23,560 --> 00:03:27,720
open some of them not many but
some crunchy right crunchy bridge

67
00:03:27,720 --> 00:03:30,360
Michael: yeah or some support different
except like Maybe the

68
00:03:30,360 --> 00:03:34,580
extension you want is on 1 managed
service but not another.

69
00:03:35,240 --> 00:03:38,080
Nikolay: You want TimescaleDB
or you need to migrate to Timescale Cloud?

70
00:03:39,440 --> 00:03:42,100
Michael: Or even I think I saw
a conversation you were having

71
00:03:42,100 --> 00:03:43,580
that is it PL/Proxy?

72
00:03:44,720 --> 00:03:45,520
Nikolay: Oh, Citus.

73
00:03:45,900 --> 00:03:46,400
Yeah.

74
00:03:46,500 --> 00:03:46,740
Yeah.

75
00:03:46,740 --> 00:03:50,400
Because how the Citus is working
there, this is my theory.

76
00:03:50,500 --> 00:03:55,640
That's why PGQ and SkyTools are
there basically, right?

77
00:03:55,640 --> 00:03:57,180
So and PL/Proxy as well.

78
00:03:57,620 --> 00:04:02,300
So yeah, well, and all others only
have, not actually all others,

79
00:04:02,300 --> 00:04:06,040
RDS doesn't have even PgBouncer,
which also originated from Skype

80
00:04:06,820 --> 00:04:09,180
20, almost 20 years ago.

81
00:04:09,340 --> 00:04:10,420
Michael: I didn't know that.

82
00:04:10,580 --> 00:04:14,600
Nikolay: Yeah, PgBouncer is the
most popular product created

83
00:04:14,600 --> 00:04:16,580
inside Skype 20 years ago.

84
00:04:17,020 --> 00:04:19,740
Others are PGQ, Londiste and
PL/Proxy.

85
00:04:20,860 --> 00:04:25,240
I think there are more, but these
I used and remember very well.

86
00:04:25,240 --> 00:04:25,740
Michael: Nice.

87
00:04:26,120 --> 00:04:31,740
Nikolay: So yeah, if we talk about
reasons, There are many reasons.

88
00:04:32,540 --> 00:04:36,900
For me, inside me, and the services
are different.

89
00:04:36,900 --> 00:04:38,040
Some of them are open,

90
00:04:38,140 --> 00:04:38,480
Michael: some of

91
00:04:38,480 --> 00:04:41,040
Nikolay: them are fully based on
open source, which is great,

92
00:04:41,040 --> 00:04:45,520
like Supabase, and we're not strictly
based on open source,

93
00:04:45,520 --> 00:04:46,580
and this is great.

94
00:04:47,480 --> 00:04:51,000
Some of them provide super user
like Citus or access

95
00:04:51,000 --> 00:04:52,120
to physical backups.

96
00:04:52,120 --> 00:04:57,900
It's, it's miracle among managed
Postgres providers, but most

97
00:04:57,900 --> 00:05:02,920
of them are telling you, we are
protecting you give us almost

98
00:05:02,920 --> 00:05:08,260
twice more money and we will limit
your access to your own things.

99
00:05:09,280 --> 00:05:14,340
And we will take care of backups
and failover, HA and DR, and

100
00:05:14,340 --> 00:05:15,360
some other features.

101
00:05:16,500 --> 00:05:19,940
And all of these features are very
good established already in

102
00:05:19,940 --> 00:05:21,240
open source ecosystem.

103
00:05:22,340 --> 00:05:29,800
So I think anger lives inside me
and grows towards managed service

104
00:05:29,800 --> 00:05:34,700
providers because I think first
of all they make money, much

105
00:05:34,700 --> 00:05:38,860
more money than they should on
open source, not contributing

106
00:05:39,400 --> 00:05:40,220
back enough.

107
00:05:40,760 --> 00:05:42,440
Not everyone, but many.

108
00:05:42,580 --> 00:05:48,580
But most problem is I think, I
truly believe they stole the openness

109
00:05:48,640 --> 00:05:53,500
from us, the open source which
we loved, always comparing to

110
00:05:53,500 --> 00:05:57,080
SQL Server, Oracle, or Linux versus
Microsoft Windows.

111
00:05:57,540 --> 00:06:01,620
Those guys who create managed service
providers, among them are

112
00:06:01,620 --> 00:06:06,140
great engineers, and I admire and
have huge respect to many of

113
00:06:06,140 --> 00:06:06,640
them.

114
00:06:06,740 --> 00:06:11,880
But at grand schema, this is new
like Oracle and vendor lock-in

115
00:06:12,180 --> 00:06:16,940
and closed everything and bad processes,
inability to fix problems,

116
00:06:16,940 --> 00:06:19,080
understand what's happening and
so on.

117
00:06:19,080 --> 00:06:23,040
All things which open source intended
to solve.

118
00:06:23,480 --> 00:06:24,880
Like, for example, you have...

119
00:06:25,020 --> 00:06:28,220
Like, why I stopped using Oracle
8i in 2000?

120
00:06:28,260 --> 00:06:31,460
Because I spent 3 days trying to
fix something.

121
00:06:32,280 --> 00:06:36,360
It worked on 1 machine and didn't
work at our client machine.

122
00:06:36,600 --> 00:06:40,160
And then I read somewhere, OK,
internet already was a thing.

123
00:06:40,160 --> 00:06:43,600
I read somewhere that if host name
has a closing parenthesis,

124
00:06:43,780 --> 00:06:45,460
then Oracle cannot connect.

125
00:06:45,620 --> 00:06:47,360
You cannot connect to Oracle.

126
00:06:47,360 --> 00:06:48,740
And no errors, nothing.

127
00:06:49,540 --> 00:06:52,680
And I eventually switched to open
source where such kind of problems

128
00:06:52,680 --> 00:06:54,360
you can troubleshoot easily.

129
00:06:54,960 --> 00:06:56,680
Like if you have advanced skills.

130
00:06:56,840 --> 00:07:00,520
And many people, they are not dumb,
like many people are quite

131
00:07:00,520 --> 00:07:04,800
smart and can learn how to use
Pure or Solve and troubleshoot.

132
00:07:05,460 --> 00:07:09,400
But managed service providers stole
this openness and ability

133
00:07:09,440 --> 00:07:11,060
to see what's happening.

134
00:07:11,840 --> 00:07:17,220
And they just packaged open source
in the form of service, charged

135
00:07:17,220 --> 00:07:21,000
a lot of money, and also closed
important features like access

136
00:07:21,000 --> 00:07:25,900
to physical backups or ability
to physically physical replication

137
00:07:26,040 --> 00:07:29,680
connection basically implementing
vendor lock-in.

138
00:07:30,640 --> 00:07:34,900
Michael: So well but they didn't
steal it right it's It's licensed

139
00:07:34,940 --> 00:07:38,600
that way deliberately to be very
permissive Postgres.

140
00:07:38,800 --> 00:07:39,720
So there's a...

141
00:07:39,720 --> 00:07:41,560
Nikolay: It's Postgres, right?

142
00:07:41,560 --> 00:07:42,540
It's open source.

143
00:07:42,560 --> 00:07:43,880
It's the same as Postgres.

144
00:07:44,280 --> 00:07:45,620
We almost didn't change it.

145
00:07:45,620 --> 00:07:46,400
We use it as it is.

146
00:07:46,400 --> 00:07:47,140
It's Postgres.

147
00:07:47,580 --> 00:07:49,440
And it's like enterprise level.

148
00:07:49,440 --> 00:07:52,000
Michael: To call it, you mean,
you mean it's the calling it Postgres

149
00:07:52,000 --> 00:07:52,740
is what you're...

150
00:07:52,740 --> 00:07:55,940
Nikolay: Yeah, they call it Postgres
and Postgres popularity.

151
00:07:56,600 --> 00:08:00,360
Actually, it's a two-sided question,
because I also think RDS

152
00:08:00,360 --> 00:08:03,660
contributed to the rise of Postgres
popularity around 2015.

153
00:08:04,200 --> 00:08:08,160
I think JSON was a big contributor,
but if you check carefully,

154
00:08:08,480 --> 00:08:11,520
RDS probably was a bigger contributor
to the growth of Postgres

155
00:08:11,520 --> 00:08:14,080
popularity, because everyone hated
installing Postgres.

156
00:08:14,160 --> 00:08:15,040
This is true.

157
00:08:15,240 --> 00:08:16,460
But things are changing.

158
00:08:16,460 --> 00:08:19,560
It's much much easier these days
to install.

159
00:08:20,280 --> 00:08:22,980
Michael: And when I started, when
I got more and more involved

160
00:08:22,980 --> 00:08:28,040
in the Postgres community, maybe
6 years ago, most Postgres committers

161
00:08:28,320 --> 00:08:32,360
were, well, I didn't look at the
numbers, but I think from memory,

162
00:08:32,360 --> 00:08:36,560
most committers and contributors
to Postgres were working at

163
00:08:36,560 --> 00:08:42,180
consulting firms who did a lot
of community work but also I guess

164
00:08:42,180 --> 00:08:43,940
implementing features for their
clients.

165
00:08:44,480 --> 00:08:50,220
These days I wouldn't be surprised
if, and I know the lines are

166
00:08:50,220 --> 00:08:53,680
blurred between consulting firms
and managed service providers

167
00:08:53,680 --> 00:08:56,280
these days, but I wouldn't be surprised
if a group, I definitely

168
00:08:56,280 --> 00:08:59,560
know that lots and lots of committers
are moving to the hyperscalers

169
00:09:00,060 --> 00:09:02,060
and other managed service providers.

170
00:09:02,200 --> 00:09:02,820
Nikolay: Of course.

171
00:09:03,540 --> 00:09:06,720
Michael: But what I mean is there
is some contributing back from

172
00:09:06,720 --> 00:09:07,120
them.

173
00:09:07,120 --> 00:09:10,260
But I do take your point that there's
an interesting dynamic

174
00:09:10,320 --> 00:09:10,520
here.

175
00:09:10,520 --> 00:09:14,700
And I would be interested in your
take on what proportion of

176
00:09:14,700 --> 00:09:18,980
people are moving for those kind
of ideological reasons versus

177
00:09:19,380 --> 00:09:20,320
cost reasons.

178
00:09:20,440 --> 00:09:23,060
Because you mentioned it's like
double the price for less access.

179
00:09:23,420 --> 00:09:25,700
Which of those is more important
to people?

180
00:09:25,760 --> 00:09:27,440
Double the cost or the less access?

181
00:09:27,440 --> 00:09:28,400
Nikolay: That's a good question.

182
00:09:28,400 --> 00:09:32,220
First of all, again, the question
is not like one-sided.

183
00:09:33,340 --> 00:09:37,360
I admit all things RDS brought
and others brought and

184
00:09:37,360 --> 00:09:37,860
Michael: it's

185
00:09:39,560 --> 00:09:40,240
Nikolay: for sure.

186
00:09:40,240 --> 00:09:46,960
But also I just feel it like how
working with Oracle and SQL

187
00:09:46,960 --> 00:09:51,860
Server, not understanding what's
happening, always like a need

188
00:09:51,860 --> 00:09:56,460
to reach out support and waiting
months to get any small response.

189
00:09:56,600 --> 00:09:59,000
I was like, it's hard.

190
00:09:59,220 --> 00:10:03,480
Then working with open source,
you just see how it works and

191
00:10:03,480 --> 00:10:06,400
you can even fix it if needed,
or you can troubleshoot.

192
00:10:06,580 --> 00:10:08,860
It's like documentation always
lacks details.

193
00:10:09,120 --> 00:10:12,780
So source code is best documentation
and ability to debug.

194
00:10:14,680 --> 00:10:18,360
This is a huge part of open source
concept for me.

195
00:10:18,700 --> 00:10:20,380
And this lacks there, right?

196
00:10:20,380 --> 00:10:22,800
So for me, it's a big problem.

197
00:10:23,360 --> 00:10:26,600
But I understand I'm in minority
here.

198
00:10:26,880 --> 00:10:28,700
But I'm not afraid to be in minority.

199
00:10:28,700 --> 00:10:32,280
20 years ago, Postgres itself was
deeply in minority, deeply.

200
00:10:32,500 --> 00:10:37,580
It was like, MySQL everywhere,
Oracle is in enterprise.

201
00:10:37,940 --> 00:10:39,300
What is Postgres, right?

202
00:10:39,620 --> 00:10:47,080
So I truly believe that RDS, RISE
started roughly 10 years ago,

203
00:10:47,080 --> 00:10:49,260
maybe 12, right, around that time.

204
00:10:49,400 --> 00:10:51,060
It was like a storm.

205
00:10:51,060 --> 00:10:54,140
Heroku started first, and I think
Heroku was historically first,

206
00:10:54,140 --> 00:10:56,120
like, good managed Postgres, right?

207
00:10:56,120 --> 00:10:57,320
And RDS, others.

208
00:10:58,780 --> 00:10:59,620
Everyone has it.

209
00:10:59,620 --> 00:11:03,640
Even IBM or Huawei has it, or Oracle
has it.

210
00:11:03,640 --> 00:11:05,700
It's insane, absolutely insane.

211
00:11:05,980 --> 00:11:12,480
So I truly believe this peak, we
already like should go down

212
00:11:13,320 --> 00:11:17,920
or soon we'll start go down and
something new has to be created.

213
00:11:18,540 --> 00:11:19,780
This is what I feel.

214
00:11:19,840 --> 00:11:24,380
Maybe it will take 10 years or
so, another 10 years, but it should

215
00:11:24,380 --> 00:11:31,480
not be so because the whole philosophical
point of open source

216
00:11:31,480 --> 00:11:32,620
is deeply disturbed.

217
00:11:33,420 --> 00:11:33,920
Right?

218
00:11:34,640 --> 00:11:38,240
It sucks to troubleshoot problem
on RDS trying to reproduce it

219
00:11:38,240 --> 00:11:39,860
on regular Postgres.

220
00:11:40,160 --> 00:11:42,260
You try to guess things all the
time.

221
00:11:42,260 --> 00:11:44,160
You don't see, you are blind, right?

222
00:11:44,680 --> 00:11:50,640
Some database is suffering, their
support is slow or unexperienced.

223
00:11:50,740 --> 00:11:53,620
Michael: We're repeating a conversation
we've had several times.

224
00:11:54,020 --> 00:11:54,440
Nikolay: Yeah, yeah.

225
00:11:54,440 --> 00:11:58,700
Well, the problem is like, for
me, this season is deep, it's

226
00:11:58,700 --> 00:12:03,840
strong, but I also understand I'm
among very few people who feel

227
00:12:03,840 --> 00:12:04,700
it I think

228
00:12:05,080 --> 00:12:07,800
Michael: So why bring this 1 up
about how to move off?

229
00:12:07,800 --> 00:12:10,080
Is it because you want people to
move or is it because you're

230
00:12:10,080 --> 00:12:11,760
seeing that they're moving and
they need advice?

231
00:12:11,760 --> 00:12:13,080
Nikolay: I want to start this discussion.

232
00:12:13,080 --> 00:12:17,400
I also know that probably more
people will move because of budgets

233
00:12:18,000 --> 00:12:23,000
especially like economical reasons
they can yeah you at some

234
00:12:23,000 --> 00:12:28,920
point you realize you spend a lot
on RDS and other services probably,

235
00:12:28,940 --> 00:12:30,760
but RDS sometimes is huge.

236
00:12:30,760 --> 00:12:34,940
Like it can be sometimes it's like
between 20 or 50% of whole

237
00:12:34,940 --> 00:12:36,780
budget for cloud RDS.

238
00:12:37,340 --> 00:12:42,760
And I think there are interesting
pieces of how you can get support.

239
00:12:43,480 --> 00:12:46,320
As I understand, to get support
at RDS, whole account should

240
00:12:46,320 --> 00:12:47,140
be upgraded.

241
00:12:47,200 --> 00:12:49,240
So you cannot upgrade only RDS
part.

242
00:12:49,240 --> 00:12:50,700
But this I'm not sure.

243
00:12:50,740 --> 00:12:55,080
What I'm sure, it costs a lot,
and it costs for something that

244
00:12:55,080 --> 00:12:59,440
can be achieved with open source,
and a little bit of support

245
00:12:59,440 --> 00:13:00,400
from some guys.

246
00:13:01,880 --> 00:13:05,800
You can hire those guys, or you
can use services, but it will

247
00:13:05,800 --> 00:13:09,360
cost a fraction of what you can
have on AWS.

248
00:13:11,320 --> 00:13:13,680
And so sometimes some companies
are in trouble.

249
00:13:14,120 --> 00:13:15,680
They check spendings.

250
00:13:16,360 --> 00:13:22,580
They see big check, like say 10,
000 per month for some small

251
00:13:22,580 --> 00:13:23,080
company.

252
00:13:23,200 --> 00:13:24,640
It's significant money, right?

253
00:13:24,960 --> 00:13:26,920
So you can hire a guy for this.

254
00:13:27,820 --> 00:13:32,300
And you check what you can achieve
on low costers like Hetzner

255
00:13:32,560 --> 00:13:34,460
or OVH or something.

256
00:13:35,240 --> 00:13:37,460
And you see, you can drop it 5
times.

257
00:13:39,400 --> 00:13:40,120
Why not?

258
00:13:40,440 --> 00:13:40,940
Right.

259
00:13:41,400 --> 00:13:45,780
Or maybe it can be a hundred and
20, 000 after migration.

260
00:13:46,520 --> 00:13:51,060
And you see that, like, as I said
between 20 and 50 percent of

261
00:13:51,060 --> 00:13:58,160
costs are databases right we're
talking about guys and not raising

262
00:13:58,260 --> 00:14:02,360
but they need to survive as well
There are such companies and

263
00:14:02,540 --> 00:14:04,240
sometimes they need help as well.

264
00:14:05,140 --> 00:14:09,480
Michael: Yeah, so the cynic in
me is thinking, oh, Nikolay wants

265
00:14:09,480 --> 00:14:10,220
more customers.

266
00:14:11,480 --> 00:14:14,380
Nikolay: No, unfortunately, in
this case, I can help 1 time,

267
00:14:14,380 --> 00:14:16,580
like Moo, but we don't have a solution
yet.

268
00:14:17,780 --> 00:14:21,340
Others might have solutions to
this, and sometimes it's Kubernetes,

269
00:14:21,380 --> 00:14:22,680
sometimes it's not Kubernetes.

270
00:14:22,780 --> 00:14:24,820
I would prefer not Kubernetes in
this case.

271
00:14:26,880 --> 00:14:29,340
If you're very familiar with Kubernetes,
okay, good.

272
00:14:29,340 --> 00:14:33,360
There are options, but I would
just prefer maybe like old-school

273
00:14:34,660 --> 00:14:38,560
solution without additional layers
if it's a small project, it's

274
00:14:38,560 --> 00:14:43,520
maybe easier and yeah, we can help,
but I'm not looking for customers

275
00:14:43,520 --> 00:14:48,040
because this is actually not, not
super huge customer.

276
00:14:48,060 --> 00:14:52,120
Usually We had such cases, but
it's like small project for us

277
00:14:52,120 --> 00:14:55,960
to migrate and I just wanted to
share some pieces of advice and

278
00:14:55,960 --> 00:15:02,380
start like spark this negotiation
about why like we use RDS so

279
00:15:02,380 --> 00:15:06,240
much while backups, HA and DR already
solved in open source,

280
00:15:06,300 --> 00:15:07,240
fully solved.

281
00:15:07,420 --> 00:15:10,040
Take Patroni, and pgBackRest or WAL-G,
that's it.

282
00:15:10,040 --> 00:15:14,560
And just find some packaging or
build your own and that's it.

283
00:15:15,200 --> 00:15:15,700
Yeah.

284
00:15:16,400 --> 00:15:18,620
So not convincing or what?

285
00:15:19,200 --> 00:15:22,480
Michael: Yeah, well, I mean, I
think you're right at a certain

286
00:15:22,480 --> 00:15:25,920
scale, but I think there's a lot
of smaller scales where it just

287
00:15:25,920 --> 00:15:26,600
makes sense.

288
00:15:26,600 --> 00:15:30,440
Like it doesn't make as much sense
to try and save that amount

289
00:15:30,440 --> 00:15:31,740
of money for the...

290
00:15:32,360 --> 00:15:35,660
Nikolay: How much, like, let's
take some company, how much is

291
00:15:35,660 --> 00:15:38,980
total cost and like in percentage,
how much databases?

292
00:15:40,360 --> 00:15:41,740
Michael: Yeah, good, good question.

293
00:15:41,940 --> 00:15:47,220
But I think like, even up to spending,
Well, definitely less

294
00:15:47,220 --> 00:15:50,140
if you're spending less than $1,
000 a month on the database,

295
00:15:50,340 --> 00:15:51,340
like why bother?

296
00:15:51,900 --> 00:15:52,860
Nikolay: Not noticeable, right.

297
00:15:52,860 --> 00:15:53,760
But if it's

298
00:15:54,140 --> 00:15:55,080
Michael: Do you see what I mean?

299
00:15:55,080 --> 00:15:59,280
Like, that's quite a big like,
for a lot of very small startups,

300
00:15:59,480 --> 00:16:00,920
they won't be spending that.

301
00:16:00,920 --> 00:16:03,060
So, well, I think there were a
lot.

302
00:16:03,340 --> 00:16:06,880
Nikolay: Well, depending on the
term small, because if it's still

303
00:16:06,880 --> 00:16:11,900
small team, but they accumulated
a lot of data, the budget for

304
00:16:12,440 --> 00:16:15,300
to keep this data in RDS Postgres
will be high.

305
00:16:15,300 --> 00:16:18,620
And I think we touched on an interesting
question.

306
00:16:18,620 --> 00:16:23,360
I think there is some threshold
where you can estimate how much

307
00:16:23,360 --> 00:16:26,700
effort in terms of engineering
resources it will take.

308
00:16:26,700 --> 00:16:30,080
And then how much, like, there
should be some threshold.

309
00:16:30,540 --> 00:16:34,040
Below that, it's not, like, reasonable
to move out of RDS.

310
00:16:34,040 --> 00:16:37,740
But above that, it might be reasonable,
especially considering

311
00:16:37,740 --> 00:16:43,100
that the quality of Patroni, WAL-G,
and pgBackRest, it's very

312
00:16:43,100 --> 00:16:43,620
good enough.

313
00:16:43,620 --> 00:16:45,540
They are battle-proven many years
already.

314
00:16:46,320 --> 00:16:52,120
And yeah, so I think, and we see
in RDS, I don't know if they

315
00:16:52,120 --> 00:16:56,140
implemented using Patroni, but
this so-called HA cluster, in

316
00:16:56,140 --> 00:17:00,320
addition to HA, not HA, multi-AZ,
multi-AZ cluster, in addition

317
00:17:00,320 --> 00:17:05,600
to multi-AZ instance, 3 node cluster,
it already looks like classic

318
00:17:05,600 --> 00:17:08,080
Patroni setup, right?

319
00:17:08,320 --> 00:17:11,020
And if you compare costs, it will
be interesting.

320
00:17:11,200 --> 00:17:17,440
But also you, if you move out of
AWS to out of cloud or to cheaper

321
00:17:17,440 --> 00:17:22,760
cloud like Hetzner Cloud, which
has regions in the US and a lot

322
00:17:22,760 --> 00:17:24,640
of regions, not a lot, some regions

323
00:17:24,640 --> 00:17:24,960
Michael: in Europe.

324
00:17:24,960 --> 00:17:26,400
As of recently, yeah.

325
00:17:27,040 --> 00:17:30,980
Nikolay: And they also, since I
think last year, have S3 compatible

326
00:17:31,680 --> 00:17:35,040
object storage for backups, but
only in Europe.

327
00:17:35,220 --> 00:17:35,740
That's a fact.

328
00:17:35,740 --> 00:17:36,220
Michael: Oh, really?

329
00:17:36,220 --> 00:17:37,060
Nikolay: That's a fact.

330
00:17:37,060 --> 00:17:40,340
I hope they will add it to US regions
as well soon.

331
00:17:41,040 --> 00:17:43,680
And I actually really like their
dedicated offering.

332
00:17:43,700 --> 00:17:47,660
I used it many years ago when I
was bootstrapping it.

333
00:17:47,660 --> 00:17:49,140
And I know many people use it.

334
00:17:49,140 --> 00:17:52,320
And sometimes customers come to
us and they use Hetzner for bootstrapping,

335
00:17:52,640 --> 00:17:53,720
and it makes sense.

336
00:17:54,720 --> 00:17:58,600
And it's even more cost saving.

337
00:17:58,820 --> 00:18:01,620
But it's also not available in
the US, unfortunately.

338
00:18:02,240 --> 00:18:09,240
So anyway, if you take that price
for instances, EC2 versus virtual

339
00:18:09,240 --> 00:18:12,040
machine and heads in the cloud,
it's already 5X.

340
00:18:12,920 --> 00:18:19,180
But on top of that, premium AWS
ads to run backups and failover

341
00:18:19,220 --> 00:18:20,900
and other services they provide.

342
00:18:21,820 --> 00:18:25,220
It's additionally like 60, 70%,
right?

343
00:18:25,640 --> 00:18:25,960
Michael: Yeah.

344
00:18:25,960 --> 00:18:26,820
Well, but okay.

345
00:18:26,820 --> 00:18:29,200
So I think we've covered enough
on like why.

346
00:18:29,380 --> 00:18:30,300
I think people,

347
00:18:31,020 --> 00:18:34,580
Nikolay: yeah, control and money,
money and control.

348
00:18:35,400 --> 00:18:35,900
Yeah.

349
00:18:36,000 --> 00:18:39,660
Michael: So, so then I think the
more, the trickier question

350
00:18:39,660 --> 00:18:40,360
is how?

351
00:18:41,600 --> 00:18:45,360
And especially how to do it with
minimal or no downtime.

352
00:18:46,060 --> 00:18:49,020
Nikolay: Yeah, well It's again.

353
00:18:49,020 --> 00:18:53,000
This is not only about RDS Maybe
you want to migrate out of other,

354
00:18:53,000 --> 00:18:57,680
like CloudSQL or Crunchy Bridge
or something, despite their achievements.

355
00:18:58,860 --> 00:19:04,040
First thing, Let's talk about versions,
compatibility, plugins.

356
00:19:04,400 --> 00:19:08,300
If you move out to different cloud,
it's of course important

357
00:19:08,300 --> 00:19:11,980
to compare all the extensions and
capabilities, versions supported,

358
00:19:12,600 --> 00:19:17,260
how fast the service delivers minor
upgrades, this is big deal.

359
00:19:17,260 --> 00:19:18,280
This is big deal.

360
00:19:18,620 --> 00:19:23,640
If they lag, it's like some flag
for me, like we need to make

361
00:19:23,640 --> 00:19:28,380
sure if bugs happen in Postgres,
it will be delivered to my setup

362
00:19:28,380 --> 00:19:28,880
quickly.

363
00:19:29,440 --> 00:19:32,720
Also how my major upgrades are
done, right?

364
00:19:33,100 --> 00:19:36,680
What kind of control I have there,
but if it's to your hands,

365
00:19:37,120 --> 00:19:39,960
well, it's easier because you can
take, it's open source.

366
00:19:39,960 --> 00:19:43,940
You can take any extension besides
extensions AWS, for example,

367
00:19:43,940 --> 00:19:48,060
created for RDS and didn't publish
to open source.

368
00:19:48,180 --> 00:19:52,900
They have it, like AWS Lambda,
extensions for plant control in

369
00:19:52,900 --> 00:19:57,680
Aurora, to freeze plants, to mitigate
plant flips, to avoid plant

370
00:19:57,680 --> 00:19:58,180
flips.

371
00:19:58,680 --> 00:20:02,920
These extensions, unfortunately,
are not available in open source,

372
00:20:02,920 --> 00:20:04,420
so you need to find a replacement.

373
00:20:05,140 --> 00:20:07,060
But this is rare, actually.

374
00:20:07,060 --> 00:20:08,460
So usually people...

375
00:20:08,480 --> 00:20:12,780
It's either observability-related
capabilities, you can find

376
00:20:12,780 --> 00:20:16,800
a replacement for it, or It's something
people usually rarely

377
00:20:16,800 --> 00:20:17,240
use.

378
00:20:17,240 --> 00:20:22,160
So most of the offering in RTS,
it's based on open source pieces.

379
00:20:22,500 --> 00:20:25,240
And you can bring even more extensions
if you're thinking in

380
00:20:25,240 --> 00:20:26,180
your hands, right?

381
00:20:26,520 --> 00:20:27,720
You can have- Make

382
00:20:27,720 --> 00:20:27,900
Michael: your own?

383
00:20:27,900 --> 00:20:31,200
Nikolay: Yeah, make your own extension,
easier, compile everything.

384
00:20:31,420 --> 00:20:33,640
Michael: So- P-Y-O-A, yeah.

385
00:20:33,920 --> 00:20:34,960
Nikolay: Yeah, yeah.

386
00:20:34,980 --> 00:20:38,620
So yeah, there was a mini conference
inside, a recent conference

387
00:20:38,620 --> 00:20:39,800
in Montreal, right?

388
00:20:39,800 --> 00:20:42,480
So it was organized by Yurii, right?

389
00:20:42,720 --> 00:20:44,980
It was Postgres Extensions Day or something.

390
00:20:45,520 --> 00:20:47,900
I know several people who presented
talks there.

391
00:20:47,980 --> 00:20:48,760
It was interesting.

392
00:20:49,400 --> 00:20:54,620
So, actually, a part of my lack
of love to extensions over the

393
00:20:54,620 --> 00:20:58,020
last 5 to 10 years is because of
RDS and these guys.

394
00:20:58,260 --> 00:21:01,420
Because I understand if I create
extension, it will take ages

395
00:21:01,920 --> 00:21:03,740
for them to take it, right?

396
00:21:04,160 --> 00:21:10,120
So if you move out, yeah, you have
control and can use more extensions.

397
00:21:11,540 --> 00:21:12,040
What?

398
00:21:12,900 --> 00:21:15,920
Michael: Yeah, so but now we're
talking about some, yeah, of

399
00:21:15,920 --> 00:21:18,440
course you need to do a bit of
research bit of prep as to where

400
00:21:18,440 --> 00:21:21,480
you're going but I'm assuming most
people that are thinking how

401
00:21:21,480 --> 00:21:25,420
do I move off have done that step
I think the the tricky part

402
00:21:25,520 --> 00:21:31,500
is often you don't you're not able
to for example set up logical

403
00:21:31,500 --> 00:21:35,860
replication to you know you an
easy an easy ish way to do this

404
00:21:35,860 --> 00:21:40,960
would be set up a logical replica
somewhere else, set up replication,

405
00:21:41,280 --> 00:21:46,100
decide a cutover point, pause writes,
cut over, send writes to

406
00:21:46,100 --> 00:21:50,800
the new primary, great, easy, done,
episode done, But I believe

407
00:21:50,800 --> 00:21:52,620
that's not possible in most cases.

408
00:21:53,300 --> 00:21:56,020
Nikolay: Well, it depends on the
scale.

409
00:21:57,060 --> 00:22:02,560
So before we go there, 1 more thing
related to extensions and

410
00:22:02,560 --> 00:22:03,060
compatibility.

411
00:22:03,740 --> 00:22:08,800
So if you want to think about how
not to lose anything, it's

412
00:22:08,800 --> 00:22:10,940
important to think about observability
bits.

413
00:22:11,740 --> 00:22:15,660
And in this case, pg_wait_sampling
and pg_stat_kcache and pg_stat_statements

414
00:22:16,120 --> 00:22:20,040
and auto_explain, they bring a lot
of observability bits.

415
00:22:20,500 --> 00:22:24,720
And there are several monitoring
solutions available that can

416
00:22:24,720 --> 00:22:26,140
be used on top of them.

417
00:22:26,200 --> 00:22:31,360
So the idea is everyone who works
with RDS, literally everyone,

418
00:22:31,360 --> 00:22:32,720
they use performance insights.

419
00:22:33,900 --> 00:22:37,460
And it's not a good idea to migrate
off and lose this bit.

420
00:22:38,100 --> 00:22:42,660
Fortunately, pg_wait_sampling, especially
with recent fixes, now,

421
00:22:42,660 --> 00:22:43,160
right?

422
00:22:43,320 --> 00:22:47,880
It's possible to use it and have
similar charts in your monitoring

423
00:22:48,720 --> 00:22:49,700
thanks to that.

424
00:22:49,800 --> 00:22:53,980
So It's not only about extensions
functionality, but also additional

425
00:22:54,060 --> 00:22:56,080
observability, so management extensions.

426
00:22:56,740 --> 00:23:00,960
As for cutover and so on, logical
replication is usually available

427
00:23:00,960 --> 00:23:02,680
everywhere, on RDS included.

428
00:23:03,740 --> 00:23:05,540
You can provision logical replicas.

429
00:23:06,020 --> 00:23:10,580
The tricky part is how to do it
if your database exceeds, say,

430
00:23:10,600 --> 00:23:12,440
a few terabytes or 10 terabytes.

431
00:23:12,540 --> 00:23:13,760
It's really not easy.

432
00:23:14,600 --> 00:23:17,400
Michael: As in that first initialization,
you just never catch

433
00:23:17,400 --> 00:23:17,900
up.

434
00:23:18,920 --> 00:23:19,580
Nikolay: Yeah, yeah.

435
00:23:19,980 --> 00:23:22,900
Well, yeah, never catch up.

436
00:23:23,040 --> 00:23:25,940
You can use multiple slots to catch
up, right?

437
00:23:26,460 --> 00:23:29,000
Michael: Yeah, but so, okay, but
you said it's not easy.

438
00:23:29,480 --> 00:23:33,600
Why, is it because of the time
between setting it up and getting

439
00:23:33,600 --> 00:23:34,340
it to work.

440
00:23:34,340 --> 00:23:36,880
Nikolay: To create a logical replica,
we need to do 2 things.

441
00:23:36,880 --> 00:23:41,140
First is initialization, and next
is switching to CDC and catching

442
00:23:41,140 --> 00:23:42,380
up inside the CDC.

443
00:23:42,700 --> 00:23:46,800
I don't think it's a huge deal
in terms of CDC to catch up if

444
00:23:46,800 --> 00:23:48,060
you use multiple slots.

445
00:23:48,340 --> 00:23:52,460
Usually with 1 slot you can handle
workloads like 1000 tuple

446
00:23:52,540 --> 00:23:53,420
writes per second.

447
00:23:53,420 --> 00:23:59,340
If you check pg_stat_activity, n_tup_del,
n_tup_upd, and n_tup_ins,

448
00:24:00,240 --> 00:24:03,400
These 3 numbers, you check how
many of them you can have per

449
00:24:03,400 --> 00:24:03,900
second.

450
00:24:03,900 --> 00:24:08,760
If it's on modern hardware like
Intel, AMD as well, ARM maybe,

451
00:24:08,760 --> 00:24:09,860
but maybe less.

452
00:24:09,860 --> 00:24:11,880
You need to go down with this threshold.

453
00:24:12,660 --> 00:24:15,980
Thousand tuple writes per second,
of course, it's a relative

454
00:24:16,120 --> 00:24:19,900
number because it depends on the
content of those tuples as well.

455
00:24:20,030 --> 00:24:27,040
You can have a thousand tuples
per second, you should catch up

456
00:24:27,180 --> 00:24:29,660
well just with a single slot.

457
00:24:30,060 --> 00:24:32,280
If you have more, just use multiple
slots.

458
00:24:32,280 --> 00:24:35,840
The only problem is that when you
use a logical replication using

459
00:24:35,840 --> 00:24:38,660
multiple slots, foreign key violation
happens temporarily.

460
00:24:38,860 --> 00:24:42,380
It's eventually consistent on the
subscriber, right?

461
00:24:42,520 --> 00:24:45,560
So you cannot use it while it's
a replica.

462
00:24:46,220 --> 00:24:47,900
But so this is not a problem.

463
00:24:47,900 --> 00:24:51,840
Problem is this initialization
because standard approach.

464
00:24:52,060 --> 00:24:53,420
So there are several tricks here.

465
00:24:53,420 --> 00:24:55,620
First, there is a binary mode.

466
00:24:56,780 --> 00:25:00,320
There is a traditional way to create
a replica with copy data

467
00:25:00,320 --> 00:25:03,060
set to true, but also you can set
binary to true.

468
00:25:03,120 --> 00:25:05,900
I think it's supported since Postgres
15 or 16.

469
00:25:06,780 --> 00:25:12,900
In this case, it gets data in binary
form and it's kind of faster.

470
00:25:13,260 --> 00:25:16,880
It's not the same as to create
physical replica and convert it

471
00:25:16,880 --> 00:25:20,100
to logical replica, which became,
by the way, official in Postgres 17.

472
00:25:20,280 --> 00:25:20,900
I missed this.

473
00:25:20,900 --> 00:25:21,900
There's a new

474
00:25:21,900 --> 00:25:22,620
Michael: CLI tool.

475
00:25:22,660 --> 00:25:25,820
Nikolay: There's a new CLI tool,
but it will be only available

476
00:25:26,140 --> 00:25:28,020
Postgres 17, so in future.

477
00:25:28,380 --> 00:25:31,400
Now, usually we have an older version
on production.

478
00:25:31,460 --> 00:25:36,300
Although, if you can upgrade to
17 first, then this tool is

479
00:25:36,300 --> 00:25:36,760
available.

480
00:25:36,760 --> 00:25:40,460
But these guys don't provide the
physical backups besides CrunchyBridge.

481
00:25:41,260 --> 00:25:43,420
Michael: I was going to say, so
you wouldn't be able to use this

482
00:25:43,420 --> 00:25:45,260
to migrate off RDS.

483
00:25:45,480 --> 00:25:49,780
Nikolay: But this flag, what I'm
talking about, like flag binary

484
00:25:49,780 --> 00:25:53,040
set to true, this is for regular
logical replica provisioning.

485
00:25:53,160 --> 00:25:55,020
It's not physical to logical conversion.

486
00:25:55,020 --> 00:25:55,760
It's something else.

487
00:25:55,760 --> 00:26:00,560
Like, and it, like it should speed
up, should speed things up.

488
00:26:01,160 --> 00:26:04,700
But additionally, you can speed
up things to implement it manually.

489
00:26:05,340 --> 00:26:11,080
So if you just create slot, open
transaction, export snapshot,

490
00:26:11,680 --> 00:26:17,640
and then dump everything in many
workers, actually, pg_dump supports

491
00:26:17,640 --> 00:26:18,060
it.

492
00:26:18,060 --> 00:26:22,620
In pg_dump, you can support hyphen
j, number of jobs.

493
00:26:22,740 --> 00:26:28,260
And also you can specify exact
snapshot to work with.

494
00:26:28,260 --> 00:26:31,200
If that transaction is still open
and this snapshot is being

495
00:26:31,200 --> 00:26:36,100
held in repeatable read transaction
isolation mode, a snapshot

496
00:26:36,100 --> 00:26:36,600
isolation.

497
00:26:36,740 --> 00:26:38,000
So it holds snapshot.

498
00:26:38,300 --> 00:26:42,100
Then you can use multiple pg_dumps
or multiple pg_dump workers

499
00:26:42,160 --> 00:26:43,400
to export data faster.

500
00:26:43,700 --> 00:26:44,760
There is even more.

501
00:26:45,320 --> 00:26:46,640
Even more can be done here.

502
00:26:46,640 --> 00:26:50,460
If you have huge tables, a single
worker will be a work alone.

503
00:26:50,680 --> 00:26:53,680
It will be a single-threaded, single
worker processing a huge

504
00:26:53,680 --> 00:26:56,620
table, like a billion rows, for
example.

505
00:26:57,740 --> 00:27:01,560
And in this case, it's good to
think about custom dumping and

506
00:27:01,560 --> 00:27:05,320
restoring using multiple workers
and using like logical partitioning

507
00:27:05,460 --> 00:27:11,180
like you have ID ranges for example
if it's integer big int or

508
00:27:11,320 --> 00:27:16,560
UUIDv7 or actually if it's
even if it's UUIDv4

509
00:27:16,560 --> 00:27:20,360
you can have ranges because it's
like randomly distributed

510
00:27:20,400 --> 00:27:20,900
like

511
00:27:21,300 --> 00:27:25,680
Michael: so no longer using pg_dump
but like copying to CSV or

512
00:27:25,680 --> 00:27:26,180
something.

513
00:27:26,180 --> 00:27:27,240
Nikolay: Yeah, yeah.

514
00:27:27,740 --> 00:27:28,500
Yeah, yeah.

515
00:27:28,500 --> 00:27:29,800
So something like this.

516
00:27:29,800 --> 00:27:34,740
And you can specify ranges and
export data in huge table using

517
00:27:34,740 --> 00:27:35,720
multiple workers.

518
00:27:38,800 --> 00:27:40,740
This is to speed up the process.

519
00:27:41,680 --> 00:27:44,620
And the more we speed up, the less
we need to catch up.

520
00:27:46,060 --> 00:27:46,560
Obviously.

521
00:27:47,300 --> 00:27:50,760
But the problem with speeding up
too much is if you do it from

522
00:27:50,760 --> 00:27:55,200
the primary, and until only recently,
16 or 17, it was when it

523
00:27:55,200 --> 00:27:59,680
became possible to create logical
replication from physical standbys.

524
00:28:00,260 --> 00:28:02,660
Before that, you need to deal with
primary.

525
00:28:03,120 --> 00:28:05,040
And this is not fun.

526
00:28:05,800 --> 00:28:07,940
Michael: But also, I don't understand
how you...

527
00:28:08,000 --> 00:28:12,340
Can you on RDS set up logical replicas
from a standby?

528
00:28:12,380 --> 00:28:15,200
Nikolay: Never tried, but it should
work in 17, right?

529
00:28:15,580 --> 00:28:18,380
Michael: Oh, I mean, the features,
they're in Postgres.

530
00:28:18,380 --> 00:28:20,280
I just didn't know it was available
in RDS.

531
00:28:20,280 --> 00:28:21,520
Nikolay: I also don't know.

532
00:28:22,420 --> 00:28:28,460
But actually, you can have, you
can think about having, if, ah,

533
00:28:28,660 --> 00:28:30,160
so it's interesting, right?

534
00:28:30,480 --> 00:28:35,280
If we create a slot on the primary
and we have physical replica.

535
00:28:35,940 --> 00:28:38,860
Theoretically, we could have replica
lagging a little bit.

536
00:28:38,860 --> 00:28:43,340
We create a slot and then we use
recovery_target_lsn to catch

537
00:28:43,340 --> 00:28:49,200
up and synchronize with slot position
using physical replication.

538
00:28:49,740 --> 00:28:54,380
And then we pause it again, and
we can grab data from physical

539
00:28:54,760 --> 00:28:55,260
replica.

540
00:28:55,740 --> 00:28:59,120
And we know it will correspond
exactly to the slot position,

541
00:28:59,340 --> 00:29:01,180
which is waiting for us, right?

542
00:29:01,680 --> 00:29:05,320
But the problem is RDS doesn't
support recovery_target_lsn, you

543
00:29:05,320 --> 00:29:06,360
cannot change it.

544
00:29:06,580 --> 00:29:09,860
So my question is, can we pause
replica and instead of recovery

545
00:29:09,860 --> 00:29:13,760
target_lsn, advance slot, there
is a function,

546
00:29:13,820 --> 00:29:16,940
pg_replication_slot_advance or something like
this, you can, knowing position

547
00:29:17,040 --> 00:29:20,240
of your physical standby, you could
advance the slot.

548
00:29:20,500 --> 00:29:21,800
So they again synchronized.

549
00:29:22,280 --> 00:29:25,460
And then you can grab your data
from physical standby and you

550
00:29:25,460 --> 00:29:28,120
know this data will correspond
to the slot position.

551
00:29:29,560 --> 00:29:34,780
Michael: So, But in reality, how
are you seeing people, when

552
00:29:34,780 --> 00:29:38,920
you do a project to move people
off, what are you actually ending

553
00:29:38,920 --> 00:29:39,820
up doing normally?

554
00:29:40,160 --> 00:29:43,820
Nikolay: Well, recently we decided
to do it with downtime, honestly,

555
00:29:44,040 --> 00:29:48,960
because this project could allocate
maintenance window 3 hours,

556
00:29:48,960 --> 00:29:49,620
no problem.

557
00:29:49,840 --> 00:29:52,000
But the scale was, I think, a couple
of terabytes.

558
00:29:53,560 --> 00:29:54,580
It was not, yeah.

559
00:29:55,240 --> 00:29:58,240
But at this scale, actually, I
would probably use traditional

560
00:29:58,260 --> 00:29:59,700
logical replication provisioning.

561
00:30:00,060 --> 00:30:02,960
But what I'm describing right now
here, it's interesting because

562
00:30:02,960 --> 00:30:07,940
again, we want to move as fast
as possible with initialization

563
00:30:08,720 --> 00:30:10,740
to catch up less later.

564
00:30:11,120 --> 00:30:15,240
But we also don't want to overload
the primary if it's still

565
00:30:15,240 --> 00:30:16,020
being used.

566
00:30:16,400 --> 00:30:19,900
So this trade-off is like competing
reasons, right?

567
00:30:19,900 --> 00:30:23,520
That's why it's a good idea to
move this disk I/O or reading disk

568
00:30:23,520 --> 00:30:26,060
I/O off the primary to a standby.

569
00:30:26,480 --> 00:30:29,320
This is what I just tried to elaborate.

570
00:30:29,440 --> 00:30:31,400
Honestly, I never tried on RDS.

571
00:30:31,740 --> 00:30:32,460
I never tried.

572
00:30:32,460 --> 00:30:34,700
I only tried from from self-managed.

573
00:30:35,080 --> 00:30:37,360
Michael: That's why I wanted to
bring us back a little bit to

574
00:30:37,360 --> 00:30:41,140
how to move, because it feels like,
even though these things

575
00:30:41,140 --> 00:30:44,620
might be possible, they start to
get quite complicated.

576
00:30:44,860 --> 00:30:50,200
And sometimes complexity in these
situations is scary, it's hard

577
00:30:50,200 --> 00:30:54,560
to test, it's hard to be sure it's
going to work well, and often

578
00:30:55,600 --> 00:30:58,000
because it's a project that won't
happen, hopefully you won't

579
00:30:58,000 --> 00:31:02,120
be migrating every year or 2, So
hopefully it's a fairly infrequent

580
00:31:02,220 --> 00:31:03,620
task people can offer.

581
00:31:04,120 --> 00:31:08,960
I think sometimes people can arrange
for a couple of hours of

582
00:31:08,960 --> 00:31:12,720
downtime at really, you know, over
a weekend announce it really

583
00:31:12,720 --> 00:31:13,580
far in advance.

584
00:31:13,740 --> 00:31:18,140
Let people like obviously some
some can't but I think plenty

585
00:31:18,140 --> 00:31:20,560
can and then the complexity just
drops.

586
00:31:20,640 --> 00:31:23,940
Nikolay: So far there are tools
that are open source tools and

587
00:31:23,940 --> 00:31:27,260
there are proprietary tools additionally,
which help with logical

588
00:31:27,260 --> 00:31:27,760
replication.

589
00:31:28,100 --> 00:31:28,780
For example-

590
00:31:28,860 --> 00:31:30,360
Michael: Like CDC type.

591
00:31:30,540 --> 00:31:34,780
Nikolay: Yeah, so DMS from AWS,
and Google also supports their

592
00:31:34,780 --> 00:31:35,720
own tooling.

593
00:31:36,040 --> 00:31:38,040
And there are third party tools
like Qlik.

594
00:31:38,040 --> 00:31:39,300
I wouldn't recommend Qlik.

595
00:31:39,380 --> 00:31:41,260
They suck at Postgres for sure.

596
00:31:41,360 --> 00:31:42,440
But there's Fivetran.

597
00:31:42,440 --> 00:31:44,600
Fivetran is good proprietary tool.

598
00:31:45,480 --> 00:31:51,080
Like They promise to work very
reliably in very big databases,

599
00:31:51,140 --> 00:31:55,740
so you can use them to migrate
out and then that's it.

600
00:31:55,960 --> 00:31:59,540
Or there are open source tools
like there is pgcopydb, which

601
00:31:59,540 --> 00:32:01,800
I think works fully at logical
level, right?

602
00:32:01,800 --> 00:32:07,000
So it should be compatible with
RDS by Dmitry Fontaine, right?

603
00:32:07,080 --> 00:32:09,980
Again, I only know it, I never
use it myself.

604
00:32:10,760 --> 00:32:13,580
There's something from Xata, I
think?

605
00:32:14,320 --> 00:32:14,820
No?

606
00:32:15,060 --> 00:32:15,820
Should be.

607
00:32:16,640 --> 00:32:17,580
Michael: I don't know.

608
00:32:17,780 --> 00:32:20,140
Nikolay: The problem is support
of DDL, because the problem with

609
00:32:20,140 --> 00:32:21,780
logical is DDL always, right?

610
00:32:21,780 --> 00:32:26,040
If you need to create DDL often,
sometimes it's a part of user

611
00:32:26,040 --> 00:32:27,940
activity, DDL creation.

612
00:32:28,380 --> 00:32:32,460
And this is not a good position
to be in, because DDL is not

613
00:32:32,460 --> 00:32:34,640
propagated with logical replication
and it's...

614
00:32:34,640 --> 00:32:35,140
Yeah.

615
00:32:36,220 --> 00:32:37,320
If it's like...

616
00:32:37,440 --> 00:32:40,100
If you can pause it, it's great,
but if you cannot...

617
00:32:40,940 --> 00:32:41,780
Yeah, so...

618
00:32:42,700 --> 00:32:44,200
Michael: Another reason to keep
the window closed.

619
00:32:44,200 --> 00:32:45,360
Nikolay: Yeah, I agree with you.

620
00:32:45,560 --> 00:32:50,460
Complexity can grow, but it's already
not rocket science.

621
00:32:51,160 --> 00:32:52,220
Michael: Yeah, that's fair.

622
00:32:52,500 --> 00:32:56,120
Nikolay: And you can evaluate this
complexity and how much you

623
00:32:56,120 --> 00:33:00,660
need to engineer, compare, compare
with your savings.

624
00:33:02,100 --> 00:33:05,940
Michael: Yeah, it's a good point
though because if we're talking

625
00:33:06,080 --> 00:33:09,900
like projects that are already
spending more like a few thousand

626
00:33:09,900 --> 00:33:13,860
a month or more they're gonna they're
gonna be of a certain scale

627
00:33:13,860 --> 00:33:16,880
so there is gonna be that you I
think you mentioned a couple

628
00:33:16,880 --> 00:33:20,920
of terabytes and it was say about
a couple of 2 hours of downtime

629
00:33:20,920 --> 00:33:21,500
or something.

630
00:33:21,500 --> 00:33:23,900
Nikolay: Sometimes people spend
millions and we know there are

631
00:33:23,940 --> 00:33:28,580
cases of quite good, in terms of
scale, bigger companies who

632
00:33:28,580 --> 00:33:33,640
migrated out of cloud even, you
know, like posts by DHH and 37signals.

633
00:33:34,980 --> 00:33:38,660
So it was I think a couple of years
ago already right?

634
00:33:39,020 --> 00:33:42,940
Michael: Oh yeah all I mean is
the bigger you are the more that

635
00:33:42,940 --> 00:33:47,040
complexity makes sense because
you well unless I'm mistaken I

636
00:33:47,040 --> 00:33:49,680
was always thinking the down the
downtime you'd need to take

637
00:33:49,680 --> 00:33:54,400
would be somewhat proportional
to the amount of data because

638
00:33:54,400 --> 00:33:58,620
of the dump restore time, so it's
like Yeah, if you've got a

639
00:33:58,620 --> 00:34:02,320
small database the amount of downtime
you would need to cut over

640
00:34:02,320 --> 00:34:03,740
would be much lower.

641
00:34:04,300 --> 00:34:05,580
Nikolay: Yeah, yeah, that's

642
00:34:05,580 --> 00:34:06,080
Michael: fair.

643
00:34:06,580 --> 00:34:08,900
But you're not gonna have a small
database and be moving for

644
00:34:08,900 --> 00:34:09,660
cost reasons.

645
00:34:09,880 --> 00:34:11,460
That doesn't make sense to me.

646
00:34:11,740 --> 00:34:14,280
Nikolay: By the way, there is a
good approach.

647
00:34:14,340 --> 00:34:20,140
So we don't want to have long-lasting
initialization just because

648
00:34:20,200 --> 00:34:21,040
we lag a lot.

649
00:34:21,040 --> 00:34:24,480
We just don't want to risk the
health of the primary in case

650
00:34:24,480 --> 00:34:25,940
if we cancel this migration.

651
00:34:26,080 --> 00:34:28,680
Because if we cancel, the health of the...

652
00:34:28,680 --> 00:34:31,640
It will be disturbed because of accumulation of dead tuples and

653
00:34:31,640 --> 00:34:33,180
eventually bloat, right?

654
00:34:33,940 --> 00:34:37,480
And there is an approach, I think some tools I just mentioned,

655
00:34:37,480 --> 00:34:41,260
I implemented, if you start consuming from CDC immediately when

656
00:34:41,260 --> 00:34:44,980
slot is created and put it into some intermediate place like

657
00:34:44,980 --> 00:34:47,420
Kafka or something, or object storage.

658
00:34:47,920 --> 00:34:54,220
In this case, you like xmin horizon is already propagating, right?

659
00:34:55,360 --> 00:34:55,860
Yeah.

660
00:34:55,960 --> 00:34:57,680
So you're already using this slot.

661
00:34:57,940 --> 00:35:01,640
You're just using it not by the final user Postgres, but some

662
00:35:01,640 --> 00:35:04,600
intermediate user who will promise to deliver all the changes

663
00:35:04,600 --> 00:35:07,980
later when final destination will be ready.

664
00:35:08,680 --> 00:35:12,520
So this is also this also like a reasonable approach, but it

665
00:35:12,520 --> 00:35:15,440
adds a little even more complexity because now you need to manage

666
00:35:15,440 --> 00:35:17,720
Kafka and this is a whole another story.

667
00:35:17,720 --> 00:35:18,900
Or something else.

668
00:35:19,200 --> 00:35:19,700
Files.

669
00:35:19,900 --> 00:35:21,260
Files on object storage.

670
00:35:21,340 --> 00:35:22,160
It's like.

671
00:35:22,200 --> 00:35:24,640
Michael: Well, like, is that what Debezium is used for?

672
00:35:24,780 --> 00:35:28,020
Nikolay: Yeah, well, but I stopped hearing from Debezium for

673
00:35:28,020 --> 00:35:28,860
quite long.

674
00:35:28,860 --> 00:35:30,120
I don't know what's happening.

675
00:35:30,120 --> 00:35:32,980
If you know, like, who's listening, if you know, like, can you

676
00:35:32,980 --> 00:35:34,620
leave some comments somewhere?

677
00:35:34,940 --> 00:35:36,920
I'm curious what's happening with this project.

678
00:35:38,100 --> 00:35:40,680
Michael: So, but you raised a good point about testing, like

679
00:35:40,680 --> 00:35:44,800
what if you need to go back, or I mean, I think there's a valid

680
00:35:44,800 --> 00:35:47,720
question around how do you even test this kind of thing?

681
00:35:47,720 --> 00:35:49,260
How do you do a dry run?

682
00:35:49,280 --> 00:35:51,580
Nikolay: So the trickiest part is switchover.

683
00:35:52,060 --> 00:35:56,380
Because it's hard to undo this, because it's already like jump.

684
00:35:56,760 --> 00:36:00,540
But provisioning of logical replica, it can be tested in production.

685
00:36:00,660 --> 00:36:03,620
You start from non-production, then you go to production and

686
00:36:03,640 --> 00:36:07,600
you are very careful, 2 big risks with logical is to be out of

687
00:36:07,600 --> 00:36:10,760
disk space and we have a new settings, I keep forgetting the

688
00:36:10,760 --> 00:36:15,420
name of new setting that can mitigate, like so maximum size of

689
00:36:15,420 --> 00:36:19,200
the leg, you can control it and say, better kill my slot if you

690
00:36:19,200 --> 00:36:21,700
achieve this threshold.

691
00:36:22,120 --> 00:36:26,620
So you can set, I don't know, like 10 or like 100 gigabytes there

692
00:36:26,920 --> 00:36:27,900
to avoid risks.

693
00:36:27,940 --> 00:36:32,700
And second danger is to affect health, because xmin horizon is

694
00:36:32,700 --> 00:36:33,880
not advancing.

695
00:36:34,000 --> 00:36:37,840
So this you can just monitor and then define some threshold when

696
00:36:37,840 --> 00:36:43,940
you say stop we are killing the slot and and let it go but besides

697
00:36:43,940 --> 00:36:48,480
these 2 risks users won't notice anything well disk I/O as well

698
00:36:48,480 --> 00:36:52,440
If you provision right from the primary, disk I/O can be significant

699
00:36:52,440 --> 00:36:54,040
if you use multiple workers.

700
00:36:54,720 --> 00:36:56,880
So it's better to control this.

701
00:36:57,520 --> 00:36:58,020
Michael: Yeah.

702
00:36:58,080 --> 00:37:01,820
You also can't be doing migrations
during that time.

703
00:37:01,820 --> 00:37:03,360
Like you can't be doing DDL.

704
00:37:03,540 --> 00:37:04,600
Nikolay: You can, you can.

705
00:37:04,600 --> 00:37:05,140
If it's a

706
00:37:05,140 --> 00:37:05,880
Michael: good point,

707
00:37:06,180 --> 00:37:10,020
Nikolay: you yeah, they will just
put your application on pause.

708
00:37:10,080 --> 00:37:13,860
And if it's not a lot, you can
just if it's a test, what I did,

709
00:37:13,860 --> 00:37:18,840
I just checked in the logs, I see,
I see DDL, it's good to have

710
00:37:18,840 --> 00:37:22,260
a log_statement DDL, it's good
advice anyway for any setup, just

711
00:37:22,260 --> 00:37:23,940
to control all the changes.

712
00:37:23,940 --> 00:37:27,140
And you see, this is when DDL happened,
and now our replication

713
00:37:27,160 --> 00:37:31,160
gets stuck because of this DDL,
okay, we have up to 1 minute

714
00:37:31,720 --> 00:37:33,120
when I manually propagate.

715
00:37:33,120 --> 00:37:34,260
It's a test, right?

716
00:37:34,540 --> 00:37:39,600
But a good test is a production
test, because it's really hard.

717
00:37:39,600 --> 00:37:42,940
Well, I recommend starting from
non-production tests, but eventually,

718
00:37:43,120 --> 00:37:46,640
due to complexity, production tests
here I would definitely recommend

719
00:37:46,640 --> 00:37:47,720
having as well.

720
00:37:48,480 --> 00:37:50,740
Under control, risk is not high.

721
00:37:53,100 --> 00:37:57,040
And if you see like the lag, okay,
propagate it.

722
00:37:57,040 --> 00:38:01,480
If the deal happens, new partition
is created every hour or what,

723
00:38:01,480 --> 00:38:01,980
right?

724
00:38:02,360 --> 00:38:06,420
So you can just propagate it manually
and understand this spike

725
00:38:06,420 --> 00:38:06,980
was there.

726
00:38:06,980 --> 00:38:07,540
That's it.

727
00:38:07,540 --> 00:38:08,480
So it doesn't affect.

728
00:38:08,480 --> 00:38:11,000
But good thing is that users don't
notice, right?

729
00:38:11,000 --> 00:38:17,980
So it gives you benefit of performing
tests inside, like in the

730
00:38:17,980 --> 00:38:18,840
field, right?

731
00:38:19,080 --> 00:38:20,720
Where the real battle happens.

732
00:38:21,340 --> 00:38:23,640
Unlike switchover, switchover is
different.

733
00:38:24,960 --> 00:38:29,700
Unless you have switchover backed
by PgBouncer, suppose you,

734
00:38:30,300 --> 00:38:32,800
It won't be fully 0 downtime.

735
00:38:32,900 --> 00:38:36,000
You will need to accept like up
to 1 minute loss, for example.

736
00:38:36,620 --> 00:38:39,140
Loss of availability, not of data.

737
00:38:39,140 --> 00:38:41,540
Data should not be lost, any of
data.

738
00:38:41,760 --> 00:38:43,000
It's to switch, right?

739
00:38:43,500 --> 00:38:46,440
And PgBouncer, if you want to
pause/resume, I guess it should

740
00:38:46,440 --> 00:38:52,380
be installed on separate nodes
in Kubernetes or somehow and under

741
00:38:52,380 --> 00:38:53,080
your control.

742
00:38:53,720 --> 00:38:57,040
Because if it's PgBouncer provided
by that service, you need

743
00:38:57,040 --> 00:38:58,980
to switch out of that service,
right?

744
00:38:59,480 --> 00:39:03,180
So It's quite rare when people
control their own PgBouncer.

745
00:39:04,120 --> 00:39:04,940
But it happens.

746
00:39:05,080 --> 00:39:08,500
I mean, in the case of users of
managed Postgres.

747
00:39:08,560 --> 00:39:09,440
But it happens.

748
00:39:09,780 --> 00:39:11,400
It happened recently with us.

749
00:39:12,260 --> 00:39:12,760
People-

750
00:39:13,260 --> 00:39:16,420
Michael: So might that be a sensible
initial thing to consider

751
00:39:16,420 --> 00:39:16,920
migrating?

752
00:39:18,620 --> 00:39:20,760
Like run your own proxy?

753
00:39:20,900 --> 00:39:22,460
Or migrate to running your own?

754
00:39:22,460 --> 00:39:22,960
Nikolay: Oh yeah.

755
00:39:22,960 --> 00:39:24,820
Well, and you can start from it,
actually.

756
00:39:24,820 --> 00:39:25,960
This is what happened recently.

757
00:39:25,960 --> 00:39:29,360
It was managed service and PgBouncer,
that managed service

758
00:39:29,360 --> 00:39:32,920
provided PgBouncer but they didn't
provide important bits of

759
00:39:32,920 --> 00:39:34,140
it, some control.

760
00:39:35,740 --> 00:39:38,800
I think a pause/resume was either
not support.

761
00:39:38,800 --> 00:39:41,200
I don't remember details, but definitely
it was not supported

762
00:39:41,200 --> 00:39:42,780
all the details about monitoring.

763
00:39:43,280 --> 00:39:47,340
So you could not export this bits
to some custom monitoring solution.

764
00:39:47,980 --> 00:39:51,420
So first thing was migrate those
PgBouncers, get control over

765
00:39:51,420 --> 00:39:51,660
them.

766
00:39:51,660 --> 00:39:54,560
Then you can have pure pause/resume.

767
00:39:56,540 --> 00:39:57,360
And it's good.

768
00:39:57,500 --> 00:40:00,540
Oh, important thing I forgot I
also wanted to mention.

769
00:40:00,860 --> 00:40:05,660
When you do all this, migrate out,
you need to first think, you

770
00:40:05,660 --> 00:40:06,660
need to plan.

771
00:40:06,960 --> 00:40:10,340
And inside planning, it's super
important to understand the topology

772
00:40:10,380 --> 00:40:13,220
and route trip times between the
nodes.

773
00:40:14,280 --> 00:40:16,420
The distance, Is it like the same
region?

774
00:40:17,880 --> 00:40:22,640
If it's, for example, you're migrating
out of RDS, but you stay

775
00:40:22,640 --> 00:40:25,300
inside AWS, you can have the same
region.

776
00:40:25,460 --> 00:40:30,320
It's good, because where will be
your users, I mean application

777
00:40:30,440 --> 00:40:30,940
nodes?

778
00:40:31,780 --> 00:40:33,480
If they are far, it's bad.

779
00:40:33,740 --> 00:40:36,960
Sometimes people migrate to like,
okay, Hetzner or something.

780
00:40:36,980 --> 00:40:39,640
And in this case, it's better to
be closer.

781
00:40:40,320 --> 00:40:46,080
And some AWS regions have like
a couple of milliseconds latency,

782
00:40:46,280 --> 00:40:51,260
route trip time to a few Hetzner
regions.

783
00:40:51,300 --> 00:40:54,060
I think we have only 2 or 3, I
don't remember.

784
00:40:54,520 --> 00:40:57,840
So maybe you first need to migrate
to a different region, not

785
00:40:57,840 --> 00:41:01,400
edible yet, your application nodes,
I mean, to be closer to that

786
00:41:01,620 --> 00:41:02,500
Hetzner region.

787
00:41:02,600 --> 00:41:03,980
Geography matters here.

788
00:41:04,000 --> 00:41:07,320
How many miles or kilometers between
them?

789
00:41:07,900 --> 00:41:09,780
And you need to test latency.

790
00:41:10,120 --> 00:41:11,140
How to test latency?

791
00:41:11,760 --> 00:41:15,520
Like, absolutely simple test is,
you have Postgres, You connect

792
00:41:15,520 --> 00:41:21,640
to it, you write backslash timing
in psql, and just select semicolon

793
00:41:22,280 --> 00:41:23,000
multiple times.

794
00:41:23,000 --> 00:41:26,400
You already see, it's not scientific,
you need to do something

795
00:41:26,400 --> 00:41:30,360
better, but it's super easy, because
ping usually doesn't work,

796
00:41:30,660 --> 00:41:35,440
and you need some ways to test
TCPIP level, some router time.

797
00:41:35,440 --> 00:41:39,340
I think there are tools what we
used in the past I don't remember

798
00:41:39,340 --> 00:41:44,860
but this is the easiest way just
to check routed round trip time

799
00:41:44,860 --> 00:41:50,140
to Postgres and choose better because
actually 1 millisecond

800
00:41:50,140 --> 00:41:51,680
is already noticeable, right?

801
00:41:51,980 --> 00:41:54,260
Like 2 milliseconds if you have many queries.

802
00:41:54,380 --> 00:41:58,520
10 milliseconds I would already start hesitating to have it.

803
00:42:00,540 --> 00:42:02,980
It's you remember our first episode, right?

804
00:42:02,980 --> 00:42:05,220
Michael: Yeah, I was going to say, that's what you're aiming

805
00:42:05,220 --> 00:42:08,720
for, to have most queries be less than 10 milliseconds, right?

806
00:42:08,720 --> 00:42:12,800
Nikolay: Yeah, because HTTP can have like many, for example,

807
00:42:12,800 --> 00:42:16,220
10, and HTTP 100 milliseconds is already noticeable.

808
00:42:16,400 --> 00:42:18,580
Not noticeable until 200, okay.

809
00:42:20,420 --> 00:42:23,800
Should be noticeable for engineers, not to users yet.

810
00:42:24,520 --> 00:42:25,780
Okay, that's it.

811
00:42:26,040 --> 00:42:29,840
I think we covered many specific areas.

812
00:42:30,240 --> 00:42:33,960
I just wanted to say it's not like, it feels maybe scary, but

813
00:42:33,960 --> 00:42:34,920
it should not be scary.

814
00:42:34,920 --> 00:42:37,940
And I think in the future we will have more mature products,

815
00:42:38,300 --> 00:42:42,180
purely open source, delivering HA and DR in packaged form.

816
00:42:42,500 --> 00:42:45,880
Not only in Kubernetes, but Kubernetes is already so.

817
00:42:45,880 --> 00:42:51,380
You can choose among multiple Kubernetes operators, fully open-source,

818
00:42:51,940 --> 00:42:54,780
and they have, or at least they promise to have everything.

819
00:42:55,600 --> 00:42:59,980
But if you don't like Kubernetes, like actually I do more and

820
00:42:59,980 --> 00:43:02,540
more, I don't like 4 databases Kubernetes more and more.

821
00:43:02,540 --> 00:43:08,600
In this case, I think more products will arise and help you to

822
00:43:08,940 --> 00:43:13,880
stop worrying about that about backups and availability

823
00:43:14,960 --> 00:43:16,960
Michael: yeah I think it's good to have options as well.

824
00:43:16,960 --> 00:43:17,780
Good to have.

825
00:43:18,000 --> 00:43:21,140
It's good to have a second best option when you're negotiating

826
00:43:21,360 --> 00:43:21,820
as well.

827
00:43:21,820 --> 00:43:22,540
Nikolay: If you

828
00:43:22,540 --> 00:43:26,680
Michael: because if costs are the main concern, step 1 might

829
00:43:26,680 --> 00:43:29,340
be try to negotiate a lower cost.

830
00:43:29,340 --> 00:43:32,880
And at that point, it helps to have a oh, we could actually migrate.

831
00:43:32,880 --> 00:43:34,980
We've looked into it, and here's our plan.

832
00:43:35,740 --> 00:43:38,540
We can move on to something that will cost us this much less.

833
00:43:38,560 --> 00:43:42,480
You might be able to get slightly a best of both worlds and just

834
00:43:42,640 --> 00:43:44,940
get the cost reduced without having to migrate.

835
00:43:45,540 --> 00:43:46,500
But yeah, great.

836
00:43:46,500 --> 00:43:50,220
So we covered a bit of planning, we covered how to in terms of

837
00:43:50,220 --> 00:43:53,160
technically, we covered things you have to make sure you don't

838
00:43:53,160 --> 00:43:55,540
forget about, a few bits there.

839
00:43:55,840 --> 00:43:57,180
Anything else before we wrap

840
00:43:57,180 --> 00:43:57,620
Nikolay: it up?

841
00:43:57,620 --> 00:44:01,980
I just wanted to add a comment that I understand that this discussion

842
00:44:01,980 --> 00:44:03,400
maybe is a little bit early.

843
00:44:03,940 --> 00:44:07,860
So let's see what happens in the next, say, 5 years.

844
00:44:09,000 --> 00:44:16,160
And am I right to predict that the rise of new products around

845
00:44:16,160 --> 00:44:18,140
Postgres should happen?

846
00:44:19,800 --> 00:44:20,640
We'll see.

847
00:44:21,280 --> 00:44:21,780
Michael: Yeah.

848
00:44:21,860 --> 00:44:24,520
Well, I'm going to predict that actually I think we might even

849
00:44:24,520 --> 00:44:26,540
go slightly in the other direction.

850
00:44:27,340 --> 00:44:30,900
I think we're gonna see more and
more managed services, and I

851
00:44:30,900 --> 00:44:32,340
think they're gonna be- Because
of AI.

852
00:44:33,760 --> 00:44:38,300
Maybe because of AI, but also,
like, some of the companies that

853
00:44:38,440 --> 00:44:41,480
are spinning up thousands of these,
like Supabase seem to be

854
00:44:41,480 --> 00:44:43,480
going from strength to strength,
good for them.

855
00:44:43,480 --> 00:44:46,660
I know Neon just got acquired but
the numbers of new databases

856
00:44:46,720 --> 00:44:48,780
on their platform is significant.

857
00:44:48,780 --> 00:44:49,340
They're both

858
00:44:49,340 --> 00:44:51,720
Nikolay: because of AI builders,
so to speak, right?

859
00:44:52,060 --> 00:44:53,220
Michael: Yeah, but not only.

860
00:44:53,940 --> 00:44:54,840
I think there was...

861
00:44:55,440 --> 00:44:57,160
Nikolay: This is what I read in
social media.

862
00:44:57,160 --> 00:45:01,100
Michael: I think the recent continued
trend is definitely that.

863
00:45:01,100 --> 00:45:01,860
But Supabase...

864
00:45:01,900 --> 00:45:05,080
I know a lot of people building
small businesses and a lot of

865
00:45:05,080 --> 00:45:09,320
them are doing it on on Supabase
instead of Firebase in the

866
00:45:09,320 --> 00:45:14,540
old days so I think there are a
lot of non AI projects on using

867
00:45:14,540 --> 00:45:19,560
that kind of service as well so
I I'm seeing a lot of those maybe

868
00:45:19,560 --> 00:45:25,680
not the enterprise and this the
larger scale projects but it'll

869
00:45:25,680 --> 00:45:27,940
be interesting to see in a few
years' time.

870
00:45:28,280 --> 00:45:29,280
Hopefully we'll have both.

871
00:45:29,280 --> 00:45:30,660
Hopefully we'll have both ends
of the spectrum.

872
00:45:30,660 --> 00:45:33,800
Nikolay: What I remember is the
saying that, not a saying, like

873
00:45:33,800 --> 00:45:39,180
phrase that more and more, maybe
more than half of new databases

874
00:45:39,400 --> 00:45:43,660
created, it's from integration
automation from like say some

875
00:45:43,660 --> 00:45:46,960
AI, like coding or something and
they need database and it's

876
00:45:46,960 --> 00:45:52,120
just created basically fully automatically
right yeah yeah interesting

877
00:45:52,120 --> 00:45:53,640
Michael: yeah what the other half
then

878
00:45:54,960 --> 00:45:57,960
Nikolay: I don't know I don't know
exact numbers it's just what

879
00:45:57,960 --> 00:46:01,240
like yeah I just put some random
number 50

880
00:46:01,240 --> 00:46:01,740
Michael: yeah

881
00:46:03,460 --> 00:46:03,820
Nikolay: yeah so

882
00:46:03,820 --> 00:46:06,300
Michael: and scale matters too
right like I think whilst that's

883
00:46:06,300 --> 00:46:09,360
a lot of databases, I don't imagine
that's a lot of huge databases,

884
00:46:09,480 --> 00:46:11,960
like we're probably not talking
about that's not the ones that

885
00:46:11,960 --> 00:46:14,280
are going to be migrating to RDS
anytime soon.

886
00:46:14,280 --> 00:46:15,180
Most of them.

887
00:46:15,240 --> 00:46:16,920
Nikolay: Yeah, well, yeah.

888
00:46:17,520 --> 00:46:18,360
It's interesting.

889
00:46:18,380 --> 00:46:23,820
Like I would I would love to see
some report from good, trustworthy

890
00:46:24,240 --> 00:46:29,960
sources, landscape of databases
in terms of sizes, budgets, and

891
00:46:29,960 --> 00:46:30,640
so on.

892
00:46:31,500 --> 00:46:33,780
About open source, or maybe only
Postgres.

893
00:46:34,740 --> 00:46:37,460
I remember some Gartner report
from 2018.

894
00:46:39,480 --> 00:46:42,040
Yeah, that was already many years
ago.

895
00:46:42,900 --> 00:46:46,960
It was saying that open source
database market exceeded 1000000000

896
00:46:46,960 --> 00:46:48,540
already, so it was great.

897
00:46:49,020 --> 00:46:50,260
But what's the distribution?

898
00:46:52,280 --> 00:46:54,940
Those who are created automatically,
they are small.

899
00:46:55,680 --> 00:46:58,640
And they are definitely below the
threshold we discussed, right?

900
00:46:59,200 --> 00:47:00,780
Where it doesn't work.

901
00:47:00,780 --> 00:47:01,020
And

902
00:47:01,020 --> 00:47:02,060
Michael: also, what counts?

903
00:47:02,420 --> 00:47:04,040
What counts as open source?

904
00:47:04,040 --> 00:47:10,340
Like is Gartner including Aurora
Postgres?

905
00:47:12,380 --> 00:47:13,780
In the open as open source?

906
00:47:13,780 --> 00:47:15,220
Nikolay: Well the basis is open.

907
00:47:17,320 --> 00:47:18,140
Michael: I understand that.

908
00:47:18,140 --> 00:47:20,780
I just I think it's a blurred line
at this point

909
00:47:20,860 --> 00:47:23,680
Nikolay: yeah it's hard to say
there are so many flavors I agree

910
00:47:23,680 --> 00:47:26,520
Michael: yeah all right anyway
really a pleasure to speaking

911
00:47:26,520 --> 00:47:29,840
with you always and catch you next
week

912
00:47:30,130 --> 00:47:30,880
Nikolay: good Good.