1
00:00:00,060 --> 00:00:02,720
Michael: Hello and welcome to Postgres.FM, a weekly show about

2
00:00:02,720 --> 00:00:03,660
all things PostgreSQL.

3
00:00:03,900 --> 00:00:05,660
I am Michael, founder of pgMustard.

4
00:00:05,660 --> 00:00:08,080
This is my co-host, Nikolay, founder of Postgres.AI.

5
00:00:08,200 --> 00:00:08,940
Hey Nikolay.

6
00:00:09,360 --> 00:00:10,100
Nikolay: Hi Michael.

7
00:00:10,180 --> 00:00:10,980
How are you?

8
00:00:11,260 --> 00:00:11,520
Michael: Good.

9
00:00:11,520 --> 00:00:11,780
How are

10
00:00:11,780 --> 00:00:12,280
you?

11
00:00:12,740 --> 00:00:13,520
Nikolay: Very good.

12
00:00:13,940 --> 00:00:15,200
Michael: Well, yeah, me too.

13
00:00:15,780 --> 00:00:16,840
Haki: Thank you for asking.

14
00:00:18,720 --> 00:00:19,740
Nikolay: Thank you for asking.

15
00:00:20,620 --> 00:00:21,740
Haki: Like I'm here too.

16
00:00:21,740 --> 00:00:25,120
If you just want to keep chat among yourselves, I'll just wait

17
00:00:25,120 --> 00:00:25,620
here.

18
00:00:25,760 --> 00:00:27,880
Nikolay: Look, look, we have a guest today.

19
00:00:28,260 --> 00:00:28,520
Haki: Yeah.

20
00:00:28,520 --> 00:00:29,020
Hi.

21
00:00:29,300 --> 00:00:29,640
Michael: Yeah.

22
00:00:29,640 --> 00:00:34,300
We are delighted to have a guest today and it is the one and only

23
00:00:34,300 --> 00:00:38,600
Haki Benita, technical lead and database enthusiast who also

24
00:00:38,600 --> 00:00:42,520
writes an excellent blog over at hakibenita.com that we've mentioned

25
00:00:42,520 --> 00:00:44,020
many times on the podcast before.

26
00:00:44,020 --> 00:00:46,560
He also gives popular talks and training sessions.

27
00:00:46,680 --> 00:00:48,220
Thank you so much for joining us.

28
00:00:48,220 --> 00:00:48,560
Haki: Thank you

29
00:00:48,560 --> 00:00:49,620
for having me.

30
00:00:50,020 --> 00:00:51,520
Big fan, very excited.

31
00:00:52,760 --> 00:00:53,940
Nikolay: The blog is great.

32
00:00:53,980 --> 00:00:58,140
If someone didn't read it yet, like it's, it's a must-read

33
00:00:58,140 --> 00:01:00,780
for, I think, and long reads, right?

34
00:01:00,780 --> 00:01:02,800
So, It's long reads, but not very frequent.

35
00:01:02,800 --> 00:01:05,860
I guess it's not possible to make them frequent, right?

36
00:01:06,380 --> 00:01:07,860
Haki: Yeah, it's a lot of work.

37
00:01:09,140 --> 00:01:13,440
Nikolay: Funny story is we agreed that our bot will have your

38
00:01:13,440 --> 00:01:16,180
blog post in our knowledge base.

39
00:01:16,180 --> 00:01:16,800
Haki: Yeah, that's right.

40
00:01:16,800 --> 00:01:18,040
Nikolay: And thank you for that.

41
00:01:18,280 --> 00:01:21,740
And like a few days ago, I told the team like, you know, like

42
00:01:21,740 --> 00:01:25,100
this part of our knowledge base was not updated since January,

43
00:01:25,580 --> 00:01:26,520
since alpha release.

44
00:01:26,520 --> 00:01:28,640
And now we're approaching beta release.

45
00:01:28,940 --> 00:01:31,840
And they said, no problem, we'll update and pay attention to

46
00:01:31,840 --> 00:01:37,280
this blog, I said, and then they came and said, we have updated

47
00:01:37,540 --> 00:01:43,180
only one article added since January, but it makes sense because

48
00:01:43,180 --> 00:01:44,800
long reads require a lot of time.

49
00:01:44,800 --> 00:01:45,920
I know it very well.

50
00:01:45,920 --> 00:01:46,280
Right.

51
00:01:46,280 --> 00:01:46,440
Haki: Yeah.

52
00:01:46,440 --> 00:01:50,360
It takes a very long time to produce
these types of articles.

53
00:01:50,440 --> 00:01:51,660
Yeah, I know.

54
00:01:51,760 --> 00:01:54,720
I've been writing for a very, very
long time and I've been very

55
00:01:54,720 --> 00:01:56,820
surprised about this article.

56
00:01:56,820 --> 00:01:59,940
I'll be honest with you, you know,
after you write for some time,

57
00:02:00,060 --> 00:02:03,260
you start to develop like a sense
of which ones are going to

58
00:02:03,260 --> 00:02:05,460
do well and be very accepted.

59
00:02:05,540 --> 00:02:08,760
And which ones are going to be
like, you know, just one for the,

60
00:02:08,760 --> 00:02:09,580
for the archives.

61
00:02:10,160 --> 00:02:15,360
So, you know, when I wrote stuff
like comparing pandas to, to

62
00:02:15,360 --> 00:02:18,840
SQL, I knew that it's going to
be a hit because people like pandas.

63
00:02:18,840 --> 00:02:21,600
And if you say pandas is slow,
then people are going to hate

64
00:02:21,600 --> 00:02:22,100
you.

65
00:02:22,120 --> 00:02:25,760
And when I wrote this, me and Michael
wrote something about hash

66
00:02:25,760 --> 00:02:30,980
indexes a few years back, and I
think it's a top result for Postgres

67
00:02:31,020 --> 00:02:33,940
hash index on just about every
search engine.

68
00:02:33,940 --> 00:02:36,760
So that one, you know, you, you released
that one, you know, that

69
00:02:36,760 --> 00:02:39,600
it's going to do well, but then
I released this one.

70
00:02:39,600 --> 00:02:42,340
I said, man, it's so long, such
a niche problem.

71
00:02:42,400 --> 00:02:46,400
No, no chance anybody's actually
going to read all of that.

72
00:02:46,400 --> 00:02:50,780
And I was very surprised by the
amount of responses that I received

73
00:02:51,040 --> 00:02:51,840
to this article.

74
00:02:51,840 --> 00:02:57,300
And I think the most surprising
comment that I received on this

75
00:02:57,300 --> 00:03:01,260
article is that a lot of Postgres
people didn't know about merge.

76
00:03:02,080 --> 00:03:06,440
Now I came from, I came from Oracle,
so I knew all about merge.

77
00:03:06,680 --> 00:03:11,680
And actually when I came to Postgres,
I wrote a lot of ETL processes

78
00:03:11,840 --> 00:03:15,860
and I really wanted merge and I
didn't have merge, so I had to

79
00:03:15,860 --> 00:03:17,740
learn about INSERT ON CONFLICT.

80
00:03:18,600 --> 00:03:23,000
So a lot of people told me this
is the first time I heard about

81
00:03:23,000 --> 00:03:23,500
merge.

82
00:03:23,560 --> 00:03:24,640
Such a cool command.

83
00:03:24,840 --> 00:03:26,640
Nikolay: Because it's new, it's
still new.

84
00:03:26,640 --> 00:03:27,040
Haki: I know.

85
00:03:27,040 --> 00:03:28,820
Nikolay: It was 15, right?

86
00:03:28,940 --> 00:03:29,440
15.

87
00:03:29,540 --> 00:03:31,320
So it's super new.

88
00:03:31,940 --> 00:03:36,020
Haki: And also to be honest, INSERT
ON CONFLICT is like what

89
00:03:36,020 --> 00:03:36,700
you want.

90
00:03:37,040 --> 00:03:38,260
99% of the cases.

91
00:03:39,160 --> 00:03:44,760
And while merge is nice, I think
that ON CONFLICT is, you seem

92
00:03:44,760 --> 00:03:45,060
pissed.

93
00:03:45,060 --> 00:03:46,500
Why, why did I say?

94
00:03:47,780 --> 00:03:50,860
Nikolay: Well, it was you describing
your article, why it's bad,

95
00:03:50,860 --> 00:03:51,180
right?

96
00:03:51,180 --> 00:03:52,760
Let's talk about why it's bad.

97
00:03:52,960 --> 00:03:56,100
These gaps in sequences, like a
BLOAT created.

98
00:03:56,720 --> 00:03:57,220
Yeah.

99
00:03:57,900 --> 00:03:59,540
This is not what people want.

100
00:03:59,600 --> 00:04:00,100
Right.

101
00:04:00,720 --> 00:04:01,160
I know.

102
00:04:01,160 --> 00:04:02,540
Let's step, let's step back.

103
00:04:02,540 --> 00:04:07,660
Like our, Michael likes us to pronounce
the topic.

104
00:04:10,840 --> 00:04:11,920
Haki: I have to tell you something.

105
00:04:11,920 --> 00:04:12,640
I have to tell you something.

106
00:04:12,640 --> 00:04:13,140
It's funny.

107
00:04:13,140 --> 00:04:13,320
Okay.

108
00:04:13,320 --> 00:04:14,200
I have to tell you something.

109
00:04:14,200 --> 00:04:18,440
So when I talked with Michael about
doing this show and I asked

110
00:04:18,440 --> 00:04:20,140
him, okay, let's do it.

111
00:04:20,140 --> 00:04:23,860
Can you give me like a rough outline
of how it's going to go?

112
00:04:23,920 --> 00:04:27,600
And he told me, I can tell you
how it's going to go, but to be

113
00:04:27,600 --> 00:04:30,960
honest, if Nikolay is going to
be on the show, then I can't promise

114
00:04:30,960 --> 00:04:31,460
anything.

115
00:04:33,780 --> 00:04:35,360
Michael: I can't promise any structure.

116
00:04:35,660 --> 00:04:38,760
Nikolay: Well, I see you are the
same type of person, actually.

117
00:04:38,800 --> 00:04:39,240
Haki: I don't know.

118
00:04:39,240 --> 00:04:41,400
You delivered like on the first
5 minutes.

119
00:04:41,400 --> 00:04:43,440
Nikolay: Don't follow the agenda,
right?

120
00:04:43,580 --> 00:04:47,360
So you said that you were surprised
this topic would be popular,

121
00:04:47,360 --> 00:04:49,080
but how come? It's super popular.

122
00:04:49,080 --> 00:04:53,100
It's like in computer science,
particularly in databases, find

123
00:04:53,100 --> 00:04:56,760
or create, this is like a super
common pattern, right?

124
00:04:56,820 --> 00:05:00,360
And the fact that in Postgres,
the solutions have so many pros

125
00:05:00,360 --> 00:05:03,740
and cons, like it's super interesting
because everyone needs

126
00:05:03,740 --> 00:05:04,440
it actually.

127
00:05:04,540 --> 00:05:04,920
Yeah.

128
00:05:04,920 --> 00:05:06,900
You build something, you want this,
right?

129
00:05:07,280 --> 00:05:07,760
Haki: Yeah.

130
00:05:07,760 --> 00:05:09,300
I'm going to tell you a secret.

131
00:05:09,600 --> 00:05:14,620
After I published this article,
I actually discovered that there

132
00:05:14,620 --> 00:05:19,400
are still situations where you
get very unpredictable results.

133
00:05:20,280 --> 00:05:20,740
Okay.

134
00:05:20,740 --> 00:05:24,720
And, and, and, and I've done some
experimenting in the past 2

135
00:05:24,720 --> 00:05:28,260
weeks based on comments that I
received and I haven't cracked

136
00:05:28,260 --> 00:05:28,940
it yet.

137
00:05:30,520 --> 00:05:31,020
Yeah.

138
00:05:31,280 --> 00:05:32,720
It gets a lot more complicated.

139
00:05:33,700 --> 00:05:35,460
And I'll even tell you another
secret.

140
00:05:35,540 --> 00:05:40,380
There's a very, in my opinion,
an unexpected difference between

141
00:05:40,380 --> 00:05:45,160
the way merge and insert on conflict
behave under different circumstances.

142
00:05:46,640 --> 00:05:49,440
But, you know, we promised Michael
that we'll describe the topic

143
00:05:49,440 --> 00:05:51,980
before we actually dive into the
details.

144
00:05:52,840 --> 00:05:53,340
Nikolay: Right.

145
00:05:54,000 --> 00:05:55,580
Let's name the topic.

146
00:05:55,920 --> 00:05:56,120
Yeah.

147
00:05:56,120 --> 00:05:56,620
Yeah.

148
00:05:56,820 --> 00:05:58,780
Michael: Well, I was interested
in your thoughts on this actually,

149
00:05:58,780 --> 00:06:02,160
because I feel like you deliberately
called your blog post get

150
00:06:02,160 --> 00:06:02,880
or create.

151
00:06:03,180 --> 00:06:06,360
Whereas from the database side
of it, I've always thought of

152
00:06:06,360 --> 00:06:08,300
it as like INSERT or SELECT.

153
00:06:08,300 --> 00:06:11,120
And I think Nikolay's called it
that, but in the past in like

154
00:06:11,120 --> 00:06:12,440
a how-to guide and things.

155
00:06:12,440 --> 00:06:16,020
So we'll link up the blog post
and we'll link up Nikolay's how

156
00:06:16,020 --> 00:06:18,480
to, and there's like a few other
things.

157
00:06:18,620 --> 00:06:21,340
I think one of the things you're
referencing about how it's become

158
00:06:21,340 --> 00:06:23,940
even more complicated in the last
few weeks was like you linked

159
00:06:23,940 --> 00:06:28,540
me to a great answer by Erwin Brandstetter
on Stack Overflow

160
00:06:28,740 --> 00:06:31,260
that discusses this as well, but
it's a surprise.

161
00:06:31,260 --> 00:06:32,620
It sounds really simple, right?

162
00:06:32,720 --> 00:06:35,360
SELECT or INSERT, get or create.

163
00:06:35,360 --> 00:06:37,780
It seems such a simple piece of
functionality.

164
00:06:37,900 --> 00:06:43,260
And in small, low traffic systems,
it can be like you can, like

165
00:06:43,260 --> 00:06:47,180
any newbie developer could come
up with a solution to this problem.

166
00:06:47,400 --> 00:06:51,960
It just gets complicated when
you have to factor in concurrency

167
00:06:52,420 --> 00:06:55,240
and MVCC implementation and things.

168
00:06:55,240 --> 00:06:57,180
So, yeah, I loved your post.

169
00:06:57,180 --> 00:07:00,040
I found it really interesting how
complicated this could get,

170
00:07:00,040 --> 00:07:02,560
But yeah, maybe we should start
simple and like, it'd be great

171
00:07:02,560 --> 00:07:04,800
to hear a little bit about like
you, you both said it's quite

172
00:07:04,800 --> 00:07:07,540
a common thing that you come, like
you have to implement it quite

173
00:07:07,540 --> 00:07:08,040
often.

174
00:07:08,560 --> 00:07:10,020
I haven't had to that often.

175
00:07:10,020 --> 00:07:13,040
Like I've got a couple of occasions,
but not maybe not as often

176
00:07:13,040 --> 00:07:13,380
as you.

177
00:07:13,380 --> 00:07:15,640
So I'd be interested to hear like
where you're coming across

178
00:07:15,640 --> 00:07:16,140
this.

179
00:07:16,400 --> 00:07:16,900
Haki: Okay.

180
00:07:16,960 --> 00:07:21,060
So my use case was very similar
to what I actually implemented

181
00:07:21,100 --> 00:07:25,680
in the article because I had this
organization table and we had

182
00:07:25,680 --> 00:07:30,260
members and we wanted to have users
be able to tag members with

183
00:07:30,260 --> 00:07:33,560
their own tags and we wanted to
encourage reuse of tags.

184
00:07:33,560 --> 00:07:35,360
Otherwise, they don't make much
sense.

185
00:07:35,500 --> 00:07:38,720
So we've set up this very simple
HTML form.

186
00:07:38,720 --> 00:07:40,680
Now HTML forms are not SPAs.

187
00:07:40,680 --> 00:07:42,900
They're not, they're not very complicated.

188
00:07:43,100 --> 00:07:44,340
You can send JSON.

189
00:07:44,440 --> 00:07:49,700
You just send a list of names, like
from a radio selection or like

190
00:07:49,700 --> 00:07:50,960
an autocomplete, whatever.

191
00:07:51,740 --> 00:07:55,460
So at the end of the day, the server
receives a list of strings

192
00:07:56,240 --> 00:07:59,620
and it wants to set these tags
on the member.

193
00:08:00,040 --> 00:08:04,540
Now, if the tag exists, you want
to reuse the tag.

194
00:08:04,700 --> 00:08:07,920
And if the tag does not exist,
you want to create a new tag and

195
00:08:07,920 --> 00:08:08,800
get the ID.

196
00:08:09,060 --> 00:08:13,380
Because the next step would be
to get all the IDs and put them in

197
00:08:13,380 --> 00:08:16,560
the table associating tags with
members, right?

198
00:08:16,560 --> 00:08:23,560
So that's the first part where
you have a list of tag names and

199
00:08:23,560 --> 00:08:26,340
you want to get or create tags.

200
00:08:26,720 --> 00:08:28,920
This is where I came to this idea.

201
00:08:29,140 --> 00:08:32,920
Now, the thing that surprised me
is that now this is not a very

202
00:08:32,920 --> 00:08:34,840
high traffic part of the application.

203
00:08:35,140 --> 00:08:39,140
I could have done the brute force
approach would have been just

204
00:08:39,140 --> 00:08:39,640
fine.

205
00:08:40,080 --> 00:08:44,500
But you know, I wrote some unit
testing and 1 of the tests was,

206
00:08:44,500 --> 00:08:48,380
let's see what happens when you
just have an existing tag, you

207
00:08:48,380 --> 00:08:49,900
want to make sure that it's reused.

208
00:08:50,280 --> 00:08:56,180
So I use INSERT ON CONFLICT DO
NOTHING with RETURNING *.

209
00:08:56,840 --> 00:09:02,480
So I expected that if I insert,
for example, 2 tags and 1 of

210
00:09:02,480 --> 00:09:06,840
them already exists, I expected
to get 2 tags in return.

211
00:09:07,800 --> 00:09:10,660
But in fact, what I got was just
1 tag.

212
00:09:11,060 --> 00:09:14,320
So this was very strange to me.

213
00:09:14,600 --> 00:09:18,220
So at this point I started investigating,
you know, and starting

214
00:09:18,220 --> 00:09:20,080
to explore why this is happening.

215
00:09:20,080 --> 00:09:25,460
And in fact, the first thing that
I thought about was let's do

216
00:09:25,560 --> 00:09:26,960
a meaningless UPDATE.

217
00:09:26,960 --> 00:09:32,240
Like instead of doing ON CONFLICT
DO NOTHING, I did ON CONFLICT

218
00:09:32,980 --> 00:09:35,820
SET ID = EXCLUDED.ID.

219
00:09:36,100 --> 00:09:40,300
Like let's fool the database into
thinking that this tag was

220
00:09:40,520 --> 00:09:41,020
modified.

221
00:09:41,160 --> 00:09:44,640
So I get, get that in return, but
it really bugged me because

222
00:09:44,640 --> 00:09:48,080
it's very like, it's a very ugly
solution.

223
00:09:48,080 --> 00:09:51,180
Why would I want to update something
for no reason just to get

224
00:09:51,180 --> 00:09:51,840
it back?

225
00:09:52,080 --> 00:09:55,220
So this is where, you know, all
the, the different scenarios

226
00:09:55,440 --> 00:09:56,360
started to uncover.

227
00:09:56,360 --> 00:09:59,740
And as I tested farther and farther,
I came to the conclusion

228
00:09:59,760 --> 00:10:01,740
like, Hell man, why is this so
complicated?

229
00:10:02,140 --> 00:10:05,140
I mean, this, this is what database
and applications do.

230
00:10:05,280 --> 00:10:06,580
This should be so simple.

231
00:10:06,580 --> 00:10:07,940
Why is this so complicated?

232
00:10:08,520 --> 00:10:13,760
And then I started digging and
1 of the places that I eventually

233
00:10:13,840 --> 00:10:16,480
arrived was Django.

234
00:10:16,900 --> 00:10:21,780
Django is a web framework and there's
a, an implementation of

235
00:10:21,780 --> 00:10:23,120
get_or_create().

236
00:10:23,140 --> 00:10:29,760
And what Django is doing, they
try to fetch the record.

237
00:10:30,260 --> 00:10:32,060
If they find it, they return it.

238
00:10:32,060 --> 00:10:34,900
If they don't find it, they try
to insert.

239
00:10:35,740 --> 00:10:36,240
Right.

240
00:10:36,740 --> 00:10:41,060
But then, and that's like the funny
part, the non-intuitive part,

241
00:10:41,280 --> 00:10:45,700
then they actually handle an integrity,
unique constraint violation,

242
00:10:46,320 --> 00:10:48,100
and then they try to get it again.

243
00:10:48,660 --> 00:10:52,580
This is the select, insert, select,
but then it gets even more

244
00:10:52,580 --> 00:10:57,080
complicated because if you have
a unique constraint violation

245
00:10:57,160 --> 00:11:00,540
inside a transaction, it aborts
the transaction, right?

246
00:11:01,060 --> 00:11:05,440
You need to wrap the second insert
in another transaction.

247
00:11:06,820 --> 00:11:07,320
Right.

248
00:11:07,360 --> 00:11:08,260
Nikolay: Or use subtransactions.

249
00:11:08,680 --> 00:11:12,360
I think Django by default uses
subtransactions, which is a very

250
00:11:12,360 --> 00:11:14,880
bad idea, actually, but we will
talk about it later.

251
00:11:15,040 --> 00:11:18,160
Haki: The reason that he uses subtransactions,
and I know you're

252
00:11:18,160 --> 00:11:21,260
a very big fan of subtransaction,
you use them all the time.

253
00:11:21,260 --> 00:11:23,100
You encourage everybody to use
subtransactions.

254
00:11:23,920 --> 00:11:27,320
You wrote a very long blog post
about why they are absolutely

255
00:11:27,480 --> 00:11:29,480
great and you should use them all
the time.

256
00:11:30,360 --> 00:11:34,720
But the reason that you do that
in Django, the reason that they're

257
00:11:34,720 --> 00:11:39,620
doing that is because if you happen
to call this function inside

258
00:11:39,620 --> 00:11:45,720
a transaction of your own, and
if you reach a unique constraint

259
00:11:45,720 --> 00:11:48,580
violation, it gets your transaction
aborted.

260
00:11:48,680 --> 00:11:49,820
So that's a problem.

261
00:11:50,000 --> 00:11:53,360
So the only way that Django can
make sure that they don't get

262
00:11:53,360 --> 00:11:57,660
your transaction aborted is to
execute this part in a subtransaction

263
00:11:57,780 --> 00:11:58,700
of its own.

264
00:11:59,440 --> 00:12:02,800
Now There's also another very interesting
thing that happens

265
00:12:02,800 --> 00:12:03,300
here.

266
00:12:03,340 --> 00:12:05,860
And this is also something that
I mentioned in the article.

267
00:12:06,580 --> 00:12:10,020
Python encourages asking for forgiveness.

268
00:12:11,500 --> 00:12:15,800
So in Python, the idiom says that
you should try something and

269
00:12:15,800 --> 00:12:17,580
then handle all the possible exceptions.

270
00:12:18,460 --> 00:12:21,660
So trying to insert a record and
then handling the unique constraint

271
00:12:21,660 --> 00:12:25,520
violation is actually consistent
with how you're expected to

272
00:12:25,520 --> 00:12:26,700
do things in Python.

273
00:12:27,180 --> 00:12:31,400
But it kind of conflicts with the
way Postgres handles this situation.

274
00:12:31,500 --> 00:12:36,300
Because in Postgres, when you have
an exception inside a transaction,

275
00:12:36,300 --> 00:12:40,160
it aborts the transaction, which
is not a very good thing.

276
00:12:40,800 --> 00:12:44,380
So the approaches between how you
would do things in Postgres

277
00:12:44,380 --> 00:12:47,660
and how you would do things in,
in Python kind of clash in

278
00:12:47,660 --> 00:12:48,360
this case.

279
00:12:48,420 --> 00:12:49,980
I thought it was very interesting.

280
00:12:50,500 --> 00:12:54,360
So, you know, I explored different
approaches, like what would

281
00:12:54,360 --> 00:12:58,240
be the equivalent of asking for
forgiveness in Postgres, what

282
00:12:58,240 --> 00:13:01,560
would be the equivalent of looking
before you leap, check all

283
00:13:01,560 --> 00:13:02,940
the preconditions in advance.

284
00:13:04,460 --> 00:13:08,800
So, yeah, it turned out to be way
more complicated than I anticipated.

285
00:13:10,320 --> 00:13:11,880
Nikolay: Yeah, that's interesting.

286
00:13:11,980 --> 00:13:15,660
And you explored several properties,
right?

287
00:13:16,020 --> 00:13:20,500
Bloat, concurrency, constraint
and idempotency, right?

288
00:13:20,500 --> 00:13:21,880
Haki: Idempotency, yeah.

289
00:13:23,080 --> 00:13:24,640
Nikolay: So I'm very curious.

290
00:13:25,360 --> 00:13:29,140
I saw some parts of benchmarks,
but you decided not to include

291
00:13:29,140 --> 00:13:33,920
the performance to the resulting
table when you compared methods.

292
00:13:34,280 --> 00:13:35,040
Why so?

293
00:13:35,380 --> 00:13:37,160
Performance is also interesting, right?

294
00:13:37,540 --> 00:13:39,060
Haki: Performance is interesting.

295
00:13:39,100 --> 00:13:43,580
And I've seen some articles
that do this from a performance

296
00:13:43,580 --> 00:13:44,440
point of view.

297
00:13:45,040 --> 00:13:48,220
In my situation, performance was
not a concern.

298
00:13:48,820 --> 00:13:52,960
I was more focused on getting the
functional attributes that

299
00:13:52,960 --> 00:13:57,400
I wanted, like idempotency, for
example, was a lot more important 

300
00:13:57,400 --> 00:13:59,280
to me than making it fast.

301
00:13:59,280 --> 00:14:02,820
Because at the end of the day,
you don't UPDATE lots of tags.

302
00:14:03,240 --> 00:14:05,580
You probably set no more than 5.

303
00:14:06,100 --> 00:14:08,220
So performance was not a concern.

304
00:14:08,480 --> 00:14:11,420
I did want to focus on the functional
attributes.

305
00:14:11,720 --> 00:14:13,100
Also, it got pretty long.

306
00:14:13,100 --> 00:14:16,080
So at some point you need to let
something go, you know.

307
00:14:16,860 --> 00:14:17,320
Nikolay: Right.

308
00:14:17,320 --> 00:14:17,820
Yeah.

309
00:14:18,220 --> 00:14:22,460
Let's maybe talk about your favorite
method right now and for

310
00:14:22,460 --> 00:14:22,860
future.

311
00:14:22,860 --> 00:14:24,300
For future, I guess it's MERGE,
right?

312
00:14:24,300 --> 00:14:27,720
Because in Postgres 17 it will 
have RETURNING clause.

313
00:14:27,720 --> 00:14:28,880
Actually, I didn't know.

314
00:14:29,060 --> 00:14:33,340
And when I read about it in your
blog post, I immediately thought,

315
00:14:33,340 --> 00:14:37,740
oh, this is a candidate to be one
of my favorite additions to Postgres

316
00:14:37,740 --> 00:14:38,240
17.

317
00:14:38,380 --> 00:14:40,280
I somehow overlooked it.

318
00:14:40,360 --> 00:14:41,180
So it's great.

319
00:14:41,780 --> 00:14:44,840
MERGE was added to Postgres 15.

320
00:14:45,480 --> 00:14:48,200
It was being developed for more
than 10 years, I think.

321
00:14:49,080 --> 00:14:50,460
It was a huge story.

322
00:14:51,220 --> 00:14:53,860
But it lacks RETURNING in 2015
and 2016.

323
00:14:53,880 --> 00:14:58,240
Now in 2017, to be honest, it's 
receiving RETURNING.

324
00:14:58,740 --> 00:15:03,720
It looks like all your checkboxes
are green, right?

325
00:15:07,460 --> 00:15:11,980
You will choose this if you already
was on, we are on Postgres

326
00:15:12,240 --> 00:15:16,020
17, but before 17, what's your
favorite method?

327
00:15:16,220 --> 00:15:20,640
Haki: Well, INSERT ON CONFLICT,
DO NOTHING with a UNION is currently

328
00:15:20,680 --> 00:15:22,320
my way to go.

329
00:15:22,680 --> 00:15:28,080
And I expect it to remain my go-to
approach, mostly because of

330
00:15:28,080 --> 00:15:28,580
habit.

331
00:15:29,340 --> 00:15:35,320
And also because I experimented
with the new MERGE under different

332
00:15:35,320 --> 00:15:37,360
circumstances in Postgres 17.

333
00:15:38,140 --> 00:15:43,080
And it turns out that MERGE can 
actually fail with unique constraint

334
00:15:43,080 --> 00:15:43,580
violation.

335
00:15:45,520 --> 00:15:47,580
So I found it unexpected.

336
00:15:48,480 --> 00:15:51,800
I don't know, because I don't know
much about how these two are 

337
00:15:51,820 --> 00:15:54,640
different in the underlying implementation,
but I'm guessing

338
00:15:55,080 --> 00:15:57,420
that they are not implemented the 
same way.

339
00:15:57,700 --> 00:15:59,960
Nikolay: And we talk about it commitment
level.

340
00:15:59,960 --> 00:16:00,560
Haki: Yes, of course.

341
00:16:00,560 --> 00:16:02,320
Nikolay: I think it also matters,
right?

342
00:16:02,320 --> 00:16:03,220
It's the lowest.

343
00:16:03,900 --> 00:16:04,400
Haki: Yeah.

344
00:16:04,440 --> 00:16:09,900
So, so after I published this article,
I, some reader reached

345
00:16:09,900 --> 00:16:13,080
out to me and he said, I really
liked your posts and everything

346
00:16:13,080 --> 00:16:16,780
is very comprehensive, blah, blah,
blah, But there is one race

347
00:16:16,780 --> 00:16:20,420
condition that can still cause
inconsistent results.

348
00:16:20,900 --> 00:16:24,480
So basically, if you have two sessions
inserting the exact same

349
00:16:24,480 --> 00:16:28,880
tag at the exact same time, then
you can end up with a situation

350
00:16:29,100 --> 00:16:35,040
where using INSERT ON CONFLICT,
you would not get idempotency.

351
00:16:35,500 --> 00:16:42,920
meaning the first statement would
return the tag, but the second

352
00:16:42,920 --> 00:16:44,040
would return nothing.

353
00:16:45,940 --> 00:16:52,000
And the reason for that is when
you start the first transaction,

354
00:16:52,660 --> 00:16:54,360
you INSERT the tag A.

355
00:16:55,200 --> 00:17:03,060
So tag A does not exist and INSERT,
INSERT the tag, right?

356
00:17:03,780 --> 00:17:06,700
And then you get that in return,
but then you don't COMMIT.

357
00:17:07,640 --> 00:17:12,560
Now another session starts, the
transaction begins, and now you

358
00:17:12,560 --> 00:17:13,820
do the exact same thing.

359
00:17:13,820 --> 00:17:19,900
You do INSERT tag A ON CONFLICT
DO NOTHING, RETURNING *, and

360
00:17:19,900 --> 00:17:22,660
then you SELECT from the table
and SELECT from the RETURNING.

361
00:17:23,160 --> 00:17:27,560
What's happening now is interesting,
because right now, the row

362
00:17:27,560 --> 00:17:28,300
is LOCKED.

363
00:17:28,660 --> 00:17:31,900
So, the second session hangs.

364
00:17:32,620 --> 00:17:35,440
So now you COMMIT the first session.

365
00:17:36,180 --> 00:17:39,860
So the second session at this point
is going to return nothing.

366
00:17:41,180 --> 00:17:45,820
And the reason is INSERT ON CONFLICT
LOCKED.

367
00:17:46,020 --> 00:17:50,600
It encountered a unique constraint
violation, so the row was

368
00:17:50,600 --> 00:17:51,420
not inserted.

369
00:17:51,800 --> 00:17:55,900
But then when you tried to SELECT
from the table, it found nothing

370
00:17:55,960 --> 00:18:01,520
because the snapshot is older than
the other transaction.

371
00:18:03,160 --> 00:18:07,900
So this is a scenario where, where
you get, you don't get idempotency.

372
00:18:08,540 --> 00:18:13,500
You expect to get the same output
in both cases, but you don't.

373
00:18:13,740 --> 00:18:17,140
And the more interesting thing
is if you do the exact same exercise

374
00:18:17,160 --> 00:18:19,640
with MERGE, you get different results.

375
00:18:19,640 --> 00:18:20,600
What do you get?

376
00:18:20,600 --> 00:18:21,520
Can you guess?

377
00:18:22,180 --> 00:18:23,600
I found it to be surprising.

378
00:18:24,620 --> 00:18:27,380
If you do the exact same experiment
with MERGE, you are going

379
00:18:27,380 --> 00:18:28,860
to get unique constraint violation.

380
00:18:31,720 --> 00:18:36,220
Even though you can say in the
MERGE clause, you can say ON CONFLICT,

381
00:18:36,220 --> 00:18:38,500
DO NOTHING WHEN MATCHED, DO NOTHING.

382
00:18:38,940 --> 00:18:41,600
You are still going to get unique
constraint violation.

383
00:18:42,080 --> 00:18:47,780
So this raises a question of whether
MERGE is really, or I would,

384
00:18:47,780 --> 00:18:52,240
I wouldn't want to say something
like, is it safe in concurrent,

385
00:18:52,240 --> 00:18:56,280
highly concurrent reloads, but
it doesn't do what it promises.

386
00:18:56,280 --> 00:18:59,480
Better understanding of how MERGE
handles these types of situations,

387
00:18:59,620 --> 00:19:00,260
you know.

388
00:19:01,260 --> 00:19:01,500
Nikolay: Yeah.

389
00:19:01,500 --> 00:19:06,300
It shouldn't produce unique constraint
violation in read committed,

390
00:19:06,300 --> 00:19:07,120
but it does.

391
00:19:08,680 --> 00:19:11,760
But it should be easy to check
based on what you say, like if

392
00:19:11,760 --> 00:19:17,320
you just do pgbench multiple sessions
in parallel doing the same

393
00:19:17,320 --> 00:19:18,100
thing, right?

394
00:19:18,100 --> 00:19:19,580
Haki: It's very easily reproducible.

395
00:19:19,740 --> 00:19:22,940
You just need 2 terminal windows
and you can immediately replicate

396
00:19:23,000 --> 00:19:23,260
it.

397
00:19:23,260 --> 00:19:27,240
I, the thing I found to be, I would
say disturbing.

398
00:19:27,540 --> 00:19:27,880
Yeah.

399
00:19:27,880 --> 00:19:31,960
And I'm using air quotes for those
of you just listening is that

400
00:19:31,960 --> 00:19:36,100
when you do MERGE and you set ON
MATCH, DO NOTHING, you don't

401
00:19:36,100 --> 00:19:38,040
expect to get Unique constraint
violation.

402
00:19:38,620 --> 00:19:38,940
Right.

403
00:19:38,940 --> 00:19:43,260
If I wrote this statement in my
code, I would not handle any

404
00:19:43,260 --> 00:19:44,120
Constraint violation.

405
00:19:45,060 --> 00:19:45,320
Okay.

406
00:19:45,320 --> 00:19:49,280
Because if I'm inserting into a
row and I know that it can raise,

407
00:19:49,280 --> 00:19:52,840
I would handle the situation and
do whatever I need to do, but

408
00:19:52,840 --> 00:19:58,120
I would not expect to get integrity
error when I explicitly handle

409
00:19:58,120 --> 00:20:01,100
the case where there is a match.

410
00:20:01,560 --> 00:20:03,220
So I found this to be surprising.

411
00:20:03,240 --> 00:20:06,760
So to answer your previous question,
what would I use now?

412
00:20:07,240 --> 00:20:11,180
And that's a, that's a tough choice
between you don't get the

413
00:20:11,180 --> 00:20:13,340
expected result to you get an exception.

414
00:20:15,060 --> 00:20:17,360
You know, when I'm thinking about
right now, I think that it

415
00:20:17,360 --> 00:20:19,020
kind of makes sense to get the
exception.

416
00:20:20,240 --> 00:20:20,740
Right?

417
00:20:20,820 --> 00:20:21,060
Nikolay: Yeah.

418
00:20:21,060 --> 00:20:25,960
Well, I guess we're moving from
the end of article backwards,

419
00:20:26,000 --> 00:20:26,420
right?

420
00:20:26,420 --> 00:20:31,180
So much, this, the problem you
describe, is it discussed somewhere?

421
00:20:31,360 --> 00:20:35,860
Like, do you know, like, like this
behavior was discussed in

422
00:20:35,860 --> 00:20:37,310
mailing lists or somewhere, no?

423
00:20:37,310 --> 00:20:38,000
Haki: I don't know.

424
00:20:38,000 --> 00:20:40,840
Maybe I'm guessing that it might
have, but I haven't seen any.

425
00:20:40,840 --> 00:20:43,220
And it also, as you said, it's
kind of new in Postgres.

426
00:20:43,280 --> 00:20:46,640
So it's possible that some of the
people that worked on it didn't

427
00:20:46,640 --> 00:20:49,200
have enough time to actually document
it.

428
00:20:49,200 --> 00:20:52,700
I wouldn't say document, but maybe
write these types of articles

429
00:20:52,720 --> 00:20:56,140
on it, analyzing how it behaves
under different circumstances.

430
00:20:57,040 --> 00:20:57,440
Nikolay: Yeah.

431
00:20:57,440 --> 00:20:58,240
I'm curious.

432
00:20:58,260 --> 00:20:59,720
I think this should be documented.

433
00:20:59,820 --> 00:21:01,880
Maybe, maybe it's actually documented.

434
00:21:01,940 --> 00:21:03,980
We should check, But it's super
interesting.

435
00:21:04,080 --> 00:21:06,780
I never used MERGE in production,
never, ever.

436
00:21:07,360 --> 00:21:08,760
But yeah, so what...

437
00:21:09,520 --> 00:21:14,840
Well, honestly, most production
systems are still behind a lot

438
00:21:14,840 --> 00:21:16,360
in terms of major version.

439
00:21:16,860 --> 00:21:21,560
But moving backwards, in SERP,
on conflict, do nothing or do

440
00:21:21,560 --> 00:21:22,040
UPDATE.

441
00:21:22,040 --> 00:21:27,340
This is interesting because we
said our goal is find or create

442
00:21:27,340 --> 00:21:30,920
or get or create, right?

443
00:21:31,020 --> 00:21:34,960
But when you say, let's also UPDATE,
it's like a third layer

444
00:21:34,960 --> 00:21:36,240
of logic, right?

445
00:21:36,680 --> 00:21:41,440
Find or create, but if you found
also UPDATE, like it's, it's

446
00:21:41,440 --> 00:21:42,180
more complicated.

447
00:21:42,180 --> 00:21:45,020
I like it, but it's already a different
task, right?

448
00:21:45,020 --> 00:21:45,520
Haki: Yeah.

449
00:21:45,860 --> 00:21:49,640
Well, you know, when I wrote lots
of ETL, I don't know, 15 years

450
00:21:49,640 --> 00:21:53,100
ago in, in Oracle, I used MERGE
all the time.

451
00:21:53,460 --> 00:21:56,420
It put like a third hand every
time I needed to do anything,

452
00:21:56,420 --> 00:22:00,040
I would use MERGE because when
you do ETL processes, you're basically

453
00:22:00,060 --> 00:22:03,560
taking data from one place and you
want to sync it to some kind

454
00:22:03,560 --> 00:22:04,260
of target.

455
00:22:04,440 --> 00:22:06,360
So you use MERGE all the time.

456
00:22:06,620 --> 00:22:10,340
In Postgres, I kind of gotten used
to not using it.

457
00:22:10,760 --> 00:22:16,460
And also some may also claim that
the whole ETL, the way that

458
00:22:16,460 --> 00:22:20,220
we do ETL now is kind of different
than the way we used to do

459
00:22:20,220 --> 00:22:21,860
ETL 15 years ago.

460
00:22:21,940 --> 00:22:25,060
A lot of it is the same, but some
of it is still kind of different.

461
00:22:25,320 --> 00:22:27,340
Yeah, I think it's interesting.

462
00:22:27,380 --> 00:22:31,020
And I, and hopefully when more
people adopt MERGE, I know that

463
00:22:31,020 --> 00:22:33,960
it was a very desirable feature,
right?

464
00:22:34,200 --> 00:22:35,780
A lot of people wanted that.

465
00:22:37,060 --> 00:22:37,560
Right.

466
00:22:37,660 --> 00:22:37,900
Nikolay: People

467
00:22:37,900 --> 00:22:39,840
Haki: were very excited when it
came out.

468
00:22:39,840 --> 00:22:41,060
It was all in there.

469
00:22:41,060 --> 00:22:44,940
What I anticipate, what I expect,
most in version, you know,

470
00:22:45,060 --> 00:22:45,980
the MERGE command.

471
00:22:46,800 --> 00:22:49,760
So hopefully we're going to see
some people doing interesting

472
00:22:49,760 --> 00:22:50,780
stuff with MERGE.

473
00:22:51,760 --> 00:22:51,960
Nikolay: Yeah.

474
00:22:51,960 --> 00:22:56,580
And I was surprised to see, so
there is a difference between, in

475
00:22:56,580 --> 00:22:59,760
certain ON CONFLICT DO NOTHING and
DO UPDATE in the table you provided

476
00:22:59,760 --> 00:23:04,860
in the end of the article and DO
UPDATE, there is a like red

477
00:23:04,860 --> 00:23:09,400
cross in the column of bloat, meaning
that it has bloat issues,

478
00:23:09,400 --> 00:23:11,020
so obviously update, okay.

479
00:23:11,160 --> 00:23:16,500
But also you have a green checkbox
for INSERT ON CONFLICT DO NOTHING.

480
00:23:16,720 --> 00:23:21,140
The thing is that when we created
our bot, we also used INSERT

481
00:23:21,140 --> 00:23:23,640
ON CONFLICT DO NOTHING for the knowledge
base.

482
00:23:23,640 --> 00:23:28,440
For example, if your article gets
inserted, it searches if this

483
00:23:28,440 --> 00:23:29,940
URL is known already.

484
00:23:31,220 --> 00:23:33,180
And if it's known already, DO NOTHING.

485
00:23:33,180 --> 00:23:35,420
We decided not to apply update
yet.

486
00:23:36,280 --> 00:23:42,080
So I was surprised to see you concluded
that there is no bloat

487
00:23:42,080 --> 00:23:46,980
issues because we had a lot of
activity, a lot of data to insert

488
00:23:47,420 --> 00:23:49,140
and it suffers from bloat.

489
00:23:49,700 --> 00:23:50,660
We have gaps.

490
00:23:51,580 --> 00:23:55,580
So if you have a collision, right?

491
00:23:55,680 --> 00:23:58,460
So you insert something, you try
to insert and DO NOTHING.

492
00:23:59,340 --> 00:24:01,740
It actually generates that tuple,
I think.

493
00:24:01,740 --> 00:24:02,240
No,

494
00:24:02,480 --> 00:24:05,040
Haki: No, I think that, first of
all, gaps,

495
00:24:05,200 --> 00:24:05,980
Nikolay: gaps for sure.

496
00:24:05,980 --> 00:24:06,380
I just

497
00:24:06,380 --> 00:24:06,760
Haki: checked it.

498
00:24:06,760 --> 00:24:07,740
Nikolay: Gaps for sure.

499
00:24:07,800 --> 00:24:08,040
Haki: Yeah.

500
00:24:08,040 --> 00:24:09,060
Gaps for sure.

501
00:24:09,160 --> 00:24:09,720
And okay.

502
00:24:09,720 --> 00:24:10,920
I know what you're talking about.

503
00:24:10,920 --> 00:24:15,140
You, you, you're talking about 
the, the issue of bloat, right?

504
00:24:15,140 --> 00:24:19,540
So this is also a very interesting 
topic, which I briefly mentioned,

505
00:24:19,540 --> 00:24:22,940
but I think that this is also something 
that some people commented

506
00:24:22,940 --> 00:24:26,720
about that they weren't aware of 
the fact that it causes bloat.

507
00:24:26,920 --> 00:24:30,140
So apparently in Postgres, when 
you have unique constraint violation,

508
00:24:30,300 --> 00:24:34,340
it can cause bloat because the 
way it works is that Postgres

509
00:24:34,440 --> 00:24:39,860
tries to, it basically inserts 
the row into the table and then

510
00:24:39,860 --> 00:24:43,640
it checks that there are no unique 
constraint violations, right?

511
00:24:43,640 --> 00:24:46,560
And if there is a unique constraint 
violation, this row is marked

512
00:24:46,560 --> 00:24:48,460
as dead, which causes bloat.

513
00:24:48,960 --> 00:24:53,680
So this is what's happening if 
you rely, heavily rely on catching

514
00:24:53,680 --> 00:24:54,860
unique constraint violation.

515
00:24:55,240 --> 00:24:59,740
However, however, however, and 
this is also in the original RDS

516
00:24:59,760 --> 00:25:02,720
article, which is how I found out 
about it.

517
00:25:02,720 --> 00:25:06,080
It was also very unexpected that 
unique constraint violation

518
00:25:06,080 --> 00:25:07,020
would cause bloat.

519
00:25:07,200 --> 00:25:09,900
But according to the article and 
according to the benchmarks

520
00:25:09,960 --> 00:25:14,040
that I've done, when you use INSERT 
ON CONFLICT, it uses a different

521
00:25:14,040 --> 00:25:20,640
mechanism that can check in advance 
for collisions, which prevents

522
00:25:20,640 --> 00:25:21,140
bloat.

523
00:25:21,600 --> 00:25:26,520
So I'm pretty confident that INSERT 
ON CONFLICT DO NOTHING cause

524
00:25:26,520 --> 00:25:30,960
bloat, which is a big plus for 
DO NOTHING.

525
00:25:32,380 --> 00:25:36,340
And also, you know, we, we talked 
before about the difference

526
00:25:36,340 --> 00:25:39,300
between how you do things in Python, 
like asking for forgiveness

527
00:25:39,840 --> 00:25:41,640
versus look before you leap.

528
00:25:41,640 --> 00:25:45,940
So in Python, the pattern that 
is encouraged, basically try to

529
00:25:45,940 --> 00:25:49,200
do something and then handle 
the exception can end up causing

530
00:25:49,200 --> 00:25:51,660
lots of bloat, which is a problem.

531
00:25:51,960 --> 00:25:55,080
So if you have a process with lots 
of collisions where you actually

532
00:25:55,080 --> 00:25:58,380
rely on unique constraint violation, 
this is something that you

533
00:25:58,380 --> 00:25:59,740
need to think about.

534
00:26:00,180 --> 00:26:00,600
Nikolay: Right.

535
00:26:00,600 --> 00:26:01,840
So yeah, I double-checked.

536
00:26:01,840 --> 00:26:02,440
You're right.

537
00:26:02,440 --> 00:26:05,460
No bloat, but the sequence gaps are happening.

538
00:26:05,580 --> 00:26:06,960
We have a sequence gap.

539
00:26:07,600 --> 00:26:11,520
Haki: You know, there was a period 
where I liked to ask in a job

540
00:26:11,520 --> 00:26:14,000
interviews, how would you implement 
gapless sequences?

541
00:26:15,600 --> 00:26:17,780
Well, have you ever talked about 
this?

542
00:26:17,780 --> 00:26:18,680
Michael: Evil question.

543
00:26:18,740 --> 00:26:19,720
Haki: Evil question.

544
00:26:19,760 --> 00:26:20,260
Yeah.

545
00:26:20,380 --> 00:26:20,820
Nikolay: Yeah.

546
00:26:20,820 --> 00:26:25,320
It has many depths to 
go into, right?

547
00:26:25,320 --> 00:26:25,600
Yeah.

548
00:26:25,600 --> 00:26:26,740
Deeper, deeper, deeper.

549
00:26:26,820 --> 00:26:27,320
Yeah.

550
00:26:27,840 --> 00:26:30,720
Michael: There's a company that 
a few of my friends and former

551
00:26:30,720 --> 00:26:32,580
colleagues work at called incident.io.

552
00:26:33,340 --> 00:26:36,800
They do like incident management
within things like Slack and

553
00:26:36,800 --> 00:26:37,300
teams.

554
00:26:37,700 --> 00:26:41,640
And they had this issue where they,
well, they blogged about

555
00:26:41,640 --> 00:26:45,560
it, but they wanted gapless sequences
for their incident numbers.

556
00:26:45,620 --> 00:26:49,540
They didn't, The customers got
confused by the idea of the incident

557
00:26:49,540 --> 00:26:52,800
number jumping by like 5 when they'd
only had, you know, they've

558
00:26:52,800 --> 00:26:55,080
had 1 incident, then a second incident,
a third incident, suddenly

559
00:26:55,080 --> 00:26:56,260
they're on incident 6.

560
00:26:56,260 --> 00:26:56,980
What happened?

561
00:26:57,040 --> 00:26:59,540
So there's a really nice blog post
there.

562
00:26:59,760 --> 00:27:01,740
We can explain how that was.

563
00:27:03,740 --> 00:27:08,080
But I'm curious, why are you bothered
by sequence gaps, Nikolay?

564
00:27:08,160 --> 00:27:09,280
Why is that an issue?

565
00:27:09,280 --> 00:27:10,820
Nikolay: Well, it's not a big issue.

566
00:27:12,040 --> 00:27:15,560
It might be an issue for some cases,
but actually I don't care.

567
00:27:15,660 --> 00:27:18,960
It gave me a false impression that
there is bloat.

568
00:27:19,920 --> 00:27:23,040
But now I double-checked, I'm absolutely
right.

569
00:27:23,160 --> 00:27:23,900
No bloat.

570
00:27:24,760 --> 00:27:29,440
So INSERT ON CONFLICT DO NOTHING
probably is the winner in Postgres

571
00:27:29,440 --> 00:27:30,520
16 or older.

572
00:27:32,520 --> 00:27:34,960
Michael: I've got a question here,
because you mentioned the

573
00:27:34,960 --> 00:27:39,360
kind of trade-off between the issues
in the highly concurrent

574
00:27:39,440 --> 00:27:44,440
cases or like the potential issue
of INSERT ON CONFLICT DO NOTHING

575
00:27:44,760 --> 00:27:51,100
returning null if you happen to
have this case with the 2 concurrent

576
00:27:51,140 --> 00:27:56,140
sessions inserting the exact same
tag at the same time, wouldn't

577
00:27:56,820 --> 00:28:01,820
INSERT ON CONFLICT DO UPDATE avoid
that issue at the cost of

578
00:28:01,820 --> 00:28:03,480
some bloat?

579
00:28:04,820 --> 00:28:07,200
Haki: Well, that's a different
issue.

580
00:28:07,200 --> 00:28:11,920
The issue I describe is that when
you have 2 transactions trying

581
00:28:11,920 --> 00:28:15,520
to insert a new tag at the same
time, the same new tag at the

582
00:28:15,520 --> 00:28:21,140
same time, then the second ON CONFLICT
would not do anything

583
00:28:21,280 --> 00:28:23,540
and it would also not return anything.

584
00:28:24,840 --> 00:28:28,780
This is the unexpected part because
one of my requirements is that

585
00:28:28,780 --> 00:28:30,260
the operation would be idempotent.

586
00:28:30,400 --> 00:28:34,300
So if I give it tag a, I expect
to get an item returned.

587
00:28:34,740 --> 00:28:37,800
So in this case, there is a scenario
where I provide it with

588
00:28:37,800 --> 00:28:42,020
a list of N tags and I get less
than N tags in return.

589
00:28:43,260 --> 00:28:46,440
Michael: But if they both updated,
imagine if both concurrent

590
00:28:46,440 --> 00:28:50,280
sessions were doing INSERT ON 
CONFLICT UPDATE, they both get

591
00:28:50,280 --> 00:28:51,180
to the update stage.

592
00:28:51,180 --> 00:28:54,820
You get 2 updates, but you still
get back all the tags that you

593
00:28:55,240 --> 00:28:56,260
wanted to insert.

594
00:28:56,840 --> 00:29:00,080
Haki: First of all, in my scenario,
I don't update.

595
00:29:00,240 --> 00:29:04,120
This is a very, a lot of people
got this very confused.

596
00:29:04,120 --> 00:29:08,240
This is why I added the comment
and some people made like a very,

597
00:29:09,140 --> 00:29:11,120
let's call them funny comments
about it.

598
00:29:11,120 --> 00:29:11,620
Okay.

599
00:29:12,840 --> 00:29:16,620
But there's a difference between
upsert where you want to update

600
00:29:16,780 --> 00:29:21,420
or insert, which is also a very,
a very popular pattern.

601
00:29:21,780 --> 00:29:26,200
In this case, by the way, it's
simpler because if you actually

602
00:29:26,200 --> 00:29:30,780
update the row, then it's going
to get returned by RETURNING

603
00:29:30,820 --> 00:29:31,320
STAR.

604
00:29:31,340 --> 00:29:31,840
Michael: Returning.

605
00:29:31,980 --> 00:29:32,180
Haki: Yeah.

606
00:29:32,180 --> 00:29:33,840
So that's like the easy case.

607
00:29:33,840 --> 00:29:37,120
In my case, I don't want to touch
the row, but I still want

608
00:29:37,120 --> 00:29:38,140
to get it back.

609
00:29:38,440 --> 00:29:42,760
This is why get_or_create() is like
a more difficult variation

610
00:29:42,980 --> 00:29:43,720
of UPSERT.

611
00:29:44,480 --> 00:29:45,420
If you will.

612
00:29:45,700 --> 00:29:46,200
Okay.

613
00:29:46,640 --> 00:29:49,520
So, there's a tricky part.

614
00:29:50,020 --> 00:29:52,040
So apparently there's no good solution.

615
00:29:52,040 --> 00:29:57,840
By the way, the post by Erwin on Stack Overflow lands

616
00:29:58,100 --> 00:30:02,440
on a brute force solution where
you basically loop until you

617
00:30:02,440 --> 00:30:03,580
get what you expect.

618
00:30:05,340 --> 00:30:09,520
I don't know if I would have, if
I, maybe this is the way to

619
00:30:09,520 --> 00:30:10,020
go.

620
00:30:10,520 --> 00:30:14,360
Nikolay: Loop where in, in a different
language or in PL/pgSQL?

621
00:30:14,380 --> 00:30:17,660
Haki: He ended up writing a function
where you essentially loop

622
00:30:18,340 --> 00:30:20,900
and you need constraint violations.

623
00:30:20,960 --> 00:30:21,340
Yeah.

624
00:30:21,340 --> 00:30:23,440
Nikolay: But you, you need a subtransaction.

625
00:30:23,760 --> 00:30:25,380
So this is no go for me.

626
00:30:25,380 --> 00:30:25,840
Definitely.

627
00:30:25,840 --> 00:30:26,340
Right.

628
00:30:27,180 --> 00:30:28,320
This doesn't scale.

629
00:30:28,320 --> 00:30:28,940
I mean, this,

630
00:30:29,280 --> 00:30:31,360
Haki: Nikolay, you are a man with
a mission.

631
00:30:32,380 --> 00:30:34,260
Nikolay: Well, it's, it's no joke.

632
00:30:34,260 --> 00:30:38,080
I, I would have clients literally
last week I spent like half

633
00:30:38,100 --> 00:30:42,760
1 hour, and people went and switched
off it in Django.

634
00:30:43,260 --> 00:30:47,280
Because it hurts, it hurts constantly
people, like subtransaction

635
00:30:47,440 --> 00:30:49,500
SLRU, wait a minute.

636
00:30:49,640 --> 00:30:51,000
It happens all the time.

637
00:30:51,180 --> 00:30:54,940
The fact that Django by default
uses them and people don't realize

638
00:30:54,960 --> 00:30:59,080
and then come to us with subtransaction
SLRU, it's good for my income,

639
00:30:59,800 --> 00:31:03,200
but it's bad in general.

640
00:31:03,480 --> 00:31:05,460
By the way, I wanted to highlight
1 thing.

641
00:31:05,460 --> 00:31:10,560
If you do INSERT ON CONFLICT DO
NOTHING, RETURNING STAR, it won't

642
00:31:10,560 --> 00:31:12,700
return anything if a row already
exists.

643
00:31:12,700 --> 00:31:15,700
I just wanted to highlight that
you have a trick.

644
00:31:16,100 --> 00:31:20,140
You use CTE, right? And then SELECT
for such case, right.

645
00:31:20,140 --> 00:31:20,580
Yeah.

646
00:31:20,580 --> 00:31:22,740
Additionally, with UNION, right?

647
00:31:22,840 --> 00:31:24,060
Haki: Yeah, that's right.

648
00:31:24,060 --> 00:31:24,860
Nikolay: UNION or what?

649
00:31:24,860 --> 00:31:25,080
Haki: Yeah.

650
00:31:25,080 --> 00:31:27,540
This is the only way that he can
actually communicate between

651
00:31:27,540 --> 00:31:28,680
parts of a query.

652
00:31:28,680 --> 00:31:31,600
Otherwise, they all see the same
snapshot and you don't get.

653
00:31:32,220 --> 00:31:34,700
Nikolay: This is a trick also not
straightforward at all.

654
00:31:34,760 --> 00:31:35,260
Honestly.

655
00:31:35,500 --> 00:31:35,860
Yeah.

656
00:31:35,860 --> 00:31:40,200
Like some people can move away
from INSERT ON CONFLICT DO NOTHING

657
00:31:40,200 --> 00:31:41,360
just because of that.

658
00:31:42,040 --> 00:31:42,340
Michael: Yeah.

659
00:31:42,340 --> 00:31:42,840
Right.

660
00:31:42,900 --> 00:31:43,400
Yeah.

661
00:31:43,900 --> 00:31:46,060
One of many cool tricks in that book.

662
00:31:46,720 --> 00:31:49,620
Haki: I went through this process
where I try to figure out,

663
00:31:49,620 --> 00:31:53,260
wait, I just inserted this row
while, when I SELECT outside the

664
00:31:53,260 --> 00:31:53,800
CTE, I

665
00:31:53,800 --> 00:31:54,940
Nikolay: don't get it.

666
00:31:56,040 --> 00:31:56,540
Michael: Yeah.

667
00:31:57,260 --> 00:31:57,540
Yeah.

668
00:31:57,540 --> 00:32:02,880
I understand there's some costs
to UPSERT, but given the complexity

669
00:32:03,000 --> 00:32:07,780
we're running into here, is it,
would you both see it as acceptable

670
00:32:08,000 --> 00:32:13,180
to pay that extra cost, the kind
of heavier, I know you mentioned

671
00:32:13,180 --> 00:32:16,320
the double hit of the table
being annoying, but like

672
00:32:16,620 --> 00:32:19,840
those updates, even though they'd
be setting the tag name to

673
00:32:19,840 --> 00:32:23,360
the exact same tag name, potentially
over and over again.

674
00:32:23,360 --> 00:32:23,860
Again.

675
00:32:24,120 --> 00:32:27,780
I just see the guarantees
that provides is so attractive.

676
00:32:28,940 --> 00:32:30,200
Haki: I'll be honest with you.

677
00:32:30,200 --> 00:32:32,700
I understand what you're saying
and you're probably correct.

678
00:32:33,160 --> 00:32:36,760
This is like the practical approach,
but I would not do it.

679
00:32:37,120 --> 00:32:38,100
It just bugs me.

680
00:32:38,100 --> 00:32:39,440
I would not do it.

681
00:32:40,640 --> 00:32:43,900
I would not do a meaningless update
just to save, I would just

682
00:32:43,900 --> 00:32:44,860
do two queries.

683
00:32:44,860 --> 00:32:46,820
I would INSERT, and then I would
SELECT.

684
00:32:47,640 --> 00:32:48,080
Separately.

685
00:32:48,080 --> 00:32:50,680
Nikolay: Like an INSERT fallback,
INSERT fallback, you end up

686
00:32:50,680 --> 00:32:53,980
having a lot of bloat depending
on the concrete situation.

687
00:32:54,940 --> 00:33:01,340
So this, so subtransactions and
bloat, huge limitations of performance

688
00:33:01,380 --> 00:33:01,720
here.

689
00:33:01,720 --> 00:33:02,120
Right?

690
00:33:02,120 --> 00:33:02,660
Haki: I know.

691
00:33:02,660 --> 00:33:03,960
I think this is a scenario.

692
00:33:03,960 --> 00:33:08,680
This is a case where you kind of
understand the limitation and

693
00:33:08,680 --> 00:33:12,780
restrictions of the database and
you kind of end up solving it

694
00:33:12,780 --> 00:33:14,180
at the application level.

695
00:33:15,720 --> 00:33:20,160
Now, you mentioned at the beginning,
I know that I, well,

696
00:33:20,160 --> 00:33:26,760
get_or_create() is, is useful, but I
haven't had a chance to implement

697
00:33:26,760 --> 00:33:28,060
it as much.

698
00:33:30,060 --> 00:33:33,400
So yeah, for the rare occasion,
what I actually need to do get

699
00:33:33,400 --> 00:33:37,660
or create, I would just do it at
the application level because

700
00:33:37,660 --> 00:33:41,020
as it turns out, it's very complicated
at the database level.

701
00:33:41,520 --> 00:33:41,880
Nikolay: Right.

702
00:33:41,880 --> 00:33:43,720
In applications, it becomes simpler.

703
00:33:44,340 --> 00:33:47,220
You like, you don't deal with,
probably don't deal with subtransactions

704
00:33:47,480 --> 00:33:52,760
and bloat, but latency like there
are trip times between attempts

705
00:33:52,800 --> 00:33:57,940
to do one step, another step, it
increases chances of failures,

706
00:33:57,980 --> 00:33:58,260
right?

707
00:33:58,260 --> 00:34:00,200
I mean, collisions and so on.

708
00:34:00,420 --> 00:34:00,580
Yeah.

709
00:34:00,580 --> 00:34:04,200
Haki: But you know, if, if my main
business is to get or create,

710
00:34:04,200 --> 00:34:07,120
I would come up with a very innovative
solution, put a lot of

711
00:34:07,120 --> 00:34:08,680
engineering work into it.

712
00:34:08,680 --> 00:34:11,760
But if it's just updating tags
for members in an organization,

713
00:34:12,040 --> 00:34:12,940
I would go.

714
00:34:13,200 --> 00:34:13,520
Yeah.

715
00:34:13,520 --> 00:34:14,920
But I understand what you're saying.

716
00:34:14,920 --> 00:34:19,640
I spent a lot of time doing unnecessary
optimizations on, you

717
00:34:19,640 --> 00:34:22,860
know, weird places in the code
just for fun.

718
00:34:23,260 --> 00:34:24,700
I do it all the time.

719
00:34:25,440 --> 00:34:25,840
Yeah.

720
00:34:25,840 --> 00:34:30,860
Like 90% of my blog posts are inspired
by these strange endeavors

721
00:34:30,860 --> 00:34:32,660
where I try to optimize things.

722
00:34:33,760 --> 00:34:35,540
So yeah, I, this is interesting.

723
00:34:35,760 --> 00:34:40,320
And I think that this very simple
problem surfaced a lot of issues.

724
00:34:41,000 --> 00:34:43,620
I also learned a lot from writing
this.

725
00:34:43,860 --> 00:34:48,100
It got me interested in, in speculative
insertion, the mechanism

726
00:34:48,340 --> 00:34:50,040
used by INSERT ON CONFLICT.

727
00:34:50,740 --> 00:34:54,220
It, I think, broadened my understanding
of how the interaction

728
00:34:54,280 --> 00:34:57,400
between different queries and a
common table expressions work

729
00:34:57,400 --> 00:34:59,120
within the same query.

730
00:34:59,640 --> 00:35:03,200
No, lots of things that I'm now
aware of them.

731
00:35:03,200 --> 00:35:09,140
So at least I'm better equipped
to debug issues I might have

732
00:35:09,140 --> 00:35:10,660
and don't even know about.

733
00:35:12,160 --> 00:35:12,660
Michael: Yeah.

734
00:35:12,720 --> 00:35:12,980
Yeah.

735
00:35:12,980 --> 00:35:14,600
I really liked it as a journey.

736
00:35:14,600 --> 00:35:17,600
I would encourage people to read
this, even if they don't have

737
00:35:17,600 --> 00:35:21,560
this problem, just as a, almost
as just to watch somebody go

738
00:35:21,560 --> 00:35:23,000
through this problem.

739
00:35:24,160 --> 00:35:26,880
And you know, I feel like you've
included things as well.

740
00:35:26,880 --> 00:35:30,660
I like there's 1 section that's
I took a long time as well, like

741
00:35:30,660 --> 00:35:31,080
that.

742
00:35:31,080 --> 00:35:33,600
And that's so helpful to share
those things with people.

743
00:35:33,600 --> 00:35:36,380
Cause otherwise you can read some
of these blog posts and they

744
00:35:36,380 --> 00:35:39,660
just sound like the person knew,
like just got it right straight

745
00:35:39,660 --> 00:35:40,080
away.

746
00:35:40,080 --> 00:35:42,660
And it's quite frustrating as somebody
that quite often goes

747
00:35:42,660 --> 00:35:44,580
down long-term for me reading those.

748
00:35:44,580 --> 00:35:46,280
So I appreciated that.

749
00:35:46,380 --> 00:35:49,840
Nikolay: I just wish you had
also some warning about subtransactions

750
00:35:50,220 --> 00:35:50,720
anyway.

751
00:35:53,300 --> 00:35:55,600
Haki: You know what, let me just
change the banner.

752
00:35:55,600 --> 00:35:58,860
The top of the website, instead
of the about page, I would just

753
00:35:58,860 --> 00:36:01,380
place like this huge warning, They
don't need subtransactions.

754
00:36:02,020 --> 00:36:03,380
Nikolay: Let me put right here.

755
00:36:03,380 --> 00:36:09,360
If you see a PL/pgSQL with BEGIN,
EXCEPTION

756
00:36:09,480 --> 00:36:10,400
WHEN or something.

757
00:36:10,440 --> 00:36:15,400
So anyway, nested BEGIN... END blocks,
this is subtransaction.

758
00:36:16,720 --> 00:36:17,220
Haki: Yeah.

759
00:36:17,580 --> 00:36:20,720
But in this case, it's warranted
because otherwise, it cancels

760
00:36:20,720 --> 00:36:21,720
any calling transaction.

761
00:36:22,060 --> 00:36:24,800
So that would be the responsible
thing to do in this case.

762
00:36:24,800 --> 00:36:25,760
But you know what?

763
00:36:25,760 --> 00:36:29,180
Maybe it needs a comment saying
this is a good subtransaction.

764
00:36:30,360 --> 00:36:33,040
Nikolay: It's not a good subtransaction
because if you have

765
00:36:33,040 --> 00:36:33,440
high...

766
00:36:33,440 --> 00:36:36,360
Haki: This is the least worst type
of subtransaction.

767
00:36:36,980 --> 00:36:40,100
Nikolay: I cannot agree with you
here because you talk about,

768
00:36:40,840 --> 00:36:44,420
you explore clashes and concurrency
issues.

769
00:36:44,680 --> 00:36:45,180
Yeah.

770
00:36:45,720 --> 00:36:50,140
It means you think that there will
be a lot of sessions issuing

771
00:36:50,900 --> 00:36:52,000
this query.

772
00:36:52,440 --> 00:36:56,740
It means eventually the project
will bump into this wall.

773
00:36:57,940 --> 00:37:03,120
Good subtransactions is, for example,
DDL, which happens not

774
00:37:03,120 --> 00:37:04,660
often in 1 session.

775
00:37:04,780 --> 00:37:05,540
This is good.

776
00:37:05,540 --> 00:37:09,600
And even there, it can become bad.

777
00:37:09,900 --> 00:37:14,280
But here I cannot agree because
we should consider multiple sessions

778
00:37:14,540 --> 00:37:17,120
fighting and this is where things
can go wrong.

779
00:37:17,120 --> 00:37:19,860
Haki: What you're saying reinforces
what I'm saying that in this

780
00:37:19,860 --> 00:37:22,700
situation, because it's so complicated
in the database, I would

781
00:37:22,700 --> 00:37:27,940
just elevate that to the application
level and try to find a

782
00:37:27,940 --> 00:37:30,420
solution there to avoid all the
complexity.

783
00:37:30,480 --> 00:37:33,480
But it was a nice thought exercise,
understanding different types

784
00:37:33,480 --> 00:37:34,080
of transactions.

785
00:37:34,080 --> 00:37:37,280
Nikolay: Just to try regular, regular
transaction, just to try

786
00:37:37,280 --> 00:37:38,260
regular transaction maybe.

787
00:37:38,260 --> 00:37:38,680
Right.

788
00:37:38,680 --> 00:37:38,860
Yeah.

789
00:37:38,860 --> 00:37:40,920
It should not be so complex,
right?

790
00:37:40,920 --> 00:37:41,460
Haki: I know.

791
00:37:41,580 --> 00:37:43,520
I was surprised that it was so
complicated.

792
00:37:44,240 --> 00:37:45,180
Nikolay: Why is it so?

793
00:37:45,560 --> 00:37:49,120
Michael: What do you, like, what
would you both like as a, is

794
00:37:49,120 --> 00:37:51,700
it that you would like MERGE to
work as expected?

795
00:37:51,780 --> 00:37:54,940
Like what is the nice solution
on the database side for this?

796
00:37:55,080 --> 00:37:59,120
Haki: Well, the dream syntax would
be for SELECT DO NOTHING RETURNING

797
00:37:59,120 --> 00:38:01,480
STAR to just return everything.

798
00:38:02,020 --> 00:38:03,780
That would be like the dream syntax.

799
00:38:04,220 --> 00:38:07,080
Nikolay: I think part of the problem
is that RETURNING STAR is

800
00:38:07,080 --> 00:38:12,240
quite like not super popular thing,
which was added later to

801
00:38:12,240 --> 00:38:14,000
INSERT and UPDATE and DELETE.

802
00:38:14,180 --> 00:38:15,740
It was not originally there.

803
00:38:15,960 --> 00:38:17,880
So it's not a super natural thing.

804
00:38:18,820 --> 00:38:21,540
Michael: Also, once you've defined
that behavior, it's so difficult

805
00:38:21,560 --> 00:38:24,840
to like, you can't change the behavior
for existing applications.

806
00:38:24,840 --> 00:38:25,340
Haki: You can't.

807
00:38:25,340 --> 00:38:30,740
And I can even give you an example
because you usually use INSERT

808
00:38:31,180 --> 00:38:32,060
ON CONFLICT.

809
00:38:32,980 --> 00:38:36,160
With, you know, there is no MERGE
command in Postgres.

810
00:38:36,300 --> 00:38:42,340
So if you want to sync data, you
use INSERT ON CONFLICT and then

811
00:38:42,340 --> 00:38:43,680
when matched do nothing.

812
00:38:44,100 --> 00:38:44,600
OK.

813
00:38:44,820 --> 00:38:48,400
And more often what you want to
do is you want to avoid the unnecessary

814
00:38:48,540 --> 00:38:48,960
updates.

815
00:38:48,960 --> 00:38:54,900
So you do when matched, do UPDATE
where something actually changed.

816
00:38:55,200 --> 00:38:55,580
OK.

817
00:38:55,580 --> 00:38:59,120
And then some rows end up not
being affected.

818
00:38:59,500 --> 00:39:02,300
So at this point you do RETURNING
star.

819
00:39:02,660 --> 00:39:07,120
And then usually what I like to
do in my ETL processes is I count

820
00:39:07,120 --> 00:39:09,860
the rows so I can log how many
rows were affected.

821
00:39:11,180 --> 00:39:11,680
Right?

822
00:39:11,820 --> 00:39:16,440
So now the question is, if I expect
to get the number of affected

823
00:39:16,560 --> 00:39:19,440
rows, we go for a circle here.

824
00:39:19,440 --> 00:39:20,020
I know.

825
00:39:20,020 --> 00:39:23,860
So if I'm just expecting that the
affected rows, that if I'm

826
00:39:23,860 --> 00:39:29,600
going to get rows that were not
affected, also very strange,

827
00:39:29,620 --> 00:39:30,120
right?

828
00:39:31,060 --> 00:39:31,560
Nikolay: Right.

829
00:39:32,540 --> 00:39:33,040
Haki: Yeah.

830
00:39:33,420 --> 00:39:34,520
So what's the conclusion?

831
00:39:34,640 --> 00:39:35,320
Come on, guys.

832
00:39:35,320 --> 00:39:36,560
I mean, what's the conclusion?

833
00:39:36,600 --> 00:39:37,940
Just do it in the application?

834
00:39:37,960 --> 00:39:38,900
Nikolay: Don't go there.

835
00:39:38,940 --> 00:39:40,780
The main conclusion is don't go
there.

836
00:39:40,840 --> 00:39:41,920
Don't think about it.

837
00:39:41,920 --> 00:39:42,420
Haki: Yeah.

838
00:39:44,440 --> 00:39:49,340
Nikolay: If an error happens, it only
confirms that application

839
00:39:49,540 --> 00:39:52,580
developers should implement proper
retry logic.

840
00:39:52,820 --> 00:39:53,320
Always.

841
00:39:54,020 --> 00:39:57,940
If a transaction fails, code should
be ready to retry it.

842
00:39:58,040 --> 00:39:58,780
That's it.

843
00:39:59,180 --> 00:40:00,880
Universal solution to everything.

844
00:40:02,420 --> 00:40:05,400
Haki: We should rename this to
just do it in the application

845
00:40:05,560 --> 00:40:06,060
FM.

846
00:40:08,160 --> 00:40:10,960
Well, it's complicated FM.

847
00:40:13,140 --> 00:40:13,960
Michael: It's complicated.

848
00:40:14,140 --> 00:40:15,260
Haki: No, it is complicated.

849
00:40:15,480 --> 00:40:16,620
No, no jokes aside.

850
00:40:16,620 --> 00:40:16,860
It is.

851
00:40:16,860 --> 00:40:19,520
I don't know why it's so complicated,
but it is.

852
00:40:19,700 --> 00:40:21,500
Michael: Turns out concurrency
is hard.

853
00:40:21,500 --> 00:40:22,000
Haki: Yeah.

854
00:40:22,240 --> 00:40:25,240
You know, one of the things
that I remember is that

855
00:40:25,240 --> 00:40:29,060
when I read this RDS article about
a unique constraint violation

856
00:40:29,060 --> 00:40:31,160
causing bloat, I was shocked.

857
00:40:32,020 --> 00:40:37,120
I was shocked because coming from
Python, I'm like encouraged

858
00:40:37,360 --> 00:40:38,440
to trigger exceptions.

859
00:40:39,220 --> 00:40:40,860
This is how I do things.

860
00:40:41,040 --> 00:40:43,680
I fail and then I, I adjust.

861
00:40:44,840 --> 00:40:47,540
So, that was very strange for me.

862
00:40:47,540 --> 00:40:52,160
I have this entire lecture about
how to handle concurrency with

863
00:40:52,160 --> 00:40:53,980
a short URL system and everything.

864
00:40:53,980 --> 00:40:58,680
Like there's a problem of how you
generate short random IDs.

865
00:40:59,180 --> 00:41:00,200
How do you do it?

866
00:41:00,200 --> 00:41:03,320
You need to, if you know, like
a short URL system, you need to

867
00:41:03,320 --> 00:41:05,240
generate keys, very small keys.

868
00:41:06,040 --> 00:41:07,420
So the space is limited.

869
00:41:07,900 --> 00:41:10,520
So you want to generate random
unique keys.

870
00:41:10,520 --> 00:41:11,920
So how do you do it?

871
00:41:12,700 --> 00:41:16,960
So you end up in a solution where
you try to INSERT and then

872
00:41:16,960 --> 00:41:21,540
you loop until you don't get unique
constraint violations anymore.

873
00:41:22,200 --> 00:41:22,700
Right.

874
00:41:22,800 --> 00:41:28,440
And now all I'm thinking is I may
have caused lots of bloat without

875
00:41:28,440 --> 00:41:28,940
intention.

876
00:41:30,040 --> 00:41:31,500
Maybe you have a better solution.

877
00:41:31,720 --> 00:41:34,320
By the way, it's very similar to 
the question about the gap, 

878
00:41:34,320 --> 00:41:35,040
the sequences.

879
00:41:36,340 --> 00:41:36,840
Michael: Yeah.

880
00:41:37,540 --> 00:41:39,240
I get that bloat is bad, right?

881
00:41:39,240 --> 00:41:42,540
Like I understand, I see lots of
slow queries that are caused by

882
00:41:42,540 --> 00:41:46,220
bloat, but once you're aware of it,
it's like something that can be managed.

883
00:41:46,220 --> 00:41:47,500
And you've written a great blog
post about how to manage it at

884
00:41:47,500 --> 00:41:50,080
Haki as well, that I'll include.

885
00:41:50,080 --> 00:41:55,640
But it feels to me like it might
be one of those pains worth paying 

886
00:41:55,640 --> 00:41:59,100
sometimes when you're aware, like
every single PostgreSQL table

887
00:41:59,100 --> 00:42:02,240
is going to have some bloat in
it, in a natural work life.

888
00:42:02,240 --> 00:42:02,640
You're not going to get it down
to zero, even when you do some of 

889
00:42:02,640 --> 00:42:05,360
these maintenance tasks.

890
00:42:05,360 --> 00:42:06,820
So we're always managing some amount.

891
00:42:06,820 --> 00:42:08,720
It's just like, what's a, what's
a healthy amount and can, can

892
00:42:08,720 --> 00:42:11,720
we keep it within those boundaries?

893
00:42:11,720 --> 00:42:13,420
It's going to be more painful with
certain workflows.

894
00:42:13,440 --> 00:42:15,760
Like maybe once we're getting near
the end of that, like once

895
00:42:15,760 --> 00:42:18,560
we get past maybe halfway full
of that space, we're starting

896
00:42:18,560 --> 00:42:22,320
like an average of 2 tries per
unique URL might start to get

897
00:42:22,320 --> 00:42:26,480
annoying.

898
00:42:26,480 --> 00:42:26,940
So maybe that's the point where
you start expanding the space.

899
00:42:29,540 --> 00:42:30,900
by an extra digit or something.

900
00:42:30,900 --> 00:42:33,740
I don't, I imagine that once you're
aware of the problems, you

901
00:42:33,740 --> 00:42:37,040
can design a solution that isn't
that bad and then manage the

902
00:42:37,040 --> 00:42:37,540
bloat.

903
00:42:37,540 --> 00:42:40,120
Haki: Well, it all boils down to
how many collisions you actually

904
00:42:40,120 --> 00:42:41,140
expect to get.

905
00:42:42,040 --> 00:42:50,580
Nikolay: I think the fact that
rollback inserts cause bloat makes

906
00:42:50,580 --> 00:42:53,420
me think that only undo people
can save us someday.

907
00:42:54,280 --> 00:42:55,900
Michael: We had a quote last week.

908
00:42:57,260 --> 00:42:58,840
Nikolay: The new term, undo people.

909
00:42:58,940 --> 00:43:00,080
Because indeed, if you think

910
00:43:00,080 --> 00:43:00,980
Michael: about it,

911
00:43:03,940 --> 00:43:07,480
Nikolay: this different approach
would place a new row in the

912
00:43:07,480 --> 00:43:10,740
same position on the page, right?

913
00:43:11,540 --> 00:43:14,380
And if it's committed, it's there.

914
00:43:14,380 --> 00:43:18,220
If not, like, I mean, the different
approach would not cause

915
00:43:19,120 --> 00:43:21,700
bloat if an insert is rolled back.

916
00:43:22,540 --> 00:43:23,040
Right.

917
00:43:23,200 --> 00:43:28,680
So this is just the design of PostgreSQL's
MVCC we should blame.

918
00:43:29,540 --> 00:43:30,040
Right.

919
00:43:30,940 --> 00:43:31,440
Yeah.

920
00:43:32,020 --> 00:43:34,300
It's very unexpected for sure.

921
00:43:34,300 --> 00:43:34,700
Haki: Yeah.

922
00:43:34,700 --> 00:43:38,140
Well, at least we can run very
long queries without worrying

923
00:43:38,140 --> 00:43:41,300
about undo segments running out.

924
00:43:42,560 --> 00:43:43,060
Nikolay: Yeah.

925
00:43:43,360 --> 00:43:44,140
Michael: I've been there.

926
00:43:44,140 --> 00:43:47,120
And it's great to have an old group,
like a former ORACLE person

927
00:43:47,120 --> 00:43:49,740
understanding the downsides of
undo too.

928
00:43:49,740 --> 00:43:50,240
Right.

929
00:43:51,820 --> 00:43:54,560
Haki, any, any last things that
we should have mentioned that

930
00:43:54,560 --> 00:43:54,960
we didn't?

931
00:43:54,960 --> 00:43:55,440
Nikolay: And

932
00:43:55,440 --> 00:43:58,980
Haki: no, I think that we zigzagged
across this article very,

933
00:43:58,980 --> 00:43:59,820
very nicely.

934
00:44:00,240 --> 00:44:02,520
Nikolay definitely delivered on
his promise.

935
00:44:03,500 --> 00:44:03,900
Nikolay: Yeah.

936
00:44:03,900 --> 00:44:06,420
Well, are you planning part 2 with
benchmarks?

937
00:44:07,800 --> 00:44:10,520
Haki: I don't know if I still don't
care much about performance

938
00:44:10,520 --> 00:44:14,800
in this case, but I think that
I might add an update on some

939
00:44:14,800 --> 00:44:19,180
of the findings about how merge
and insert can end up eventually

940
00:44:19,500 --> 00:44:26,180
causing a unique constraint violation
and non-idempotent results,

941
00:44:27,180 --> 00:44:33,180
I hope that you implemented upsert
so you can identify where

942
00:44:33,180 --> 00:44:35,200
blog post is updated.

943
00:44:36,260 --> 00:44:39,120
So your AI can answer this correctly.

944
00:44:40,300 --> 00:44:41,640
Nikolay: Oh, this is good.

945
00:44:41,760 --> 00:44:42,500
Good goal.

946
00:44:42,500 --> 00:44:44,080
Actually, we don't have.

947
00:44:45,240 --> 00:44:45,740
Haki: Yeah.

948
00:44:46,620 --> 00:44:48,520
Misconceptions about blog posts.

949
00:44:48,520 --> 00:44:49,500
They never update.

950
00:44:52,720 --> 00:44:52,940
Yeah.

951
00:44:52,940 --> 00:44:55,760
I just solved your get or create
problem.

952
00:44:55,760 --> 00:44:57,940
You can just update all the time.

953
00:44:58,260 --> 00:45:01,200
Nikolay: It's not that easy because
we have many pre-processing

954
00:45:01,420 --> 00:45:05,460
stuff for really long articles
as yours, because you know, the

955
00:45:05,580 --> 00:45:10,460
LLM has usually has some limitation
OpenAI has like 8K tokens

956
00:45:10,840 --> 00:45:14,640
as input, and I'm sure your article
exceeds it a lot.

957
00:45:15,120 --> 00:45:15,620
So.

958
00:45:16,120 --> 00:45:17,780
Haki: I try to write shorter ones.

959
00:45:18,700 --> 00:45:19,200
Nikolay: Yeah.

960
00:45:20,060 --> 00:45:21,680
Probably you should have 2 versions.

961
00:45:21,980 --> 00:45:25,760
No, like for people who have issues
with long reads.

962
00:45:25,760 --> 00:45:25,940
Yeah.

963
00:45:25,940 --> 00:45:26,720
I do actually.

964
00:45:26,720 --> 00:45:28,420
Haki: You like blog posts in parts.

965
00:45:30,060 --> 00:45:33,840
Nikolay: I like short versions
till they are, but with some details.

966
00:45:35,460 --> 00:45:37,280
Michael: I'm a big fan of long
blog posts.

967
00:45:37,280 --> 00:45:39,440
I can, I know there's a lot of
others out there too?

968
00:45:39,440 --> 00:45:42,080
So thank you for continuing to
work for us.

969
00:45:42,260 --> 00:45:42,760
Nikolay: Yeah.

970
00:45:43,660 --> 00:45:47,920
It's something that I I'm going
to cite for sure.

971
00:45:48,280 --> 00:45:48,780
Yeah.

972
00:45:48,780 --> 00:45:51,060
Links will be used to this
article.

973
00:45:51,060 --> 00:45:51,560
Yeah.

974
00:45:51,780 --> 00:45:52,740
Thank you for this.

975
00:45:53,040 --> 00:45:54,720
Haki: I need an idea for the next
1.

976
00:45:54,720 --> 00:46:00,300
So if you have 1 ping me, I have
a working title in my mind.

977
00:46:01,500 --> 00:46:04,620
It's called how to take a dump
like a pro.

978
00:46:04,940 --> 00:46:06,540
I couldn't resist the title.

979
00:46:07,360 --> 00:46:10,180
Nikolay: That's a huge topic actually.

980
00:46:10,800 --> 00:46:11,040
Haki: Yeah.

981
00:46:11,040 --> 00:46:14,200
I've actually had some experience
with it lately and I think

982
00:46:14,200 --> 00:46:17,900
that it's Interesting to see
how you can optimize pg_dump with

983
00:46:17,900 --> 00:46:22,060
the different compression algorithms,
how they affect the CPU.

984
00:46:23,420 --> 00:46:29,700
And when dumping in parallel is
actually not useful at all.

985
00:46:30,860 --> 00:46:32,120
Nikolay: Unpartitioned tables.

986
00:46:33,460 --> 00:46:35,960
Haki: No, if you have 1 big table.

987
00:46:36,660 --> 00:46:40,100
Nikolay: Right, it's possible,
but you need to engineer yourself

988
00:46:40,200 --> 00:46:42,540
with snapshots and repeatable transactions.

989
00:46:44,240 --> 00:46:45,200
Haki: So that's common.

990
00:46:45,420 --> 00:46:47,020
Nikolay: And use ranges of IDs.

991
00:46:47,320 --> 00:46:51,760
This is what Patroni, for
example, does for logical replication

992
00:46:51,940 --> 00:46:52,440
initialization.

993
00:46:53,080 --> 00:46:54,740
So it's possible, for sure.

994
00:46:55,080 --> 00:46:57,540
Haki: Yeah, but I do have another
consideration that you didn't

995
00:46:57,540 --> 00:46:58,340
think about.

996
00:46:59,440 --> 00:47:04,600
Because 1 of my restrictions was
that I wanted to stream the

997
00:47:04,600 --> 00:47:10,740
output directly to S3 and not to
the local file system.

998
00:47:10,760 --> 00:47:12,900
Imagine that you run this in a
Lambda.

999
00:47:13,380 --> 00:47:18,820
In this case, using parallel is
problematic, but using a single

1000
00:47:18,820 --> 00:47:22,460
file, you can just stream it directly
and skip the file system.

1001
00:47:22,920 --> 00:47:29,120
And also if you want to pass the
dump through some kind of encryption,

1002
00:47:30,160 --> 00:47:33,760
then also doing things in parallel
makes it a bit difficult.

1003
00:47:34,500 --> 00:47:37,460
So yeah, different restrictions,
different solution.

1004
00:47:38,040 --> 00:47:41,080
Nikolay: When you do this, like
you, you find some like lacking

1005
00:47:41,200 --> 00:47:41,700
feature.

1006
00:47:42,560 --> 00:47:46,480
Do you have sometimes idea to implement
something?

1007
00:47:47,940 --> 00:47:48,300
Haki: Implement?

1008
00:47:48,300 --> 00:47:48,440
Nikolay: To

1009
00:47:48,440 --> 00:47:49,420
become a hacker.

1010
00:47:49,560 --> 00:47:50,640
To become a hacker.

1011
00:47:51,340 --> 00:47:54,200
Haki: Oh, what, like contribute
to Postgres?

1012
00:47:55,960 --> 00:47:58,780
Nikolay: Because if you, well,
dump lacks a lot of stuff all

1013
00:47:58,780 --> 00:48:01,840
the time, like, but it's super
complicated to add some things

1014
00:48:01,840 --> 00:48:04,460
there, but definitely it likes
a lot of stuff.

1015
00:48:05,140 --> 00:48:09,960
So I'm curious if you've had such
feeling, like idea, like, Oh,

1016
00:48:09,960 --> 00:48:12,080
I'm going to try to implement this.

1017
00:48:12,440 --> 00:48:12,900
No?

1018
00:48:12,900 --> 00:48:17,700
Haki: I've looked at the Postgres
code many, many times.

1019
00:48:18,160 --> 00:48:22,400
I think that the documentation
in the source itself is absolutely

1020
00:48:22,680 --> 00:48:23,180
excellent.

1021
00:48:23,940 --> 00:48:29,020
I remember when we worked on this,
hash indexes article, we looked

1022
00:48:29,020 --> 00:48:30,180
at the actual source file.

1023
00:48:30,180 --> 00:48:33,060
There's lots of very, very useful
information.

1024
00:48:33,420 --> 00:48:36,940
By the way, this information was
a lot more useful than anything

1025
00:48:36,940 --> 00:48:39,600
I could find online, including the
documentation itself.

1026
00:48:40,240 --> 00:48:42,040
So it was absolutely excellent.

1027
00:48:42,040 --> 00:48:44,180
And I think that the code is very
well written.

1028
00:48:44,600 --> 00:48:45,720
I'll be honest with you.

1029
00:48:45,720 --> 00:48:48,820
I don't think I'm capable enough
to contribute to Postgres itself.

1030
00:48:49,400 --> 00:48:54,440
But, you know, I do from time to
time think about my contribution,

1031
00:48:54,680 --> 00:48:57,100
my type of contribution to the
Postgres community.

1032
00:48:57,380 --> 00:49:00,600
And I know that the most straightforward
way to contribute to

1033
00:49:00,600 --> 00:49:07,940
Postgres is to contribute code,
but I feel like I contribute

1034
00:49:08,000 --> 00:49:13,840
in my way by advocating for Postgres
and educating people how

1035
00:49:13,840 --> 00:49:17,820
to use Postgres correctly because
I believe that if people would

1036
00:49:17,960 --> 00:49:21,800
have good systems, they'll be happy
with Postgres, they'll continue

1037
00:49:21,820 --> 00:49:25,760
using it, they'll recommend it
to their friends, and this is

1038
00:49:25,760 --> 00:49:26,940
how you build a community.

1039
00:49:27,360 --> 00:49:31,880
So I think that I'm more capable
in telling stories, educate

1040
00:49:31,880 --> 00:49:35,520
people about Postgres, working
on ETL processes and applications

1041
00:49:35,660 --> 00:49:39,380
that I do working on, you know,
the tooling, the internal stuff.

1042
00:49:39,800 --> 00:49:43,220
There are people much smarter than
me that can do this stuff.

1043
00:49:44,600 --> 00:49:45,980
I'll write about them.

1044
00:49:47,380 --> 00:49:48,140
Nikolay: Well, yeah.

1045
00:49:48,140 --> 00:49:52,580
Postgres documentation lacks how-to
recipes, and this is what

1046
00:49:52,580 --> 00:49:53,000
you do.

1047
00:49:53,000 --> 00:49:54,960
You describe some recipes and pros
and cons.

1048
00:49:54,960 --> 00:49:56,180
This is an excellent thing.

1049
00:49:56,180 --> 00:49:56,680
Haki: Yeah.

1050
00:49:57,040 --> 00:49:57,900
War stories.

1051
00:49:58,440 --> 00:50:00,900
Michael: You absolutely do all
those things, and we really appreciate

1052
00:50:00,900 --> 00:50:01,340
it, Haki.

1053
00:50:01,340 --> 00:50:02,840
Thank you so much for coming on
as well.

1054
00:50:02,840 --> 00:50:04,460
It's been a pleasure having you
here.

1055
00:50:04,640 --> 00:50:05,140
Haki: Okay.

1056
00:50:05,140 --> 00:50:06,720
Thank you, Nikolay, Michael.

1057
00:50:07,120 --> 00:50:08,000
It was a pleasure.

1058
00:50:08,000 --> 00:50:10,460
This is the first time for me in
a podcast.

1059
00:50:10,900 --> 00:50:11,980
Michael: But you did great.

1060
00:50:12,180 --> 00:50:13,400
Haki: Thank you so much.

1061
00:50:14,440 --> 00:50:14,820
Bye.

1062
00:50:14,820 --> 00:50:15,580
Nikolay: Thank you. Have a good week.

1063
00:50:16,700 --> 00:50:16,744
Michael: Take care.

1064
00:50:16,744 --> 00:50:16,766
Haki: Thanks.