1
00:00:00,060 --> 00:00:01,840
Nikolay: Hello, hello, this is
Postgres.FM.

2
00:00:01,840 --> 00:00:05,900
My name is Nikolay, Postgres.AI, and as usual my co-host is

3
00:00:05,900 --> 00:00:07,060
Michael, pgMustard.

4
00:00:07,360 --> 00:00:08,180
Hi Michael

5
00:00:08,840 --> 00:00:09,820
Michael: Hello Nikolay

6
00:00:10,640 --> 00:00:15,240
Nikolay: So there is a topic I
wanted to discuss and it's widely

7
00:00:15,240 --> 00:00:19,380
used by those projects which are
grown-up projects.

8
00:00:19,780 --> 00:00:24,500
Usually in tiny projects, in new
projects this topic is skipped.

9
00:00:25,040 --> 00:00:28,940
This topic is related to my favorite
area in Postgres administration.

10
00:00:30,040 --> 00:00:32,220
It's zero downtime changes.

11
00:00:32,980 --> 00:00:34,300
First of all, schema changes.

12
00:00:35,740 --> 00:00:39,560
We all know that it's a mistake
to create index without the word

13
00:00:39,560 --> 00:00:43,680
CONCURRENTLY live, because it's
not a live operation, not online

14
00:00:43,680 --> 00:00:44,180
operation.

15
00:00:44,380 --> 00:00:46,420
There's such term online operations.

16
00:00:47,080 --> 00:00:51,060
It means it's blocking other sessions
which deal with the same

17
00:00:51,060 --> 00:00:53,080
table and it's bad.

18
00:00:53,680 --> 00:00:57,540
While CREATE INDEX CONCURRENTLY
is good, if it succeeds it takes

19
00:00:58,180 --> 00:01:02,700
roughly 2 times longer because
it scans table twice, but if it

20
00:01:02,700 --> 00:01:05,940
succeeds, we have an index and
nobody is blocked, which is great.

21
00:01:06,260 --> 00:01:09,860
And we discussed many times that
there are many changes which

22
00:01:09,860 --> 00:01:10,580
are tricky.

23
00:01:10,920 --> 00:01:15,360
For example, if you just add a
column, you can block others if

24
00:01:15,360 --> 00:01:18,480
you are blocked by, for example,
a vacuum running in transaction

25
00:01:18,480 --> 00:01:19,860
ID wraparound mode.

26
00:01:20,800 --> 00:01:25,520
And in many cases, there are tricks
how to, instead of single

27
00:01:25,520 --> 00:01:29,560
step, do multiple steps, but achieve
the same result in truly

28
00:01:29,600 --> 00:01:30,640
online fashion.

29
00:01:31,500 --> 00:01:34,260
And Let's talk about some details
here.

30
00:01:34,640 --> 00:01:40,400
First of all, how we introduce
new constraints on existing columns.

31
00:01:40,920 --> 00:01:43,280
First of all, check constraints
on foreign keys.

32
00:01:43,820 --> 00:01:47,580
Michael: Yeah, should we talk about
if we didn't have this feature,

33
00:01:47,580 --> 00:01:50,280
so if we didn't have the ability
to do it in multiple steps,

34
00:01:50,280 --> 00:01:53,420
if we couldn't set it NOT VALID
first, what would we do?

35
00:01:53,420 --> 00:01:55,400
Or would we live without the constraint?

36
00:01:55,720 --> 00:01:59,940
I think it's worth discussing what
kind of lock is taken and

37
00:02:00,360 --> 00:02:02,740
why we need this feature or why
this feature was implemented

38
00:02:02,740 --> 00:02:03,740
in the first place.

39
00:02:04,740 --> 00:02:06,540
Nikolay: Yeah, let's talk about
this.

40
00:02:06,820 --> 00:02:11,540
If we imagine that we have already
a column and we want to change

41
00:02:11,540 --> 00:02:12,740
something about it.

42
00:02:12,740 --> 00:02:17,040
For example, we want to change
default, it's easy because default

43
00:02:17,040 --> 00:02:18,660
is only about the future.

44
00:02:18,820 --> 00:02:23,780
Postgres doesn't need to check the
existing rows, doesn't need to

45
00:02:23,780 --> 00:02:28,600
change anything in existing rows,
it's only some metadata change.

46
00:02:29,160 --> 00:02:34,400
So, ALTER TABLE, ALTER COLUMN
set default, super fast, unless

47
00:02:34,700 --> 00:02:36,100
we acquired a lock.

48
00:02:37,080 --> 00:02:41,100
This topic is important almost
to any alter.

49
00:02:41,480 --> 00:02:43,540
It's important, lock acquisition.

50
00:02:44,040 --> 00:02:47,060
In some cases, you need exclusive
lock acquisition.

51
00:02:47,080 --> 00:02:49,080
In some cases, also exclusive.

52
00:02:49,080 --> 00:02:50,420
But I remember differences.

53
00:02:50,500 --> 00:02:54,460
For example, if you alter table,
adding a table to publication,

54
00:02:54,960 --> 00:02:59,020
to enable publication with logical
replication, there it's not

55
00:02:59,340 --> 00:03:00,100
that bad.

56
00:03:00,520 --> 00:03:05,040
But if we alter table, alter column,
set default, it's quite

57
00:03:05,040 --> 00:03:05,460
bad.

58
00:03:05,460 --> 00:03:07,380
It's fast, but you need a lock.

59
00:03:07,640 --> 00:03:10,120
If you cannot acquire a lock, you
are blocking others.

60
00:03:10,520 --> 00:03:12,180
But this is a simple example.

61
00:03:12,660 --> 00:03:16,380
It's super fast, nothing to worry
about in terms of this operation

62
00:03:16,380 --> 00:03:17,240
takes long.

63
00:03:17,320 --> 00:03:20,380
Lock acquisition might take long,
right?

64
00:03:21,040 --> 00:03:22,500
We need to take care of it.

65
00:03:22,500 --> 00:03:27,520
We need to have a low lock timeout
and retry logic, be ready to

66
00:03:27,520 --> 00:03:28,140
do it.

67
00:03:28,140 --> 00:03:32,680
Again, I'm discussing here, my
comment in the beginning was specifically

68
00:03:32,800 --> 00:03:36,600
we talk about projects which are
grown already and they have

69
00:03:36,600 --> 00:03:39,640
significant load, say more than
1000 TPS.

70
00:03:40,020 --> 00:03:43,340
Other projects also experience
it but they just don't notice

71
00:03:43,340 --> 00:03:44,940
because traffic is too low.

72
00:03:45,680 --> 00:03:46,180
Right.

73
00:03:47,020 --> 00:03:49,720
Michael: Yeah, and we did a whole
episode on the general case

74
00:03:49,720 --> 00:03:50,220
here.

75
00:03:50,280 --> 00:03:54,380
I think we did one on zero downtime
migrations, which I think we went

76
00:03:54,380 --> 00:03:57,040
into detail on the general case,
didn't we?

77
00:03:57,040 --> 00:03:58,160
Nikolay: Yeah, I agree.

78
00:03:58,260 --> 00:04:00,980
If you are interested in this topic,
you listen to this episode.

79
00:04:01,080 --> 00:04:04,640
Please listen to that episode as
well because today we are going

80
00:04:04,640 --> 00:04:09,520
to go deeper in a specific narrow
area, which is not that narrow.

81
00:04:11,000 --> 00:04:15,460
If you take any project, take three
years of its lifetime, definitely

82
00:04:15,460 --> 00:04:16,760
these things happen.

83
00:04:18,380 --> 00:04:20,900
Adjustment of default is relatively
easy.

84
00:04:20,900 --> 00:04:24,360
We need just to remember about
this lock acquisition issue.

85
00:04:25,080 --> 00:04:29,200
But if we go further, for example,
we created a table.

86
00:04:29,200 --> 00:04:29,880
It has...

87
00:04:30,980 --> 00:04:32,820
Oh, by the way, another comment.

88
00:04:33,080 --> 00:04:39,660
Folks who just prefer JSON, JSONB,
or very flexible schema would

89
00:04:39,660 --> 00:04:41,320
say, it doesn't matter.

90
00:04:41,320 --> 00:04:45,220
We keep everything in text, no
additional constraints.

91
00:04:45,300 --> 00:04:48,660
Let's application take care of
everything.

92
00:04:49,220 --> 00:04:53,980
And I understand this position,
but I fully disagree with it

93
00:04:53,980 --> 00:04:58,000
because I think database is the
only, like database constraints

94
00:05:00,080 --> 00:05:04,840
is very powerful mechanism to ensure
the data is clean and high

95
00:05:04,840 --> 00:05:05,340
quality.

96
00:05:05,900 --> 00:05:06,400
Right?

97
00:05:06,600 --> 00:05:07,400
So we...

98
00:05:07,900 --> 00:05:10,940
Michael: And I think there are
even extensions that help you

99
00:05:11,040 --> 00:05:15,380
validate schemas even within JSONB
columns.

100
00:05:15,540 --> 00:05:15,780
Right.

101
00:05:15,780 --> 00:05:17,540
And I think they're using constraint.

102
00:05:18,080 --> 00:05:22,080
Nikolay: But anyway, constraint
capabilities in database system,

103
00:05:22,580 --> 00:05:23,540
it's very powerful.

104
00:05:24,240 --> 00:05:28,980
And when you shift these to application
logic, you should be

105
00:05:28,980 --> 00:05:32,440
100% sure that in future you don't
write second application which

106
00:05:32,440 --> 00:05:36,220
deals with the same database or
different parts of it in different

107
00:05:36,220 --> 00:05:36,720
code.

108
00:05:37,060 --> 00:05:43,520
Or don't use some UIs which can
allow people to change data and

109
00:05:43,520 --> 00:05:46,600
bypass those checks which are left
in your application.

110
00:05:47,660 --> 00:05:52,200
So Constraint validation should
be done closer to data, and this

111
00:05:52,200 --> 00:05:53,320
is database system.

112
00:05:53,760 --> 00:05:55,360
Okay, default, easy.

113
00:05:55,580 --> 00:06:00,460
Next, for example, let's jump to
the main case.

114
00:06:00,460 --> 00:06:01,660
We have not null.

115
00:06:02,540 --> 00:06:09,160
Not null is a tricky constraint
because, in my opinion, it should

116
00:06:09,160 --> 00:06:09,900
not exist.

117
00:06:10,080 --> 00:06:12,840
Because we have check constraint,
which is much more powerful,

118
00:06:13,580 --> 00:06:16,400
it's a superset of not null.

119
00:06:17,140 --> 00:06:20,600
It includes it because we can say,
check, this column is not

120
00:06:20,600 --> 00:06:20,900
null.

121
00:06:20,900 --> 00:06:21,600
That's it.

122
00:06:22,280 --> 00:06:22,780
Easy.

123
00:06:23,180 --> 00:06:28,480
But is not null also, it exists
because it's SQL standard.

124
00:06:29,140 --> 00:06:34,520
I think it was created before check
constraint was added to standard.

125
00:06:34,620 --> 00:06:35,380
I guess.

126
00:06:35,460 --> 00:06:36,140
It's my guess.

127
00:06:36,140 --> 00:06:37,220
I might be wrong.

128
00:06:37,280 --> 00:06:39,640
Michael: I'm not even sure, check
constraint is definitely in

129
00:06:39,640 --> 00:06:40,140
standard.

130
00:06:40,460 --> 00:06:41,960
Nikolay: Check constraint should
be in standard.

131
00:06:42,180 --> 00:06:42,680
Michael: Okay.

132
00:06:42,740 --> 00:06:45,860
Nikolay: I'm not 100% sure, but
it would be very strange if not.

133
00:06:46,220 --> 00:06:49,820
But it's obviously, like, why do
we have not null constraints

134
00:06:49,820 --> 00:06:50,160
at all?

135
00:06:50,160 --> 00:06:51,640
We have check constraints, right?

136
00:06:52,300 --> 00:06:55,460
And interesting that if you define
a primary key, it requires

137
00:06:55,640 --> 00:06:56,880
a not null constraint.

138
00:06:58,080 --> 00:06:59,080
It needs it.

139
00:06:59,380 --> 00:07:00,320
It requires 2 things.

140
00:07:00,320 --> 00:07:01,040
Michael: It's inherent.

141
00:07:01,240 --> 00:07:01,640
Nikolay: Yeah.

142
00:07:01,640 --> 00:07:03,540
Unique constraint and not null
constraint.

143
00:07:04,300 --> 00:07:08,540
Fortunately, since Postgres 12
optimizations, we can forget about

144
00:07:08,540 --> 00:07:13,000
this distinction and we can build
the check constraint with not

145
00:07:13,000 --> 00:07:17,420
null and implicitly not null actual
constraint will be built

146
00:07:17,420 --> 00:07:20,540
efficiently on top of our check
constraint and primary key will

147
00:07:20,540 --> 00:07:21,300
use it.

148
00:07:21,660 --> 00:07:23,540
But some implementation detail,
right?

149
00:07:23,680 --> 00:07:25,220
Before 12 it was nightmare.

150
00:07:26,040 --> 00:07:30,700
I remember discussing with very
experienced DBAs, discussing

151
00:07:31,000 --> 00:07:33,140
how we can introduce not null constraints.

152
00:07:34,000 --> 00:07:36,720
Not null doesn't support this two-phase
approach we discussed

153
00:07:37,000 --> 00:07:37,500
today.

154
00:07:38,300 --> 00:07:42,240
You cannot define not null constraint,
first NOT VALID and then

155
00:07:42,240 --> 00:07:42,740
validate.

156
00:07:43,620 --> 00:07:44,560
It's not possible.

157
00:07:45,220 --> 00:07:49,700
Before 12, it always led to full
table scan.

158
00:07:50,420 --> 00:07:54,180
I remember our experience DBA said,
you know what, between us,

159
00:07:54,400 --> 00:07:59,880
we just adjust system catalogs
because we know there are no records

160
00:08:00,480 --> 00:08:05,100
violating these constraints, so
we accept this, we just do it.

161
00:08:05,380 --> 00:08:08,040
Seems like Postgres 8.0 or something.

162
00:08:09,920 --> 00:08:12,940
But yeah, this is an ugly hack
actually.

163
00:08:14,180 --> 00:08:15,220
Risky if you don't know

164
00:08:15,220 --> 00:08:16,560
Michael: exactly what you're doing.

165
00:08:16,560 --> 00:08:19,900
Nikolay: Well, it seems like not
super ugly, not super risky,

166
00:08:20,140 --> 00:08:23,720
but fortunately since Postgres
12 we don't need it and we can

167
00:08:23,720 --> 00:08:28,620
define check constraint in this
two-phased manner and then not

168
00:08:28,620 --> 00:08:33,180
null constraint will be implicitly
defined very fast without

169
00:08:33,180 --> 00:08:34,320
full table scan.

170
00:08:34,320 --> 00:08:39,600
So, for example, we create check
constraint, we define some logic

171
00:08:39,600 --> 00:08:40,160
in this.

172
00:08:40,160 --> 00:08:42,620
For example, not null, or something
else.

173
00:08:42,620 --> 00:08:47,960
For example, we can say all integer
values in this column should

174
00:08:47,960 --> 00:08:48,660
be positive.

175
00:08:49,460 --> 00:08:51,100
Michael: Yeah, well it's super
flexible isn't it?

176
00:08:51,100 --> 00:08:55,840
You can pretty much define whatever
you want via any query via

177
00:08:55,840 --> 00:08:56,580
a check constraint.

178
00:08:56,580 --> 00:08:56,720
It's

179
00:08:56,720 --> 00:08:57,440
Nikolay: really cool.

180
00:08:57,440 --> 00:08:58,480
Not any query unfortunately.

181
00:08:58,820 --> 00:09:00,040
There is a huge limitation.

182
00:09:00,480 --> 00:09:03,040
Michael: For multiple tables you
can talk about columns?

183
00:09:04,020 --> 00:09:07,000
Nikolay: You need to stay inside
the same table context, you

184
00:09:07,000 --> 00:09:08,900
cannot refer other tables, unfortunately.

185
00:09:09,520 --> 00:09:10,980
Michael: But it's pretty powerful,
right?

186
00:09:10,980 --> 00:09:11,680
Nikolay: Of course.

187
00:09:12,740 --> 00:09:15,360
Michael: You can reference multiple
columns within the same table.

188
00:09:15,580 --> 00:09:19,460
But yeah, the normal use cases
are like, this number cannot be

189
00:09:19,500 --> 00:09:21,820
negative or greater

190
00:09:21,820 --> 00:09:22,780
Nikolay: than or odd.

191
00:09:23,100 --> 00:09:24,100
Or something.

192
00:09:24,280 --> 00:09:24,960
This number should

193
00:09:24,960 --> 00:09:25,080
Michael: be proper.

194
00:09:25,080 --> 00:09:29,640
Yeah, or, I've not seen that in
the wild, but yeah, or certain,

195
00:09:30,040 --> 00:09:32,140
Maybe like a maximum value or something.

196
00:09:32,560 --> 00:09:32,680
Nikolay: Yeah.

197
00:09:32,680 --> 00:09:36,900
Yeah, ranges for many, many things
can be, can like make, it's

198
00:09:36,900 --> 00:09:37,360
good.

199
00:09:37,360 --> 00:09:37,752
It's good.

200
00:09:37,752 --> 00:09:41,980
It's good practice because you,
this is how you ensure data has

201
00:09:41,980 --> 00:09:42,680
high quality.

202
00:09:43,440 --> 00:09:43,940
Yes.

203
00:09:44,860 --> 00:09:45,900
And that's it.

204
00:09:45,900 --> 00:09:51,060
But imagine we have a billion rows
table, which should not happen.

205
00:09:51,060 --> 00:09:54,940
You should have partitioning in
place and don't allow it to grow

206
00:09:55,600 --> 00:09:58,600
beyond a few dozens of millions
of rows maybe.

207
00:09:58,940 --> 00:10:00,480
But it happens all the time.

208
00:10:00,480 --> 00:10:05,720
In grown-up projects, I see it
as a billion-plus rows all the

209
00:10:05,720 --> 00:10:06,220
time.

210
00:10:07,840 --> 00:10:10,880
And we define this constraint and
how Postgres...

211
00:10:11,540 --> 00:10:15,640
If we just say alter blah blah
blah, once a lock is acquired,

212
00:10:16,120 --> 00:10:20,740
to adjust metadata, system catalogs,
Postgres acquires lock, and

213
00:10:20,740 --> 00:10:25,100
then what it does, of course it
goes and checks existing rows

214
00:10:25,520 --> 00:10:27,740
to see if there is any violation.

215
00:10:28,580 --> 00:10:30,480
If there is, it will roll back.

216
00:10:31,060 --> 00:10:34,540
And this is super bad because for
1 billion rows, it takes a

217
00:10:34,540 --> 00:10:35,340
lot of time.

218
00:10:35,380 --> 00:10:35,920
Michael: Full table

219
00:10:35,920 --> 00:10:36,540
scale.

220
00:10:36,660 --> 00:10:40,240
And while it's doing that, it's
blocking any new writes.

221
00:10:40,240 --> 00:10:40,740
Exactly.

222
00:10:41,680 --> 00:10:44,250
Nikolay: Not only writes, SELECTs
as well, because it's an algorithm.

223
00:10:44,250 --> 00:10:44,940
Yeah, true.

224
00:10:45,540 --> 00:10:45,660
Yeah.

225
00:10:45,660 --> 00:10:46,940
So it's super bad.

226
00:10:47,040 --> 00:10:52,000
This is where some engineers have
feeling like...

227
00:10:52,280 --> 00:10:55,100
They promised us that SELECTs are
not blocked.

228
00:10:55,240 --> 00:10:56,400
What's happening here?

229
00:10:57,600 --> 00:11:01,260
Well, they are not blocked by writes,
regular writes, but you

230
00:11:01,260 --> 00:11:03,940
are changing the rules right now.

231
00:11:03,940 --> 00:11:07,800
Rules, I mean, you're changing
the schema.

232
00:11:08,560 --> 00:11:11,100
So this is not simple, right?

233
00:11:11,640 --> 00:11:15,660
So you acquire the lock, you start
reading this.

234
00:11:15,660 --> 00:11:20,080
It can take a lot, like minutes,
maybe more, depends on disk

235
00:11:20,080 --> 00:11:22,540
capabilities and actual table size.

236
00:11:23,200 --> 00:11:24,440
Even SELECTs are blocked.

237
00:11:24,440 --> 00:11:25,200
This is super bad.

238
00:11:25,200 --> 00:11:29,280
That's why we have for CHECK constraints,
not for NOT NULL.

239
00:11:30,060 --> 00:11:31,320
For NOT NULL, we don't have it.

240
00:11:31,320 --> 00:11:35,960
For CHECK constraints, we have
the ability to say, NOT VALID.

241
00:11:36,580 --> 00:11:38,240
We say, ALTER TABLE, blah, blah,
blah.

242
00:11:38,240 --> 00:11:42,700
We say, add constraint, and then
we specify NOT VALID.

243
00:11:42,700 --> 00:11:43,080
Flag.

244
00:11:43,080 --> 00:11:43,880
It's a flag.

245
00:11:44,040 --> 00:11:45,780
By the way, terrible naming.

246
00:11:46,720 --> 00:11:47,220
Michael: Yeah.

247
00:11:47,280 --> 00:11:48,000
Or confusing.

248
00:11:48,400 --> 00:11:51,960
If you had to guess what it was
without knowing, I think you

249
00:11:51,960 --> 00:11:53,820
wouldn't get the details right.

250
00:11:53,900 --> 00:11:54,400
Nikolay: Right.

251
00:11:54,520 --> 00:11:55,660
Why is it terrible?

252
00:11:55,660 --> 00:12:00,400
Because once this transaction is
fast because it doesn't check

253
00:12:00,400 --> 00:12:02,540
the existing volume of data at
all.

254
00:12:02,680 --> 00:12:05,940
It just defines this constraint
as it was with...

255
00:12:06,460 --> 00:12:09,100
As the table was very small, like
empty.

256
00:12:09,480 --> 00:12:09,960
Super fast.

257
00:12:09,960 --> 00:12:11,280
Just Lock acquisition is needed.

258
00:12:11,280 --> 00:12:13,940
Again, we need lock_timeout and
retries.

259
00:12:14,600 --> 00:12:15,520
This is the rule.

260
00:12:16,380 --> 00:12:18,220
Michael: I think that's not the
bit that would have confused

261
00:12:18,220 --> 00:12:18,340
me.

262
00:12:18,340 --> 00:12:21,400
The thing that would have confused
me is that it will be checking

263
00:12:21,420 --> 00:12:23,400
new rows and updating.

264
00:12:23,400 --> 00:12:24,840
Exactly, yes.

265
00:12:24,840 --> 00:12:26,540
That's what's confusing, right?

266
00:12:27,920 --> 00:12:29,620
Nikolay: You, okay, you brought
it before.

267
00:12:29,860 --> 00:12:30,580
It's okay.

268
00:12:31,020 --> 00:12:36,540
Yeah, but I mean, mechanics is
like, we create constraint like

269
00:12:36,820 --> 00:12:38,860
table's empty, and that's it.

270
00:12:39,100 --> 00:12:43,140
And it's marked NOT VALID in table
definition if you use backslash

271
00:12:43,140 --> 00:12:46,000
D in psql, and you say, oh,
This is NOT VALID constraint,

272
00:12:46,000 --> 00:12:46,860
I don't care.

273
00:12:46,960 --> 00:12:53,600
And I remember I was preparing
the first complex transformation

274
00:12:53,880 --> 00:12:58,480
of 4-byte primary key to 8-byte
primary key.

275
00:12:59,440 --> 00:13:04,520
And it was a very complex operation,
like 7 steps.

276
00:13:05,220 --> 00:13:06,440
Each step was reversible.

277
00:13:07,640 --> 00:13:11,680
And I remember for the sake of
reversibility, it was not about

278
00:13:11,680 --> 00:13:14,680
check constraints, it was about
foreign keys.

279
00:13:14,680 --> 00:13:15,760
It's quite similar there.

280
00:13:15,760 --> 00:13:18,420
You can also say NOT VALID for
foreign key.

281
00:13:18,960 --> 00:13:23,980
And I left NOT VALID foreign keys
for some like shadow table

282
00:13:24,400 --> 00:13:25,940
or old table or something.

283
00:13:27,040 --> 00:13:30,780
And I thought they won't be used.

284
00:13:30,940 --> 00:13:34,660
It was huge mistake, led to some
bug.

285
00:13:35,660 --> 00:13:38,560
Unfortunately, I quickly realized,
okay, NOT VALID doesn't mean

286
00:13:38,560 --> 00:13:39,780
it's not working.

287
00:13:40,600 --> 00:13:41,980
It's still working.

288
00:13:42,040 --> 00:13:49,400
It means that it's applied to all
new writes, INSERTs and UPDATEs,

289
00:13:51,060 --> 00:13:51,800
not DELETEs.

290
00:13:52,360 --> 00:13:54,860
In the case of CHECK constraint
DELETEs don't matter, right,

291
00:13:54,860 --> 00:13:56,680
because we don't need to check.

292
00:13:57,260 --> 00:13:57,760
Yeah.

293
00:13:58,500 --> 00:13:59,000
Yeah.

294
00:14:01,240 --> 00:14:02,860
But in the case of foreign keys,
DELETEs also matter, right?

295
00:14:03,180 --> 00:14:03,920
Yeah, true.

296
00:14:04,540 --> 00:14:09,560
Yeah, because if you DELETE the
referenced row...

297
00:14:11,780 --> 00:14:12,940
Michael: Referencing row.

298
00:14:12,980 --> 00:14:17,300
Nikolay: Yeah, referencing row
becomes an orphan if...

299
00:14:18,140 --> 00:14:24,260
Yeah, so for all new writes, let's
say writes, like, aggregate

300
00:14:24,440 --> 00:14:26,760
all changes of data here.

301
00:14:27,120 --> 00:14:28,400
And if you have...

302
00:14:29,180 --> 00:14:33,280
You define check constraint, you
try to insert, if you have,

303
00:14:33,280 --> 00:14:36,240
for example, let's say we have
not null.

304
00:14:36,420 --> 00:14:42,140
You try to insert null, it says,
okay, violation of NOT VALID

305
00:14:42,140 --> 00:14:42,640
constraint.

306
00:14:43,440 --> 00:14:44,520
How does it sound?

307
00:14:46,560 --> 00:14:47,800
It sounds not good.

308
00:14:48,100 --> 00:14:49,000
It's confusing.

309
00:14:50,020 --> 00:14:52,860
So I think a better name could
be...

310
00:14:54,000 --> 00:14:57,180
I had an idea yesterday we were
discussing with 1 client.

311
00:14:57,580 --> 00:14:58,220
I forgot.

312
00:14:59,540 --> 00:15:00,460
What would be better?

313
00:15:00,460 --> 00:15:02,060
Not verified maybe, right?

314
00:15:03,960 --> 00:15:04,700
Not verified.

315
00:15:05,200 --> 00:15:06,260
Michael: Yeah, it's better.

316
00:15:06,760 --> 00:15:09,440
Naming things is notoriously difficult,
right?

317
00:15:11,040 --> 00:15:13,680
But yeah, we've got this now and
we're stuck with it.

318
00:15:13,680 --> 00:15:17,780
I think that distinction between
the types of rights is quite

319
00:15:17,780 --> 00:15:18,960
important though.

320
00:15:19,540 --> 00:15:24,440
I think I would then have gone
away and thought, oh, if I change

321
00:15:24,440 --> 00:15:27,480
that column, it will validate it.

322
00:15:27,580 --> 00:15:30,720
But I wouldn't have realized that
if I changed a

323
00:15:30,720 --> 00:15:31,820
Nikolay: different column.

324
00:15:31,860 --> 00:15:33,380
Or no columns at all.

325
00:15:33,840 --> 00:15:39,060
As you remember, my favorite example,
update setId equals id

326
00:15:39,060 --> 00:15:42,760
doing nothing logically, but physically
it produces a new tuple.

327
00:15:43,140 --> 00:15:46,880
A new tuple is also check against
this NOT VALID constraint,

328
00:15:47,620 --> 00:15:48,260
Michael: and it

329
00:15:48,260 --> 00:15:49,660
Nikolay: complains violation.

330
00:15:51,100 --> 00:15:54,620
Michael: Yeah, so in case that
wasn't clear, if that tuple has

331
00:15:55,080 --> 00:15:59,120
a violation in it, and because
we haven't validated yet, it didn't

332
00:15:59,120 --> 00:16:04,080
fail, then it tries to write that
tuple again with the invalid

333
00:16:04,300 --> 00:16:04,760
column.

334
00:16:04,760 --> 00:16:08,240
So yeah, not verified feels good,
actually, yeah, that makes

335
00:16:08,240 --> 00:16:08,740
sense.

336
00:16:08,800 --> 00:16:13,260
But I don't think the side effects
of this, leaving a constraint

337
00:16:13,260 --> 00:16:16,600
in a NOT VALID state, are obvious.

338
00:16:17,180 --> 00:16:21,800
Or at least not to me when I first
came across it.

339
00:16:22,060 --> 00:16:27,260
Nikolay: But despite naming, I
think the idea that it's already

340
00:16:27,260 --> 00:16:31,440
working for new rights is great
because it gives you opportunity

341
00:16:31,720 --> 00:16:38,600
to say, okay, we now are covered
by all future rights, all future

342
00:16:38,600 --> 00:16:39,100
data.

343
00:16:39,480 --> 00:16:41,980
Now let's take care of existing
data.

344
00:16:45,560 --> 00:16:46,560
How do we do this?

345
00:16:46,560 --> 00:16:49,040
First of all, we need to...

346
00:16:49,700 --> 00:16:53,540
I would not go straight to AlTER TABLE
VALIDATE CONSTRAINT, which

347
00:16:53,940 --> 00:16:57,780
scans the table in non-blocking
way.

348
00:16:57,780 --> 00:16:58,760
This is the key.

349
00:16:58,860 --> 00:17:00,200
It doesn't block anyone.

350
00:17:00,200 --> 00:17:01,520
It just scans it.

351
00:17:02,080 --> 00:17:06,500
And if everything is clear, it
updates system catalogs.

352
00:17:06,500 --> 00:17:09,140
Again, I think lock will be needed
maybe, right?

353
00:17:09,140 --> 00:17:10,520
But only in the end.

354
00:17:10,520 --> 00:17:13,220
I'm not sure if Retrialogic is
inside this.

355
00:17:13,280 --> 00:17:13,860
It's an interesting question.

356
00:17:13,860 --> 00:17:14,020
I

357
00:17:14,020 --> 00:17:14,940
Michael: looked it up.

358
00:17:15,180 --> 00:17:20,560
There's a really good note in the
ALTER TABLE docs, and it says

359
00:17:20,800 --> 00:17:24,280
validation acquires only a share
update exclusive lock for the

360
00:17:24,280 --> 00:17:27,540
table being altered for like a
check constraint, and there is

361
00:17:27,540 --> 00:17:31,660
an additional lock when it's a
foreign key on the...

362
00:17:33,080 --> 00:17:35,780
If a constraint is a foreign key,
then a row share lock is also

363
00:17:35,800 --> 00:17:38,540
required on the table referenced
by the constraint.

364
00:17:39,720 --> 00:17:40,580
Nikolay: Which makes sense.

365
00:17:40,840 --> 00:17:42,980
Yeah, yeah, yeah.

366
00:17:42,980 --> 00:17:47,140
It might fail, for example, again,
if autovacuum is running transaction

367
00:17:47,360 --> 00:17:51,800
ID wraparound prevention mode,
I think it can be blocked by it.

368
00:17:53,640 --> 00:17:55,900
Fortunately, we can retry it multiple
times.

369
00:17:55,920 --> 00:18:02,020
Yes, it will read the whole table,
quiet data, disk I/O intensive

370
00:18:02,020 --> 00:18:04,840
operation, but it's already much
better.

371
00:18:05,240 --> 00:18:06,980
The key is we don't block anyone.

372
00:18:07,360 --> 00:18:08,860
If we succeed, it's great.

373
00:18:08,860 --> 00:18:13,840
But we might not succeed if there
are nulls, or if it's different

374
00:18:13,860 --> 00:18:16,940
check constraint, if there are
rows which violated.

375
00:18:17,520 --> 00:18:21,440
In this case, before doing this
second step, there should be

376
00:18:21,440 --> 00:18:22,840
additional step in between.

377
00:18:22,840 --> 00:18:25,520
We should take care of violations
ourselves.

378
00:18:25,640 --> 00:18:29,440
Either we need to delete those
rows or update them, setting value,

379
00:18:29,700 --> 00:18:30,200
right?

380
00:18:30,720 --> 00:18:33,160
Michael: Look, question, why not
do that before

381
00:18:33,480 --> 00:18:35,640
Nikolay: the setting?

382
00:18:37,200 --> 00:18:38,980
Yeah, good question.

383
00:18:39,520 --> 00:18:45,300
And I think if we think like, since
NOT VALID constraint already

384
00:18:45,300 --> 00:18:50,100
covers all future rights, I would
prefer first introduce constraint,

385
00:18:50,280 --> 00:18:56,520
then clean up, taking care of violations
in one way or another.

386
00:18:56,520 --> 00:18:57,680
It's very domain-specific.

387
00:18:59,440 --> 00:19:01,760
And then VALIDATE CONSTRAINT.

388
00:19:02,440 --> 00:19:02,800
Why?

389
00:19:02,800 --> 00:19:09,640
Because in this way we don't introduce
any gaps where new violations

390
00:19:09,720 --> 00:19:12,180
can be added.

391
00:19:13,120 --> 00:19:17,760
If we move the step of cleanup
before, like if we have it as

392
00:19:17,760 --> 00:19:23,260
the very first step, in this case,
what is like again, depending

393
00:19:23,300 --> 00:19:27,800
on the logic of application, but
in general case, after we're

394
00:19:27,800 --> 00:19:34,000
done, and before we start creating
constraint, some new rights

395
00:19:34,000 --> 00:19:38,180
might happen and violate it once
again right and this is bad.

396
00:19:38,180 --> 00:19:42,400
Michael: So yeah I guess it depends
how many we've got but I

397
00:19:42,400 --> 00:19:45,400
would still like if we've got loads
then we're gonna have we're

398
00:19:45,400 --> 00:19:49,280
gonna end up with a NOT VALID constraint
for quite a while while

399
00:19:49,280 --> 00:19:51,100
we clean up quite a lot of data.

400
00:19:51,480 --> 00:19:56,500
My feeling would be, I guess both
might be the solution.

401
00:19:57,980 --> 00:20:01,720
Get an idea of how many you've
got in your existing data, work

402
00:20:01,720 --> 00:20:05,680
out a plan for fixing those, deal
with the ones that you knew

403
00:20:05,680 --> 00:20:06,897
about at that time.

404
00:20:06,897 --> 00:20:11,520
So, there's some snapshot from
yesterday, we checked which rows

405
00:20:11,520 --> 00:20:14,640
were in violation, we've got this
script to fix them all up,

406
00:20:14,640 --> 00:20:15,640
we've run that.

407
00:20:16,100 --> 00:20:19,080
Now we check, maybe we check again,
see if there's any new ones,

408
00:20:19,080 --> 00:20:21,300
or maybe then you flip it to NOT VALID,
because at least you're

409
00:20:21,300 --> 00:20:25,680
only going to have a day's worth
of, or however long it takes

410
00:20:25,680 --> 00:20:28,300
your company or your organization
to deal with those.

411
00:20:28,660 --> 00:20:31,760
So it could be that you do it before
and then in between.

412
00:20:32,380 --> 00:20:36,000
Nikolay: Yes, unless we don't want
to scan the table too many

413
00:20:36,000 --> 00:20:41,400
times, because we already need,
when we do it, to clean up.

414
00:20:42,180 --> 00:20:47,080
And also this VALIDATE CONSTRAINT
step, the final step, also

415
00:20:47,080 --> 00:20:48,140
needs to do it.

416
00:20:48,600 --> 00:20:52,100
But maybe we should do it before,
like this very first step,

417
00:20:52,120 --> 00:20:55,360
then introduce constraint, do it
again, and then validate.

418
00:20:55,520 --> 00:20:58,380
Maybe this is the ideal situation.

419
00:20:58,520 --> 00:21:04,840
But regardless of that, if we just
blindly say, okay, we introduce

420
00:21:04,840 --> 00:21:10,340
constraint, then clean up, then
validate, this may lead to not

421
00:21:10,340 --> 00:21:13,620
understanding that NOT VALID constraint
is already applied to

422
00:21:13,620 --> 00:21:14,200
new writes.

423
00:21:14,200 --> 00:21:17,000
This may lead to late code fixes.

424
00:21:17,040 --> 00:21:23,240
And if code, application code or
whatever, continues to perform

425
00:21:23,240 --> 00:21:26,860
new writes which violate constraint,
this is a problem because

426
00:21:27,380 --> 00:21:29,840
it will bring more and more and more,
right?

427
00:21:30,040 --> 00:21:31,580
But there is another problem also.

428
00:21:32,120 --> 00:21:33,420
Exactly what you brought.

429
00:21:33,760 --> 00:21:37,060
Imagine we perform an update, changing
the other column.

430
00:21:37,060 --> 00:21:43,500
For example, if it's some post
or something, we just edit title.

431
00:21:43,700 --> 00:21:47,440
And then there is some additional
column which we want to be

432
00:21:47,440 --> 00:21:48,160
not NULL.

433
00:21:48,740 --> 00:21:52,240
We had some NULLs, we want to convert
them to, I don't know,

434
00:21:52,240 --> 00:21:56,980
false or 0 or something, because
we finally realized we don't

435
00:21:56,980 --> 00:22:00,080
consider NULLs as zeros or empty
strings.

436
00:22:00,820 --> 00:22:04,700
And then we have a bunch of rows
which still have NULLs, and

437
00:22:04,700 --> 00:22:06,240
we already introduced constraint.

438
00:22:06,580 --> 00:22:10,900
Then user goes and tries to, for
example, edit the title, and

439
00:22:10,900 --> 00:22:14,840
cannot because constraint is already
working.

440
00:22:15,420 --> 00:22:19,620
During editing a title, Postgres,
as we know, in MVCC, Postgres

441
00:22:19,780 --> 00:22:23,460
copies the whole row, it's a physical
row, it's called tuple,

442
00:22:24,020 --> 00:22:24,520
right?

443
00:22:25,600 --> 00:22:29,420
And new tuple is violating
NOT VALID constraint.

444
00:22:30,300 --> 00:22:33,580
So editing of title is broken.

445
00:22:34,860 --> 00:22:38,860
It means we must clean up as the
first step.

446
00:22:39,520 --> 00:22:40,740
Accept gaps.

447
00:22:41,280 --> 00:22:44,440
Michael: So this is why we always
need to validate the constraint

448
00:22:44,440 --> 00:22:44,940
eventually.

449
00:22:45,360 --> 00:22:48,900
When I say eventually, I mean at
the end of this process and

450
00:22:48,900 --> 00:22:51,800
ideally as quickly as possible
to avoid issues.

451
00:22:52,340 --> 00:22:54,440
Nikolay: Do you see how deep the topic is?

452
00:22:54,900 --> 00:22:56,460
Yeah, it's interesting.

453
00:22:57,500 --> 00:22:58,220
Michael: For sure.

454
00:22:58,480 --> 00:22:59,540
Nikolay: Unexpectedly, right?

455
00:22:59,540 --> 00:23:00,860
Michael: What do you see in the wild?

456
00:23:00,860 --> 00:23:04,100
Yeah, do you see people trying to do this almost instantaneously?

457
00:23:04,540 --> 00:23:07,760
Because running, still, if we run them back to back, it's still

458
00:23:07,760 --> 00:23:15,200
much, much better than introducing the constraint without the

459
00:23:15,200 --> 00:23:15,940
NOT VALID.

460
00:23:16,080 --> 00:23:19,320
Nikolay: Honestly, I think I saw systems, maybe it was smaller

461
00:23:19,320 --> 00:23:22,540
projects, and maybe it was because of me where we made a mistake

462
00:23:22,540 --> 00:23:29,600
and did cleanup as the second step, not having it before the

463
00:23:29,600 --> 00:23:30,480
first ALTER TABLE.

464
00:23:30,480 --> 00:23:32,720
It should be before the first ALTER TABLE, definitely.

465
00:23:33,080 --> 00:23:35,360
And we just didn't notice this bug.

466
00:23:35,580 --> 00:23:38,660
I mean, it's a bug of application, it's not a bug of Postgres.

467
00:23:39,960 --> 00:23:43,340
When user tries to edit a title, but because of some additional

468
00:23:43,380 --> 00:23:45,640
column violation, what's happening here?

469
00:23:46,120 --> 00:23:49,340
And now I'm more cautious about it.

470
00:23:49,340 --> 00:23:51,960
So I would put it on the first place.

471
00:23:52,840 --> 00:23:53,860
And yeah.

472
00:23:54,800 --> 00:23:59,540
Michael: So just to clarify, first step, we manually run, we

473
00:23:59,540 --> 00:24:00,020
manually

474
00:24:00,020 --> 00:24:00,760
Nikolay: check for violations.

475
00:24:00,760 --> 00:24:01,400
Michael: Automatic validation.

476
00:24:02,240 --> 00:24:02,940
Yeah, exactly.

477
00:24:02,940 --> 00:24:03,860
So you know what I mean?

478
00:24:03,860 --> 00:24:08,400
Like we don't use the validation step to validate whether

479
00:24:08,400 --> 00:24:08,420
Nikolay: it's actually...

480
00:24:08,420 --> 00:24:11,980
First step is get rid of violations even before you introduce

481
00:24:11,980 --> 00:24:12,340
constraint.

482
00:24:12,340 --> 00:24:16,100
Then you introduce NOT VALID constraint, then ideally once again

483
00:24:16,100 --> 00:24:19,540
get rid of violations because maybe we had a gap, maybe a new

484
00:24:19,540 --> 00:24:21,420
came, and then VALIDATE CONSTRAINT.

485
00:24:21,980 --> 00:24:24,240
Michael: I've got, I actually think I'm wrong there.

486
00:24:24,240 --> 00:24:28,880
I think we'd be better off checking for violations, fixing them,

487
00:24:28,980 --> 00:24:32,560
check for violations again, And if you find any, don't go ahead

488
00:24:32,560 --> 00:24:35,260
with your plan, because you've still got a problem at the application

489
00:24:35,340 --> 00:24:37,180
level of bad data being inserted.

490
00:24:37,960 --> 00:24:42,780
So probably then don't get started, because you need to go back

491
00:24:42,780 --> 00:24:43,760
and stop.

492
00:24:43,860 --> 00:24:47,220
Nikolay: This algorithm needs to be visualized already.

493
00:24:47,960 --> 00:24:48,340
Michael: Yeah, yeah.

494
00:24:48,340 --> 00:24:49,640
But do you see what I mean?

495
00:24:49,640 --> 00:24:51,540
Because otherwise we're going to have errors.

496
00:24:52,000 --> 00:24:52,700
Nikolay: Yes, exactly.

497
00:24:53,320 --> 00:24:54,840
But how does it change it?

498
00:24:54,840 --> 00:24:58,940
Okay, clean up, create constraint with NOT VALID while you are

499
00:24:58,940 --> 00:24:59,980
shaking your head.

500
00:25:00,040 --> 00:25:02,540
Michael: I'm talking about leaving a gap at the beginning.

501
00:25:02,720 --> 00:25:07,280
So, clean up the existing data, wait a while.

502
00:25:07,280 --> 00:25:08,820
Nikolay: Wait is a good idea, yeah.

503
00:25:10,520 --> 00:25:12,260
See if you need to clean up again.

504
00:25:12,440 --> 00:25:14,120
First, fix your application.

505
00:25:14,380 --> 00:25:15,300
Fix your application.

506
00:25:15,480 --> 00:25:19,200
Not to produce wrong data anymore.

507
00:25:19,900 --> 00:25:21,100
Second, clean up.

508
00:25:22,940 --> 00:25:24,360
1 more step, wait.

509
00:25:24,800 --> 00:25:25,300
Yeah.

510
00:25:25,440 --> 00:25:27,440
Like days, weeks, I don't know.

511
00:25:27,440 --> 00:25:31,580
Then introduce, no need to second clean up, introduce constraint.

512
00:25:32,960 --> 00:25:33,420
Why not?

513
00:25:33,420 --> 00:25:34,570
Ah, okay, second cleanup.

514
00:25:34,570 --> 00:25:35,160
Why not

515
00:25:35,800 --> 00:25:36,040
Michael: check?

516
00:25:36,040 --> 00:25:36,540
Why

517
00:25:36,600 --> 00:25:37,860
Nikolay: not check without cleanup?

518
00:25:37,860 --> 00:25:39,620
Okay, I mean, just, yeah.

519
00:25:39,620 --> 00:25:41,820
Michael: It's just a minor step, it doesn't take long.

520
00:25:42,880 --> 00:25:44,540
1 more scan of the table, though.

521
00:25:44,540 --> 00:25:47,000
Nikolay: Okay, this is, yeah, This is full-fledged.

522
00:25:47,080 --> 00:25:49,340
Okay, 1 more time.

523
00:25:49,960 --> 00:25:51,980
Fix application code.

524
00:25:52,420 --> 00:25:53,220
Clean up.

525
00:25:53,680 --> 00:25:55,740
Check and clean up if needed again.

526
00:25:55,760 --> 00:25:58,580
But if clean up needed again the second time, it means code is

527
00:25:58,580 --> 00:25:59,660
not fixed yet.

528
00:26:00,240 --> 00:26:01,140
This is great.

529
00:26:01,780 --> 00:26:04,500
Then, ALTER TABLE constraint NOT VALID.

530
00:26:05,060 --> 00:26:06,100
And then what do you think?

531
00:26:06,100 --> 00:26:07,320
Do you need new cleanup?

532
00:26:07,380 --> 00:26:08,600
Maybe no already, right?

533
00:26:09,240 --> 00:26:10,580
Michael: Shouldn't do, right?

534
00:26:11,040 --> 00:26:11,540
Nikolay: Yeah.

535
00:26:12,620 --> 00:26:12,880
Well,

536
00:26:12,880 --> 00:26:15,740
Michael: because at this point, if you just validate it, if you

537
00:26:15,740 --> 00:26:17,280
needed cleanup, it will fail.

538
00:26:17,780 --> 00:26:19,740
Nikolay: Yeah, it's similar to just checking.

539
00:26:19,860 --> 00:26:20,280
Yeah.

540
00:26:20,280 --> 00:26:20,660
Yeah.

541
00:26:20,660 --> 00:26:21,160
Yeah.

542
00:26:21,220 --> 00:26:26,600
So yeah, I agree I let's remove clean up between to ALTER TABLE

543
00:26:26,600 --> 00:26:31,420
steps and but have to Clean up plus additional check and some

544
00:26:31,900 --> 00:26:33,260
some time between them.

545
00:26:33,260 --> 00:26:34,180
Yeah, that's great.

546
00:26:34,540 --> 00:26:36,180
Final question I have here.

547
00:26:37,060 --> 00:26:43,820
Could we have just ALTER TABLE CONCURRENTLY and have less headache

548
00:26:43,860 --> 00:26:45,080
when doing these things?

549
00:26:45,520 --> 00:26:46,400
What do you think?

550
00:26:46,400 --> 00:26:49,780
For example, if you create a similar situation, CREATE INDEX

551
00:26:49,780 --> 00:26:53,480
CONCURRENTLY, CREATE UNIQUE INDEX
CONCURRENTLY, it's not just

552
00:26:53,480 --> 00:26:55,620
index, it's constraint basically.

553
00:26:56,680 --> 00:26:59,420
And if we have duplicates, it will
fail.

554
00:27:00,240 --> 00:27:03,540
What we need to do in this case
is clean up a manually invalid

555
00:27:03,540 --> 00:27:04,040
index.

556
00:27:04,400 --> 00:27:06,660
It will be shown as invalid.

557
00:27:07,800 --> 00:27:11,180
And then make a new attempt from
scratch.

558
00:27:12,180 --> 00:27:14,680
Michael: You said index there,
but did you mean constraint?

559
00:27:15,300 --> 00:27:17,360
Nikolay: Well, constraint will
be created implicitly.

560
00:27:17,540 --> 00:27:22,280
There's a subtle difference between
unique indexes and constraints.

561
00:27:22,740 --> 00:27:23,900
Let's not go there.

562
00:27:24,640 --> 00:27:30,040
And I know places in Postgres where
confusion is violating logic.

563
00:27:30,480 --> 00:27:34,900
I have a patch proposed 5 years
ago or so, nobody took care of

564
00:27:34,900 --> 00:27:35,860
it, it's okay.

565
00:27:36,380 --> 00:27:40,680
Anyway, when we create a unique
index, logically we are building

566
00:27:40,680 --> 00:27:41,180
constraints.

567
00:27:41,820 --> 00:27:46,400
So if we have duplicates, this
operation will fail, leaving invalid

568
00:27:46,400 --> 00:27:49,940
index behind because this attempt
also two-phase, just single

569
00:27:49,940 --> 00:27:54,300
line of code, but it's two-phase,
right?

570
00:27:54,620 --> 00:27:58,580
We need to drop that index again
CONCURRENTLY, and then start

571
00:27:58,580 --> 00:27:59,280
from scratch.

572
00:27:59,340 --> 00:28:02,580
Oh, again, we need to clean up
duplicates and start from scratch.

573
00:28:02,680 --> 00:28:03,380
How about here?

574
00:28:03,380 --> 00:28:08,740
We could say, ALTER TABLE CONCURRENTLY,
ALTER TABLE blah blah,

575
00:28:08,740 --> 00:28:10,740
like, add, check constraint CONCURRENTLY.

576
00:28:11,140 --> 00:28:13,880
If there are problems, it would
fail.

577
00:28:14,380 --> 00:28:17,360
And we need to clean up and try
again.

578
00:28:17,360 --> 00:28:18,220
I don't know.

579
00:28:18,740 --> 00:28:21,880
Michael: I don't understand the
reason...

580
00:28:22,540 --> 00:28:25,940
I've never really thought about
this before, but why does CREAT INDEX CONCURRENTLY,

581
00:28:27,260 --> 00:28:28,860
if it fails, leave behind...

582
00:28:28,860 --> 00:28:31,240
Why can't it fully tidy up after
itself?

583
00:28:31,240 --> 00:28:32,300
Nikolay: Because it's not transactional.

584
00:28:33,840 --> 00:28:37,040
Michael: But if it's not transactional
already, why can't it

585
00:28:37,040 --> 00:28:38,260
do the DROP INDEX?

586
00:28:38,260 --> 00:28:40,200
Like, why can't it know that it's
left?

587
00:28:40,200 --> 00:28:41,740
Nikolay: Yeah, it's a good question
actually.

588
00:28:41,740 --> 00:28:43,480
And it could DROP INDEX CONCURRENTLY.

589
00:28:44,320 --> 00:28:44,820
Yeah.

590
00:28:44,860 --> 00:28:45,180
Yeah.

591
00:28:45,180 --> 00:28:46,200
It's a good question.

592
00:28:47,020 --> 00:28:48,500
I will think about this.

593
00:28:48,780 --> 00:28:50,380
Michael: There's probably a reason,
right?

594
00:28:50,380 --> 00:28:51,580
People would have thought of that.

595
00:28:51,580 --> 00:28:54,000
So I imagine there is a good reason,
but I don't know it.

596
00:28:54,000 --> 00:28:56,620
Nikolay: Maybe it's just easier
to implement, that's it.

597
00:28:57,660 --> 00:29:01,120
Michael: Maybe, but that would
be a good feature if it's not

598
00:29:01,120 --> 00:29:01,720
really difficult.

599
00:29:01,720 --> 00:29:05,720
Nikolay: Yeah, the user should
be okay just with the error message.

600
00:29:06,740 --> 00:29:08,400
Michael: Yeah, exactly, it didn't
work.

601
00:29:08,940 --> 00:29:11,040
You're back where you started,
even

602
00:29:11,040 --> 00:29:11,680
Nikolay: if I hadn't

603
00:29:11,680 --> 00:29:12,840
Michael: seen several transitions.

604
00:29:13,140 --> 00:29:15,000
Nikolay: Fix your duplicates and
try again.

605
00:29:15,680 --> 00:29:20,860
But if you don't use explicit naming
for your indexes, there

606
00:29:20,860 --> 00:29:27,280
is a risk that you had 10 attempts
and didn't notice that you

607
00:29:27,280 --> 00:29:33,120
have 9 invalid indexes left behind
because you didn't check the

608
00:29:33,120 --> 00:29:33,620
definition.

609
00:29:33,920 --> 00:29:35,100
This is a risk here.

610
00:29:35,200 --> 00:29:38,560
If you use explicit naming, of
course it will say the index with

611
00:29:38,560 --> 00:29:40,580
such name already exists, right?

612
00:29:41,000 --> 00:29:41,680
Or no?

613
00:29:41,740 --> 00:29:42,265
Index comparable.

614
00:29:42,265 --> 00:29:44,140
Michael: I actually
don't know if it's valid.

615
00:29:44,140 --> 00:29:45,320
I assume so.

616
00:29:45,420 --> 00:29:46,750
Nikolay: I think,
yeah, it will be in the index.

617
00:29:46,750 --> 00:29:47,860
It must be, yeah.

618
00:29:47,860 --> 00:29:49,020
Yeah.

619
00:29:49,080 --> 00:29:55,280
So I always prefer to use explicit
naming for indexes, just to

620
00:29:55,280 --> 00:29:59,340
control the naming schema, like
understanding which columns are

621
00:29:59,340 --> 00:30:00,340
indexed and so on.

622
00:30:00,340 --> 00:30:04,240
But also because of this, to better
control of my attempts.

623
00:30:04,240 --> 00:30:07,980
If it's CONCURRENTLY, it's always
CONCURRENTLY, almost always.

624
00:30:08,440 --> 00:30:11,600
Yeah, so anyway, back to this ALTER TABLE,
let constraint, check

625
00:30:11,600 --> 00:30:12,100
constraint.

626
00:30:13,040 --> 00:30:14,360
CONCURRENTLY, why not?

627
00:30:14,640 --> 00:30:16,940
It could just do 2 phases.

628
00:30:17,580 --> 00:30:20,320
First phase NOT VALID, second phase
validation.

629
00:30:20,940 --> 00:30:25,760
If validation fails, clean up or
no is a separate topic now.

630
00:30:26,120 --> 00:30:27,460
You detached this topic.

631
00:30:27,840 --> 00:30:29,560
Michael: I prefer clean up, but
yeah.

632
00:30:29,620 --> 00:30:32,460
Nikolay: But it wouldn't it be
so great?

633
00:30:33,340 --> 00:30:33,840
Michael: Yeah.

634
00:30:34,300 --> 00:30:37,200
I feel like CONCURRENTLY is coming
up in every episode these

635
00:30:37,200 --> 00:30:37,430
days.

636
00:30:37,430 --> 00:30:38,160
It's good.

637
00:30:39,140 --> 00:30:39,520
Nikolay: Great, great.

638
00:30:39,520 --> 00:30:39,760
Yeah.

639
00:30:39,760 --> 00:30:41,080
So, ALTER TABLE CONCURRENTLY.

640
00:30:41,880 --> 00:30:43,440
Michael: I think it'd be a great
feature.

641
00:30:43,480 --> 00:30:46,900
Nikolay: It's not an easy to implement,
I'm sure.

642
00:30:47,080 --> 00:30:48,080
Michael: And again, non-transactional.

643
00:30:48,420 --> 00:30:50,400
It's another 1 of those things
that would have...

644
00:30:50,860 --> 00:30:54,940
Nikolay: It means that there is
a promise of Postgres has transactional

645
00:30:55,460 --> 00:30:59,200
SQL, but when you start working,
it's already so.

646
00:30:59,640 --> 00:31:04,040
In real life, we don't have transactional
DDL, because create-index-concurrent

647
00:31:04,560 --> 00:31:11,080
is not transactional, and these
steps, when you split it to several

648
00:31:11,100 --> 00:31:15,520
steps, each 1 of them is transactional,
which is great, but Sometimes

649
00:31:15,520 --> 00:31:18,360
you cannot reverse this.

650
00:31:19,220 --> 00:31:22,940
And overall, you lose some data
or something.

651
00:31:24,140 --> 00:31:27,100
Michael: Yeah, I think it's fair to say we do have transactional

652
00:31:27,100 --> 00:31:31,520
DDO in Postgres, but it's not practical in larger projects with

653
00:31:31,520 --> 00:31:35,240
high load, because you just can't afford the downtime that comes

654
00:31:35,660 --> 00:31:37,920
is as a result of that transactional data.

655
00:31:38,300 --> 00:31:42,520
So in medium to heavy load, large enough setups, we don't have

656
00:31:42,520 --> 00:31:44,560
transaction data or not practical.

657
00:31:44,600 --> 00:31:46,120
We can't practically use it.

658
00:31:46,120 --> 00:31:49,940
So these practical solutions are really valuable, I think.

659
00:31:49,960 --> 00:31:54,980
The other approach seems to be, yeah, like the completely rewriting

660
00:31:55,020 --> 00:31:58,200
the table type approaches like we talked about last week.

661
00:31:58,440 --> 00:32:03,340
Or there's another project and maybe it's a whole, It's definitely

662
00:32:03,340 --> 00:32:06,880
a whole other topic, but have you come across pgroll?

663
00:32:06,880 --> 00:32:09,120
I think it's been brought up a couple of times on the podcast.

664
00:32:09,120 --> 00:32:10,120
Nikolay: Yes, in the past.

665
00:32:10,200 --> 00:32:11,780
Fully wrapped off a table.

666
00:32:12,380 --> 00:32:13,120
Michael: Yeah, exactly.

667
00:32:13,320 --> 00:32:15,400
I think that's another solution to this, right?

668
00:32:15,400 --> 00:32:16,720
Nikolay: Just to introduce constraints.

669
00:32:17,080 --> 00:32:17,900
Thank you, no.

670
00:32:17,900 --> 00:32:19,840
Michael: I don't know if that's what they do for this, but I

671
00:32:19,840 --> 00:32:22,980
think that's how they handle most schema changes.

672
00:32:23,100 --> 00:32:24,180
Nikolay: I understand that.

673
00:32:24,640 --> 00:32:27,760
Michael: It's heavy-handed, but it does also solve this problem.

674
00:32:28,140 --> 00:32:28,620
Nikolay: Yeah, yeah, yeah.

675
00:32:28,620 --> 00:32:32,140
Now I remember when we discussed pg_squeeze, we discussed

676
00:32:32,500 --> 00:32:34,460
changing order of Columns.

677
00:32:34,640 --> 00:32:37,980
This makes sense and of course CONCURRENTLY... not CONCURRENTLY.

678
00:32:40,680 --> 00:32:44,880
Michael: I guess 1 final solution is again super heavy-handed,

679
00:32:45,060 --> 00:32:49,600
but you could logically replicate to an Instance that has the

680
00:32:49,600 --> 00:32:51,580
Constraints that you want in place already.

681
00:32:55,360 --> 00:32:56,920
A huge amount of work.

682
00:32:56,920 --> 00:33:01,720
Nikolay: When we want to just constrain, we need just to read

683
00:33:01,720 --> 00:33:04,460
everything to ensure there are no violations.

684
00:33:05,540 --> 00:33:09,580
It's not a big change, it's just adding a restriction.

685
00:33:10,160 --> 00:33:11,530
So we don't need to change data.

686
00:33:11,530 --> 00:33:11,880
Yeah,

687
00:33:11,880 --> 00:33:12,680
Michael: shouldn't be.

688
00:33:13,160 --> 00:33:17,300
Nikolay: Anyway, I think that's it, what I wanted

689
00:33:17,300 --> 00:33:17,980
to discuss.

690
00:33:19,760 --> 00:33:20,580
Anything else?

691
00:33:21,540 --> 00:33:22,660
Michael: Yes, super useful feature.

692
00:33:23,480 --> 00:33:26,140
I'm hoping more people become aware of it now.

693
00:33:26,140 --> 00:33:28,540
Nikolay: You mean ALTER TABLE CONCURRENTLY doesn't exist?

694
00:33:29,200 --> 00:33:33,920
Michael: No, I mean alter table add constraint NOT VALID.

695
00:33:34,020 --> 00:33:34,740
Nikolay: NOT VALID.

696
00:33:36,100 --> 00:33:40,360
And a little bit confusing naming, so watch out.

697
00:33:40,440 --> 00:33:42,780
New rights are already being checked.

698
00:33:43,500 --> 00:33:46,360
Michael: Yeah, I'll include some links to the documentation because

699
00:33:46,380 --> 00:33:48,340
that explains it pretty well as well.

700
00:33:48,340 --> 00:33:49,520
Nikolay: Great, great.

701
00:33:49,940 --> 00:33:50,700
Thank you.

702
00:33:51,660 --> 00:33:53,340
Michael: Merci Nikolay, thanks so
much.

703
00:33:53,340 --> 00:33:53,574
Nikolay: See you soon.

704
00:33:53,574 --> 00:33:54,067
Bye bye.

705
00:33:54,067 --> 00:33:54,481
Michael: See you soon. Bye