1
0:0:0,06 --> 0:0:2,6
Michael: Hello and welcome to PostgresFM,
a weekly show about

2
0:0:2,6 --> 0:0:3,5
all things PostgreSQL.

3
0:0:3,6799998 --> 0:0:6,7599998
I am Michael, founder of pgMustard,
and I'm joined as always

4
0:0:6,7599998 --> 0:0:8,559999
by Nik, founder of Postgres.AI.

5
0:0:8,559999 --> 0:0:9,22
Hey Nik!

6
0:0:9,8 --> 0:0:11,82
Nikolay: Hi Michael, how are you?

7
0:0:12,179999 --> 0:0:13,679999
Michael: I am good, how are you?

8
0:0:14,2 --> 0:0:15,0
Nikolay: Very good.

9
0:0:16,06 --> 0:0:16,24
Fantastic.

10
0:0:16,24 --> 0:0:16,74
Oh

11
0:0:18,88 --> 0:0:19,38
Michael: great.

12
0:0:19,66 --> 0:0:21,54
What are we talking about this
week?

13
0:0:22,44 --> 0:0:23,46
Nikolay: Not about AI.

14
0:0:24,380001 --> 0:0:25,880001
Michael: Yes, for a few.

15
0:0:26,18 --> 0:0:27,099998
We have a detox.

16
0:0:27,5 --> 0:0:30,2
Nikolay: Everyone is talking about
AI, so we are not going to

17
0:0:30,2 --> 0:0:31,22
talk about AI.

18
0:0:31,64 --> 0:0:33,54
Michael: Great, so what are we
talking about?

19
0:0:34,08 --> 0:0:37,08
Nikolay: Let's talk about plan
flips and it's quite a complex

20
0:0:37,08 --> 0:0:42,98
topic because I feel Postgres lacks
a lot of stuff and I'm not

21
0:0:42,98 --> 0:0:47,28
sure what it needs to have in future
to be better.

22
0:0:47,38 --> 0:0:50,44
And there is a lot of criticism,
including the recent one.

23
0:0:50,489998 --> 0:0:55,4
A recent wave of criticism against
Postgres in this area.

24
0:0:56,18 --> 0:1:1,739998
After some incidents from one company,
the company name is Clerk,

25
0:1:2,059998 --> 0:1:2,559998
right?

26
0:1:3,14 --> 0:1:3,64
Yeah.

27
0:1:4,2 --> 0:1:4,7
Clerk.

28
0:1:4,78 --> 0:1:5,22
Clerk.

29
0:1:5,22 --> 0:1:6,54
How do you pronounce it?

30
0:1:6,66 --> 0:1:7,58
Michael: I say Clerk.

31
0:1:8,24 --> 0:1:8,64
Nikolay: Clerk.

32
0:1:8,64 --> 0:1:8,94
Okay.

33
0:1:8,94 --> 0:1:9,66
That's interesting.

34
0:1:9,66 --> 0:1:10,34
Michael: So it's E,

35
0:1:10,34 --> 0:1:11,6
Nikolay: but you say I.

36
0:1:11,66 --> 0:1:12,26
I don't know.

37
0:1:12,26 --> 0:1:12,7
Yeah.

38
0:1:12,7 --> 0:1:13,2
Okay.

39
0:1:14,06 --> 0:1:14,44
Yeah.

40
0:1:14,44 --> 0:1:17,36
And there was an outage or series
of outages.

41
0:1:17,36 --> 0:1:18,04
I don't know.

42
0:1:18,04 --> 0:1:18,54
Michael: Yeah.

43
0:1:20,86 --> 0:1:21,78
Nikolay: Two weeks ago.

44
0:1:21,78 --> 0:1:26,88
And I'm as usual kudos to all companies
who share experience.

45
0:1:26,92 --> 0:1:27,7
This is great.

46
0:1:27,7 --> 0:1:28,68
They shared experience.

47
0:1:29,54 --> 0:1:33,6
They had a blog post with timeline
and root cause analysis to

48
0:1:33,6 --> 0:1:39,2
some extent and then I saw on Twitter,
I saw discussions which

49
0:1:39,64 --> 0:1:45,92
looked like Postgres is not good
enough, let's choose better systems

50
0:1:46,98 --> 0:1:50,04
because look what you might have
with Postgres.

51
0:1:50,58 --> 0:1:52,66
Michael: Yeah, look what can happen
and...

52
0:1:52,66 --> 0:1:54,68
Nikolay: Yeah, might happen, can
happen.

53
0:1:54,68 --> 0:1:55,18
Michael: Exactly.

54
0:1:55,4 --> 0:1:56,68
Something quite innocuous.

55
0:1:56,72 --> 0:1:59,86
So it was, they've said the plan
flip was the, maybe not the

56
0:1:59,86 --> 0:2:5,1
root cause, but the thing that
took them down and due to insufficient

57
0:2:5,22 --> 0:2:8,36
statistics but yeah they're an
infrastructure provider right

58
0:2:8,36 --> 0:2:13,38
so they they do authentication
and if auth is down that means

59
0:2:13,5 --> 0:2:17,16
all of their customers are down
yeah exactly so this is a huge

60
0:2:17,16 --> 0:2:19,3
issue for them and for all of their
customers.

61
0:2:19,6 --> 0:2:24,78
They are very wary of things that
can change unexpectedly, understandably.

62
0:2:25,28 --> 0:2:29,72
But equally, lots of services that
need extreme high availability

63
0:2:29,92 --> 0:2:32,9
run on Postgres, so how can they
do it?

64
0:2:32,9 --> 0:2:34,38
I think that would be a good discussion.

65
0:2:35,9 --> 0:2:40,9
Nikolay: So in my opinion, their
root cause analysis is halfway

66
0:2:40,9 --> 0:2:41,4
done.

67
0:2:42,18 --> 0:2:47,54
And naming plan flip A root cause
is not all right.

68
0:2:48,26 --> 0:2:48,76
Why?

69
0:2:49,54 --> 0:2:50,78
Why 5 why's?

70
0:2:51,5 --> 0:2:52,62
You know this approach.

71
0:2:52,96 --> 0:2:54,22
Michael: Yeah, I think to a...

72
0:2:54,24 --> 0:2:57,08
Nikolay: Yeah, basic approach for
root cause analysis is ask

73
0:2:57,08 --> 0:2:59,28
why until you cannot answer at
all.

74
0:2:59,88 --> 0:3:5,08
So their analysis was saying like,
the problem happened because

75
0:3:5,08 --> 0:3:9,92
Postgres triggered Analyze, obviously,
automatic Analyze by autovacuum

76
0:3:9,92 --> 0:3:13,64
process, which caused plan
flip.

77
0:3:14,06 --> 0:3:17,62
So they did only 2 steps in this
5-y approach.

78
0:3:18,08 --> 0:3:19,7
They said okay what happened?

79
0:3:19,78 --> 0:3:21,18
Why we are down?

80
0:3:21,66 --> 0:3:22,8
Because statistics changed.

81
0:3:22,8 --> 0:3:23,48
Why is this?

82
0:3:23,48 --> 0:3:24,64
Because plan flip.

83
0:3:24,64 --> 0:3:25,08
Why?

84
0:3:25,08 --> 0:3:29,06
Because it was triggered by autovacuum.

85
0:3:29,06 --> 0:3:30,22
2 of 3 steps.

86
0:3:30,9 --> 0:3:33,02
But you can go deeper, right?

87
0:3:33,06 --> 0:3:33,56
Yeah.

88
0:3:33,72 --> 0:3:34,74
You can go deeper.

89
0:3:34,96 --> 0:3:35,22
I think there was 1

90
0:3:35,22 --> 0:3:38,32
Michael: more, I think they did
increase statistics target for

91
0:3:38,32 --> 0:3:42,1
example to make it less likely
to happen again but there still

92
0:3:42,1 --> 0:3:43,78
are more steps for sure

93
0:3:43,78 --> 0:3:45,74
Nikolay: yeah they increased
default_statistics_target?

94
0:3:46,5 --> 0:3:49,94
Michael: Maybe not default but
for the table I think.

95
0:3:51,34 --> 0:3:53,26
Or for the column, I forget which.

96
0:3:53,26 --> 0:3:55,16
Nikolay: Yeah, it's either Column
or global.

97
0:3:55,32 --> 0:3:59,48
Anyway, that's interesting because
in my opinion there are different

98
0:4:0,06 --> 0:4:3,28
plan flips, different kinds of
plan flips.

99
0:4:3,6 --> 0:4:9,18
And I'm still trying to figure
out for myself, but if you have

100
0:4:9,18 --> 0:4:15,8
plan flip when suddenly like it
flipped after recalculation of

101
0:4:15,8 --> 0:4:20,58
statistics, that's interesting because that might happen.

102
0:4:20,58 --> 0:4:24,02
This is actually a difficult type, this is a hard type of plan

103
0:4:24,02 --> 0:4:24,52
flip.

104
0:4:24,52 --> 0:4:28,5
But the question is why it happened only now, not 1 hour ago,

105
0:4:28,5 --> 0:4:29,06
for example.

106
0:4:29,06 --> 0:4:30,84
Michael: And what's to stop it happening again?

107
0:4:31,3 --> 0:4:31,74
Nikolay: Yeah, yeah.

108
0:4:31,74 --> 0:4:33,72
And this is interesting, right?

109
0:4:33,74 --> 0:4:35,84
Michael: So I have some thoughts on this specific case but I

110
0:4:35,84 --> 0:4:38,68
think it might actually be useful for us to go back to basics

111
0:4:38,68 --> 0:4:42,34
a little bit and define what do we mean by a plan flip and what

112
0:4:42,34 --> 0:4:45,16
the normal causes of them and I think this is a little bit of

113
0:4:45,16 --> 0:4:48,3
a special case that should be interesting and might be completely

114
0:4:48,32 --> 0:4:52,66
avoidable I think, but a lot of cases aren't avoidable and I

115
0:4:52,66 --> 0:4:54,1
think that's an interesting discussion.

116
0:4:54,76 --> 0:4:59,22
But yeah, in terms of plan flips, I wanted to ask you, do you

117
0:4:59,22 --> 0:5:3,36
define them only when they're bad because we have a cost-based

118
0:5:3,42 --> 0:5:4,08
optimizer, right?

119
0:5:4,08 --> 0:5:8,86
The reason we can flip query plans is because the planner has

120
0:5:8,86 --> 0:5:12,4
choices and it makes those choices based on estimated costs of

121
0:5:12,4 --> 0:5:16,86
doing different join algorithms or different scan types.

122
0:5:17,98 --> 0:5:21,04
But flipping is part of its design, right?

123
0:5:21,04 --> 0:5:24,68
The same query running on different data as the data changes,

124
0:5:24,92 --> 0:5:27,04
perhaps should use a different execution plan.

125
0:5:27,04 --> 0:5:29,54
It would be faster to use a different execution plan.

126
0:5:29,6 --> 0:5:33,6
So flipping is designed, it's deliberate, but the idea is it

127
0:5:33,6 --> 0:5:37,4
flips to a better plan when the data suits that.

128
0:5:38,4 --> 0:5:41,88
Nikolay: So sudden plan change is normal.

129
0:5:43,18 --> 0:5:43,52
Michael: Yes.

130
0:5:43,52 --> 0:5:44,64
So, yes.

131
0:5:45,78 --> 0:5:49,12
When people say plan flip, it feels like they're only ever

132
0:5:49,12 --> 0:5:50,52
Nikolay: talking about that.

133
0:5:50,74 --> 0:5:51,36
Michael: Yes, exactly.

134
0:5:51,94 --> 0:5:57,42
Nikolay: Yeah, because this is when it changes to wrong state.

135
0:5:57,44 --> 0:6:1,8
Because the planner sometimes have mistakes.

136
0:6:2,86 --> 0:6:4,22
Michael: Yes, or misestimations.

137
0:6:6,22 --> 0:6:10,12
Nikolay: For example, if you have random page cost 4, which is

138
0:6:10,12 --> 0:6:16,96
default, I saw many situations when we have 2 candidate plans

139
0:6:18,18 --> 0:6:22,04
which have very similar cost under these circumstances.

140
0:6:23,04 --> 0:6:27,78
So for this data and what we have in pg_statistic, and for these

141
0:6:27,78 --> 0:6:32,04
settings, including random page cost 4, which is default, I saw

142
0:6:32,04 --> 0:6:34,82
the cases when there are 2 plans.

143
0:6:34,84 --> 0:6:36,42
1 is obviously wrong.

144
0:6:37,36 --> 0:6:39,9
For example, sequential scan for huge table.

145
0:6:40,08 --> 0:6:43,04
And another is obviously right, which is index scan.

146
0:6:43,26 --> 0:6:47,12
Like I'm remembering some Simple case, like simple in terms of

147
0:6:47,12 --> 0:6:51,5
it's obvious to a human or even AI that this plan for this table

148
0:6:51,5 --> 0:6:54,52
shouldn't be chosen because it's sequential scan on whole table,

149
0:6:54,52 --> 0:7:1,28
but since for example Postgres thinks that random I/O is so expensive

150
0:7:2,52 --> 0:7:7,96
suddenly chooses sequential scan plan instead of index And you

151
0:7:7,96 --> 0:7:12,32
don't notice until some point when
these costs, they are very

152
0:7:12,32 --> 0:7:13,14
close to each other.

153
0:7:13,14 --> 0:7:18,18
And then suddenly when a little
bit more data comes to the table

154
0:7:18,18 --> 0:7:19,54
and ANALYZE happens.

155
0:7:19,94 --> 0:7:24,16
Sequential bad 1 wins and that's
it.

156
0:7:24,24 --> 0:7:29,68
And this we usually solve by adjusting
random_page_cost.

157
0:7:30,04 --> 0:7:33,9
Telling the planner that please
don't choose sequential plans.

158
0:7:33,9 --> 0:7:39,22
I know we had, like, Tomas Vondra,
Tomas Vondra, has great

159
0:7:39,22 --> 0:7:43,08
proof that maybe random_page_cost
4 is not bad.

160
0:7:43,08 --> 0:7:45,66
Maybe even it should be bigger.

161
0:7:46,08 --> 0:7:52,0
But I have brain split here because
I see what his reasoning

162
0:7:52,02 --> 0:7:56,68
is makes sense theoretically, but
what I saw in production proves

163
0:7:56,74 --> 0:7:57,24
opposite.

164
0:7:57,38 --> 0:8:2,04
I saw these plan flips happening
when just like suddenly, and

165
0:8:2,04 --> 0:8:6,3
it was like some ticking bomb it
was like yeah it was sitting

166
0:8:6,3 --> 0:8:9,52
there sitting and then boom like
suddenly the planner chooses

167
0:8:9,96 --> 0:8:14,62
the sequential scan and we are
down because the time's out right

168
0:8:14,76 --> 0:8:17,7
Michael: yeah I have a theory about
this I've been thinking a

169
0:8:17,7 --> 0:8:24,4
little bit about it and I think
it's like, Tomasz's work is very

170
0:8:24,4 --> 0:8:28,84
good in theory because he cares
about the optimal time to flip,

171
0:8:28,98 --> 0:8:32,36
he cares about how much data do
we need to be reading before

172
0:8:32,36 --> 0:8:33,98
sequential scan is faster?

173
0:8:34,34 --> 0:8:39,78
But in practice, especially with
large tables, the flipping point

174
0:8:39,78 --> 0:8:40,76
is not that important.

175
0:8:40,76 --> 0:8:45,83
The main point is we almost never
want to request that much data.

176
0:8:45,83 --> 0:8:48,8
Often, especially in OLTP, we want
very small SELECTs.

177
0:8:48,8 --> 0:8:52,54
We're only selecting a time, 1
record or only a very few number

178
0:8:52,54 --> 0:8:53,2
of records.

179
0:8:53,9 --> 0:8:58,14
So the flipping point, we want
to put so far out of sight that

180
0:8:58,14 --> 0:9:1,28
even if our row estimate is really
bad, because often it's a

181
0:9:1,28 --> 0:9:4,04
combination of 2 things right it's
not just the the cost of doing

182
0:9:4,04 --> 0:9:8,08
random access it's also how many
rows do we expect to be reading

183
0:9:8,08 --> 0:9:12,78
and if that's out by a lot and
the cost is high those multiply

184
0:9:12,8 --> 0:9:15,38
together and suddenly Postgres
is thinking you know what I might

185
0:9:15,38 --> 0:9:17,72
as well scan the whole table here
because I'm expecting to read

186
0:9:17,72 --> 0:9:18,4
so much.

187
0:9:18,4 --> 0:9:23,1
But if we can, if we decrease the
random_page_cost, we get away

188
0:9:23,1 --> 0:9:24,5
with the bad row estimate.

189
0:9:24,52 --> 0:9:27,84
So I think you can fix either and
get towards it.

190
0:9:27,84 --> 0:9:31,26
But if you do both, then you massively
decrease your chance

191
0:9:31,26 --> 0:9:31,56
Nikolay: of

192
0:9:31,56 --> 0:9:32,94
Michael: this kind of issue.

193
0:9:33,54 --> 0:9:39,48
Nikolay: I hear you and this in
my mind connects very well to

194
0:9:39,48 --> 0:9:44,64
the topic of OLTP versus analytical
versus hybrid because there

195
0:9:44,64 --> 0:9:48,56
are good advancements in the area
of analytical queries processing

196
0:9:48,58 --> 0:9:50,52
for Postgres in various ways.

197
0:9:51,04 --> 0:9:55,44
And there is also, I'm strongly
believe that there is a very

198
0:9:56,1 --> 0:9:58,58
simple case called pure OLTP.

199
0:9:58,78 --> 0:10:3,98
Pure OLTP means that We need to
serve users who are humans mostly,

200
0:10:4,12 --> 0:10:6,52
maybe, and they have human perception.

201
0:10:6,74 --> 0:10:9,06
Our very first episode was about
this.

202
0:10:9,14 --> 0:10:13,76
200 milliseconds, this is when
they recognize something is already

203
0:10:13,76 --> 0:10:14,26
happening.

204
0:10:14,64 --> 0:10:18,8
And If it's significantly more
than 200 milliseconds, it feels

205
0:10:18,8 --> 0:10:23,4
slow, which means we need to have
our queries below 20 milliseconds

206
0:10:23,44 --> 0:10:27,84
because 1 web page consists of
multiple queries, usually roughly

207
0:10:27,84 --> 0:10:30,04
10, say 10, like some guess, right?

208
0:10:30,14 --> 0:10:34,28
It means we need to be below 20
milliseconds, in this case, by

209
0:10:34,28 --> 0:10:38,26
no means I want any sequential
scan be applied.

210
0:10:38,52 --> 0:10:41,52
Never, like, I don't want it, like,
just forget about it.

211
0:10:41,52 --> 0:10:43,4
I want to turn it off completely.

212
0:10:44,68 --> 0:10:44,76
I

213
0:10:44,76 --> 0:10:48,08
Michael: would encounter that Sequential
scans on tiny tables

214
0:10:48,08 --> 0:10:48,84
are fine, right?

215
0:10:48,84 --> 0:10:52,86
If it's only got 100 rows in it,
but also an index scan on that

216
0:10:52,86 --> 0:10:56,14
table would be fine as well because
it's still only got 100 rows.

217
0:10:56,4 --> 0:11:0,72
So the times where sequential scan
is best don't matter anymore

218
0:11:0,72 --> 0:11:3,72
because an index scan, whilst being
less efficient, or even a

219
0:11:3,72 --> 0:11:7,86
bitmap scan would be absolutely
fine, even if it's twice as slow,

220
0:11:7,86 --> 0:11:10,94
or even 5 times as slow because
it's such a tiny table.

221
0:11:11,6 --> 0:11:15,04
Nikolay: But we also have different
other, there is also bitmap

222
0:11:15,18 --> 0:11:19,14
scan, index and heap scan, So it
also, it's like in the middle

223
0:11:19,54 --> 0:11:21,82
and how the planner thinks about
the cost.

224
0:11:21,82 --> 0:11:22,58
It's interesting.

225
0:11:22,68 --> 0:11:28,1
So I think how we tune some service
for OLTP, and this is obviously

226
0:11:28,1 --> 0:11:28,52
OLTP.

227
0:11:28,52 --> 0:11:30,84
People didn't want this service
to,

228
0:11:31,08 --> 0:11:31,74
Michael: of course,

229
0:11:32,02 --> 0:11:32,52
Nikolay: Yeah.

230
0:11:32,64 --> 0:11:34,9
We definitely want very strict
timeouts.

231
0:11:36,22 --> 0:11:39,22
I think 15 milliseconds, it's old.

232
0:11:40,02 --> 0:11:42,42
We should go down to 10 seconds.

233
0:11:43,7 --> 0:11:45,64
So statement timeout should be
10 seconds.

234
0:11:45,64 --> 0:11:48,16
Even transaction timeout should
be 10 seconds for radical.

235
0:11:48,16 --> 0:11:50,46
Let's introduce concept of radical
OLTP.

236
0:11:50,82 --> 0:11:54,66
Yes, you have some workloads that
needs to be analytical and

237
0:11:54,66 --> 0:11:57,22
some people can wait on or some
background jobs.

238
0:11:57,34 --> 0:12:0,84
Yes, the create index taking a
couple of hours.

239
0:12:1,1 --> 0:12:5,02
It's also a different story, but
if it's pure OLTP mobile app,

240
0:12:5,02 --> 0:12:10,2
which many people depend on, put
transaction timeout and statement

241
0:12:10,2 --> 0:12:15,48
timeout, very low value, be very
strict, and just cut those queries

242
0:12:15,48 --> 0:12:17,72
which start running longer.

243
0:12:18,16 --> 0:12:22,52
In this case, if you, for example,
have a bunch of them happening,

244
0:12:23,1 --> 0:12:28,44
there's a chance that on powerful
machine, you will survive.

245
0:12:29,08 --> 0:12:35,14
If you don't have it, it will be
a snowball effect of many sessions

246
0:12:35,22 --> 0:12:38,24
being stuck and running in sequential
scan or something like

247
0:12:40,36 --> 0:12:43,22
a not optimal and it will be consuming
resources.

248
0:12:43,84 --> 0:12:48,28
Michael: The other thing it might
do is it might more quickly

249
0:12:48,38 --> 0:12:51,94
point you in the right direction
as to what's causing the issues

250
0:12:51,94 --> 0:12:52,8
that you start...

251
0:12:53,76 --> 0:12:54,86
Nikolay: Yeah, how to diagnose.

252
0:12:54,94 --> 0:12:58,98
Let's maybe, I think I'm pushing
us to some like problems and

253
0:12:58,98 --> 0:13:1,02
resolutions already but let's go
back.

254
0:13:1,02 --> 0:13:5,58
You raised a very good point about
what plan flip is so in my

255
0:13:5,58 --> 0:13:9,8
head, it's also like this is negative
term plan flip is Sudden

256
0:13:9,8 --> 0:13:13,0
plan change in the wrong direction
when like basically the planner

257
0:13:13,58 --> 0:13:17,24
Chooses a plan we don't want like
sequential scan or maybe bitmap

258
0:13:17,24 --> 0:13:19,3
scan which is timing out for example.

259
0:13:19,82 --> 0:13:22,36
Michael: Yeah I think that it might
even be worse than that I

260
0:13:22,36 --> 0:13:25,36
think it might even be like it's
noticeably worse.

261
0:13:25,36 --> 0:13:25,52
Nikolay: If

262
0:13:25,52 --> 0:13:29,06
Michael: it flips and it's 1.1
times slower I think most people

263
0:13:29,06 --> 0:13:30,18
don't really care.

264
0:13:31,0 --> 0:13:34,2
But if it's double or triple or
it's like it takes you from being

265
0:13:34,2 --> 0:13:37,86
90% CPU to 100% CPU It crosses
that threshold.

266
0:13:37,86 --> 0:13:38,3
Maybe again.

267
0:13:38,3 --> 0:13:41,52
It's the Tomas episode we did
on performance cliffs It's a cliff

268
0:13:41,52 --> 0:13:42,26
for you then.

269
0:13:42,26 --> 0:13:45,7
Nikolay: Yeah in my opinion performance
cliff term is different

270
0:13:45,7 --> 0:13:49,86
and it's macro Yeah, while plan
shape is micro term.

271
0:13:49,86 --> 0:13:53,72
It's like it's just 1 session and
we see suddenly execution of

272
0:13:53,72 --> 0:13:56,98
this query became much worse.

273
0:13:57,12 --> 0:14:2,64
It was 10 milliseconds, suddenly
became like 1 minute.

274
0:14:2,9 --> 0:14:3,4
Boom.

275
0:14:3,54 --> 0:14:7,08
And obvious example is like we
had an index scan, but suddenly

276
0:14:7,08 --> 0:14:7,76
the planner...

277
0:14:7,9 --> 0:14:10,08
And usually it happens in complex
queries, right?

278
0:14:10,08 --> 0:14:14,58
So joins, subqueries, CTE, multi-stage,
like it's something complex.

279
0:14:15,24 --> 0:14:16,1
And then boom.

280
0:14:16,38 --> 0:14:20,2
While performance cliffs for me,
like it's macro level, like

281
0:14:20,2 --> 0:14:23,4
some lightweight lock contention
happens and didn't happen and

282
0:14:23,4 --> 0:14:24,56
then suddenly it happens.

283
0:14:25,68 --> 0:14:27,6
There is no official terminology
here.

284
0:14:27,62 --> 0:14:32,16
So I'm just reading some blog posts
and trying to adjust my own

285
0:14:32,16 --> 0:14:34,12
mind to them mostly, right?

286
0:14:34,16 --> 0:14:35,52
This is how I feel.

287
0:14:35,86 --> 0:14:38,04
So plan flip is negative term,
right?

288
0:14:38,52 --> 0:14:40,38
So we don't want it to happen.

289
0:14:40,68 --> 0:14:43,66
And we started with difficult example.

290
0:14:44,2 --> 0:14:47,18
This company obviously experienced
this difficult example.

291
0:14:47,74 --> 0:14:51,48
And my question to them is like,
what random page cost you had?

292
0:14:51,48 --> 0:14:56,12
Did you have statement timeout,
item transaction timeout, maybe

293
0:14:56,12 --> 0:14:58,34
not related in this case, and transaction
timeout?

294
0:14:58,78 --> 0:15:4,3
So safeguards, safeguards which
would protect from a huge wave

295
0:15:4,3 --> 0:15:10,36
of sessions which experience this
bad plan to be executed and

296
0:15:10,56 --> 0:15:13,76
also what were your autovacuum
settings.

297
0:15:15,72 --> 0:15:16,12
I think

298
0:15:16,12 --> 0:15:16,98
Michael: that's relevant.

299
0:15:17,74 --> 0:15:22,5
Nikolay: Before we go to an increased
number of buckets with

300
0:15:22,5 --> 0:15:27,44
statistics target, these are macro
settings we need to keep in

301
0:15:27,44 --> 0:15:27,94
mind.

302
0:15:28,78 --> 0:15:32,14
Michael: I had another thought.

303
0:15:34,02 --> 0:15:36,6
This for me felt like a state of skew issue.

304
0:15:36,78 --> 0:15:44,8
And so they mentioned the column in question was 99.9996% NULL

305
0:15:44,8 --> 0:15:45,3
values.

306
0:15:46,12 --> 0:15:49,96
And this query was looking for the not null values which of which

307
0:15:49,96 --> 0:15:53,1
there are still 17,000 so we're talking about billions of records

308
0:15:53,1 --> 0:15:56,3
but only 17,000 were not NULL.

309
0:15:56,94 --> 0:16:1,56
In cases like that, where you've got such a data skew, you could

310
0:16:1,56 --> 0:16:4,54
be looking at, like for those columns you could be looking at

311
0:16:4,54 --> 0:16:9,96
increased statistics or often you want a partial index on a column

312
0:16:9,96 --> 0:16:13,88
like that right because you don't care about indexing the billions

313
0:16:13,94 --> 0:16:17,36
of NULL values you only care about indexing the not null values

314
0:16:17,36 --> 0:16:20,28
and once you've got the perfect index on that I know it could

315
0:16:20,28 --> 0:16:22,64
be a more complex query where you want different columns but

316
0:16:22,64 --> 0:16:26,86
if you've got those columns in the index at where the value is

317
0:16:26,86 --> 0:16:30,24
not null then actually the planners almost certainly going to

318
0:16:30,24 --> 0:16:33,12
choose that because it's such the perfect index for that query

319
0:16:33,12 --> 0:16:36,42
and you don't have to worry about misestimations it's never going

320
0:16:36,42 --> 0:16:38,38
to pick a sex scan in that case.

321
0:16:40,24 --> 0:16:43,86
Nikolay: So we definitely lack information here.

322
0:16:43,86 --> 0:16:44,36
Yeah.

323
0:16:44,38 --> 0:16:45,98
We didn't see the plans.

324
0:16:46,3 --> 0:16:47,64
We don't know their settings.

325
0:16:47,64 --> 0:16:49,16
We even don't know the version.

326
0:16:49,46 --> 0:16:49,66
Right.

327
0:16:49,66 --> 0:16:51,36
So we can only guess.

328
0:16:51,68 --> 0:16:54,88
And again, there are difficult situations with plan flips.

329
0:16:54,88 --> 0:16:58,28
I saw quite difficult ones and where obviously like everything

330
0:16:58,28 --> 0:16:59,08
was normal.

331
0:16:59,34 --> 0:17:3,72
But what I'm trying to say, you can get a plan flip because you

332
0:17:3,72 --> 0:17:9,06
didn't do a good job configuring Postgres well enough and putting

333
0:17:9,52 --> 0:17:11,78
these safeguards for OLTP workloads.

334
0:17:12,44 --> 0:17:15,2
And this is what we do, number 1, with all the clients who are

335
0:17:15,2 --> 0:17:19,88
doing consulting and which also the clients who install our monitoring

336
0:17:19,9 --> 0:17:21,0
and checkup service.

337
0:17:21,62 --> 0:17:26,2
And this is like number 1 thing we need to be, we recognize we

338
0:17:26,2 --> 0:17:26,92
are OLTP.

339
0:17:28,26 --> 0:17:32,74
Priority is to have the super fast queries and we need to prepare

340
0:17:33,12 --> 0:17:34,26
basics, right?

341
0:17:35,38 --> 0:17:39,72
Unfortunately, they cannot protect us in 100% of cases.

342
0:17:40,12 --> 0:17:43,94
But if you have your autovacuum not tuned, for example, you

343
0:17:43,94 --> 0:17:46,78
have this relatively new setting.

344
0:17:46,78 --> 0:17:49,64
I remember Darafei raised this topic.

345
0:17:49,74 --> 0:17:55,44
We didn't have autovacuum insert, autovacuum vacuum insert

346
0:17:55,44 --> 0:17:56,24
scale factor.

347
0:17:57,5 --> 0:18:0,04
In autovacuum analyzing, there are 2, right?

348
0:18:0,18 --> 0:18:2,44
So which triggers on insert?

349
0:18:3,92 --> 0:18:5,34
And by default, it's 20%.

350
0:18:6,82 --> 0:18:11,52
And also we have all the regular vacuum scale factor, which triggers

351
0:18:11,52 --> 0:18:14,36
on updates and deletes, also 20%.

352
0:18:15,48 --> 0:18:20,68
For analyze, it's 10%, 0.10, which
means 10% of your table needs

353
0:18:20,68 --> 0:18:21,42
to be changed.

354
0:18:21,42 --> 0:18:23,54
If it's a huge table, it's a lot
of data.

355
0:18:23,62 --> 0:18:26,54
Again, this data skew is a good
point.

356
0:18:27,78 --> 0:18:31,56
You had 1 distribution, but suddenly
all the new records, they

357
0:18:31,56 --> 0:18:32,54
are very different.

358
0:18:33,24 --> 0:18:38,14
And you don't have properly tuned
autovacuum, you accumulate

359
0:18:38,14 --> 0:18:39,16
that data a lot.

360
0:18:39,4 --> 0:18:44,74
And then suddenly after 10 or 20%
of data changed or inserted,

361
0:18:45,28 --> 0:18:47,64
autovacuum finally ran analyze.

362
0:18:48,26 --> 0:18:49,88
Suddenly we have a flick.

363
0:18:50,28 --> 0:18:54,58
If it happened earlier, maybe it
also would be bad.

364
0:18:54,58 --> 0:18:55,9
It's actually a good point.

365
0:18:55,9 --> 0:18:59,06
Maybe still we need the statistics
target, but we, like in my

366
0:18:59,06 --> 0:19:3,42
opinion, it's better to have more
frequent process running on

367
0:19:3,42 --> 0:19:4,2
large tables.

368
0:19:4,46 --> 0:19:8,8
So statistics will represent actual
data better, more often,

369
0:19:8,8 --> 0:19:9,3
right?

370
0:19:9,86 --> 0:19:12,22
To allocate more resources, to
have more workers.

371
0:19:12,28 --> 0:19:15,78
This is what we do all the time
with larger services.

372
0:19:15,82 --> 0:19:22,0
And unfortunately, we see, like
I would say 2 out of 3 clients

373
0:19:22,06 --> 0:19:27,84
coming, most of our clients are
startups, which like have say

374
0:19:27,94 --> 0:19:33,16
database exceeding 100 gigabytes
up to 10 or 20 terabytes.

375
0:19:33,16 --> 0:19:36,68
This is like sweet spot for us
where they experienced some problems

376
0:19:36,68 --> 0:19:42,18
already and 2 out of 3 roughly
have poorly tuned autovacuum.

377
0:19:42,78 --> 0:19:43,72
Poorly tuned.

378
0:19:44,48 --> 0:19:47,24
Michael: Only 2 out of 3 is quite
surprising to me.

379
0:19:47,52 --> 0:19:48,82
Nikolay: Yeah this is my rough.

380
0:19:48,82 --> 0:19:51,6
I can check statistics you

381
0:19:51,6 --> 0:19:53,1
Michael: need to update your statistics

382
0:19:56,68 --> 0:20:0,62
Nikolay: my internal yeah I need
to ask a guy to maintain statistics

383
0:20:2,64 --> 0:20:7,18
Michael: anyway but I had a question
though yeah default_statistics_target

384
0:20:7,66 --> 0:20:11,52
is like 100 and I quite
often see people increasing it

385
0:20:11,52 --> 0:20:15,3
to 1000 when an issue like this
happens or at least for like

386
0:20:15,3 --> 0:20:20,6
specific columns other than increased
analyze time which is like

387
0:20:20,6 --> 0:20:24,36
important but not that important
for yeah and it doesn't increase

388
0:20:24,36 --> 0:20:27,54
that much and also it's not that
important for OLTP systems right

389
0:20:27,54 --> 0:20:31,23
like it's a background thing it
can be like it's I'm wondering...

390
0:20:31,23 --> 0:20:33,48
Nikolay: I'm not really sure during
upgrades though.

391
0:20:34,3 --> 0:20:35,98
Michael: Only some types of upgrades,
right?

392
0:20:35,98 --> 0:20:39,14
Because if you've got such an important
infrastructure OLTP system,

393
0:20:39,14 --> 0:20:41,98
you're probably not going to be
doing pg_upgrade.

394
0:20:41,98 --> 0:20:44,44
Like, you're probably not going
to be doing those types of upgrades

395
0:20:44,44 --> 0:20:45,38
that care about analyze.

396
0:20:45,38 --> 0:20:48,54
You're probably going to be doing
logical failover.

397
0:20:48,9 --> 0:20:50,14
Nikolay: You still need to analyze.

398
0:20:50,58 --> 0:20:51,1
Still not.

399
0:20:51,1 --> 0:20:54,78
You create this logical replica,
you need to analyze this replica.

400
0:20:55,12 --> 0:20:58,12
Michael: Not while you're, you
don't have to be down during that.

401
0:20:58,38 --> 0:21:1,4
That could be done while the other
system's serving queries.

402
0:21:1,4 --> 0:21:2,2
Nikolay: You are right.

403
0:21:2,2 --> 0:21:2,94
You are right.

404
0:21:3,24 --> 0:21:5,74
Michael: And the statistics will
have been getting updated while

405
0:21:5,74 --> 0:21:9,68
it's being populated because vacuum
will be running and analyze

406
0:21:9,68 --> 0:21:10,46
will be running.

407
0:21:10,76 --> 0:21:12,34
So it's not in this.

408
0:21:12,34 --> 0:21:17,36
So I was wondering for really important
OLTP systems, why not

409
0:21:17,36 --> 0:21:20,98
just boost the default_statistics_target
everywhere and just

410
0:21:20,98 --> 0:21:22,34
pay that extra cost?

411
0:21:22,76 --> 0:21:26,4
Nikolay: Did we raise this question
with Tomas Vondra or no?

412
0:21:26,48 --> 0:21:27,86
Michael: I don't think so.

413
0:21:27,9 --> 0:21:32,22
Nikolay: Maybe we raised it when
he was on PostgresTV YouTube

414
0:21:32,22 --> 0:21:32,72
channel.

415
0:21:32,86 --> 0:21:36,7
I remember the opinion that 100
buckets is a lot.

416
0:21:38,12 --> 0:21:38,36
Right?

417
0:21:38,36 --> 0:21:41,36
This topic connects to partitioning
for me.

418
0:21:41,68 --> 0:21:47,06
Because when I see a table with
10 billion records, like a table

419
0:21:47,06 --> 0:21:53,22
like multi-terabyte in size, It's,
oh, this is like already quite

420
0:21:53,22 --> 0:21:53,56
late.

421
0:21:53,56 --> 0:21:54,9
We should have it partitioned.

422
0:21:55,14 --> 0:22:0,74
In this case, we would have many
tables and at least we would

423
0:22:0,74 --> 0:22:2,08
have localized problem.

424
0:22:2,08 --> 0:22:4,46
Only for this partition we have
a problem.

425
0:22:4,92 --> 0:22:8,16
Michael: And we would have, let's
say it had 30 partitions, we

426
0:22:8,16 --> 0:22:10,9
would have 30 times more data in
the samples

427
0:22:10,92 --> 0:22:11,54
Nikolay: because it...

428
0:22:11,54 --> 0:22:13,06
Different statistics, yeah.

429
0:22:13,1 --> 0:22:13,6
Yeah.

430
0:22:14,1 --> 0:22:14,54
Yeah.

431
0:22:14,54 --> 0:22:15,8
So I don't know.

432
0:22:15,8 --> 0:22:17,22
I see people raise it.

433
0:22:17,58 --> 0:22:21,14
I like, okay, does it affect or
analyze?

434
0:22:21,14 --> 0:22:22,28
Yeah, it does.

435
0:22:22,36 --> 0:22:24,18
How much we can afford it?

436
0:22:24,18 --> 0:22:24,52
Okay.

437
0:22:24,52 --> 0:22:25,96
Set it to thousand globally.

438
0:22:26,04 --> 0:22:26,54
Okay.

439
0:22:26,96 --> 0:22:27,82
How to prove it?

440
0:22:27,82 --> 0:22:28,68
It helps or not.

441
0:22:28,68 --> 0:22:30,92
We need holistic analysis of plans.

442
0:22:30,92 --> 0:22:35,2
So we need, This is actually, we
tried to do many years ago before

443
0:22:35,46 --> 0:22:39,68
like current state of art of like
things with branching and so

444
0:22:39,68 --> 0:22:40,18
on.

445
0:22:40,52 --> 0:22:41,62
Actually I disagree.

446
0:22:41,82 --> 0:22:41,98
I've

447
0:22:41,98 --> 0:22:43,12
Michael: just realized I disagree.

448
0:22:43,18 --> 0:22:46,62
I don't think this is about on
average improving things.

449
0:22:46,72 --> 0:22:49,4
I don't think this is a case of
on average I want a slightly

450
0:22:49,4 --> 0:22:50,22
better execution plan.

451
0:22:50,22 --> 0:22:54,02
I think this is a case of I never
want to completely flip to

452
0:22:54,02 --> 0:22:55,02
an awful plan.

453
0:22:55,08 --> 0:22:58,7
So it's about risk mitigating,
it's like mitigating the outliers

454
0:22:59,06 --> 0:23:0,2
not about improving.

455
0:23:0,24 --> 0:23:4,08
I would actually accept slightly
worse on average latencies.

456
0:23:4,4 --> 0:23:6,14
Nikolay: I don't talk about average.

457
0:23:6,26 --> 0:23:7,28
I don't talk about average.

458
0:23:7,28 --> 0:23:9,04
I'm talking about complete analysis.

459
0:23:9,14 --> 0:23:12,88
When you have, for example, thousands
of plans based on production,

460
0:23:12,9 --> 0:23:16,68
you have a clone of production
and using branching you can create

461
0:23:16,68 --> 0:23:19,9
replicas and compare plans before
and after the change.

462
0:23:20,22 --> 0:23:23,5
This is what if framework which
we have with Database Lab.

463
0:23:24,24 --> 0:23:25,9
And you can compare plans.

464
0:23:26,04 --> 0:23:28,78
There's a question how to properly
compare plans at the ground

465
0:23:28,78 --> 0:23:29,28
scale.

466
0:23:29,44 --> 0:23:30,82
We talked about this before.

467
0:23:31,12 --> 0:23:34,92
Our approach is just compare structure
and cost and the buffers

468
0:23:34,92 --> 0:23:35,9
and also timing.

469
0:23:36,04 --> 0:23:39,22
But timing, it's tricky because
it's volatile and so on.

470
0:23:39,22 --> 0:23:43,84
If cost changes seriously, this
is already a signal to investigate

471
0:23:43,9 --> 0:23:44,66
what happened.

472
0:23:45,04 --> 0:23:48,46
If structure changes significantly,
also a signal.

473
0:23:48,6 --> 0:23:51,5
Structure, by the way, maybe like
you don't need the whole structure,

474
0:23:51,5 --> 0:23:54,72
you probably just need to access
methods like you had 3 index

475
0:23:54,72 --> 0:23:58,28
scans, suddenly you have 2 index
scans and 1 heap scan, sequential

476
0:23:58,28 --> 0:23:59,16
scan, right?

477
0:23:59,16 --> 0:24:0,62
So it's bad, right?

478
0:24:0,62 --> 0:24:1,76
So what happened?

479
0:24:2,06 --> 0:24:4,6
Anyway, this holistic approach
is great.

480
0:24:4,6 --> 0:24:9,94
So you can have a huge amount of
plans, and then you have the

481
0:24:9,94 --> 0:24:13,3
ability to do the change and compare
before and after.

482
0:24:13,78 --> 0:24:17,08
Again, I haven't visited the topic
of default_statistics_target

483
0:24:17,08 --> 0:24:21,38
particularly in this context for
quite a while, but it would

484
0:24:21,38 --> 0:24:22,62
be great to do it.

485
0:24:22,74 --> 0:24:25,82
And again, I have a bunch of unfinished
thoughts.

486
0:24:26,18 --> 0:24:29,44
This topic, plan flips, is unfinished
in my head.

487
0:24:29,44 --> 0:24:31,3
I'm still thinking what's better.

488
0:24:31,48 --> 0:24:34,58
So once again, you don't need average
or something.

489
0:24:34,6 --> 0:24:37,74
You want, it's a classic optimization
problem.

490
0:24:38,0 --> 0:24:42,14
You want to optimize something
and you don't want everything

491
0:24:42,18 --> 0:24:44,82
else to degrade significantly.

492
0:24:45,9 --> 0:24:50,5
You say, I'm okay if something
degrades like maximum 10% on some

493
0:24:50,5 --> 0:24:53,36
metric There's a question which
metric to choose also, but I

494
0:24:53,36 --> 0:24:54,94
want this to be improved.

495
0:24:55,02 --> 0:24:56,1
This is 1 thing.

496
0:24:56,2 --> 0:25:0,32
So we have framework We don't have
all the answers yet Second

497
0:25:0,32 --> 0:25:4,98
thing is how to understand we are
on the edge of flip.

498
0:25:5,98 --> 0:25:8,36
Or cliff if you like, doesn't matter.

499
0:25:8,4 --> 0:25:8,8999
How to understand.

500
0:25:8,8999 --> 0:25:9,64
Yeah, I think,

501
0:25:10,14 --> 0:25:13,22
Michael: I actually don't know
if this is important as much as

502
0:25:13,22 --> 0:25:15,13
how do we stick with what we've
got.

503
0:25:15,13 --> 0:25:15,42
I think

504
0:25:15,42 --> 0:25:17,04
Nikolay: I want to predict the
flip.

505
0:25:17,42 --> 0:25:20,02
Michael: Yeah, but I think it's
really difficult to predict bad

506
0:25:20,02 --> 0:25:22,74
plan flips without also predicting
good ones.

507
0:25:22,74 --> 0:25:24,52
There would be so many false positives.

508
0:25:24,76 --> 0:25:27,02
If you think about how you would
do it.

509
0:25:27,54 --> 0:25:32,72
Nikolay: I forgot to mention that
in our case, We execute on

510
0:25:32,72 --> 0:25:36,5
real database, full size, full
clone, using database branching

511
0:25:36,5 --> 0:25:37,62
provisioned very quickly.

512
0:25:37,72 --> 0:25:42,42
And we see not only planner, but
executor in action, like realistic

513
0:25:42,62 --> 0:25:43,32
testing, right?

514
0:25:43,32 --> 0:25:44,68
Full-fledged realistic testing.

515
0:25:44,68 --> 0:25:51,3
We execute hundreds or thousands
of queries, samples properly

516
0:25:51,78 --> 0:25:52,28
collected.

517
0:25:53,3 --> 0:25:56,98
So it's not we have 100 samples
and 90 of them is the same query

518
0:25:56,98 --> 0:25:57,48
ID.

519
0:25:57,66 --> 0:26:1,1
We take query ID, we collect multiple
samples for each query

520
0:26:1,1 --> 0:26:4,12
ID, and then we execute before
and after the change.

521
0:26:4,12 --> 0:26:9,06
We developed this framework specifically
to predict plan flips

522
0:26:9,72 --> 0:26:11,06
for major upgrades.

523
0:26:12,88 --> 0:26:15,78
Michael: Yeah, which is, it's slightly
different, right?

524
0:26:15,78 --> 0:26:16,32
I think

525
0:26:16,32 --> 0:26:17,64
Nikolay: it's different.

526
0:26:17,68 --> 0:26:18,64
Michael: I think it's different.

527
0:26:18,96 --> 0:26:23,14
I think why it's different is often
people talk about plan flips

528
0:26:23,14 --> 0:26:26,98
in the context of I didn't change
anything and this thing just

529
0:26:26,98 --> 0:26:30,24
happened yeah whereas with a major
version upgrade you know in

530
0:26:30,24 --> 0:26:31,42
advance you're doing

531
0:26:31,42 --> 0:26:32,32
Nikolay: it you

532
0:26:32,32 --> 0:26:35,74
Michael: can look it yeah you you
have this reference point if

533
0:26:35,74 --> 0:26:36,28
we change you

534
0:26:36,28 --> 0:26:37,7
Nikolay: have new angel big

535
0:26:37,98 --> 0:26:41,96
Michael: yeah yeah So I think psychologically
at least it's different.

536
0:26:41,96 --> 0:26:45,06
I understand that in practice there's
a lot of the same harnesses.

537
0:26:45,06 --> 0:26:48,42
Nikolay: So that's why I'm saying
there are easy plan flips and

538
0:26:48,42 --> 0:26:49,94
there are difficult plan flips.

539
0:26:50,06 --> 0:26:53,5
Easy is when you did some action,
you upgraded Postgres.

540
0:26:53,56 --> 0:26:57,24
You jumped between 13 to 18, 5
versions for example.

541
0:26:57,32 --> 0:26:59,08
There are new nodes and plans,
right?

542
0:26:59,08 --> 0:27:0,32
Never existed before.

543
0:27:0,48 --> 0:27:1,22
So definitely

544
0:27:1,56 --> 0:27:4,32
Michael: new settings, new configuration
parameter defaults.

545
0:27:4,64 --> 0:27:7,8
Nikolay: In my opinion, plan flip
analysis should be a part of

546
0:27:7,8 --> 0:27:9,24
major Postgres upgrade.

547
0:27:9,24 --> 0:27:10,62
And we have it for our clients.

548
0:27:10,68 --> 0:27:12,22
Yeah, we have it for our clients.

549
0:27:12,5 --> 0:27:16,86
It still requires some, like It's
semi-automated right now.

550
0:27:16,86 --> 0:27:19,74
I hope at some point we will fully
automate it, and it should

552
0:27:21,76 --> 0:27:25,32
And it shines when you have Database
branching properly implemented.

553
0:27:26,28 --> 0:27:28,68
But there are difficult plan flips.

554
0:27:30,06 --> 0:27:36,54
So to explain how to properly categorize
it, let's a little bit

555
0:27:36,94 --> 0:27:42,04
stop and think about what the planner, what does it

556
0:27:42,04 --> 0:27:45,18
take into account when choosing plans?

557
0:27:45,36 --> 0:27:46,74
It's only 3 things.

558
0:27:47,78 --> 0:27:52,48
It's version, which defines the
version of the planner, basically,

559
0:27:52,66 --> 0:27:54,1
which engine we are using.

560
0:27:55,08 --> 0:28:0,26
And second, it's settings, not
all settings.

561
0:28:1,5 --> 0:28:4,48
My current vision is like the planner
settings plus work, plus

562
0:28:4,48 --> 0:28:5,22
work_mem.

563
0:28:5,22 --> 0:28:6,38
Why I'm saying plus work_mem?

564
0:28:6,38 --> 0:28:8,4
Because the planner settings, there
is a category.

565
0:28:8,42 --> 0:28:11,98
If you check pg_settings, 1 of
the columns will be category,

566
0:28:11,98 --> 0:28:12,44
right?

567
0:28:12,44 --> 0:28:15,52
And there is a category called
planner settings, but work_mem

568
0:28:15,52 --> 0:28:17,14
is beyond.

569
0:28:17,8 --> 0:28:19,44
So plus work_mem.

570
0:28:19,94 --> 0:28:23,58
And finally, the number 3 is pg_statistic
content.

571
0:28:25,02 --> 0:28:25,94
And that's it.

572
0:28:26,4 --> 0:28:30,98
It doesn't care about hardware
or actual data.

573
0:28:31,6 --> 0:28:32,22
Michael: Yeah, true.

574
0:28:32,22 --> 0:28:34,8
It's not actually the data volume,
it's how much data it thinks

575
0:28:34,8 --> 0:28:36,3
it has, which is the statistics.

576
0:28:36,6 --> 0:28:40,52
Nikolay: Yeah and if the statistics
is lagging, it might live

577
0:28:40,52 --> 0:28:41,46
in different reality.

578
0:28:41,46 --> 0:28:44,72
That's why I say I would prefer,
it might still be wrong, but

579
0:28:44,72 --> 0:28:48,24
I would prefer it be wrong sooner
and be closer to reality, right?

580
0:28:48,24 --> 0:28:48,74
Yeah.

581
0:28:49,08 --> 0:28:52,26
So when we change engine, it's
obvious we know when it happens.

582
0:28:52,66 --> 0:28:55,74
And before that, we can perform
this plan flip analysis holistically

583
0:28:56,12 --> 0:28:57,18
according to our methodology.

584
0:28:57,18 --> 0:28:57,78
This is great.

585
0:28:57,78 --> 0:29:2,22
We can predict this plan flip and
prepare a resolution, solution

586
0:29:2,22 --> 0:29:3,34
for it in advance.

587
0:29:3,48 --> 0:29:5,04
This is what we do with clients.

588
0:29:5,74 --> 0:29:10,76
Second, if we change settings,
basically it's the same.

589
0:29:10,76 --> 0:29:12,18
We also know when we do it.

590
0:29:12,18 --> 0:29:14,64
For example, okay, we're going
to raise default_statistics_target

591
0:29:14,64 --> 0:29:15,14
globally.

592
0:29:16,02 --> 0:29:17,54
This will help to this plan.

593
0:29:17,58 --> 0:29:20,44
Question, how will other plans
behave?

594
0:29:21,02 --> 0:29:24,36
We don't want them to degrade significantly,
at least now.

595
0:29:24,96 --> 0:29:27,08
Same approach can happen, it can
be applied.

596
0:29:27,88 --> 0:29:31,24
The third thing is pg_statistic
content is changing.

597
0:29:32,1 --> 0:29:36,32
Again I would prefer to be changing
more often, like continuously

598
0:29:36,6 --> 0:29:40,08
changing so more often an autoanalyze
should happen more often

599
0:29:40,08 --> 0:29:41,54
to reflect the current reality.

600
0:29:41,88 --> 0:29:43,82
For this we need autovacuum tuning.

601
0:29:44,44 --> 0:29:46,26
And Then how to predict?

602
0:29:46,92 --> 0:29:49,84
Michael: Also, you missed something
which is we could gather

603
0:29:49,84 --> 0:29:52,78
more statistics, like it's a sample
and we could increase that.

604
0:29:52,78 --> 0:29:54,96
Nikolay: Yeah, this is mitigation
activity already.

605
0:29:54,96 --> 0:29:59,0
I agree that moving from 100 to
1,000 sometimes might make sense.

606
0:29:59,44 --> 0:30:2,12
And honestly, like, I don't have
strong opinion here.

607
0:30:2,12 --> 0:30:4,62
I have clients with 100, with 1000.

608
0:30:4,96 --> 0:30:8,3
In the past, I had clients who
thought that going down to 10

609
0:30:8,3 --> 0:30:9,24
is a good idea.

610
0:30:9,24 --> 0:30:11,18
I don't have them anymore somehow.

611
0:30:13,58 --> 0:30:17,8
Somehow right now, all the thoughts
are to direction of raising

612
0:30:17,8 --> 0:30:20,82
this, not to reducing this.

613
0:30:20,94 --> 0:30:23,04
Yeah, but I remember we had such
discussions.

614
0:30:23,14 --> 0:30:24,5
Maybe we should raise it.

615
0:30:24,52 --> 0:30:27,24
Counterintuitive ideas, sometimes
we know, and we don't have

616
0:30:27,24 --> 0:30:27,9
it anymore.

617
0:30:27,98 --> 0:30:30,92
Anyway, 2 first things we usually
control.

618
0:30:31,64 --> 0:30:35,24
We don't have self-driving Postgres
yet, so we control it.

619
0:30:35,24 --> 0:30:38,56
And we can perform plan flip analysis
if we have such tool.

620
0:30:38,56 --> 0:30:39,06
But...

621
0:30:39,14 --> 0:30:40,58
Michael: Wait, there's 1 more thing.

622
0:30:40,84 --> 0:30:44,76
It's not available everywhere,
but another thing that does affect

623
0:30:44,76 --> 0:30:47,9
this is, for example, if you're
using hints with pg_hint_plan.

624
0:30:49,22 --> 0:30:50,24
Nikolay: Okay, hints.

625
0:30:50,82 --> 0:30:53,8
Michael: I think that's a big topic,
though, in production.

626
0:30:54,28 --> 0:30:55,36
Nikolay: That's a great topic.

627
0:30:55,38 --> 0:30:59,86
Let's consider the exclusion because
it's not a major approach

628
0:30:59,86 --> 0:31:0,25
right now.

629
0:31:0,25 --> 0:31:3,04
It's not, it's like very rare.

630
0:31:3,04 --> 0:31:3,76
Is that true?

631
0:31:4,2 --> 0:31:4,76
I hope so.

632
0:31:4,76 --> 0:31:8,6
Michael: I think in other databases,
this is actually often 1

633
0:31:8,6 --> 0:31:8,98
of the main.

634
0:31:8,98 --> 0:31:9,78
Yeah, and

635
0:31:10,08 --> 0:31:12,52
Nikolay: you just try to open another
like gate or something

636
0:31:12,52 --> 0:31:15,92
and I agree let's open it but let's
just finish with this second

637
0:31:15,92 --> 0:31:17,88
plan while I think about second
plan.

638
0:31:17,88 --> 0:31:22,2
Again, if you change major version,
you control it, you can predict

639
0:31:22,2 --> 0:31:22,94
plan flip.

640
0:31:23,0 --> 0:31:25,22
If you change settings, the same
thing.

641
0:31:25,24 --> 0:31:27,14
You control it, you can predict.

642
0:31:29,32 --> 0:31:34,64
If changing pg_statistic content
due to data changes leads to

643
0:31:34,64 --> 0:31:35,64
some plan flip.

644
0:31:36,42 --> 0:31:37,62
This is hard to predict.

645
0:31:38,24 --> 0:31:39,46
How to predict it?

646
0:31:40,48 --> 0:31:45,4
When I think about it, I want to
be able to check the second

647
0:31:45,4 --> 0:31:45,9
plan.

648
0:31:46,26 --> 0:31:51,44
Or maybe first 5 plans which were
like almost winning because

649
0:31:51,44 --> 0:31:56,16
in this case I can say okay I see
the winning plan current plan

650
0:31:56,16 --> 0:32:1,3
has cost say 500 but also there
is a second plan which has cost

651
0:32:1,56 --> 0:32:4,22
490 and it's quite close.

652
0:32:4,9 --> 0:32:9,78
If it's close, say, below 20%,
if difference is below 20% I will

653
0:32:9,78 --> 0:32:14,96
start thinking might it flip soon
because this is why I want

654
0:32:14,96 --> 0:32:15,48
second plan.

655
0:32:15,48 --> 0:32:17,94
Michael: I think this is incredibly
tricky though.

656
0:32:18,94 --> 0:32:23,42
Maybe it's simple, but for example,
parameters matter a lot.

657
0:32:23,42 --> 0:32:25,52
Like per query, different parameters.

658
0:32:25,52 --> 0:32:26,9
If you get, for example...

659
0:32:26,9 --> 0:32:28,14
Nikolay: Oh yes, parameters matter.

660
0:32:28,14 --> 0:32:28,98
Yeah, I agree.

661
0:32:29,1 --> 0:32:30,04
Michael: Yeah, a huge amount.

662
0:32:30,04 --> 0:32:33,3
If you look at the solutions that
SQL Server and Oracle have,

663
0:32:33,4 --> 0:32:36,68
bear with me because I think SQL
Server and Oracle are a bit

664
0:32:36,68 --> 0:32:37,86
further ahead of Postgres.

665
0:32:38,42 --> 0:32:41,78
On this, maybe DB2 sometimes has
some features in this kind of

666
0:32:41,78 --> 0:32:42,28
area.

667
0:32:42,4 --> 0:32:46,86
But the solutions they've ended
up implementing are around forcing

668
0:32:46,94 --> 0:32:47,6
the existing.

669
0:32:47,86 --> 0:32:49,38
Yeah, so plan pinning.

670
0:32:49,46 --> 0:32:50,36
Nikolay: Aurora has it.

671
0:32:50,36 --> 0:32:51,64
Postgres Aurora has it.

672
0:32:52,14 --> 0:32:52,64
Michael: Interesting.

673
0:32:53,1 --> 0:32:56,38
So yeah, these are the solutions
that other databases have ended

674
0:32:56,38 --> 0:32:59,82
up implementing rather than detection
of, rather than trying

675
0:32:59,82 --> 0:33:3,96
to predict a plan flip, what they
try and do is pin it or freeze

676
0:33:3,96 --> 0:33:6,98
it and then maybe require approval
when it does flip.

677
0:33:6,98 --> 0:33:10,48
So instead of trying to predict
it, they prevent it and then

678
0:33:10,48 --> 0:33:12,78
have you have it as like an opt-in
thing.

679
0:33:12,8 --> 0:33:17,04
Maybe this change for the better,
But maybe you want someone

680
0:33:17,04 --> 0:33:18,18
to approve it first.

681
0:33:18,55 --> 0:33:20,74
Nikolay: So there are several things
here.

682
0:33:20,74 --> 0:33:21,9
You touched several things.

683
0:33:21,9 --> 0:33:23,46
First, depending on parameters.

684
0:33:23,5 --> 0:33:24,56
I agree with you.

685
0:33:24,86 --> 0:33:29,7
But in a system which is quite
big, we have usually like Pareto

686
0:33:29,7 --> 0:33:31,1
principle is working basically.

687
0:33:31,1 --> 0:33:36,94
Like A lot of workload is defined
by a limited number of specific

688
0:33:37,54 --> 0:33:38,04
plans.

689
0:33:39,06 --> 0:33:41,76
And this is how we...

690
0:33:42,04 --> 0:33:46,26
In our plan flip analysis, we have
methodology to identify not

691
0:33:46,26 --> 0:33:50,24
only normalized queries from pg_stat_statements, but also particular

692
0:33:50,28 --> 0:33:52,66
examples, particular parameters.

693
0:33:53,2 --> 0:33:58,22
So we would reproduce exact queries
with exact execution, so

694
0:33:58,22 --> 0:34:1,76
plans, which contribute to workload
the most.

695
0:34:1,96 --> 0:34:5,46
For example, contribute in terms
of calls or in terms of total

696
0:34:5,46 --> 0:34:7,12
execution and planned time.

697
0:34:7,8 --> 0:34:12,38
So we want to cover like 80% of
workload in several directions,

698
0:34:12,38 --> 0:34:13,48
in several metrics.

699
0:34:15,52 --> 0:34:20,18
And this is the magic, like how
we collect those samples and

700
0:34:20,46 --> 0:34:26,96
prepare our testing query set for
testing, so it would represent

701
0:34:26,98 --> 0:34:28,24
the majority of workload.

702
0:34:28,32 --> 0:34:32,22
Once we found a way to do it, there
is some tail.

703
0:34:33,54 --> 0:34:36,3
Those queries are less important
because if you just think about

704
0:34:36,3 --> 0:34:41,18
database time in terms of execution
and planning, if some query

705
0:34:41,92 --> 0:34:47,14
defines only minority of that time,
even if plan flips, it wouldn't

706
0:34:47,14 --> 0:34:50,32
affect the whole cluster so much
right yeah, that's make sense

707
0:34:50,32 --> 0:34:50,82
right

708
0:34:51,14 --> 0:34:53,98
Michael: It does and actually reminds
me of a different idea

709
0:34:54,28 --> 0:34:57,18
Nikolay: Maybe I'm wrong you know
what it might be like okay

710
0:34:57,18 --> 0:35:0,8
now It's only like 1% of the whole
time and London due to plan

711
0:35:0,8 --> 0:35:3,02
flip boom authority half of time

712
0:35:3,62 --> 0:35:5,86
Michael: It reminds me though of
something that I think could

713
0:35:5,86 --> 0:35:8,68
help here, it's not in Postgres,
actually maybe there's a couple

714
0:35:8,68 --> 0:35:12,66
of examples in Postgres core at
the moment, but the idea that

715
0:35:12,88 --> 0:35:16,5
the plan is chosen and then is
stuck to no matter what during

716
0:35:16,5 --> 0:35:19,64
execution, it could be, I think
Tom has actually brought this

717
0:35:19,64 --> 0:35:22,96
up at the end of our call on performance
cliffs, it could be

718
0:35:22,96 --> 0:35:24,22
adaptive at runtime.

719
0:35:24,44 --> 0:35:30,06
If it realizes it's doing, it's
chosen a bad plan, instead of

720
0:35:30,06 --> 0:35:30,92
sticking to it till

721
0:35:30,92 --> 0:35:31,56
Nikolay: the end.

722
0:35:31,56 --> 0:35:33,22
We should invite Andrey Lepikhov.

723
0:35:33,6 --> 0:35:38,72
Andrey Lepikhov, who I think, yeah,
who can talk about this adaptive

724
0:35:38,72 --> 0:35:40,18
query optimization a lot.

725
0:35:40,58 --> 0:35:42,84
I consider him an expert in this
area.

726
0:35:42,84 --> 0:35:43,34
Michael: Great.

727
0:35:43,5 --> 0:35:44,8
But do you see what I mean?

728
0:35:45,04 --> 0:35:47,62
It's able to adapt during execution.

729
0:35:47,84 --> 0:35:51,9
Maybe we avoid the issue, or at
least it's way less bad when

730
0:35:51,9 --> 0:35:52,2
it happens.

731
0:35:52,2 --> 0:35:52,58
Nikolay: Yeah.

732
0:35:52,58 --> 0:35:57,38
And also, like, to finish my thought,
maybe I said maybe this

733
0:35:57,38 --> 0:35:58,34
approach is wrong.

734
0:35:58,34 --> 0:36:3,38
Maybe if some plan which contributed
only like 1% to whole database

735
0:36:3,38 --> 0:36:4,8
time, now suddenly half.

736
0:36:5,74 --> 0:36:9,98
Because of this, I think my gut
told me that we need multiple

737
0:36:10,08 --> 0:36:10,58
directions.

738
0:36:11,02 --> 0:36:15,64
1 of the directions when we collect
this working set, we also

739
0:36:15,64 --> 0:36:16,7
order by calls.

740
0:36:17,08 --> 0:36:18,58
So it should be quite...

741
0:36:18,76 --> 0:36:23,96
So for a query which had the plan
flip to influence the whole

742
0:36:23,96 --> 0:36:27,72
cluster so much that it causes
downtime, it should be quite frequent

743
0:36:27,72 --> 0:36:28,62
enough, I think.

744
0:36:28,62 --> 0:36:32,48
If it's super infrequent, It should
be fine.

745
0:36:32,54 --> 0:36:36,92
If it's not super frequent, it
means that it won't occupy multiple

746
0:36:37,2 --> 0:36:38,84
backends in parallel.

747
0:36:39,4 --> 0:36:41,14
Michael: Unless it gets really
bad, yeah.

748
0:36:41,58 --> 0:36:44,92
Nikolay: Unless you forgot to tune
the statement timeout or transaction

749
0:36:44,96 --> 0:36:45,46
timeout.

750
0:36:45,72 --> 0:36:46,0
True.

751
0:36:46,0 --> 0:36:49,4
Yeah, we have multiple vectors and I think this is a good approach,

752
0:36:49,4 --> 0:36:51,22
good enough approach, maybe it can be improved.

753
0:36:51,22 --> 0:36:53,14
So I'm very curious about this area.

754
0:36:53,26 --> 0:36:54,64
And I'm coming from practice.

755
0:36:55,3 --> 0:36:57,88
Michael: Yeah, I do think there's still complex cases where you

756
0:36:57,88 --> 0:37:1,36
might have clusters of even the same query in pg_stat_statements

757
0:37:1,44 --> 0:37:5,06
could have 4 different plans that are actually all good.

758
0:37:5,06 --> 0:37:8,32
But let's say you're a multi-tenant system, maybe small customers

759
0:37:8,32 --> 0:37:11,72
look like, or tiny customers have this plan, huge customers that

760
0:37:11,72 --> 0:37:15,14
are typical have this plan, huge customers that have this weird

761
0:37:15,14 --> 0:37:17,58
distribution, or maybe like healthcare customers look different

762
0:37:17,58 --> 0:37:18,66
for some other concern.

763
0:37:18,74 --> 0:37:19,08
Yeah.

764
0:37:19,08 --> 0:37:19,54
Nikolay: Yeah.

765
0:37:19,54 --> 0:37:19,78
Yeah.

766
0:37:19,78 --> 0:37:23,0
And in analysis, again, I said, like, we collect multiple plans

767
0:37:23,0 --> 0:37:24,06
per query ID.

768
0:37:24,52 --> 0:37:25,2
Why multiple?

769
0:37:25,2 --> 0:37:26,18
Because of this.

770
0:37:26,24 --> 0:37:31,36
But so we like say top 5 for each query ID, we collect multiple

771
0:37:31,42 --> 0:37:31,88
them.

772
0:37:31,88 --> 0:37:37,96
But all of them are like, we take from this top end by each metric.

773
0:37:38,26 --> 0:37:43,58
So we try to capture like the concept of how influential this

774
0:37:43,58 --> 0:37:46,84
query is and multiple plans might appear there.

775
0:37:46,84 --> 0:37:50,4
It would be great to have plan ID, right, and shift aggregation

776
0:37:50,74 --> 0:37:53,96
logic there, but we have query ideas remain.

777
0:37:54,49 --> 0:37:54,99
I

778
0:37:55,52 --> 0:37:58,82
Michael: think pganalyze, Luke, I think did some work on a

779
0:37:58,82 --> 0:37:59,62
plan ID.

780
0:37:59,8 --> 0:38:2,64
Nikolay: Yeah, yeah, there's a, I saw it and we, yeah, I commented

781
0:38:2,64 --> 0:38:4,86
on LinkedIn and it's great work.

782
0:38:5,32 --> 0:38:6,5
Absolutely needed, I think.

783
0:38:6,5 --> 0:38:7,58
Yeah, it's great work.

784
0:38:7,72 --> 0:38:14,64
So then, like, if we have this, the question, 2 questions, like,

785
0:38:14,64 --> 0:38:21,02
first, like, how to understand that some plan candidate is approaching

786
0:38:21,02 --> 0:38:22,7
and suddenly can flip soon.

787
0:38:23,14 --> 0:38:26,3
Because data is in large tables, data usually is not changing

788
0:38:26,84 --> 0:38:27,78
too much suddenly.

789
0:38:27,84 --> 0:38:28,52
I agree with you.

790
0:38:28,52 --> 0:38:29,56
Michael: It's like gradual.

791
0:38:29,96 --> 0:38:30,18
Nikolay: Yeah.

792
0:38:30,18 --> 0:38:31,72
Yeah, it's quite gradual.

793
0:38:31,86 --> 0:38:37,24
So the planner behavior involving large tables, it's dangerous

794
0:38:37,24 --> 0:38:38,5
when we have 2 plans.

795
0:38:38,5 --> 0:38:40,08
This is my hypothesis, right?

796
0:38:40,08 --> 0:38:42,98
So again, for me, a lot of open questions still.

797
0:38:42,98 --> 0:38:44,7
It might be on the edge.

798
0:38:44,7 --> 0:38:51,76
And I want to feel this on the edge for each query idea or for

799
0:38:51,76 --> 0:38:55,38
each plan in advance, slightly in advance to be able to predict.

800
0:38:55,38 --> 0:39:0,58
I think someone recently had some work in this area to show Plan

801
0:39:0,58 --> 0:39:1,08
candidates.

802
0:39:1,16 --> 0:39:4,4
I might be mistaken, but I saw something somewhere It didn't

803
0:39:4,4 --> 0:39:8,56
dive deep and the second thing is in my opinion hints and plan

804
0:39:8,56 --> 0:39:14,18
freeze are 2 great features I can I put them to enterprise But

805
0:39:14,18 --> 0:39:17,02
yeah basket like this is for me?

806
0:39:17,02 --> 0:39:21,9
It's enterprise thing because it's interesting doing hints doing

807
0:39:21,9 --> 0:39:27,32
in plant phase for larger systems, probably we should have both.

808
0:39:28,26 --> 0:39:32,28
But I also understand the reasoning that it makes system complex

809
0:39:32,78 --> 0:39:34,78
and it requires a lot of management.

810
0:39:35,0 --> 0:39:36,42
There is a price you pay.

811
0:39:36,42 --> 0:39:39,84
If you put something, you start to be responsible for it.

812
0:39:39,84 --> 0:39:41,24
You need to maintain it.

813
0:39:41,24 --> 0:39:44,34
And if you throw some plan, the same thing, you, you're responsible

814
0:39:44,44 --> 0:39:45,14
now, right?

815
0:39:45,18 --> 0:39:49,12
Basically, do we have capacity to manage it?

816
0:39:49,12 --> 0:39:53,0
Because maybe later it helps less and maybe then it's not helping

817
0:39:53,0 --> 0:39:53,88
but opposite.

818
0:39:53,88 --> 0:39:54,94
That's the problem.

819
0:39:55,44 --> 0:39:55,92
Michael: Absolutely.

820
0:39:55,92 --> 0:40:0,72
But I do think it's interesting that these other databases, given

821
0:40:0,72 --> 0:40:2,8
the choice between the 2, went in that direction.

822
0:40:2,8 --> 0:40:5,54
I'm not saying it's a good idea, but they've clearly thought

823
0:40:5,54 --> 0:40:7,96
about it and there's some merit to it.

824
0:40:7,96 --> 0:40:10,48
And it's been working for many years, of course, with issues,

825
0:40:10,48 --> 0:40:15,8
of course, with unintended side effects, but it is a solution

826
0:40:15,92 --> 0:40:17,12
I think worth considering.

827
0:40:17,9 --> 0:40:19,2
So yeah, this is good.

828
0:40:19,2 --> 0:40:20,88
I thought it was actually really interesting.

829
0:40:20,9 --> 0:40:22,24
Nikolay: A lot of open questions.

830
0:40:23,62 --> 0:40:27,98
Of course I wish, I saw you send me this Postgres 19 improvements

831
0:40:27,98 --> 0:40:31,08
in terms of plan management and hidden plans mentioned there.

832
0:40:31,26 --> 0:40:36,1
I wish Aurora team open sourced this.

833
0:40:36,74 --> 0:40:38,04
It's not Aurora specifically.

834
0:40:38,86 --> 0:40:43,16
It's called APG plan, MGMT extension.

835
0:40:44,84 --> 0:40:46,14
APG plan management.

836
0:40:46,64 --> 0:40:46,96
Yeah.

837
0:40:46,96 --> 0:40:48,18
So it's great thing.

838
0:40:48,82 --> 0:40:52,7
And I wish it existed in the open source ecosystem.

839
0:40:52,72 --> 0:40:56,18
I think eventually it should exist in open source ecosystem.

840
0:40:56,88 --> 0:41:1,96
Because when I would use it, When suddenly we have plan flip,

841
0:41:1,96 --> 0:41:6,86
we know what plan we would prefer, and let's just pin it right

842
0:41:6,86 --> 0:41:11,16
now before we apply permanent solution, just mitigate as fast

843
0:41:11,16 --> 0:41:13,06
as possible without any deployments.

844
0:41:13,82 --> 0:41:18,7
Michael: But let's say we're Clark, and we know we've got 15

845
0:41:18,7 --> 0:41:22,64
to 20 really important queries, they're on the hot path for every

846
0:41:22,64 --> 0:41:26,68
single auth request, they're really frequent, they're thousands

847
0:41:26,68 --> 0:41:30,44
of times a second, we could just, we could deliberately pin all

848
0:41:30,44 --> 0:41:32,68
of those or put hints into all of those

849
0:41:32,68 --> 0:41:33,12
Nikolay: to make

850
0:41:33,12 --> 0:41:34,34
Michael: them really unlike...

851
0:41:34,82 --> 0:41:35,52
Pardon me?

852
0:41:35,74 --> 0:41:36,88
Nikolay: So this is different.

853
0:41:36,9 --> 0:41:41,2
So what I'm saying is temporarily
pin before we apply permanent

854
0:41:41,2 --> 0:41:41,7
solution.

855
0:41:42,28 --> 0:41:47,9
What you say, permanent pin, which
means there might be some

856
0:41:47,9 --> 0:41:50,58
cost you will pay to maintain it.

857
0:41:51,74 --> 0:41:55,96
Michael: Maybe yes, but my point
is at least we then can guarantee

858
0:41:56,0 --> 0:41:57,04
it won't go bad.

859
0:41:57,04 --> 0:42:0,02
Like it maybe it will gradually
degrade and we can look at fixing

860
0:42:0,02 --> 0:42:1,48
if it's going to degrade it.

861
0:42:1,48 --> 0:42:5,18
I think it's more likely to degrade
gradually instead of this

862
0:42:5,32 --> 0:42:9,64
all of a sudden, which I think
we can then fix Monday to Friday

863
0:42:9,64 --> 0:42:12,88
9 till 5 instead of when there's
a huge database outage.

864
0:42:12,88 --> 0:42:14,88
Nikolay: And again, just partition
your tables.

865
0:42:15,3 --> 0:42:18,36
I know it costs some effort, but
with AI it's easier.

866
0:42:19,54 --> 0:42:22,02
Michael: And gather statistics
on the parent Partition.

867
0:42:23,3 --> 0:42:24,78
Nikolay: Yeah, which is another
topic.

868
0:42:24,86 --> 0:42:25,58
It's fixed.

869
0:42:25,58 --> 0:42:27,98
Laurenz Albe from CYBERTEC fixed
it.

870
0:42:27,98 --> 0:42:29,48
I think it will be in 19.

871
0:42:30,16 --> 0:42:30,66
Michael: Nice.

872
0:42:30,78 --> 0:42:32,54
So automatic gathering of statistics.

873
0:42:33,74 --> 0:42:36,24
Nikolay: So it's a headache during
upgrades especially.

874
0:42:36,42 --> 0:42:37,72
I think it's solved now.

875
0:42:37,72 --> 0:42:38,68
I mean now it means

876
0:42:38,68 --> 0:42:39,56
Michael: in the future.

877
0:42:39,78 --> 0:42:40,52
Nikolay: In the future.

878
0:42:40,52 --> 0:42:41,02
Yeah.

879
0:42:42,12 --> 0:42:42,63
Great talk.

880
0:42:42,63 --> 0:42:43,38
Nice 1.

881
0:42:43,44 --> 0:42:43,78
Yeah.

882
0:42:43,78 --> 0:42:44,24
Enjoyed.

883
0:42:44,24 --> 0:42:44,94
Thank you.

884
0:42:45,28 --> 0:42:45,78
Michael: Likewise.

885
0:42:45,8 --> 0:42:46,84
Thanks so much and catch

886
0:42:46,84 --> 0:42:46,855
you next time.