1
00:00:00,140 --> 00:00:03,180
Nikolay: Hello, hello, this is
PostgresFM, episode number 96.

2
00:00:03,900 --> 00:00:08,400
My name is Nikolay, founder of
Postgres.AI, and as usual, my

3
00:00:08,400 --> 00:00:10,780
co-host is Michael from pgMustard.

4
00:00:10,920 --> 00:00:11,700
Hi, Michael.

5
00:00:12,100 --> 00:00:13,000
Michael: Hello, Nikolay.

6
00:00:13,660 --> 00:00:18,220
Nikolay: So pgMustard, as I mentioned
last time, is focusing on

7
00:00:18,420 --> 00:00:21,800
explain plans and helping optimize
Postgres queries.

8
00:00:22,900 --> 00:00:28,180
And the topic I chose this time
is very, very close to that topic,

9
00:00:28,180 --> 00:00:28,440
right?

10
00:00:28,440 --> 00:00:29,640
I mean, it's a subtopic.

11
00:00:30,140 --> 00:00:35,900
I know many people discuss a lot
of very in-depth materials around

12
00:00:35,900 --> 00:00:40,580
this topic, but let's just focus
on basics just for awareness.

13
00:00:41,360 --> 00:00:46,220
This topic is how plan caching
works and generic versus custom

14
00:00:46,220 --> 00:00:46,720
plans.

15
00:00:47,260 --> 00:00:50,820
Because at a very high level, we
don't know about them at all.

16
00:00:50,860 --> 00:00:53,440
We know that EXPLAIN shows us a
plan.

17
00:00:54,140 --> 00:00:55,680
We know there is plan caching.

18
00:00:56,140 --> 00:01:00,700
We know pgBouncer since recently
started supporting prepared

19
00:01:00,700 --> 00:01:04,360
statements, so you can avoid planning
time, because sometimes

20
00:01:04,360 --> 00:01:05,340
it's very costly.

21
00:01:05,640 --> 00:01:08,300
For example, if you have a lot
of partitions, a lot of indexes,

22
00:01:09,160 --> 00:01:11,180
log manager contention can happen.

23
00:01:11,840 --> 00:01:15,820
So it's good, especially for queries
which have very high frequency,

24
00:01:15,820 --> 00:01:21,040
for example, a thousand times per
second, it's good to just perform

25
00:01:21,040 --> 00:01:26,120
planning once per session and then
rely on the cached plan.

26
00:01:26,120 --> 00:01:28,640
But it's not really once per session,
right?

27
00:01:28,660 --> 00:01:31,100
There are more nuances in this
area.

28
00:01:31,580 --> 00:01:35,640
So let's uncover them and discuss.

29
00:01:36,700 --> 00:01:37,500
Michael: Yeah, nice.

30
00:01:37,580 --> 00:01:41,680
At a kind of high level, I see
people that are using ORMs and

31
00:01:41,680 --> 00:01:45,420
frameworks using prepared statements,
sometimes without realizing

32
00:01:45,420 --> 00:01:47,840
it, or sometimes they know but
they don't really know what it

33
00:01:47,840 --> 00:01:48,340
means.

34
00:01:48,520 --> 00:01:52,260
Do you see a lot of people, like
is that the same in your experience?

35
00:01:52,440 --> 00:01:55,640
A lot of people using this without
realizing, or are you seeing

36
00:01:55,640 --> 00:01:58,740
quite a lot of conscious choice,
like conscious use of prepared

37
00:01:58,740 --> 00:02:02,220
statements, even for people that
aren't using it via an ORM or

38
00:02:02,220 --> 00:02:02,720
similar?

39
00:02:03,240 --> 00:02:07,200
Nikolay: Well, you know my point
of view on this world.

40
00:02:07,200 --> 00:02:10,480
On one hand, I think people are quite
smart in general,

41
00:02:11,200 --> 00:02:11,380
Michael: but

42
00:02:11,380 --> 00:02:14,420
Nikolay: they are smart only when
they are very motivated.

43
00:02:15,360 --> 00:02:18,740
And we know a lot of engineers
don't distinguish between binary tree

44
00:02:18,740 --> 00:02:23,000
and B-tree at all, because they
are not motivated enough to understand

45
00:02:23,040 --> 00:02:23,760
the differences.

46
00:02:24,480 --> 00:02:28,120
Similarly here, a lot of people,
in my opinion, don't understand

47
00:02:28,220 --> 00:02:34,860
that one normalized query in pg_stat_statements
might have different

48
00:02:34,860 --> 00:02:36,300
plans depending on parameters.

49
00:02:36,460 --> 00:02:43,000
This simple truth is something
that should be understood, but

50
00:02:43,000 --> 00:02:46,640
it can be understood only when
you are motivated, when you have

51
00:02:46,640 --> 00:02:49,700
some incident, for example, and
you start off thinking, oh, okay,

52
00:02:49,860 --> 00:02:54,020
this query can behave differently
depending on parameters.

53
00:02:55,080 --> 00:02:57,780
So this is like basic knowledge.

54
00:02:58,780 --> 00:03:02,280
And honestly, I didn't realize
it for quite long in my career

55
00:03:02,280 --> 00:03:02,940
as well.

56
00:03:03,700 --> 00:03:06,140
I thought, okay, this is so, but
why is it important?

57
00:03:06,140 --> 00:03:09,780
Well, it's very important because
looking at pg_stat_statements

58
00:03:09,940 --> 00:03:15,480
and understanding heavy queries,
it's not easy to jump to optimizing

59
00:03:15,480 --> 00:03:21,360
these queries, especially when
we talk about mixed situation,

60
00:03:21,420 --> 00:03:26,260
when it's also quite high frequency
query and also it has issues

61
00:03:26,260 --> 00:03:30,300
with not perfect indexes chosen,
not perfect plan, like not only

62
00:03:30,300 --> 00:03:32,140
just index only scan, that's it.

63
00:03:32,680 --> 00:03:37,340
So it means that we need to understand,
okay, we depend on parameters,

64
00:03:37,540 --> 00:03:41,420
but which parameters should we
choose for optimization?

65
00:03:42,500 --> 00:03:45,420
Michael: When you say for optimization,
do you mean like we get

66
00:03:45,420 --> 00:03:48,300
the PG stat statements and it has
these parameterized queries,

67
00:03:48,660 --> 00:03:53,500
which ones do we put in in order
to run, explain, analyze buffers?

68
00:03:54,060 --> 00:03:55,280
Nikolay: Right, right, right.

69
00:03:55,760 --> 00:03:59,440
This is unfortunately a big gap
between macro and micro levels

70
00:03:59,920 --> 00:04:02,340
of query analysis, which still
exists.

71
00:04:03,440 --> 00:04:08,660
Of course, with the addition of
query ID to logs and to pg_stat_activity,

72
00:04:09,720 --> 00:04:13,320
we have this bridge between pg_stat_statements and individual

73
00:04:13,540 --> 00:04:16,420
queries, which we can find in logs
or pg_stat_activity.

74
00:04:17,380 --> 00:04:22,460
But still, I remember 1 lesson
I learned in practice when I spent

75
00:04:22,740 --> 00:04:25,420
1 hour or so optimizing some query.

76
00:04:26,260 --> 00:04:28,780
It was some parameter which was
true or false.

77
00:04:30,060 --> 00:04:35,780
And I was trying to optimize for
value which was present in 90%

78
00:04:36,100 --> 00:04:38,100
of rows, majority of rows.

79
00:04:38,860 --> 00:04:42,380
And it was, of course, difficult
because selectivity is low,

80
00:04:42,620 --> 00:04:44,480
cardinality is high, right?

81
00:04:45,160 --> 00:04:51,440
But then I understood all this
hour spent was wasted because

82
00:04:52,260 --> 00:04:53,260
it's never solved.

83
00:04:53,680 --> 00:05:00,540
Never, ever, application runs this
value, uses this value in

84
00:05:00,540 --> 00:05:01,040
query.

85
00:05:01,700 --> 00:05:06,600
So The idea is that we need to
understand real cases and also

86
00:05:06,660 --> 00:05:11,600
not super rare cases or non-existing
at all cases, but we need

87
00:05:11,600 --> 00:05:16,300
to understand the cases which hurt
user experience, for example,

88
00:05:16,720 --> 00:05:18,500
and focus on them.

89
00:05:18,940 --> 00:05:23,620
In this case, generic plan is something
really weird in this

90
00:05:23,620 --> 00:05:26,500
picture I just painted, right?

91
00:05:26,820 --> 00:05:28,780
So what do you think?

92
00:05:28,780 --> 00:05:32,860
Like generic and custom and maybe
let's talk about how planner

93
00:05:33,260 --> 00:05:39,020
uses them and then discuss what
should we do about it at all.

94
00:05:39,160 --> 00:05:42,940
So this 5 times rule, what do you
think?

95
00:05:42,980 --> 00:05:45,260
Michael: Yeah, I think that catches
people out sometimes.

96
00:05:45,360 --> 00:05:49,420
So the way prepared statements
works, at least my understanding,

97
00:05:49,920 --> 00:05:54,160
is that you can prepare a query
in advance.

98
00:05:54,160 --> 00:05:56,700
And that will do, like, the parse
stage.

99
00:05:57,260 --> 00:06:00,900
And it will do, like, several stages
up to but not including

100
00:06:00,960 --> 00:06:05,640
the execution of that query, then
you can ask it to execute with

101
00:06:05,640 --> 00:06:06,140
parameters.

102
00:06:07,540 --> 00:06:10,960
Now, that then, on the subsequent
executions, you don't have

103
00:06:10,960 --> 00:06:15,360
to reparse, like do the set, like
do several of those.

104
00:06:15,480 --> 00:06:19,400
Well, planning at first does have
to be done again, right?

105
00:06:19,400 --> 00:06:22,460
So it's the parsing and the rewriting
stages.

106
00:06:22,960 --> 00:06:28,520
Yes, on a per session basis, we
don't have to do several of those

107
00:06:28,520 --> 00:06:29,020
steps.

108
00:06:29,440 --> 00:06:32,640
But there is also this weird kind
of initial phase.

109
00:06:32,640 --> 00:06:35,500
So if you've done prepare, if you've
prepared the statement and

110
00:06:35,500 --> 00:06:40,940
then you execute it, the first
5 times, Postgres will plan each

111
00:06:40,940 --> 00:06:44,200
of those 5 executions separately
based on the parameters that

112
00:06:44,200 --> 00:06:44,880
are given.

113
00:06:45,380 --> 00:06:49,600
So if those are in the case you
gave if you used true was it

114
00:06:49,600 --> 00:06:52,680
that was 90% of time You've used
true every single time for those

115
00:06:52,680 --> 00:06:58,440
5 times It's gonna plan it the
same way each time and on the

116
00:06:58,440 --> 00:07:02,820
sixth execution it will then look
at the cost of planning.

117
00:07:03,940 --> 00:07:09,360
And if that cost is less than a
generic, or if it's more than

118
00:07:09,360 --> 00:07:12,180
a generic plan that it's calculated,
or I think there's some

119
00:07:12,180 --> 00:07:16,260
subtlety around exactly how precise
that is, then it will choose

120
00:07:16,260 --> 00:07:17,920
its generic plan thereafter.

121
00:07:18,080 --> 00:07:22,240
It will flip to not planning each
new execution.

122
00:07:22,440 --> 00:07:25,620
So sixth time, seventh time, eighth
time, ninth time, forever

123
00:07:25,860 --> 00:07:29,720
onwards, during that session, it
won't replan.

124
00:07:29,820 --> 00:07:34,560
So now you're not only saving on
the parsing, on the rewriting

125
00:07:34,740 --> 00:07:37,880
stages, you're also saving on the
planning stage, which you mentioned

126
00:07:37,880 --> 00:07:38,380
already.

127
00:07:38,680 --> 00:07:44,280
But that's not true if, for example,
you did 3 or 4 executions

128
00:07:44,340 --> 00:07:47,160
with true and then a couple of
executions with false in your

129
00:07:47,160 --> 00:07:50,920
example, then you're going to get
a different kind of average

130
00:07:51,580 --> 00:07:53,180
cost for those 5 executions.

131
00:07:53,240 --> 00:07:56,160
Then that will get compared to
the generic plan and then a different

132
00:07:56,160 --> 00:07:57,240
decision will get made.

133
00:07:57,240 --> 00:08:01,880
So depending on those first 5,
So those first 5 do actually determine

134
00:08:01,880 --> 00:08:05,340
quite a lot what happens thereafter,
which is interesting.

135
00:08:06,020 --> 00:08:10,300
But yeah, the 1 time I've seen
that the 5 become important is...

136
00:08:10,440 --> 00:08:12,280
This can be really confusing for
folks.

137
00:08:12,280 --> 00:08:16,020
If you're trying to debug something
and it's fast for a while,

138
00:08:16,020 --> 00:08:20,060
and then suddenly it goes slow,
This is 1 of those telltale signs

139
00:08:20,060 --> 00:08:21,540
once you've been around a while.

140
00:08:21,900 --> 00:08:25,740
If you notice that it's fast for
5 executions then slow on the

141
00:08:25,740 --> 00:08:30,040
sixth, you should look into this
issue.

142
00:08:30,060 --> 00:08:31,860
Custom plan versus generic plan.

143
00:08:32,140 --> 00:08:34,920
It's kind of a telltale sign, like
a smoking gun, or like a,

144
00:08:34,920 --> 00:08:37,680
you know, it's a telltale sign
that this could be what you're

145
00:08:37,680 --> 00:08:38,180
hitting.

146
00:08:38,940 --> 00:08:41,600
But aside from that, I haven't
really thought too much about

147
00:08:41,600 --> 00:08:41,760
it.

148
00:08:41,760 --> 00:08:44,020
I don't think it's configurable,
that 5 number.

149
00:08:44,020 --> 00:08:45,140
I think it's hard-coded.

150
00:08:45,140 --> 00:08:47,540
Nikolay: Yeah, it's, by the way,
why 5?

151
00:08:47,660 --> 00:08:48,360
Why 5?

152
00:08:48,680 --> 00:08:49,700
Michael: Yeah, good question.

153
00:08:50,140 --> 00:08:51,260
Nikolay: Some decision, right?

154
00:08:51,280 --> 00:08:51,980
That's it.

155
00:08:52,740 --> 00:08:55,660
But like, what is the generic plan?

156
00:08:56,040 --> 00:08:59,080
Like generic plan is just we don't
take into account parameters

157
00:08:59,140 --> 00:08:59,820
at all, right?

158
00:08:59,820 --> 00:09:05,540
It means that we don't take into
account this distribution of

159
00:09:05,540 --> 00:09:06,040
values?

160
00:09:06,820 --> 00:09:07,580
Or we take…

161
00:09:07,580 --> 00:09:08,440
Michael: Yeah, good question.

162
00:09:08,440 --> 00:09:11,460
Did you look into how it's calculated
in the first place?

163
00:09:11,600 --> 00:09:13,240
Nikolay: No, but it's an interesting
question.

164
00:09:15,480 --> 00:09:17,960
Michael: I could be wrong, but
I would have guessed that it would

165
00:09:17,960 --> 00:09:20,260
be based on some statistics at
the time.

166
00:09:22,280 --> 00:09:25,320
If you were implementing this feature,
you would think if you've

167
00:09:25,320 --> 00:09:30,340
got a data distribution there,
maybe you'd factor that in.

168
00:09:30,340 --> 00:09:31,200
But it's difficult.

169
00:09:31,200 --> 00:09:34,540
Because if it's skewed, if it's
like completely uniform, it's

170
00:09:34,540 --> 00:09:35,140
quite easy.

171
00:09:35,140 --> 00:09:37,640
Like if we're talking about primary
key lookups, it's primary

172
00:09:37,640 --> 00:09:40,580
key lookup query and every single
one's unique and every single

173
00:09:40,580 --> 00:09:42,340
one's only going to return 1 row.

174
00:09:42,440 --> 00:09:45,040
It's only scanning 1 table, that
kind of thing.

175
00:09:45,620 --> 00:09:48,380
The generic plan is really easy
because every single parameter

176
00:09:48,380 --> 00:09:52,060
you could provide to that is going
to be planned the same way,

177
00:09:52,060 --> 00:09:53,400
even in the custom plan mode.

178
00:09:53,400 --> 00:09:55,580
So generic plan in that case is
quite easy.

179
00:09:55,580 --> 00:09:58,680
But in the case where it's a skewed
distribution, some values

180
00:09:58,680 --> 00:10:04,120
have more entries than others,
it becomes tricky, I think, to

181
00:10:04,120 --> 00:10:06,880
know how you would plan that in
a generic way.

182
00:10:06,880 --> 00:10:08,080
So I don't actually know.

183
00:10:08,360 --> 00:10:14,340
Nikolay: For me, a generic plan
is a good thing that we don't

184
00:10:14,340 --> 00:10:18,620
have a lot of time to spend finding
particular examples in logs

185
00:10:18,620 --> 00:10:19,900
or in pg_stat_activity.

186
00:10:20,660 --> 00:10:24,900
In pg_stat_activity, we have a
different problem, the track activity

187
00:10:25,680 --> 00:10:26,180
size.

188
00:10:26,740 --> 00:10:31,960
So this parameter, which by default
is 1024, means that really

189
00:10:31,960 --> 00:10:36,340
long queries are truncated and
we don't see the full query.

190
00:10:36,740 --> 00:10:39,060
Parameters usually are in the end,
right?

191
00:10:40,240 --> 00:10:47,680
Select all columns, our favorite,
from table and so many column

192
00:10:47,680 --> 00:10:49,620
names, so I don't see parameter.

193
00:10:50,200 --> 00:10:51,640
Something equals truncated.

194
00:10:52,800 --> 00:10:54,040
I saw it not once.

195
00:10:55,080 --> 00:10:58,440
So when we don't want to spend
time, we want something really

196
00:10:58,440 --> 00:11:02,960
fast, this trick prepare statement,
we prepare statement and

197
00:11:02,960 --> 00:11:07,320
then we use just nulls as all values.

198
00:11:07,940 --> 00:11:12,320
This gives us a generic plan, but
also we can say set plan cache

199
00:11:12,320 --> 00:11:17,000
mode to force generic plan, which
exists as we just checked before

200
00:11:17,260 --> 00:11:19,900
we started recording this episode.

201
00:11:21,480 --> 00:11:26,920
Surprisingly, the plan cache mode
configuration setting, it exists

202
00:11:27,660 --> 00:11:30,560
since many, many years ago.

203
00:11:30,760 --> 00:11:34,260
So in all current versions, it's
supported, right?

204
00:11:34,700 --> 00:11:36,080
Michael: Yes, since version 12.

205
00:11:36,260 --> 00:11:36,760
Nikolay: Exactly.

206
00:11:36,820 --> 00:11:39,560
So 12 is the oldest currently supported
version.

207
00:11:40,080 --> 00:11:43,220
A few months left until the end
of it.

208
00:11:44,020 --> 00:11:44,520
Michael: Yeah.

209
00:11:45,060 --> 00:11:45,760
Nikolay: Yeah, yeah.

210
00:11:45,940 --> 00:11:52,400
So this trick allows us to, looking
at the pg_stat_statements,

211
00:11:53,940 --> 00:11:58,340
just take some normalized query,
which has this $1, $2, doesn't

212
00:11:58,340 --> 00:12:00,980
have parameters at all, they are
removed.

213
00:12:01,640 --> 00:12:06,500
And just to use this prepared statements
with null streak and

214
00:12:06,500 --> 00:12:11,140
get some generic plan and think
already how bad the situation

215
00:12:11,180 --> 00:12:14,880
in general, not in particular cases,
but overall.

216
00:12:15,600 --> 00:12:19,880
But this trick, as you absolutely
correctly mentioned before

217
00:12:19,920 --> 00:12:23,820
this episode recording, doesn't
work if we have, for example,

218
00:12:24,080 --> 00:12:25,460
a primary key there.

219
00:12:26,580 --> 00:12:30,860
Because primary key equals null
or is null won't work properly

220
00:12:30,860 --> 00:12:33,740
because there is no null constraint,
so the planner knows that

221
00:12:33,740 --> 00:12:35,140
nothing will be returned.

222
00:12:36,220 --> 00:12:38,460
Doesn't show us the proper plan
at all.

223
00:12:39,360 --> 00:12:44,180
Michael: Yeah, if you don't specify,
like, if you don't set plan

224
00:12:44,180 --> 00:12:49,340
cache mode to force generic plan,
then you'll get a custom plan

225
00:12:49,340 --> 00:12:53,240
and the planner will shortcut it
because there's a really, yeah.

226
00:12:53,240 --> 00:12:58,540
So, but if you do force generic
plan on version 12 onwards, then

227
00:12:58,540 --> 00:13:01,580
you will get the generic plan you're
hoping for.

228
00:13:01,840 --> 00:13:02,720
Nikolay: Well, that's great.

229
00:13:02,720 --> 00:13:04,280
I didn't realize that.

230
00:13:04,280 --> 00:13:05,580
So yeah, good, good.

231
00:13:05,580 --> 00:13:10,320
But if we are lucky enough to have
already Postgres 16, all these

232
00:13:10,320 --> 00:13:14,700
tricks are not needed because we
can just say, explain generic

233
00:13:14,700 --> 00:13:15,180
plan.

234
00:13:15,180 --> 00:13:19,900
There is a new option in explain
command, and this is very handy.

235
00:13:19,940 --> 00:13:22,160
We just ask a generic plan and
explain.

236
00:13:22,760 --> 00:13:23,580
Don't care.

237
00:13:23,600 --> 00:13:28,840
And interesting that we can even
use $1, $2, and this query text

238
00:13:29,180 --> 00:13:31,080
right from pg_stat_statements.

239
00:13:31,120 --> 00:13:32,280
This is super convenient.

240
00:13:32,300 --> 00:13:36,040
We can say, explain generic plan
and that text.

241
00:13:36,460 --> 00:13:38,860
We can even probably automate this,
right?

242
00:13:39,160 --> 00:13:44,720
And see, for example, if we have
sequential scans in generic

243
00:13:44,720 --> 00:13:48,580
plans and put warnings somewhere
and so on.

244
00:13:49,180 --> 00:13:52,780
But of course, this won't work
if we want to execute and we want

245
00:13:52,780 --> 00:13:54,160
to explain analyze buffers.

246
00:13:54,640 --> 00:13:55,740
This won't work.

247
00:13:55,760 --> 00:13:59,740
I mean, in this case, we need to
replace $1, $2 with some values.

248
00:14:00,300 --> 00:14:02,460
Michael: Yeah, and it makes sense,
right?

249
00:14:03,060 --> 00:14:06,760
We can't do the execution unless
we specify something, otherwise

250
00:14:06,760 --> 00:14:07,820
what are you executing?

251
00:14:08,420 --> 00:14:10,080
So it makes sense.

252
00:14:10,080 --> 00:14:14,640
And kudos to Lawrence from CyberTech
who added this and has blogged

253
00:14:14,640 --> 00:14:15,060
about it.

254
00:14:15,060 --> 00:14:17,340
So I'll share the blog post about
that.

255
00:14:17,500 --> 00:14:20,980
Nikolay: I remember actually this
wave from Lawrence and also

256
00:14:20,980 --> 00:14:25,600
Lucas Fittal, who generalized blogging
about generic plans and

257
00:14:25,600 --> 00:14:27,180
also getting excited.

258
00:14:27,440 --> 00:14:30,780
Also, Frank Pascho, I think, joined
at some point.

259
00:14:31,400 --> 00:14:34,300
I actually don't know who started
this, but I remember waves

260
00:14:34,300 --> 00:14:39,860
of, wave of thinking, multiple
blog authors, blog posts about

261
00:14:39,860 --> 00:14:42,860
this, generic plans a few years
ago and it was great.

262
00:14:43,340 --> 00:14:46,080
Michael: I don't think it's super
surprising that this has come

263
00:14:46,080 --> 00:14:50,200
out of a consultancy, like a Postgres
consultancy, and that Lucas

264
00:14:50,200 --> 00:14:51,180
is excited about it.

265
00:14:51,180 --> 00:14:54,660
So I think it's 1 of those things
that really helps people help

266
00:14:54,660 --> 00:14:55,160
others.

267
00:14:55,440 --> 00:14:59,100
Because if you really know what
you're doing, like these people

268
00:14:59,100 --> 00:15:03,400
all do, you can now set up Postgres
in a way where you can find

269
00:15:03,520 --> 00:15:07,320
parameter values in the vast majority
of cases, or you can reconfigure

270
00:15:07,640 --> 00:15:12,440
things so that it does log the
query ID, or you can get your

271
00:15:12,440 --> 00:15:14,320
slowest queries logged with parameters.

272
00:15:15,040 --> 00:15:18,740
So there are these ways of doing
it now, but if people haven't

273
00:15:18,840 --> 00:15:21,820
set that up in advance and you're
helping them with a production

274
00:15:21,860 --> 00:15:25,640
issue, this is a really helpful
tool to give you a clue as to

275
00:15:25,640 --> 00:15:26,740
what's going on.

276
00:15:27,100 --> 00:15:30,040
Now explain will always, there
will only ever be a clue, not

277
00:15:30,040 --> 00:15:32,400
because you don't have the execution
system, you don't know for

278
00:15:32,400 --> 00:15:34,140
sure why it's slow.

279
00:15:34,340 --> 00:15:38,300
But those clues are often enough
for experienced people to get

280
00:15:38,300 --> 00:15:42,320
a theory, get a hypothesis as to
what's going wrong, and then

281
00:15:42,380 --> 00:15:45,360
give them something to be getting
on with instead of being able

282
00:15:45,360 --> 00:15:49,360
to say, I have no idea why this
is like, instead of, it's kind

283
00:15:49,360 --> 00:15:52,440
of like extra information to make
the guess a little bit more

284
00:15:52,440 --> 00:15:55,760
educated rather than an actual
answer.

285
00:15:56,040 --> 00:15:59,620
Nikolay: Yeah, and we can have
a holistic approach now, like checking

286
00:15:59,620 --> 00:16:03,740
everything in an automated fashion,
not guessing and so on.

287
00:16:03,820 --> 00:16:04,620
It's great.

288
00:16:05,460 --> 00:16:09,620
So, yeah, what else worth mentioning
in this area?

289
00:16:10,760 --> 00:16:16,360
Maybe partitioning, which we like,
it's funny that you found

290
00:16:16,360 --> 00:16:20,700
this commit in Postgres 15, which
improved the situation.

291
00:16:22,120 --> 00:16:26,120
My general impression, if you use
a lot of partitioning, you

292
00:16:26,120 --> 00:16:30,860
should perform major upgrades promptly,
not lagging at all, because

293
00:16:30,960 --> 00:16:33,060
every year a lot of things are
improved.

294
00:16:33,760 --> 00:16:38,620
So this is a quite in-depth thing
that was improved in Postgres

295
00:16:38,620 --> 00:16:44,820
15, saying that now, like partition
pruning, so exclusion for

296
00:16:45,020 --> 00:16:53,360
irrelevant partitions based on,
like if user asks to return some

297
00:16:53,360 --> 00:16:58,580
data in a query, Postgres can remove
partitions which are for

298
00:16:58,580 --> 00:17:00,480
sure irrelevant from consideration.

299
00:17:00,480 --> 00:17:05,500
For example, we have partitioning
by time, range by time and

300
00:17:05,500 --> 00:17:10,300
by date, and then the user for
sure wants fresh data, definitely

301
00:17:10,460 --> 00:17:14,060
didn't need to look at very old
partitions, partitions with very

302
00:17:14,060 --> 00:17:14,840
old data.

303
00:17:15,040 --> 00:17:19,340
But this might happen both at planning
time and execution time,

304
00:17:19,460 --> 00:17:21,980
which can be a big surprise.

305
00:17:22,960 --> 00:17:28,700
And you want this to happen at
planning time, basically, better.

306
00:17:30,040 --> 00:17:35,780
Because this will mean that we
can rely later on cached plan,

307
00:17:36,400 --> 00:17:40,640
or if we use prepared statements,
we can rely more on cached plans.

308
00:17:41,320 --> 00:17:44,500
And then we have a very good situation
when partition pruning

309
00:17:44,500 --> 00:17:47,720
was already done at planning time,
we just execute that set.

310
00:17:47,720 --> 00:17:51,740
But it's tricky when we talk about
generic plan, because generic

311
00:17:51,740 --> 00:17:55,360
plan doesn't take into account
the values.

312
00:17:55,640 --> 00:17:58,980
So something was improved in Postgres
15 in this area, right?

313
00:17:59,340 --> 00:18:02,220
Michael: Yeah, We tried to look
into it before the call, didn't

314
00:18:02,220 --> 00:18:02,560
we?

315
00:18:02,560 --> 00:18:03,220
And couldn't...

316
00:18:03,260 --> 00:18:05,280
Nikolay: Yeah, I couldn't reproduce
the problem.

317
00:18:06,980 --> 00:18:10,400
Michael: Yeah, so we tried on an
old version before these.

318
00:18:11,680 --> 00:18:15,520
So it would be interesting to hear
from people that have reproduced

319
00:18:15,640 --> 00:18:16,140
this.

320
00:18:16,220 --> 00:18:19,800
I didn't see a blog post about
it, I saw it referenced a couple

321
00:18:19,800 --> 00:18:20,420
of times.

322
00:18:20,660 --> 00:18:23,820
Nikolay: Yeah, Lukas Fittel in
this blog post, the generalized

323
00:18:23,880 --> 00:18:28,080
blog post, mentioned that before
we had a problem with generic

324
00:18:28,080 --> 00:18:31,280
plans, that partition pruning didn't
work in planning time.

325
00:18:31,280 --> 00:18:34,460
It worked only at execution time.

326
00:18:34,820 --> 00:18:38,040
I couldn't reproduce it, but if
it's so it means that, for example,

327
00:18:38,040 --> 00:18:41,420
if you have a lot of partitions
with a lot of indexes and at

328
00:18:41,420 --> 00:18:43,880
execution time you involve all
of them.

329
00:18:43,940 --> 00:18:48,300
In general, if you have a lot of
partitions, just checking them,

330
00:18:48,400 --> 00:18:52,520
not returning everything, it's
a big overhead, huge overhead.

331
00:18:53,640 --> 00:19:00,300
But at least at execution time,
Postgres doesn't put access share

332
00:19:00,300 --> 00:19:04,080
locks on all indexes, because at
planning time it does.

333
00:19:04,860 --> 00:19:11,260
And if at planning time we don't
have partition pruning working,

334
00:19:11,880 --> 00:19:12,840
it's a nightmare.

335
00:19:13,960 --> 00:19:15,420
This case

336
00:19:15,460 --> 00:19:16,800
Michael: of manager spikes.

337
00:19:18,140 --> 00:19:20,120
Well, I think that would happen
at X.

338
00:19:20,580 --> 00:19:21,080
Yeah.

339
00:19:21,420 --> 00:19:21,920
Nikolay: So,

340
00:19:23,100 --> 00:19:24,640
Michael: a question for you.

341
00:19:24,680 --> 00:19:28,389
Because we do now have this plan
cache mode, so plan cache mode

342
00:19:28,389 --> 00:19:32,560
got added in 12, that's 1 of the
things we can use to force generic

343
00:19:32,560 --> 00:19:36,600
plan, force custom plan while we're
debugging, while we're trying

344
00:19:36,900 --> 00:19:41,180
to, pre-version 16 releases, before
we had this explained generic

345
00:19:41,180 --> 00:19:43,000
plan, we could use that for debugging.

346
00:19:43,660 --> 00:19:46,220
But I don't think I've ever seen
anybody change.

347
00:19:46,220 --> 00:19:49,060
So, Plan cache mode got added.

348
00:19:49,280 --> 00:19:54,960
You have 3 options, Auto, Force
Custom Plan and Force Generic

349
00:19:54,960 --> 00:19:55,460
Plan.

350
00:19:55,640 --> 00:19:59,160
Auto is the default, so Auto acts
as it did as...

351
00:19:59,880 --> 00:20:00,740
Nikolay: As you described.

352
00:20:01,360 --> 00:20:02,560
Michael: Yes, so the first 5 executions...

353
00:20:02,560 --> 00:20:02,920
5

354
00:20:02,920 --> 00:20:06,420
Nikolay: times custom, then decision
is made.

355
00:20:06,760 --> 00:20:07,260
Yeah.

356
00:20:07,540 --> 00:20:08,100
What choice?

357
00:20:08,100 --> 00:20:08,320
Well, I

358
00:20:08,320 --> 00:20:10,940
Michael: was going to ask because
I think I got the impression

359
00:20:11,020 --> 00:20:14,800
that some of these ORMs, like ActiveRecord,
for example, uses

360
00:20:14,800 --> 00:20:15,780
prepared statements.

361
00:20:17,140 --> 00:20:20,920
I got the impression the main reason
they do so is to protect

362
00:20:20,980 --> 00:20:25,140
against SQL injection attacks,
not for performance reasons.

363
00:20:26,400 --> 00:20:28,680
And that might be not quite accurate.

364
00:20:29,060 --> 00:20:32,680
But there is this other benefit
that seems to be mentioned quite

365
00:20:32,680 --> 00:20:34,220
a lot around prepared statements.

366
00:20:34,780 --> 00:20:38,440
Now, I wondered if you'd ever seen
anybody change, like if you

367
00:20:38,440 --> 00:20:41,600
only were, if you only were using
prepared statements for.

368
00:20:42,040 --> 00:20:45,360
SQL injection protection, and you
didn't, you weren't as bothered

369
00:20:45,360 --> 00:20:50,080
about the performance benefits
in some cases, or pitfalls in

370
00:20:50,080 --> 00:20:54,960
other cases, you might want to
allow Postgres to replan every

371
00:20:54,960 --> 00:20:55,560
single time.

372
00:20:55,560 --> 00:21:00,860
I can imagine some people considering
turning force custom plan

373
00:21:00,860 --> 00:21:01,360
on.

374
00:21:01,860 --> 00:21:04,500
I can imagine like the upside of
doing so.

375
00:21:04,500 --> 00:21:07,280
I can't see quite the same upside
for doing forced generic plan

376
00:21:07,280 --> 00:21:09,840
because all you're doing is saving
those 5.

377
00:21:09,840 --> 00:21:13,120
Well, I think you'd only save those
5 executions and then you

378
00:21:13,120 --> 00:21:14,740
flip to the generic plan anyway.

379
00:21:15,360 --> 00:21:17,060
Maybe there's another case for
that.

380
00:21:17,060 --> 00:21:20,380
But yeah, have you seen anybody
changing that setting in any

381
00:21:20,380 --> 00:21:20,880
cases?

382
00:21:21,020 --> 00:21:23,740
Nikolay: I see everyone doesn't
know about it, almost everyone.

383
00:21:24,640 --> 00:21:27,540
It's very deep.

384
00:21:28,780 --> 00:21:30,700
It's not super, super...

385
00:21:32,360 --> 00:21:34,440
It feels like some kind of fine
tuning.

386
00:21:34,440 --> 00:21:39,060
I see people have fears, for example,
to rely on prepared statements

387
00:21:39,060 --> 00:21:45,380
globally, because this idea, what
if global plan will work not

388
00:21:45,380 --> 00:21:45,880
well?

389
00:21:46,500 --> 00:21:46,740
Michael: Right?

390
00:21:46,740 --> 00:21:47,240
And...

391
00:21:47,280 --> 00:21:50,640
So generic plan is bad for a specific
parameter...

392
00:21:51,020 --> 00:21:53,680
Nikolay: If generic plan is chosen,
it's cached, and then it's

393
00:21:53,680 --> 00:21:58,040
bad for some cases and we have
performance degradation for particular

394
00:21:58,260 --> 00:21:58,760
parameters.

395
00:21:59,640 --> 00:22:05,780
But also I think this idea to general,
to use prepared statements,

396
00:22:06,500 --> 00:22:10,220
I don't think these days it already
matters as protection from...

397
00:22:11,920 --> 00:22:16,780
I think the main reason is planning
time is usually fast, but

398
00:22:16,780 --> 00:22:18,920
not always fast, as we know.

399
00:22:19,440 --> 00:22:23,260
And the idea that during planning
time Postgres locks all indexes,

400
00:22:24,520 --> 00:22:25,900
it's a super big surprise.

401
00:22:27,100 --> 00:22:30,800
Many people, including myself,
had not long ago.

402
00:22:31,800 --> 00:22:36,600
And we had benchmarks showing this
small overhead adding with

403
00:22:36,600 --> 00:22:39,440
each index for select, not for
update, for select.

404
00:22:40,080 --> 00:22:44,240
And at some point when total number
of relations, both tables

405
00:22:44,240 --> 00:22:49,500
and indexes, reaches 16, which
is also a hard-coded constant.

406
00:22:50,980 --> 00:22:58,100
If a query is very frequent, 1000
or 2000 per second, primary kilo

407
00:22:58,100 --> 00:22:59,740
caps are usually so.

408
00:23:00,240 --> 00:23:04,660
We have a log manager spike, and
partitioning increases the probability

409
00:23:04,820 --> 00:23:05,520
of it.

410
00:23:05,800 --> 00:23:08,800
And that's why getting rid of planning
is good.

411
00:23:09,240 --> 00:23:12,780
And when you get rid of planning,
it's worth understanding this

412
00:23:12,860 --> 00:23:16,560
behavior of how caching works and
generic versus custom.

413
00:23:16,720 --> 00:23:21,920
But just changing this globally,
I'm not sure this is a popular

414
00:23:21,920 --> 00:23:22,420
decision.

415
00:23:22,800 --> 00:23:25,580
But of course, my experience is
very limited.

416
00:23:26,820 --> 00:23:28,260
Michael: Let us know if you have
changed it.

417
00:23:28,260 --> 00:23:29,940
It'd be interesting to hear from
you.

418
00:23:30,060 --> 00:23:32,140
Nikolay: Yeah, it's an interesting
topic anyway.

419
00:23:33,580 --> 00:23:37,660
But in general, I think it's worth
spending efforts to get rid

420
00:23:37,660 --> 00:23:41,500
of planning for high-frequency queries,
especially for partitioning

421
00:23:42,340 --> 00:23:42,840
tables.

422
00:23:43,260 --> 00:23:44,980
Michael: I have one more related
question.

423
00:23:45,060 --> 00:23:48,500
So now pgBouncer supports prepared
statements.

424
00:23:49,200 --> 00:23:49,340
Nikolay: I

425
00:23:49,340 --> 00:23:51,500
Michael: saw they have a max prepared
state.

426
00:23:51,500 --> 00:23:53,660
Well, so it's off by default, which
I didn't realize.

427
00:23:53,680 --> 00:23:56,260
They have a max prepared statements
parameter.

428
00:23:56,820 --> 00:23:58,700
Do you have any opinions on tuning
that?

429
00:23:58,780 --> 00:24:02,720
Or like I saw that they said a
default of 100 might make sense.

430
00:24:02,720 --> 00:24:03,400
Like, might be a sensible start.

431
00:24:03,400 --> 00:24:05,140
Nikolay: Yeah, some high value
is good.

432
00:24:05,140 --> 00:24:09,020
I think it's like, depends on the
situation, of course, right?

433
00:24:09,020 --> 00:24:13,300
I mean, but in general, it's not
an issue to just increase it.

434
00:24:13,380 --> 00:24:13,880
Yeah.

435
00:24:14,060 --> 00:24:14,560
Cool.

436
00:24:15,040 --> 00:24:21,620
But again, I see the tendency to
be afraid of turning prepare

437
00:24:21,620 --> 00:24:25,180
statements globally in already
existing heavily loaded projects.

438
00:24:25,760 --> 00:24:29,240
If you start from scratch, it's
a very good decision to make

439
00:24:29,240 --> 00:24:32,460
to start using prepared statements
from the very beginning.

440
00:24:32,960 --> 00:24:36,980
And then you just grow and see
the problems, fix them as usual,

441
00:24:37,120 --> 00:24:38,460
like you would do anyway.

442
00:24:39,240 --> 00:24:43,900
And then there's confidence level
of using already there, but

443
00:24:43,900 --> 00:24:50,280
switching for a big project from
0 to 100% coverage, it's a very

444
00:24:50,280 --> 00:24:51,710
hard decision to make.

445
00:24:51,710 --> 00:24:53,700
Michael: That's hard as in scary
or?

446
00:24:54,720 --> 00:24:57,880
Nikolay: Just people are not sure
and that's the problem.

447
00:24:58,920 --> 00:25:00,560
And there is no good path.

448
00:25:00,560 --> 00:25:02,540
Maybe there is a good path to it.

449
00:25:02,780 --> 00:25:06,940
But in a couple of big companies
I observed recently, people

450
00:25:06,940 --> 00:25:10,400
decided not to proceed with this
after a lot of consideration.

451
00:25:12,120 --> 00:25:14,120
And it's kind of an interesting
topic.

452
00:25:14,700 --> 00:25:18,740
Instead, It feels safer to start
using prepare statements only

453
00:25:18,740 --> 00:25:19,900
for specific queries.

454
00:25:20,380 --> 00:25:21,840
Michael: That's a slow way in.

455
00:25:21,940 --> 00:25:23,100
Maybe for new features.

456
00:25:23,800 --> 00:25:24,780
Nikolay: Not new features.

457
00:25:25,440 --> 00:25:26,260
Not at all.

458
00:25:26,280 --> 00:25:30,040
Again, the problem is usually this
log manager overhead.

459
00:25:30,120 --> 00:25:39,600
When we have 1,000, 2,000 QPS,
we see very simple queries, select

460
00:25:39,600 --> 00:25:41,158
star from or select something from
table When we have 1000-2000

461
00:25:41,158 --> 00:25:42,540
QPS, we see a very simple query,
select something from the table

462
00:25:42,720 --> 00:25:44,940
where ID equals something, primary
queue lookup.

463
00:25:45,480 --> 00:25:49,740
And if we have planning time every
time, select itself is fast,

464
00:25:49,960 --> 00:25:52,120
planning itself is also fast.

465
00:25:52,200 --> 00:25:58,440
But if you have a lot of backends
competing, access share lock

466
00:25:58,440 --> 00:26:00,400
is not a problem at all.

467
00:26:00,580 --> 00:26:04,400
But lock manager has overhead because
of fast path is false.

468
00:26:04,700 --> 00:26:07,700
And this is what multiple companies
experienced recently.

469
00:26:07,880 --> 00:26:12,660
And Jeremy Schneider said, local
managers strikes back or bites

470
00:26:12,660 --> 00:26:14,000
back or something like that.

471
00:26:14,440 --> 00:26:19,900
Last year was a local manager year
in terms of performance issues

472
00:26:19,900 --> 00:26:20,740
people experienced.

473
00:26:21,400 --> 00:26:24,380
Michael: So you're saying, yeah,
so I'll share the episode where

474
00:26:24,380 --> 00:26:25,940
we discussed that in more detail.

475
00:26:27,540 --> 00:26:30,140
And so you're saying they choose
to turn prepared statements

476
00:26:30,140 --> 00:26:34,160
on only for those queries, because
it reduces the issue there,

477
00:26:34,160 --> 00:26:35,100
but not globally.

478
00:26:35,840 --> 00:26:37,000
Okay, makes sense.

479
00:26:37,080 --> 00:26:40,220
Nikolay: Just to remove these hotspots.

480
00:26:40,960 --> 00:26:41,460
Yeah.

481
00:26:41,600 --> 00:26:43,280
Only for these primary lookups.

482
00:26:44,320 --> 00:26:46,440
Michael: I mean, we're talking,
So if you're talking about these

483
00:26:46,440 --> 00:26:49,680
super fast queries, even though
both execution and planning time

484
00:26:49,680 --> 00:26:52,640
are fast, it could be that planning
time is more than execution

485
00:26:52,640 --> 00:26:52,800
time.

486
00:26:52,800 --> 00:26:53,860
Like it's quite common.

487
00:26:53,860 --> 00:26:54,560
Nikolay: It happens.

488
00:26:55,320 --> 00:26:58,480
Michael: So even if it wasn't for
the issues you're describing,

489
00:26:58,840 --> 00:27:01,220
you could maybe increase throughput.

490
00:27:01,300 --> 00:27:04,540
Or like there's a lot of overhead
to be gained by not having

491
00:27:04,540 --> 00:27:06,380
to plan those queries.

492
00:27:06,380 --> 00:27:09,160
Nikolay: Yeah, shave off planning
time is a good idea in many

493
00:27:09,160 --> 00:27:11,540
cases in general.

494
00:27:12,200 --> 00:27:15,060
It's like, for example, partitioning,
right?

495
00:27:15,060 --> 00:27:19,120
If partitioning is there, planning
time can be high if you have

496
00:27:19,120 --> 00:27:19,540
a lot.

497
00:27:19,540 --> 00:27:21,600
It can depend on the number of
partitions.

498
00:27:22,200 --> 00:27:24,840
This is the problem as well.

499
00:27:24,840 --> 00:27:27,680
In this case, of course, getting
rid of it is good.

500
00:27:28,040 --> 00:27:28,540
Nice.

501
00:27:29,120 --> 00:27:29,620
Cool.

502
00:27:29,880 --> 00:27:30,480
Anything else you want

503
00:27:30,480 --> 00:27:31,660
Michael: to add to this 1?

504
00:27:32,080 --> 00:27:32,580
Nikolay: No.

505
00:27:32,800 --> 00:27:36,340
It's an interesting thing to keep
in mind when dealing with plan

506
00:27:36,340 --> 00:27:36,840
optimization.

507
00:27:37,660 --> 00:27:44,020
So use explain-generic-plan in
Postgres 16 and in future Postgres

508
00:27:44,140 --> 00:27:44,640
versions.

509
00:27:45,920 --> 00:27:49,280
Michael: Yeah, if you can't, as
a starting point, or if you can't

510
00:27:49,280 --> 00:27:49,380
get

511
00:27:49,380 --> 00:27:49,880
Nikolay: parameters.

512
00:27:50,340 --> 00:27:51,280
Michael: Lazy mode.

513
00:27:51,280 --> 00:27:52,260
Lazy mode.

514
00:27:52,260 --> 00:27:55,380
Nikolay: If you don't want to go
to logs to find examples, or

515
00:27:55,380 --> 00:27:58,220
maybe a log min duration statement
was not configured somehow.

516
00:27:58,780 --> 00:27:59,240
Michael: Exactly.

517
00:27:59,240 --> 00:28:02,060
If it wasn't configured for the
time where you're having issues,

518
00:28:02,640 --> 00:28:05,900
or if you can't turn it on easily,
like if you've got a lot of

519
00:28:05,900 --> 00:28:08,900
process at your company before
changing some of these parameters,

520
00:28:09,340 --> 00:28:13,360
you know, so there might be reasons
why you can't get real values.

521
00:28:13,840 --> 00:28:16,920
But of course, I think The reason
I don't know as much as maybe

522
00:28:16,920 --> 00:28:20,580
I should on this is, if you can
get the real values, performance

523
00:28:20,600 --> 00:28:22,860
optimization work becomes so much
easier.

524
00:28:23,040 --> 00:28:23,560
Nikolay: Of course.

525
00:28:23,560 --> 00:28:24,520
Michael: There's less guesswork.

526
00:28:24,620 --> 00:28:25,080
Nikolay: Yeah.

527
00:28:25,080 --> 00:28:28,680
This generic plan should be something
like a partial solution

528
00:28:29,120 --> 00:28:30,300
in terms of analysis.

529
00:28:30,560 --> 00:28:31,900
It's a partial analysis.

530
00:28:32,020 --> 00:28:36,580
It always should be with remark,
you know, this is not, maybe

531
00:28:36,580 --> 00:28:37,320
not it.

532
00:28:37,540 --> 00:28:38,980
Maybe but maybe not.

533
00:28:39,380 --> 00:28:39,820
Michael: Yeah.

534
00:28:39,820 --> 00:28:43,120
Like a stepping stone on the way
to the solution, but like it's

535
00:28:43,380 --> 00:28:44,840
not a solution in itself.

536
00:28:45,060 --> 00:28:45,560
Nikolay: Right.

537
00:28:46,320 --> 00:28:49,900
On the other hand, it might give
very quick insights what to

538
00:28:49,900 --> 00:28:51,520
do and how to speed up.

539
00:28:51,580 --> 00:28:52,180
So yeah,

540
00:28:52,180 --> 00:28:52,860
Michael: good point.

541
00:28:52,860 --> 00:28:53,760
Yeah, good point.

542
00:28:54,140 --> 00:28:54,640
Yeah.

543
00:28:54,640 --> 00:28:55,440
Nice one, Nikolay.

544
00:28:55,440 --> 00:28:56,400
Thank you so much.

545
00:28:56,400 --> 00:28:57,160
Nikolay: Thank you.

546
00:28:57,440 --> 00:28:58,260
Have a good day.

547
00:28:58,260 --> 00:28:58,760
Bye.

548
00:28:58,780 --> 00:28:59,280
Michael: You too.