1
00:00:00,030 --> 00:00:04,110
Michael: Hello and welcome to Postgres fm,
a weekly show about all things Postgres qr.

2
00:00:04,350 --> 00:00:06,060
I am Michael, founder of PG Mustard.

3
00:00:06,120 --> 00:00:08,340
This is Michael who's Nikolay founder of Postgres ai.

4
00:00:08,550 --> 00:00:09,770
Hey, Nilay, what are we talking about?

5
00:00:10,399 --> 00:00:10,999
Nikolay: Hello.

6
00:00:11,026 --> 00:00:17,932
Let's talk about query optimization, like
second step, third step after you identified.

7
00:00:18,281 --> 00:00:19,721
Maybe third step, right?

8
00:00:20,141 --> 00:00:23,531
First step is you found some queries which behave not well.

9
00:00:23,887 --> 00:00:28,219
Second step, you found some optimization idea and what's next

10
00:00:28,849 --> 00:00:29,719
Michael: Yeah, exactly.

11
00:00:29,724 --> 00:00:35,089
How do we, know whether it be safe to make the change that we are
thinking about, whether it'll make a difference, that kind of thing.

12
00:00:35,363 --> 00:00:35,783
Nikolay: Yep.

13
00:00:35,783 --> 00:00:38,650
And this is one of the topics our listeners.

14
00:00:39,569 --> 00:00:40,289
Michael: Yeah, exactly.

15
00:00:40,289 --> 00:00:41,549
I think they phrased it quite nicely.

16
00:00:41,549 --> 00:00:47,436
It was like a, 1 0 2, I think they said like the American
class, like the, in terms of university courses, right?

17
00:00:47,436 --> 00:00:48,486
Is that how, how it works?

18
00:00:48,486 --> 00:00:51,316
1 0 1 is the intro and then 1 0 2 is the next step along

19
00:00:51,421 --> 00:00:54,051
Nikolay: per person who can explain this, but Yeah.

20
00:00:54,056 --> 00:00:55,821
I, I saw 1 0 1 is everywhere.

21
00:00:55,821 --> 00:00:56,331
Right, Right.

22
00:00:57,251 --> 00:00:58,448
So, where to start?

23
00:00:58,478 --> 00:01:05,053
Let's  first  Make a step back and think about
how we ified our optimization idea as usual.

24
00:01:05,058 --> 00:01:07,558
I have strong opinion here because I see.

25
00:01:07,969 --> 00:01:16,645
a usual approach is either to rely on experience or to test a
right on production because the problem is usual problem is that a

26
00:01:16,645 --> 00:01:25,023
non-production environments, they have different data, different size
of database everything is different, sometimes different hardware,

27
00:01:25,023 --> 00:01:28,353
but actually hardware is less important, much less important.

28
00:01:28,383 --> 00:01:36,443
We can Tune our, pause this on a weak hardware to behave in terms
of the planned choices to behave is exactly like on production.

29
00:01:36,971 --> 00:01:43,313
It's possible, but the data, statistics and planner
settings, of course, these things are the most important.

30
00:01:43,793 --> 00:01:47,483
So question is how you vilify your ideas.

31
00:01:47,542 --> 00:01:48,322
how do you do it?

32
00:01:48,591 --> 00:01:49,521
And the answer.

33
00:01:49,827 --> 00:01:50,247
Great.

34
00:01:50,757 --> 00:01:50,937
So

35
00:01:51,042 --> 00:01:52,000
Michael: Well, I'm interested.

36
00:01:52,002 --> 00:01:52,272
Yeah.

37
00:01:52,272 --> 00:01:55,662
Well, so I think you've probably gone really advanced there already.

38
00:01:55,662 --> 00:01:58,062
Talking about how hardware is is less important.

39
00:01:58,067 --> 00:02:03,492
I think the first lesson is that testing in
a similar setup as possible is really useful.

40
00:02:03,612 --> 00:02:09,899
And maybe there's some exceptions to that, but by similar,
I think the, the first thing I always try and encourage

41
00:02:09,899 --> 00:02:13,409
people to do is make sure the quantity of data is similar.

42
00:02:13,619 --> 00:02:15,299
So, and I guess we're testing two things, right?

43
00:02:15,299 --> 00:02:21,372
The first thing we want to test is, for the single query that I was
trying to optimize or for the single problem I was trying to solve?

44
00:02:21,642 --> 00:02:23,322
Does this solve that problem?

45
00:02:23,622 --> 00:02:30,904
And then there's a secondary level, Does it introduce other
problems or does it work well amongst the whole load of the system?

46
00:02:31,279 --> 00:02:31,609
Nikolay: anything?

47
00:02:31,669 --> 00:02:32,409
Right, Right, right.

48
00:02:32,629 --> 00:02:37,762
But right, So we want to, we want our non-production Postgres behave.

49
00:02:38,078 --> 00:02:41,482
Ideally exactly in the same manner as production would do.

50
00:02:41,482 --> 00:02:43,672
It actually production can change.

51
00:02:43,732 --> 00:02:45,562
Yesterday, it chose one plan.

52
00:02:45,562 --> 00:02:48,112
Today, there's no guarantee it chooses the same plan.

53
00:02:48,112 --> 00:02:49,972
And this is a quite well known in pogo.

54
00:02:50,452 --> 00:02:54,472
And unfortunately, POGS doesn't provide a good way to solve this problem.

55
00:02:54,472 --> 00:02:59,919
I mean, to freeze the blind choice and avoid  a
sudden plant flips for critical queries.

56
00:03:00,011 --> 00:03:00,971
but it's our topic.

57
00:03:02,076 --> 00:03:03,006
This, this our topic.

58
00:03:03,011 --> 00:03:08,595
Just I, I mentioned it just  to, to give the
feeling that the planner behavior is not stable.

59
00:03:08,893 --> 00:03:14,426
So achieving the same planner behavior on non-production
environment, it's quite a challenge, right?

60
00:03:15,054 --> 00:03:18,937
But there, there are specific tricks that can help us.

61
00:03:19,278 --> 00:03:21,883
For example, of course,  ideally we want the same.

62
00:03:22,240 --> 00:03:27,201
because the same data will mean produce, statistic content is the same.

63
00:03:27,368 --> 00:03:28,118
So we are fine.

64
00:03:28,364 --> 00:03:33,648
There are trick to export, import produce,
statistic content, even if you don't have data.

65
00:03:33,648 --> 00:03:37,045
There are some not very popular, but interesting extensions.

66
00:03:37,217 --> 00:03:38,747
I think they came from Japan.

67
00:03:38,777 --> 00:03:40,857
I don't remember names, but We will provide links.

68
00:03:40,878 --> 00:03:42,688
It's possible to export a st.

69
00:03:43,158 --> 00:03:45,528
I, this approach have limitations.

70
00:03:45,528 --> 00:03:48,590
It's similar to hypothetical indexes partitions.

71
00:03:48,800 --> 00:03:49,700
Very interesting approach.

72
00:03:49,700 --> 00:04:01,254
And sometimes it's good, but what if we want like, true behavior So executor
really executes our plans that we can see the behavior of pos for our query.

73
00:04:01,602 --> 00:04:03,432
In this case, we, we want the same data.

74
00:04:03,600 --> 00:04:05,395
Of course there are several levels.

75
00:04:05,423 --> 00:04:07,133
How we can be close to the same.

76
00:04:07,133 --> 00:04:09,395
We can just take it as this.

77
00:04:09,605 --> 00:04:10,835
On physical level.

78
00:04:11,075 --> 00:04:13,798
This is the absolutely good approach.

79
00:04:13,803 --> 00:04:16,108
We, we just take, grab the same page of data.

80
00:04:16,468 --> 00:04:18,928
If we have managed s we just cl it.

81
00:04:19,108 --> 00:04:21,448
Most of manage services allowed to clone.

82
00:04:21,780 --> 00:04:26,058
It takes time, it costs money, but it'll give us the same data.

83
00:04:26,058 --> 00:04:31,090
And so we can see behavior there, but sometimes we cannot copy on physical.

84
00:04:31,655 --> 00:04:37,805
So we can only copy on logical level, but as you said,
the numbers of rows, for example, will be the same.

85
00:04:37,805 --> 00:04:38,495
So we're good,

86
00:04:38,896 --> 00:04:43,242
Michael: A problem, not really in the topic of
performance, but a problem I see quite a few customers

87
00:04:43,242 --> 00:04:47,247
coming across with that is the kind of access to data.

88
00:04:47,277 --> 00:04:52,107
So giving developers access to production data can be problematic from a

89
00:04:52,107 --> 00:04:53,367
privacy point of view.

90
00:04:53,667 --> 00:04:54,057
Okay.

91
00:04:54,257 --> 00:04:54,947
What do

92
00:04:54,972 --> 00:05:01,474
Nikolay: very problematic, and in my opinion, it's not
about there are two problems that access of arbitrary

93
00:05:01,474 --> 00:05:04,294
developer and company to production can bring us.

94
00:05:04,392 --> 00:05:09,814
First problem is security and second problem, I think
it's more important, even more important, security we can.

95
00:05:10,563 --> 00:05:13,973
Ask them to sign something to do security trainings, a lot of things.

96
00:05:13,973 --> 00:05:16,027
But second problem is more interesting.

97
00:05:16,108 --> 00:05:23,045
Developers want to develop, so they add a lot of
instability if we allow developers to go to production.

98
00:05:23,379 --> 00:05:31,834
, it can introduce challenges in terms of stability because we want
to have  established process of development, testing and deployment.

99
00:05:31,834 --> 00:05:37,334
So we don't want developers to go and, and check,
for example, Oh, I have a great index idea.

100
00:05:37,334 --> 00:05:38,864
I will create it right on production.

101
00:05:39,006 --> 00:05:40,986
Primary note and I will check.

102
00:05:41,346 --> 00:05:41,946
It's not good.

103
00:05:42,171 --> 00:05:43,041
So, Right.

104
00:05:43,071 --> 00:05:44,677
But security is also a concern.

105
00:05:44,697 --> 00:05:51,308
but even if you don't talk about security, which is a
very big concern if we need to use logical copy of a

106
00:05:51,313 --> 00:05:54,240
production, for example question is, will plan be the.

107
00:05:54,930 --> 00:05:56,485
Row counts are the same, right?

108
00:05:56,490 --> 00:05:58,485
Statistics should be the same.

109
00:05:58,912 --> 00:06:05,752
And for example, we took the same, even same hardware
and we put the same positive settings to configuration.

110
00:06:06,175 --> 00:06:11,095
Question is does it guarantee that the planner
will behave in exactly the same manner?

111
00:06:11,962 --> 00:06:17,828
Michael: So the, the place my head's
going to is using analyze and it sampling.

112
00:06:18,132 --> 00:06:22,032
Even when you run analyze, you need to, it gets samples of tables, right?

113
00:06:22,212 --> 00:06:23,922
So maybe that's not the only way.

114
00:06:23,922 --> 00:06:26,682
It could be different, but that's, One it could

115
00:06:26,817 --> 00:06:27,417
Nikolay: I agree.

116
00:06:27,507 --> 00:06:27,807
Yeah.

117
00:06:27,807 --> 00:06:28,017
Yeah.

118
00:06:28,022 --> 00:06:34,107
Some, some instability there, but also if you have a
lot of bloat, for example, the real pages is different.

119
00:06:34,603 --> 00:06:39,793
So on your new note, on your, on your
logical clone rail pages will be smaller.

120
00:06:39,793 --> 00:06:43,644
And this definitely this, this directly  affects the planner behavior.

121
00:06:43,644 --> 00:06:50,916
So it can choose different plan, so we can see different access
notes and plan not, not access notes, execution notes, and, and plan.

122
00:06:50,940 --> 00:06:51,565
When you.

123
00:06:52,200 --> 00:06:58,980
Production and this logical replica, unfortunately, and there is not, I
don't know, solution to this problem, but it's quite interesting problem.

124
00:06:59,280 --> 00:06:59,490
So

125
00:06:59,645 --> 00:07:01,235
Michael: And actually not just bloat.

126
00:07:01,295 --> 00:07:03,485
Also, I guess insert order, right?

127
00:07:03,485 --> 00:07:03,755
Like

128
00:07:04,586 --> 00:07:05,726
Nikolay: Physical distribution?

129
00:07:05,786 --> 00:07:06,206
Yeah.

130
00:07:06,211 --> 00:07:06,236
yeah.

131
00:07:06,596 --> 00:07:09,896
Well this you you mean, You mean buffer numbers probably different.

132
00:07:09,926 --> 00:07:11,096
This is this, I understand.

133
00:07:11,366 --> 00:07:14,216
For example, we need to read a thousand rows.

134
00:07:14,646 --> 00:07:25,014
We, we had thousand of buffer res and hits on our source, but we have
only hundred, or we have, we have like much fewer hits and res on our

135
00:07:25,014 --> 00:07:29,195
replica because more compact storage, for example,  of these rows.

136
00:07:29,225 --> 00:07:30,305
this is possible, right?

137
00:07:31,025 --> 00:07:36,333
But question is, will, structure of the plan will be
the same, and that's where unfortunately we cannot g.

138
00:07:36,350 --> 00:07:43,190
Usually it's very close, but unfortunately real pages
is different and it directly affects the plan behavior.

139
00:07:43,257 --> 00:07:45,704
Michael: But we are, we are talking about edge cases, right?

140
00:07:45,734 --> 00:07:47,396
Not the majority of the time.

141
00:07:47,401 --> 00:07:52,916
If we're talking about simple LTP type
queries, does it use the index, does it not?

142
00:07:53,096 --> 00:07:55,616
The kind of things people are probably worried most about.

143
00:07:55,838 --> 00:08:00,435
They're likely to be easily replicated once
you have a similar like sample of data.

144
00:08:00,895 --> 00:08:01,645
Nikolay: Right, right.

145
00:08:01,789 --> 00:08:07,579
I agree and sometimes, for example, we might have multiple
production environments and they are slightly different.

146
00:08:07,579 --> 00:08:15,522
So things can be very interesting in various cases,
but at least we can be as close as possible if we.

147
00:08:16,197 --> 00:08:17,127
Aim to do it.

148
00:08:17,187 --> 00:08:20,637
We, we say, Okay, we have a billion rows in a table.

149
00:08:20,637 --> 00:08:27,087
Let's have a billion rows in a table in our production environment,
non-production environments, and then we can see the plans.

150
00:08:27,327 --> 00:08:28,467
Now, back to the topic.

151
00:08:28,527 --> 00:08:37,130
If we have a plan if we have a query and the plan, and
now we want to ensure that it's a good idea and we need

152
00:08:37,130 --> 00:08:40,199
to approve it, of course direct comparison before, and.

153
00:08:41,099 --> 00:08:42,359
It's already good, right?

154
00:08:42,359 --> 00:08:49,349
So we have one query and we see that our change, for example,
we create an index or we do something else, we change

155
00:08:49,349 --> 00:08:53,386
query somehow we directly see that IO numbers improved.

156
00:08:54,076 --> 00:08:55,696
So conclusion is, let's do it.

157
00:08:56,236 --> 00:09:02,430
But as you mentioned in the very beginning, of course there
are some cases when our change can affect other querie.

158
00:09:02,975 --> 00:09:07,761
And the question is how to find it holistically, how to test it holistically.

159
00:09:07,821 --> 00:09:09,201
And this is unsolved problem.

160
00:09:09,621 --> 00:09:13,695
First of all, let's consider one case when it's, it can be possible.

161
00:09:13,695 --> 00:09:15,705
I had it and I wrote an article about it.

162
00:09:16,120 --> 00:09:17,230
Uh, We will provide it.

163
00:09:17,440 --> 00:09:23,289
So it's like an interesting case when optimization of index led to.

164
00:09:23,458 --> 00:09:26,060
All updates being worse in terms of execution?

165
00:09:26,270 --> 00:09:26,690
Why?

166
00:09:26,733 --> 00:09:27,783
Idea was simple.

167
00:09:27,860 --> 00:09:37,706
For example, we have some, select, it's quite slow and it's maybe not slow,
but we see that it uses some index, which we can reduce in size significantly.

168
00:09:37,711 --> 00:09:41,666
Adding a where clause we add a where clause, everything fine.

169
00:09:41,796 --> 00:09:48,466
this select has better performance because index size is
smaller and, and so on, because now it's partial smaller, right?

170
00:09:49,173 --> 00:09:55,810
But then we deploy it and, and suddenly see that
latency of all updates on, on average reduced.

171
00:09:56,205 --> 00:09:56,745
Why?

172
00:09:56,775 --> 00:10:01,444
Because we lost HOT updates because adding some column to.

173
00:10:01,864 --> 00:10:10,876
Index definition, even it's in, in the wear cloud, leads to
different behavior of updates because HOT updates, heap only

174
00:10:10,876 --> 00:10:19,660
tuple updates, they are possible only if no indexes  have column
name in their definitions that calm, which we are changing.

175
00:10:20,362 --> 00:10:20,564
Right.

176
00:10:21,006 --> 00:10:29,616
So if we change some column, usual case, for example, updated at a
timestamp, we usually tend to change it when we do updates, right?

177
00:10:29,616 --> 00:10:33,044
Because it's the semantics of this column.

178
00:10:33,464 --> 00:10:39,677
But if we have an index on it,  no such updates
can be hip on the top it, hip on the top.

179
00:10:39,917 --> 00:10:46,414
Updates, they are much faster and regular updates because
during these updates POGS doesn't touch all indexes.

180
00:10:46,414 --> 00:10:48,904
It doesn't need to change all indexes.

181
00:10:49,324 --> 00:10:56,434
But with regular updates, POGS needs to update each
index and this, this is called index amplification.

182
00:10:56,584 --> 00:10:57,424
Index, right?

183
00:10:57,424 --> 00:10:58,699
Amplification issue.

184
00:10:58,749 --> 00:11:05,845
One of those which Uber h ighlighted in their
article, so losing hat updates may be painful, so

185
00:11:06,205 --> 00:11:08,987
Michael: Especially on a heavily loaded system for sure.

186
00:11:09,032 --> 00:11:16,592
The, the other, the thing, the place I am starting to
see some progress on in terms of tooling is people even

187
00:11:16,592 --> 00:11:19,052
being aware that some of these things are problems.

188
00:11:19,052 --> 00:11:22,836
So the, it's PG added an index advisor.

189
00:11:23,136 --> 00:11:24,576
In the last year or two.

190
00:11:24,966 --> 00:11:29,747
And one of the things that it points out is
an estimate of right performance overhead.

191
00:11:29,747 --> 00:11:39,397
So not only, not only can indexes be a problem in terms of preventing
hot updates, they also add right overhead if a new rose, for example.

192
00:11:39,602 --> 00:11:40,780
Nikolay: yeah, any index.

193
00:11:40,850 --> 00:11:44,540
If you create one more index, definitely you add overhead.

194
00:11:45,020 --> 00:11:45,380
Definitely.

195
00:11:45,530 --> 00:11:47,900
And we can actually estimate, we can measure.

196
00:11:48,445 --> 00:11:52,135
And this is interesting and this definitely it's possible to improve tooling.

197
00:11:52,285 --> 00:11:52,915
Definitely.

198
00:11:53,005 --> 00:11:54,799
I don't see best tools here.

199
00:11:55,189 --> 00:11:58,042
I made couple of attempts to create tools.

200
00:11:58,282 --> 00:11:58,872
One of the.

201
00:11:59,139 --> 00:12:06,417
ideas was to create a benchmarking tool which will be
used in, in any change for any change related to database.

202
00:12:06,422 --> 00:12:08,614
It, it was called Nancy Bot.

203
00:12:08,824 --> 00:12:13,444
And the idea was we will replay workload before and after our change.

204
00:12:13,474 --> 00:12:20,502
We will collect all metrics, for example, from pist, statements
from logs, auto, explain everything, and then we will

205
00:12:20,682 --> 00:12:28,812
provide the direct comparison Before and after, and we see
exactly which queries degraded and which queries improved.

206
00:12:29,082 --> 00:12:30,732
And this was great idea.

207
00:12:30,732 --> 00:12:32,082
We had even had a good interface.

208
00:12:32,082 --> 00:12:39,777
And so, but then I realized that practically it's not possible
to run full-fledged benchmarks every time developers does

209
00:12:39,837 --> 00:12:42,897
any change because it's like, it just economically doesn't.

210
00:12:43,407 --> 00:12:44,397
It's too expensive.

211
00:12:44,397 --> 00:12:49,107
You need separate machine or couple of
machines, sometimes several in, in, in a batch.

212
00:12:49,107 --> 00:12:57,339
But sometimes we ran several experiments, like, what if we do
this, this, this, Like we have 10 options and we executed them.

213
00:12:57,609 --> 00:12:59,139
Then questions arise.

214
00:12:59,169 --> 00:13:03,901
Like we, we executed them in parallel not
sequentially because sequentially is too slow.

215
00:13:04,321 --> 00:13:06,991
If, if each experiment takes 10 minutes.

216
00:13:07,576 --> 00:13:10,756
You want to execute parallel, but then questions arise.

217
00:13:10,966 --> 00:13:11,356
Okay.

218
00:13:11,356 --> 00:13:20,931
But what about are all virtual machines the same or there are some deviations,
for example, you one of them got worse discourse, CPU or something.

219
00:13:21,171 --> 00:13:21,681
So we

220
00:13:21,681 --> 00:13:21,891
had to

221
00:13:22,056 --> 00:13:22,536
Michael: neighbor.

222
00:13:22,906 --> 00:13:23,266
Yeah.

223
00:13:23,331 --> 00:13:24,411
Nikolay: Right, right, right, right.

224
00:13:24,411 --> 00:13:31,205
So, We added a lot of logic, like micro benchmarks to ensure
that like some  baseline is met and so on, and so on and so on.

225
00:13:31,565 --> 00:13:32,675
And the complexity grew.

226
00:13:32,675 --> 00:13:37,844
But the main problem was you just need to pay a lot to do such experiments.

227
00:13:37,934 --> 00:13:43,876
Some campaigns can, can afford it, Also the, one
of the biggest problems was how to replay workload.

228
00:13:44,126 --> 00:13:46,106
There was p play and replay.

229
00:13:46,106 --> 00:13:46,376
Go.

230
00:13:46,376 --> 00:13:58,613
I, I saw the other, maybe yesterday I saw the article from Frank Pa, Pa
Yuy about how to run pji replay for yuy or p play go actually for yoga.

231
00:13:58,613 --> 00:14:01,587
But, but collecting queries is also a challenge.

232
00:14:01,587 --> 00:14:02,846
And I did it several times.

233
00:14:02,846 --> 00:14:04,436
And so the bottom line.

234
00:14:04,956 --> 00:14:06,816
It's quite complex.

235
00:14:06,846 --> 00:14:08,226
It's possible to do it.

236
00:14:08,272 --> 00:14:13,769
Some companies actually implemented something, but
it's very hard to create a universal tool here.

237
00:14:14,069 --> 00:14:16,679
But still there is need here, there is need.

238
00:14:16,733 --> 00:14:20,145
we want to cover all changes with some testing, right?

239
00:14:20,688 --> 00:14:24,146
But the   good news here is that we don't need full-fledge.

240
00:14:24,581 --> 00:14:27,121
CL we we can do in a shy environment.

241
00:14:27,121 --> 00:14:32,311
We can focus on IO metrics and run many experiments on just one machine.

242
00:14:32,708 --> 00:14:36,758
This is what we do with Database Lab and think loans and database branching.

243
00:14:37,273 --> 00:14:38,143
and it's possible.

244
00:14:38,353 --> 00:14:40,543
So we can have think loans.

245
00:14:40,603 --> 00:14:46,433
Of course they have different timing because different
file system, for example is SS or, or anything else.

246
00:14:46,502 --> 00:14:47,762
BTS is also an option.

247
00:14:47,832 --> 00:14:52,662
And then you run multiple experiments right on
one machine and you don't care about timing.

248
00:14:52,692 --> 00:14:54,912
You care about plan structures and.

249
00:14:54,971 --> 00:14:57,911
Are you numbers and you can verify many things.

250
00:14:58,121 --> 00:14:59,141
So question.

251
00:14:59,141 --> 00:15:00,671
Next question is what to verify.

252
00:15:00,850 --> 00:15:01,075
Okay.

253
00:15:01,075 --> 00:15:02,155
We have our query.

254
00:15:02,155 --> 00:15:03,265
We want to optimize.

255
00:15:03,265 --> 00:15:04,615
We found some index.

256
00:15:05,460 --> 00:15:07,428
We ed that query is improved.

257
00:15:07,565 --> 00:15:08,735
How to check other queries?

258
00:15:09,125 --> 00:15:11,747
Well, some new tool link needs to be built here, right?

259
00:15:11,897 --> 00:15:14,549
And I think it consists of two areas.

260
00:15:14,549 --> 00:15:16,491
First area is a reaction.

261
00:15:16,911 --> 00:15:18,891
We need to reactive approach.

262
00:15:18,891 --> 00:15:25,062
We want to grab the content of psal statements periodically
from production and have some regression testing.

263
00:15:25,782 --> 00:15:27,612
We can reduce the scope.

264
00:15:27,673 --> 00:15:31,302
If we we can pass queries or already normalized queries.

265
00:15:31,592 --> 00:15:33,974
using PG query library from Lucas Fit.

266
00:15:34,529 --> 00:15:42,103
Right to identify which tables are involved, and if we consider
optimization for a particular table, we can find only those

267
00:15:42,103 --> 00:15:46,753
queries from top hundred or top 500 by total time by calls.

268
00:15:47,023 --> 00:15:53,887
Somebody needs to decide what is the most important metric and
which queries are most critical for our application, but we

269
00:15:53,887 --> 00:15:56,745
can find those queries which potentially might be affected.

270
00:15:57,015 --> 00:15:58,365
Of course, we can also.

271
00:15:58,590 --> 00:16:02,070
Think about triggers and various dependencies, foreigners, and so on.

272
00:16:02,400 --> 00:16:03,466
So this is tricky.

273
00:16:03,466 --> 00:16:10,011
But anyway, we can find it automatically and using
Think loan, using single connection sequentially.

274
00:16:10,011 --> 00:16:17,401
We can find, we can test many queries, but the problem will
be how to find parameters, which, which parameters to use for.

275
00:16:17,692 --> 00:16:19,730
Queries and this is unsolved problem.

276
00:16:19,730 --> 00:16:24,480
I already mentioned that it's, this is
big problem how, which parameters to test

277
00:16:25,370 --> 00:16:27,440
Michael: In, in reality, what do you see?

278
00:16:27,470 --> 00:16:35,090
Like in reality, what I see people doing, I think it's probably at
a smaller scale to you, is that they reason about the potential to.

279
00:16:35,750 --> 00:16:38,300
Effects talk, discuss it amongst the team.

280
00:16:38,390 --> 00:16:46,312
During the PR process, during their discussions, then they
monitor, well, they, then they deploy to production and monitor.

281
00:16:46,352 --> 00:16:47,732
Maybe they provision a little bit

282
00:16:47,912 --> 00:16:49,502
Nikolay: Purely reactive approach, right?

283
00:16:49,502 --> 00:16:51,962
So we, we deploy and see, yeah, this is possible.

284
00:16:51,962 --> 00:16:53,702
First of all, of course, code review helps.

285
00:16:53,746 --> 00:16:58,465
And having a bunch of very experienced positive experts helps a lot, right?

286
00:16:58,945 --> 00:17:00,505
They can say, Oh, you know what?

287
00:17:00,955 --> 00:17:03,925
This is good idea, but what will happen?

288
00:17:03,925 --> 00:17:05,155
What other head from it?

289
00:17:05,575 --> 00:17:06,715
Let's think about it.

290
00:17:07,075 --> 00:17:14,387
Of course, we can manually analyze and try to predict, and
if you have experienced 10 plus years, It'll work quite well

291
00:17:14,867 --> 00:17:18,762
unless you are very tired, very busy have days off and so on.

292
00:17:19,482 --> 00:17:28,935
So I, I still think about fully automatic, automatic approach, but
it's, yeah, it's possible and we can check metrics and have some alerts.

293
00:17:29,265 --> 00:17:34,702
For example, if updates are very important
for us, if the ratio of hot updates.

294
00:17:35,272 --> 00:17:38,662
After our release, it can be somehow flagged, alerted, and so on.

295
00:17:39,082 --> 00:17:40,012
This is good approach.

296
00:17:40,072 --> 00:17:42,562
Yes, but it's purely reactive.

297
00:17:42,802 --> 00:17:43,072
Right?

298
00:17:43,282 --> 00:17:48,452
And actually most projects even they don't even don't have this, even this.

299
00:17:48,512 --> 00:17:51,332
So I agree this could be a good step towards it.

300
00:17:52,052 --> 00:17:59,882
But back to fully automated fashion, which queries to check, we
need to check queries from production for regression testing.

301
00:17:59,882 --> 00:18:00,632
And that's possible.

302
00:18:00,632 --> 00:18:07,667
And we actually in some, in a couple of cases, we already have
something like, For with our best lap angel everything is done in c I c.

303
00:18:08,957 --> 00:18:15,390
and the user users see the difference and
potential degradation before we deploy it.

304
00:18:15,420 --> 00:18:20,310
It's, it's very good, but additionally we
need to think about absolutely new queries.

305
00:18:20,310 --> 00:18:28,173
For example, if it's a new feature, new feature branch
and it introduces some new queries or Rewrite or for

306
00:18:28,173 --> 00:18:31,143
existing queries, production doesn't have these queries.

307
00:18:31,143 --> 00:18:34,173
So second part of it is even more challenging.

308
00:18:34,503 --> 00:18:37,113
We need to test absolutely new queries here.

309
00:18:37,443 --> 00:18:42,976
And I, I'm just explaining our thoughts we move
to in this direction  with database lab and.

310
00:18:43,115 --> 00:18:54,580
, I'm excited to, to let in future we will have fully automated testing of
all things related to databases, even for those who are not experts at all.

311
00:18:55,060 --> 00:19:02,847
And when experts are involved, they have a lot of artifacts
metrics to analyze and understand much quicker what to

312
00:19:02,847 --> 00:19:06,507
do, how to fix it so every, everyone can move faster.

313
00:19:07,062 --> 00:19:08,412
With better quality and so on.

314
00:19:08,818 --> 00:19:17,261
so right now without this tool idea is right, we need to
have code review and some every database schema change.

315
00:19:17,598 --> 00:19:23,628
We need to test the change itself, but
also think about how it'll affect workload.

316
00:19:24,423 --> 00:19:24,723
Michael: Yeah.

317
00:19:25,083 --> 00:19:29,996
And I guess first step, a lot of, some people
don't even check about the locks involved, right?

318
00:19:29,996 --> 00:19:37,450
They, that's, that's like, I guess that maybe that's 1 0 1 than
1 0 2, but making sure that it's, it's not a blocking change.

319
00:19:37,450 --> 00:19:39,310
I know this has got a lot better in more

320
00:19:39,625 --> 00:19:40,255
Nikolay: Well, right.

321
00:19:40,540 --> 00:19:44,830
This is, I I even don't discuss it considering it's too trivial.

322
00:19:44,890 --> 00:19:45,160
Right.

323
00:19:45,160 --> 00:19:49,960
So like creating should have concurrently and updates should be in batches.

324
00:19:50,410 --> 00:19:50,740
Yeah.

325
00:19:50,740 --> 00:19:57,045
And this is actually, we solve the with database lab, we
have fully automated verification that  no exclusive logs.

326
00:19:57,045 --> 00:20:00,045
So last more than specified number of seconds.

327
00:20:00,255 --> 00:20:01,875
Well, for example, one second, or, Yeah.

328
00:20:02,355 --> 00:20:05,595
This is possible to fully automate for the change itself.

329
00:20:06,086 --> 00:20:10,856
we spent some time understanding how to test
the change itself and quite already solved it.

330
00:20:11,336 --> 00:20:17,443
We can put it to your C C I C D, but now
question is, I, I think it's a bigger question.

331
00:20:17,743 --> 00:20:25,756
For example, some developer created something but forgot,
index,  how to see that it'll be better if we deploy.

332
00:20:26,406 --> 00:20:33,848
And how to see it without involving very, very
expensive and very experienced DBAs every time, right?

333
00:20:34,268 --> 00:20:45,000
So we should involve them only if we have a hard question, not a simple
question like forgotten limit or index which degrades our other queries.

334
00:20:46,410 --> 00:20:46,740
Michael: Yeah.

335
00:20:46,800 --> 00:20:49,023
And I was You mentioned PG Replay Go.

336
00:20:49,047 --> 00:20:54,910
That's my, the old, the company I used to work at
Go Cardless and I, I think that's how they reason.

337
00:20:55,000 --> 00:20:55,420
Nikolay: right.

338
00:20:55,670 --> 00:20:56,143
Yeah.

339
00:20:56,145 --> 00:20:59,595
Michael: So I think that's how I remember it being used most.

340
00:20:59,595 --> 00:21:02,595
It was big changes, it was risky things.

341
00:21:02,595 --> 00:21:05,235
It was things that people already knew were potentially

342
00:21:05,480 --> 00:21:05,880
Nikolay: All.

343
00:21:05,950 --> 00:21:09,202
Example, a big,  functionality.

344
00:21:09,202 --> 00:21:14,212
We deployed big functionality, or we had
some de factoring during several months.

345
00:21:14,242 --> 00:21:14,602
Right,

346
00:21:15,172 --> 00:21:16,282
Michael: large migration, that

347
00:21:16,312 --> 00:21:16,732
Nikolay: Right.

348
00:21:17,212 --> 00:21:17,362
Right.

349
00:21:17,542 --> 00:21:24,987
Well, in my opinion benchmarks with multiple sessions should
be applied very rarely when something big changes, either like

350
00:21:25,287 --> 00:21:33,687
big new release happens or migration to new , major s version
or change of infrastructure, change of operational system.

351
00:21:34,107 --> 00:21:38,307
It happens couple of times per quarter maximum usually, right?

352
00:21:38,337 --> 00:21:38,697
Even in

353
00:21:39,177 --> 00:21:41,097
that much less normally.

354
00:21:41,097 --> 00:21:44,307
It doesn't depend on the size of company actually, because anyway.

355
00:21:45,242 --> 00:21:47,492
It's very expensive change usually.

356
00:21:47,877 --> 00:21:52,809
And it's a good, it's a good that we need benchmarks, full-fledged benchmarks.

357
00:21:52,809 --> 00:21:58,809
So not every day, but application changes might
happen every day, sometimes several times per day.

358
00:21:59,349 --> 00:22:02,259
And they might change some small part of database.

359
00:22:02,559 --> 00:22:04,509
Good news on the small part of database, right?

360
00:22:04,509 --> 00:22:06,843
We don't perform huge refactoring every day.

361
00:22:07,263 --> 00:22:09,693
So if we change only one table, two table.

362
00:22:10,483 --> 00:22:13,243
We should be able to test it in more light manner.

363
00:22:13,243 --> 00:22:18,586
This is why I'm so big fan of think learning and I'm so excited to see others.

364
00:22:18,586 --> 00:22:26,086
Also think about database branching, new database they say about
it, like they say, you can test a lot of things in C I C D.

365
00:22:26,691 --> 00:22:27,831
It's, it'll be cheap.

366
00:22:27,838 --> 00:22:31,954
Actually, AOR has, as we discussed, Avor has thin loans, but it's not cheap.

367
00:22:31,954 --> 00:22:35,464
You need to pay for each loan for compute power separately.

368
00:22:35,934 --> 00:22:37,054
I, I hope new one

369
00:22:37,589 --> 00:22:37,789
Michael: Interest.

370
00:22:38,164 --> 00:22:45,934
Nikolay: will be different running on one machine and you pay for single
machine, but run many, many tests at the same time on one machine.

371
00:22:46,264 --> 00:22:47,734
Otherwise it doesn't scale.

372
00:22:47,824 --> 00:22:49,684
Cause you sometimes need the dozens of.

373
00:22:50,229 --> 00:22:56,471
That's happening in parallel, if you have big organization,
it can be very costy, so you will turn it off and you

374
00:22:56,471 --> 00:23:00,110
will be without testing, returning to bed times again.

375
00:23:00,704 --> 00:23:06,206
But the advice right now is yes have database
reviews, try to automate some things.

376
00:23:06,304 --> 00:23:09,122
Learn your pta ptat user tables.

377
00:23:09,847 --> 00:23:15,457
Right, because it shows you how many updates
and hard updates you have some type statistics.

378
00:23:15,457 --> 00:23:17,712
It's very good to,  understand it.

379
00:23:17,892 --> 00:23:21,042
I, I'm not, I don't agree with a data doc approach.

380
00:23:21,166 --> 00:23:27,226
If you go to database dashboard in Data Doc, they have
a lot of improvements lately over the last year or two.

381
00:23:27,826 --> 00:23:33,805
But if you check what they name, they present it as,  Tap statistics?

382
00:23:34,013 --> 00:23:34,703
No, No.

383
00:23:34,733 --> 00:23:35,863
Throughput is tps.

384
00:23:35,868 --> 00:23:36,413
Qps.

385
00:23:36,443 --> 00:23:37,133
This is throughput.

386
00:23:37,304 --> 00:23:44,794
Our global throughput for our data is not, but tap statistics is
important, but it should be in details, but it's off topic anyway.

387
00:23:44,794 --> 00:23:51,782
Understanding when you change something for a table understanding
Apple statistics from  that user tables, it's a good thing to have.

388
00:23:52,319 --> 00:23:55,562
And you, can see, Oh, I lost updates.

389
00:23:55,622 --> 00:23:56,222
Oops.

390
00:23:57,182 --> 00:23:57,422
Right.

391
00:23:57,482 --> 00:24:01,502
Michael: How would your like advice change
if we're talking about smaller teams?

392
00:24:01,502 --> 00:24:10,109
If we're talking, let's say like you're a six person startup, you've got
a decent amount of data, but not necessarily, you know,  not billions

393
00:24:10,109 --> 00:24:17,992
of rows and  Maybe the team is currently prioritizing moving fast, and
if, if they have a few problems that they have to fix, that's okay.

394
00:24:18,142 --> 00:24:19,762
Is there any changes in this?

395
00:24:19,762 --> 00:24:24,837
Like, because this feels like a very safe approach, like
the kind of tooling we've discussed seems very very much

396
00:24:24,837 --> 00:24:29,037
optimized for, try to avoid a problem at, at almost any cost.

397
00:24:29,951 --> 00:24:31,510
Nikolay: Well, yeah, good question.

398
00:24:31,990 --> 00:24:37,510
First of all, I would, anyway, change mindset
to use buffers instead of timing anyway.

399
00:24:37,720 --> 00:24:41,050
Even, even if you are small, because it will save you so much.

400
00:24:41,080 --> 00:24:47,230
Like of course there is such thing as a premature
optimization when you try to optimize too early, too much.

401
00:24:47,230 --> 00:24:54,103
But still if you know that some table, you
will use it as a, like in the queue pattern.

402
00:24:54,713 --> 00:24:57,203
Insert, update and delete, delete, insert, update, delete.

403
00:24:57,203 --> 00:25:04,258
It'll belo at a lot of you need some experiment Anyway, This
experiment by the way, can still be  single user, single session

404
00:25:04,258 --> 00:25:11,638
experiment in shared environment because you don't actually
need to use bench and utilize all CPUs to, You don't need bug.

405
00:25:11,723 --> 00:25:17,637
Some kind of background workload, unless
this workload also deals with your table.

406
00:25:17,937 --> 00:25:27,691
What you do need is to think about the future, like what will happen one
and two years with this table, how big it'll be, and run some benchmark

407
00:25:27,901 --> 00:25:34,268
with one session to fill it and then to run one more session to next.

408
00:25:34,268 --> 00:25:41,799
You run to update it heavily and so on, and see What
blood will be and do, Do you have good indexes here or no?

409
00:25:41,829 --> 00:25:50,079
Like if you focus on buffers, you will quickly, This mindset
change is very important because you will start comparing

410
00:25:50,079 --> 00:25:58,599
like, like in your, in your tool pitch master to you compare,
for example, rose buffers, buffer numbers to get one row.

411
00:25:58,639 --> 00:26:00,489
We, we deal with thousand buffers.

412
00:26:00,609 --> 00:26:02,559
It doesn't feel good at all.

413
00:26:02,979 --> 00:26:06,549
So, and if in one year it will be even worse.

414
00:26:06,903 --> 00:26:11,103
probably, we should prepare for better growth right here in this area.

415
00:26:11,343 --> 00:26:12,573
So we need to test it.

416
00:26:12,813 --> 00:26:18,528
We need to test our queries and see what will
happen if we, if we have more data, for example,

417
00:26:19,248 --> 00:26:25,698
Michael: Yeah, I think if I, The only other thing I would add
is I think I see kind of a couple of extreme failure cases here.

418
00:26:25,698 --> 00:26:30,528
I see the failure case where people have not
really started adding many indexes at all.

419
00:26:30,618 --> 00:26:34,218
They almost have no indexes, and then they
start to scale and they have every problem.

420
00:26:34,473 --> 00:26:35,073
Nikolay: column, right?

421
00:26:35,163 --> 00:26:35,433
Yeah,

422
00:26:35,568 --> 00:26:36,438
Michael: Or Yeah, exactly.

423
00:26:36,438 --> 00:26:40,428
Or the other extreme where they've added
an index every time they've seen a problem.

424
00:26:40,443 --> 00:26:41,055
Or Yeah.

425
00:26:41,175 --> 00:26:44,535
I, I, I've not seen an every column case I've heard about them.

426
00:26:45,135 --> 00:26:49,935
But definitely, you know, every time they see a
potential for a multi column index, they add one.

427
00:26:49,935 --> 00:26:54,345
So a single table on even a table of six columns could have 20 indexes on it.

428
00:26:54,645 --> 00:26:55,924
Different orders, different.

429
00:26:55,924 --> 00:27:02,440
So I've seen both of those failure cases, and I guess if you know you're
in one of those, Then it kind of changes your approach quite a lot.

430
00:27:02,500 --> 00:27:08,440
If you're in the, we already have too many indexes, maybe you
need to be a bit aware of that and be looking at reducing that.

431
00:27:08,480 --> 00:27:15,740
If you know you're in the case of we don't have enough, maybe you've got a
little bit more leeway to not have to worry about as much impact of adding

432
00:27:15,815 --> 00:27:20,915
Nikolay: You need to understand your workload to,
to know what to test, and like this is true, but.

433
00:27:20,974 --> 00:27:24,526
Change of mindset to buffers and just experiment a lot.

434
00:27:24,531 --> 00:27:27,046
This is a good thing, like experiment, experiment, experiment.

435
00:27:27,046 --> 00:27:36,386
And you, you, you will see and you then you will probably start if, if
you're, everything is based on human decisions on like based on gut, right?

436
00:27:36,866 --> 00:27:43,442
In this case,  At some point you will start skipping some checks because
you already know that it's, it's, it's okay, but sometimes you'll be wrong.

437
00:27:43,532 --> 00:27:44,942
Anyway, this is a bad thing,

438
00:27:45,812 --> 00:27:47,912
Michael: Yeah, in fact, the buffers thing came up.

439
00:27:47,912 --> 00:27:57,285
I just, the other day I saw a really interesting blog post from a
company that moved from  RDS for Postgres to Aurora for Postgres, and.

440
00:27:58,010 --> 00:28:01,430
They, in my opinion, must have not had enough indexes.

441
00:28:01,430 --> 00:28:06,810
They must have ha been doing too much IO cuz their costs
went maybe tripled or something according to the graphs.

442
00:28:06,870 --> 00:28:14,450
And it was really interesting to see that and see that they thought
this simple migration would be straightforward, but actually because

443
00:28:14,450 --> 00:28:20,853
they hadn't optimized for buffers, they hadn't reduced, you know,
being quite efficient on the IO front hit them with a big bill

444
00:28:21,168 --> 00:28:21,828
Nikolay: Yeah.

445
00:28:22,158 --> 00:28:26,592
On a raw optimization it's not about performance, it's about money as well.

446
00:28:26,652 --> 00:28:26,922
Right.

447
00:28:26,922 --> 00:28:35,952
And I, I, ayo centric optimization is, it's the best thing there
because if you focus on reduction of Ayo, I don't remember.

448
00:28:35,952 --> 00:28:37,782
It's about, is it about rights or reads?

449
00:28:37,782 --> 00:28:39,162
I remember something different.

450
00:28:39,167 --> 00:28:41,532
Was there this detail I already forgot.

451
00:28:42,012 --> 00:28:45,169
But either rice or risk can, if you have them a.

452
00:28:45,884 --> 00:28:48,704
You need to pay for at Aurora.

453
00:28:48,704 --> 00:28:54,468
So you definitely want to reduce  and  you want
your buffer pool to have better efficiency.

454
00:28:54,848 --> 00:29:01,231
a ratio in terms of heat rate versus res and just you
can check your queries and optimize them and so on.

455
00:29:01,231 --> 00:29:09,389
And, and if, for example, indeed, if you, if your index
added more, for example, it broke all your hot updates.

456
00:29:09,479 --> 00:29:12,059
So hot, hot updates ratio became zero.

457
00:29:12,119 --> 00:29:13,769
This is what happened with me a couple of times.

458
00:29:14,459 --> 00:29:19,409
In this case, you'll start paying more , so
you optimize one query, but not only.

459
00:29:20,581 --> 00:29:21,091
degraded.

460
00:29:21,314 --> 00:29:24,794
you have worse performance for other queries, but you're also paying more.

461
00:29:24,794 --> 00:29:25,334
It's not good.

462
00:29:25,801 --> 00:29:28,761
So my advice is, is know your workload.

463
00:29:28,881 --> 00:29:31,581
Learn it from p statements, for example, right?

464
00:29:31,582 --> 00:29:32,952
Focus on buffers.

465
00:29:33,687 --> 00:29:34,797
Not timing one.

466
00:29:35,307 --> 00:29:42,267
Optimizing timing is final, our final goal for optimization,
but it should not be used during the process of optimization.

467
00:29:42,597 --> 00:29:45,747
It should be only like checked in the beginning and the end.

468
00:29:45,987 --> 00:29:54,057
But, but in inside of the process optimization, we should be focused
on buffers and final experiment a lot and to experiment a lot.

469
00:29:54,084 --> 00:29:54,467
You.

470
00:29:54,758 --> 00:29:58,312
Just be ready to loan and reset a lot and so on.

471
00:29:58,312 --> 00:30:05,992
That's why I think Nik is good and as I've said, Aurora has think
cloning so you can run additional load and test it there, then stop it.

472
00:30:06,532 --> 00:30:11,153
I guess they have also like per minute or even per second billing ass.

473
00:30:11,733 --> 00:30:14,793
In general has it two notes.

474
00:30:14,793 --> 00:30:16,203
So they are per second.

475
00:30:16,203 --> 00:30:17,860
So it's can be very efficiently.

476
00:30:18,226 --> 00:30:25,186
You, we will need to wait some time, like minutes to
provision a node, but then at least you can quickly stop it.

477
00:30:25,246 --> 00:30:25,529
Right?

478
00:30:25,529 --> 00:30:30,625
But pure thing close, which are I, I, I name them local Think loans.

479
00:30:30,630 --> 00:30:33,155
So we have one DM and many CL inside it.

480
00:30:33,215 --> 00:30:35,015
So we pay it for only for one dm.

481
00:30:35,375 --> 00:30:35,675
It's the.

482
00:30:36,535 --> 00:30:42,805
For experimenting, so having such a platform
for experimenting is a good, it's a good thing.

483
00:30:42,955 --> 00:30:45,775
Anyway, general device has experimented a lot.

484
00:30:46,585 --> 00:30:46,945
Michael: Yeah.

485
00:30:48,565 --> 00:30:49,165
Sounds great.

486
00:30:50,038 --> 00:30:51,028
Anything else you wanted to add?

487
00:30:51,913 --> 00:30:55,243
Nikolay: Now as usual, subscribe, like, share, provide feedback.

488
00:30:55,243 --> 00:30:56,546
Feedback is very important.

489
00:30:56,555 --> 00:31:02,707
Share is also very important in your social networks,
working groups like disc, Telegram, what you use.

490
00:31:02,707 --> 00:31:05,167
Please share and bring us more ideas.

491
00:31:05,317 --> 00:31:08,977
Actually, we have a line, have a queue of ideas anyway, but we want.

492
00:31:09,924 --> 00:31:10,614
Michael: Yeah, exactly.

493
00:31:10,619 --> 00:31:14,222
They don't, it's not just always request comes in and we do the exact topic.

494
00:31:14,222 --> 00:31:19,802
Sometimes it's gives us an idea for a slightly different one, or we
put two together and that makes a good episode, that kind of thing.

495
00:31:19,952 --> 00:31:20,252
So

496
00:31:20,882 --> 00:31:21,332
we'll see.

497
00:31:21,572 --> 00:31:22,052
Nikolay: exactly.

498
00:31:22,712 --> 00:31:23,102
Okay.

499
00:31:23,177 --> 00:31:23,867
Michael: much everybody.

500
00:31:24,107 --> 00:31:24,617
Take care.

501
00:31:25,172 --> 00:31:25,412
Nikolay: Bye.

502
00:31:25,412 --> 00:31:25,622
Bye.