1
00:00:00,010 --> 00:00:04,090
Michael: Hello and welcome to Postgres fm,
a weekly show about all things PostgreSQL.

2
00:00:04,180 --> 00:00:08,020
I'm Michael, founder of PG Mustard, and this
is my Cohot Cola, founder of Postgres ai.

3
00:00:08,350 --> 00:00:09,760
Hey Nicola, what are we talking about today?

4
00:00:10,152 --> 00:00:12,732
Nikolay: Hot updates, Hot updates, or hot updates.

5
00:00:13,032 --> 00:00:14,022
Let's discuss this

6
00:00:14,322 --> 00:00:16,422
Michael: Heap only two pull topple.

7
00:00:17,022 --> 00:00:20,262
We, are at that point again, heap only, tap updates.

8
00:00:20,529 --> 00:00:21,923
Nikolay: let's start from Short story.

9
00:00:21,923 --> 00:00:24,563
First time I was in the US it was 2008.

10
00:00:24,563 --> 00:00:28,433
I came for, for the conference in Maryland and I was in New York.

11
00:00:28,433 --> 00:00:31,643
First time I entered to Starbucks and they asked me what I want.

12
00:00:31,643 --> 00:00:33,743
I say coffee, like cappuccinos.

13
00:00:34,103 --> 00:00:41,522
I think they ask me, should it be hot or cold, and
I say, and they gimme, of course, cold because they

14
00:00:41,522 --> 00:00:46,312
hear, you know, American pronunciation is hot, not hot.

15
00:00:46,882 --> 00:00:47,572
Michael: So hot.

16
00:00:48,052 --> 00:00:54,112
Nikolay: And if you say hot, they can hear
cold, actually, Cause it's Oh, oh . So.

17
00:00:54,112 --> 00:00:54,292
Okay.

18
00:00:54,292 --> 00:00:55,912
Hot updates or hot updates.

19
00:00:56,542 --> 00:00:58,612
Michael: Who serves cold cappuccinos.

20
00:00:58,612 --> 00:00:59,002
Wow.

21
00:00:59,092 --> 00:01:02,812
Nikolay: Yeah, it's, it's, it's, You
should live here for a while to understand

22
00:01:03,112 --> 00:01:06,682
Michael: Yeah, I wish I lived in a climate
where that was a reasonable thing to do.

23
00:01:06,832 --> 00:01:07,312
But yeah.

24
00:01:07,372 --> 00:01:12,127
So  actually, this is a listener question and topic suggestion.

25
00:01:12,127 --> 00:01:13,687
So thank you to Andy for this one.

26
00:01:13,756 --> 00:01:17,670
The question was effectively when should I think about tuning fill factor?

27
00:01:17,670 --> 00:01:20,501
What kind of what kind of workloads?

28
00:01:20,506 --> 00:01:20,801
Yes.

29
00:01:21,018 --> 00:01:21,318
Yes.

30
00:01:21,378 --> 00:01:23,744
For tables, I guess, but he actually didn't specify.

31
00:01:23,906 --> 00:01:27,484
But yeah,  because he an asked about hot updates, I'm assuming tables.

32
00:01:27,964 --> 00:01:30,914
And what kind of workloads would it be worth customizing for?

33
00:01:30,974 --> 00:01:32,624
So, yeah, really good question.

34
00:01:32,834 --> 00:01:40,502
And to answer it, we kind of really have to dive into what heap only
type updates are how they help the advantages, that kind of thing.

35
00:01:41,042 --> 00:01:41,312
Nikolay: Right.

36
00:01:41,312 --> 00:01:49,003
And this is one of the questions which are originally
closer to infrastructure level, like DBA DBRE.

37
00:01:49,018 --> 00:01:58,172
But I see it clearly that developers should learn it as well and understand
it better because like, this is one of administrative topics, which.

38
00:01:58,908 --> 00:02:01,848
Are very beneficial to learn for developers for.

39
00:02:02,502 --> 00:02:06,807
people who write sql, not just maintain progress notes.

40
00:02:07,107 --> 00:02:13,957
Because this depending on which indexes you define and
how you write your queries, you can benefit from hot

41
00:02:13,957 --> 00:02:17,587
updates from having or, or you can avoid it, right?

42
00:02:17,592 --> 00:02:23,515
So of course I wish we had all updates hot, but it's not possible, of course.

43
00:02:23,521 --> 00:02:26,348
And of course fill factor by default is a hundred.

44
00:02:26,528 --> 00:02:30,968
For tables, we should probably should start from basics, right?

45
00:02:31,478 --> 00:02:38,378
What, what fill factories and how updates are
happening according to Postgres and VCC model.

46
00:02:38,678 --> 00:02:42,458
Michael: Yeah, let's start with how updates happen without this optimization.

47
00:02:42,458 --> 00:02:44,479
So before version 8.3.

48
00:02:44,479 --> 00:02:49,009
Before your 2008 story, Actually it was around
about that same time that they, this got added.

49
00:02:49,459 --> 00:02:50,791
Um, All updates.

50
00:02:50,851 --> 00:02:51,031
It,

51
00:02:51,436 --> 00:02:51,856
Nikolay: you say?

52
00:02:52,036 --> 00:02:52,366
Right?

53
00:02:52,816 --> 00:02:56,722
The same version which which got XML data type and functions.

54
00:02:56,722 --> 00:02:57,322
So I remember it

55
00:02:57,322 --> 00:02:58,372
Michael: Oh wow.

56
00:02:58,522 --> 00:02:59,512
Of course your one.

57
00:02:59,662 --> 00:03:01,792
So yes, so same version.

58
00:03:01,912 --> 00:03:05,169
And that means there were times before this optimization.

59
00:03:05,169 --> 00:03:09,159
So, and of course we have regular updates or cold updates.

60
00:03:09,159 --> 00:03:10,989
I've heard, I've seen them called at least once.

61
00:03:11,417 --> 00:03:18,347
So by default Postgres, because due to nvcc
and I guess other design decisions will.

62
00:03:19,022 --> 00:03:21,302
Create a new topple for each.

63
00:03:21,332 --> 00:03:25,172
Like if you do an update on, on a row, you'll get a new couple.

64
00:03:25,282 --> 00:03:32,809
And an you'll have an old one that gets marked eventually as dead
and cleaned up once, once no transactions can see it anymore.

65
00:03:32,812 --> 00:03:33,892
Nikolay: And cleaned by vacuum.

66
00:03:34,192 --> 00:03:34,942
By vacuum.

67
00:03:35,212 --> 00:03:39,802
It's cleaned later as synchronistic, so
independently by our activities, right,

68
00:03:40,102 --> 00:03:40,912
Michael: Yeah, exactly.

69
00:03:41,122 --> 00:03:49,312
So the main optimization here, and there are some caveats,
is that we can, if there's enough space on the page

70
00:03:49,312 --> 00:03:52,765
Nikolay: Which is eight k kb bytes by default and usually it's all.

71
00:03:53,589 --> 00:04:00,519
Michael: Yeah, total size of the page is, is eight
kilobytes and total amount left free is, is generally zero.

72
00:04:00,519 --> 00:04:05,169
So Postgres would try and pack these pages as much as it can by default.

73
00:04:05,328 --> 00:04:07,818
That's, that's the fill factor we were talking about briefly.

74
00:04:07,823 --> 00:04:12,086
So by default we have a hundred percent or a hundred as the default factor.

75
00:04:12,086 --> 00:04:14,156
So it won't leave space in them.

76
00:04:14,426 --> 00:04:14,996
Yeah, exactly.

77
00:04:14,996 --> 00:04:15,506
For tables.

78
00:04:15,511 --> 00:04:15,716
Yeah.

79
00:04:16,296 --> 00:04:17,006
Nikolay: forensic says 90.

80
00:04:17,056 --> 00:04:17,496
Okay.

81
00:04:17,906 --> 00:04:18,566
Michael: Or be treat.

82
00:04:18,566 --> 00:04:18,926
Yeah.

83
00:04:18,931 --> 00:04:19,551
Yeah, exactly.

84
00:04:19,701 --> 00:04:28,005
So this optimization with some caveats
will try and update a row on the same page.

85
00:04:28,005 --> 00:04:35,835
So the new, the new type ends up on the same page as the old one
and tis it up like without needing a vacuum, is my understanding.

86
00:04:36,255 --> 00:04:43,000
So yeah, the main caveat is there needs to be enough space on
the page, and the other one is that it needs to be updating.

87
00:04:43,015 --> 00:04:45,565
Columns that are not indexed in any way.

88
00:04:45,990 --> 00:04:46,410
Nikolay: Right.

89
00:04:46,410 --> 00:04:48,300
And here is very important.

90
00:04:48,450 --> 00:04:48,960
So right.

91
00:04:49,020 --> 00:04:50,709
Two,  conditions should be met.

92
00:04:51,219 --> 00:04:52,209
You're absolutely right.

93
00:04:52,268 --> 00:04:53,635
Enough space in page.

94
00:04:53,852 --> 00:04:58,744
And we, we are, by the way, we are skipping, discussing toast here.

95
00:04:58,744 --> 00:05:05,674
We, excuse me, Like if, if, if we insert value which
exceeds page, it's shrinked and to tables are used,

96
00:05:05,674 --> 00:05:08,719
and shrinked, like two gigabytes also compressed.

97
00:05:08,719 --> 00:05:11,179
Like, but we whiskey it here.

98
00:05:11,479 --> 00:05:12,377
But so, okay.

99
00:05:12,587 --> 00:05:15,204
Of, of course the space should be there inside our page.

100
00:05:15,354 --> 00:05:26,351
, a second reason, second condition says you, you said indexed
columns, but indexed should be defined by like explicitly.

101
00:05:26,356 --> 00:05:27,851
What, what, what does it mean?

102
00:05:27,911 --> 00:05:35,102
Indexed, Of course, if you create index on some
column, this means you, this column is indexed.

103
00:05:35,107 --> 00:05:41,732
Okay, But what if you use column, column inside
wear clothes, meaning you create a partial.

104
00:05:42,824 --> 00:05:44,714
Michael: Yeah, Partial or expression indexes.

105
00:05:44,714 --> 00:05:45,614
Count two, I believe.

106
00:05:46,387 --> 00:05:46,687
Nikolay: Right?

107
00:05:46,687 --> 00:05:55,087
So if, if com is not used in any indexes inside like the main
body of definition, but or including covering indexes, right?

108
00:05:55,417 --> 00:06:00,727
But it's used only inside wear clouds, it's also counted as like being used.

109
00:06:00,732 --> 00:06:04,687
And if you update such com, you, you cannot have hot updates, unfortunately.

110
00:06:05,017 --> 00:06:10,977
And the very popular example here is
having a colon cold, for example, modified.

111
00:06:11,794 --> 00:06:12,284
Michael: Yeah.

112
00:06:12,439 --> 00:06:14,329
Nikolay: and you just modify time and that's it.

113
00:06:14,789 --> 00:06:21,833
And then you say, I want to index only part of my
table where modification happened only recently.

114
00:06:21,983 --> 00:06:22,523
And that's it.

115
00:06:22,583 --> 00:06:25,585
You, you put  like, create some index, but blah, blah, blah.

116
00:06:25,585 --> 00:06:26,485
Were ified it.

117
00:06:27,548 --> 00:06:31,470
Is in this, this here only, for example, and that's it.

118
00:06:31,470 --> 00:06:37,180
If you try to modify, modify it, add, call,
updated, you don't have hot updates, never.

119
00:06:37,240 --> 00:06:41,320
So like, it's not possible or, I mean, you, you, any update.

120
00:06:41,375 --> 00:06:47,015
We'll modify this column because modify that, usually
updated by some trigger or by application itself.

121
00:06:47,015 --> 00:06:56,675
But any update is supposed to, to insert new value inside this
column, meaning you will always be touching a column, which is

122
00:06:56,735 --> 00:07:00,295
participating in some index definition, so you will never have.

123
00:07:00,815 --> 00:07:01,535
Hot updates.

124
00:07:01,805 --> 00:07:06,407
I have an article about it, so  about
this because I, I've hit it in production.

125
00:07:06,407 --> 00:07:10,127
I, I thought I'm optimizing things, but I lost hot updates completely.

126
00:07:10,127 --> 00:07:13,181
And it was uh, it was not good in terms of performance.

127
00:07:13,191 --> 00:07:18,501
Actually question why, Like, Okay, these two
conditions are made, but why do we want it?

128
00:07:18,700 --> 00:07:18,910
Right?

129
00:07:19,140 --> 00:07:19,710
Michael: Yeah, that's a

130
00:07:19,810 --> 00:07:21,050
Nikolay: What's the, what's the benefit?

131
00:07:21,350 --> 00:07:30,110
Michael: So my understanding is the, the, without this optimization
update, heavy workloads become very difficult on Postgres.

132
00:07:30,530 --> 00:07:35,510
So it, we get what's famously called right amplification.

133
00:07:35,680 --> 00:07:40,456
The Uber blog post is like the most extreme,
I think example I've seen of this in the wild.

134
00:07:40,486 --> 00:07:42,953
But if we have a.

135
00:07:43,343 --> 00:07:46,553
Actually, there's a really good ADM blog post on this more recently as well.

136
00:07:46,651 --> 00:07:54,961
But they, the way they explained it I thought was really neat, which is
if we have a single row that has, let's say 10 indexed columns, doing a

137
00:07:54,966 --> 00:08:04,786
single update to that row requires us to not just wrap to a new page in
the heap on the, in the table, but also to update 10 index pages too.

138
00:08:05,386 --> 00:08:07,919
So for one update, we have 11 updates.

139
00:08:07,919 --> 00:08:09,353
Even if that update.

140
00:08:09,781 --> 00:08:11,041
Update an index column.

141
00:08:11,041 --> 00:08:19,124
So even if those are effectively unnecessary updates, whereas
with this optimization, if we are updating a co, a single

142
00:08:19,129 --> 00:08:26,384
column that's not involved in any of those 10 indexes,
we can get away with a single page right instead of 11.

143
00:08:26,714 --> 00:08:28,314
And that's a huge benefit, obviously say

144
00:08:28,464 --> 00:08:28,854
Nikolay: Right.

145
00:08:28,854 --> 00:08:33,024
So yeah, Postgres, All POGS indexes are direct.

146
00:08:33,264 --> 00:08:41,214
Unlike in some other database systems where sometimes there
is discussions should we have indirect indexes as well.

147
00:08:41,244 --> 00:08:49,074
This, these discussions pop up theoretically and probably like we
should, for example, in some, some design could be only primary.

148
00:08:49,851 --> 00:08:57,711
PRI indexes, which support primary keys are direct, meaning that
they point to heap and other indexes could point to primary keys.

149
00:08:58,131 --> 00:09:04,241
And in this case we would not need to update all of
them, but, currently in PO all indexes, point to heap.

150
00:09:04,931 --> 00:09:12,341
They say this page and some of that, and if, and index doesn't
have visibility information, this is the key also to understand it.

151
00:09:12,911 --> 00:09:18,907
So visibility means which is this top debt or still live for our transaction.

152
00:09:19,447 --> 00:09:25,032
And it means that index should uh, you
need to consult with he to understand it.

153
00:09:25,032 --> 00:09:35,433
And, and this means that if you, if the sum topple is dead,
Indexes and, and new ah, also we forgot to say in, in vcc, in PO

154
00:09:35,943 --> 00:09:40,203
basics of it any update means some tap you actually described it.

155
00:09:40,368 --> 00:09:41,238
Table is deleted.

156
00:09:41,238 --> 00:09:42,798
A new, new table is created.

157
00:09:43,098 --> 00:09:45,978
A top is a raw version, physical, raw version.

158
00:09:46,308 --> 00:09:50,268
And since Synex doesn't know, which is life.

159
00:09:50,568 --> 00:09:56,958
It needs to have both old one and new one because old
one probably still needed for some transactions and.

160
00:09:57,498 --> 00:09:58,308
So Right.

161
00:09:58,313 --> 00:10:02,778
If we have 10 indexes we need to create to, to tie, to change all of them.

162
00:10:03,018 --> 00:10:08,628
And later auto, auto vacuum or regular manual
vacuum, we'll need to clean up all of them.

163
00:10:08,868 --> 00:10:10,368
All that interest there.

164
00:10:10,698 --> 00:10:19,368
And also page split can happen like very like affecting
performance even more if some index ari index already

165
00:10:19,368 --> 00:10:23,333
don't have space in some page,  although fill factor for.

166
00:10:23,633 --> 00:10:30,206
Indexes between indexes is 90, so some space should be
there, but if not, then new page needs to be created.

167
00:10:30,206 --> 00:10:33,866
Split happens, like it's, it's penalty for performance,

168
00:10:34,298 --> 00:10:41,476
Michael: While we're on the advantages or the benefits of this, there's a
few others that I know we've covered in other episodes briefly, but as well

169
00:10:41,476 --> 00:10:49,966
as not the update the indexes as well as less bloat in the indexes as, as
a result, there's also potential for less bloat in the table long term.

170
00:10:50,916 --> 00:10:53,166
less right ahead log generation.

171
00:10:53,417 --> 00:10:56,117
Normally I believe it would normally be faster.

172
00:10:56,314 --> 00:10:58,294
The correlated data stays correlated.

173
00:10:58,294 --> 00:11:02,764
So when we were talking about brain indexes, for
example, so there's a bunch of other Oh yeah.

174
00:11:02,794 --> 00:11:07,273
Also I think reduced vacuum overhead because some of this is being tidied up

175
00:11:07,292 --> 00:11:14,917
Nikolay: Right, we, we, lets don't touch additional in page vacuum,
which happens on the fly here, but like, it's too much details.

176
00:11:15,457 --> 00:11:20,765
Usually DB is usually more, Let, let's,
let's focus on performance for developers.

177
00:11:20,801 --> 00:11:23,981
So right heart of that usually are much, much better.

178
00:11:23,981 --> 00:11:28,046
Meaning that you, your actual update is much lighter.

179
00:11:28,166 --> 00:11:31,406
So it's, if you check, explain and analyze buffer.

180
00:11:32,039 --> 00:11:36,509
, you will see that your update modified much fewer buffers, right?

181
00:11:36,839 --> 00:11:39,359
Made the majority or written.

182
00:11:39,659 --> 00:11:40,799
And, and this is great.

183
00:11:41,309 --> 00:11:46,101
And of course here in this topic, we should not use only microanalysis.

184
00:11:46,101 --> 00:11:54,289
We need to check many of uh,  update cases so it makes sense to generate.

185
00:11:54,889 --> 00:12:02,766
Benchmark  and take your data, your, like simulate
to your workload and see how many updates are hot.

186
00:12:03,216 --> 00:12:06,456
It's easy to you, you just need to inspect that.

187
00:12:06,456 --> 00:12:07,446
User tables.

188
00:12:07,656 --> 00:12:09,506
It has both number of.

189
00:12:10,031 --> 00:12:17,028
Taps updated a number of taps, updated in hot
manner, and updated includes hu of course.

190
00:12:17,088 --> 00:12:19,374
So it's like you cannot summarize.

191
00:12:19,864 --> 00:12:21,154
One includes the other.

192
00:12:21,222 --> 00:12:28,850
And the ratio is easy to get and the closer to hundred you
are, the better performance will be because again uh, buffer.

193
00:12:29,415 --> 00:12:32,325
Buffers rule works here as well, right?

194
00:12:32,813 --> 00:12:37,339
The fewer buffers will be touched because fewer indexes need to be adjusted,

195
00:12:37,533 --> 00:12:42,996
Michael: And that's a really good point It's possible that
you are already even without doing any tuning, you might

196
00:12:42,996 --> 00:12:46,366
already be achieving close to a hundred percent hot updates.

197
00:12:46,369 --> 00:12:50,459
So there might be, it might, there might be no need to change your settings.

198
00:12:50,522 --> 00:12:56,762
Equally in that situation, you need to be very aware of that because
if you, if you do anything to prevent hot updates, like you mentioned

199
00:12:56,762 --> 00:13:00,002
briefly earlier, if you add an index yeah, it'll add an, yeah.

200
00:13:00,692 --> 00:13:06,495
Nikolay: yeah, it was a Ruby application, very interesting,
like very good startup acquired by a large company later in

201
00:13:06,495 --> 00:13:09,965
San Francisco, and I was advising them and I said, Oh, here we.

202
00:13:10,500 --> 00:13:15,060
We can reduce our index size and it's scope, just making it partial.

203
00:13:15,060 --> 00:13:15,690
Let's do it.

204
00:13:15,690 --> 00:13:20,550
We will put this column to wear clouds indexes much smaller, faster.

205
00:13:20,610 --> 00:13:20,970
Great.

206
00:13:21,360 --> 00:13:23,963
By the way, also worth pointing that if.

207
00:13:24,485 --> 00:13:31,274
,   we have a partial index if it's scope in
terms of how many rows is out of our change.

208
00:13:31,274 --> 00:13:33,734
So, I mean, change is happening outside.

209
00:13:33,734 --> 00:13:35,954
This index doesn't need to be touched, of course.

210
00:13:35,954 --> 00:13:43,904
So if, if one of our 10 indexes is saying like, it's index,
don't leave this part of table, but we change another part

211
00:13:43,904 --> 00:13:46,124
of table, of course this index doesn't need to be changed.

212
00:13:46,514 --> 00:13:48,786
But in my case, I, speed it up.

213
00:13:49,226 --> 00:13:54,986
My query, but then we deployed it and then I suddenly see that hot updates.

214
00:13:55,286 --> 00:14:00,596
I was checking picture, start user tables, gathering
statistics, showing that we, we good, we're good.

215
00:14:00,596 --> 00:14:02,306
We have so many hot updates.

216
00:14:02,546 --> 00:14:07,330
And then suddenly, After release, it dropped like to zero basically.

217
00:14:07,630 --> 00:14:09,190
And I, I, what's happening?

218
00:14:09,220 --> 00:14:14,473
And then I realized, so I  explained this
case in,   in, um, my article actually.

219
00:14:14,683 --> 00:14:18,160
This is quite Bright case, which says that.

220
00:14:18,640 --> 00:14:25,779
Our changes should be tested in at, at various angles,
not only for our workload, but questions should be,

221
00:14:26,169 --> 00:14:29,979
will other parts of our work workload be affected?

222
00:14:29,979 --> 00:14:35,324
And I didn't see so far comprehensive testing,
like the testing, whole workload every time.

223
00:14:35,324 --> 00:14:38,311
It's quite, quite big and expensive task.

224
00:14:38,951 --> 00:14:44,115
I like still, there are some ideas and some, in some cases we achieved.

225
00:14:44,115 --> 00:14:53,632
I mean, I've built some systems which are close or it's easier, like you would
press button and have this kind of testing, but it's not topic of course.

226
00:14:53,737 --> 00:15:00,427
Michael: So in terms of what developers can do, I guess being aware
of what their current state is is a really good starting point.

227
00:15:00,427 --> 00:15:06,726
So checking that view for highly updated tables and
just getting a sense of what's the current proportion

228
00:15:06,731 --> 00:15:10,446
of updates that are hot is a, is a great starting point.

229
00:15:10,506 --> 00:15:13,131
If it's zero you might have room for a lot of improvement.

230
00:15:13,131 --> 00:15:18,891
If it's a hundred, you might need to be very careful, but anything
in between, there might be room, I guess, for a bit of optimization.

231
00:15:19,511 --> 00:15:20,391
Nikolay: Right, right.

232
00:15:20,421 --> 00:15:26,436
Having good monitoring, which absorbs all
tables and uh, psta user tables and absorbs.

233
00:15:26,436 --> 00:15:33,119
It, it, it helps and, but also of course,
having a regular update doesn't hurt.

234
00:15:33,629 --> 00:15:41,239
Originally, like it's not, it's okay to have it right, but if
you expect that this table right, performance matters a lot.

235
00:15:41,659 --> 00:15:45,409
And also data volumes are huge and data grows.

236
00:15:45,409 --> 00:15:48,379
So vacuum pressure on vacuum will be high.

237
00:15:48,575 --> 00:15:54,215
A lot of work to, to delete this dead and both in he and all indexes, right?

238
00:15:54,425 --> 00:16:00,335
Of course it makes sense to try to achieve
higher ratio for hot updates, hot updates.

239
00:16:00,635 --> 00:16:00,995
Michael: Yeah.

240
00:16:01,595 --> 00:16:01,835
Yeah.

241
00:16:01,835 --> 00:16:06,725
I guess the bigger your workloads, the more this percent.

242
00:16:07,145 --> 00:16:08,495
Improvement will help

243
00:16:08,585 --> 00:16:17,026
Nikolay: Question to you actually as an expert in explaining plants
and, and and so on do you think having auto explain running with

244
00:16:17,026 --> 00:16:22,486
buffers options enabled for slow queries can help to identify issues?

245
00:16:22,516 --> 00:16:23,836
For example, we have update.

246
00:16:23,992 --> 00:16:26,122
Which became, which was slow.

247
00:16:26,272 --> 00:16:31,049
We see it on auto explain log, and we see a lot of buffers involved.

248
00:16:31,259 --> 00:16:34,259
We know that our update just one row, but why so many buffers?

249
00:16:34,319 --> 00:16:37,109
Meaning that maybe many indexes we are involved.

250
00:16:37,349 --> 00:16:40,621
Of course, maybe this update includes select.

251
00:16:40,671 --> 00:16:41,481
Subquery.

252
00:16:41,964 --> 00:16:45,664
So, so that's why we try to find which we want update.

253
00:16:45,664 --> 00:16:51,799
This is another reason for, But this would, no,
this would mean that buffers are, are in red.

254
00:16:52,029 --> 00:16:54,124
I mean, Hit and, and red.

255
00:16:54,124 --> 00:16:54,364
Right?

256
00:16:54,574 --> 00:17:00,499
But if we see a lot of buffers directed and
written also might be related to hi bit updating.

257
00:17:00,499 --> 00:17:01,879
Hit bits a different story.

258
00:17:01,884 --> 00:17:08,809
But do you think it's a sign like we, we see big numbers,
so probably we should try to understand what's happening.

259
00:17:08,809 --> 00:17:09,799
We've had update ratio.

260
00:17:10,099 --> 00:17:11,029
Michael: Yeah, exactly.

261
00:17:11,029 --> 00:17:17,509
I think that's the kind of thing where you can spot that
there is an issue, but you, you can't necessarily tell what

262
00:17:17,509 --> 00:17:20,689
the issue is and it, this would be a very big sign of it.

263
00:17:20,719 --> 00:17:28,529
Like even in select queries, we, we now to try and point out
when we think there's a disproportionate number of blocks, Even

264
00:17:28,589 --> 00:17:32,569
shared blocks hit or red versus the amount of data being returned.

265
00:17:32,779 --> 00:17:36,829
Cuz that's normally a sign of bloke, but
it can be a sign of other issues as well.

266
00:17:36,829 --> 00:17:42,128
So it's, it's one of those ones where a high number
of blocks might mean you've got a low hot update

267
00:17:42,128 --> 00:17:44,258
proportion, but it could be a sign of other things.

268
00:17:44,258 --> 00:17:45,432
So, yeah definitely.

269
00:17:45,612 --> 00:17:51,852
Yeah, it definitely feels like the kind of thing where you could
point out there's a problem and, and suggest some things to look into

270
00:17:52,002 --> 00:17:52,332
.
Nikolay: All right.

271
00:17:52,452 --> 00:17:52,632
Right.

272
00:17:52,722 --> 00:17:55,302
So checking pta, user tables.

273
00:17:55,332 --> 00:17:56,862
This is number one thing here.

274
00:17:57,162 --> 00:17:57,792
Okay, good.

275
00:17:58,122 --> 00:18:03,425
And um,  if we, for example, achieve high
number, we should like keep watching, right?

276
00:18:03,578 --> 00:18:03,968
That's it.

277
00:18:04,268 --> 00:18:04,538
Michael: Yeah.

278
00:18:04,543 --> 00:18:05,768
Monitoring again, isn't it?

279
00:18:05,864 --> 00:18:11,654
In terms of what people can expect in terms of speedups
or in terms of like reduction in some of these numbers?

280
00:18:11,954 --> 00:18:15,134
I've, I think the AGM blog post I read was the best,

281
00:18:15,479 --> 00:18:15,501
Nikolay: Hm.

282
00:18:15,651 --> 00:18:17,001
Michael: like real case I saw.

283
00:18:17,001 --> 00:18:23,183
So they, well, they had a serious amount of right ahead log
generation, but they reduced it by about about 20% or so.

284
00:18:23,453 --> 00:18:26,603
That feels like an interesting, nu like interesting number to me.

285
00:18:27,143 --> 00:18:32,393
So I don't think people are necessarily, I don't think people
should expect, you know, 10 x improvements in some of these things.

286
00:18:32,828 --> 00:18:34,598
Nikolay: This is a very good point.

287
00:18:34,598 --> 00:18:42,459
So it's not only about timing of your update, but also you
generate a lot of walls data and means that you put additional

288
00:18:42,464 --> 00:18:45,609
pressure to your back up system and also to replication system.

289
00:18:45,609 --> 00:18:52,059
So more, more bites should be trans transferred
to standby notes, both physical and logical.

290
00:18:52,059 --> 00:19:02,697
So it's like a lot of work and having pogs 13 or later helps here
because in producer statements, And in explain analyze, you will see wall

291
00:19:02,810 --> 00:19:08,512
metrics, how many bites and how many full page shorts happened and so on.

292
00:19:08,602 --> 00:19:09,772
This is great, great point.

293
00:19:10,072 --> 00:19:20,019
Also, actually, you mentioned this earlier, having pos 13 or 14
or maybe 15 helps as well because B3 will be more compact, right.

294
00:19:20,019 --> 00:19:20,859
In this case.

295
00:19:20,938 --> 00:19:23,308
How, how is it related to hot updates?

296
00:19:23,458 --> 00:19:26,305
We have Even updates became not hot.

297
00:19:26,305 --> 00:19:31,495
We, we have smaller penalty in terms of maintaining in Index State, right?

298
00:19:31,990 --> 00:19:37,090
Michael: There's a, there's a good video by Lucas Fiddle
on this that came out around about that time that I'll

299
00:19:37,195 --> 00:19:41,668
Nikolay: ah, you, it's about let's delete
from, it was originally per Kona block post.

300
00:19:41,668 --> 00:19:47,038
I guess let's, right, let's, let's delete
from the end or something like this, right?

301
00:19:47,068 --> 00:19:47,398
Or.

302
00:19:47,442 --> 00:19:51,329
Michael: It was something about, I think the, the
video might have been titled Bottom Up Index D.

303
00:19:51,329 --> 00:19:51,869
Exactly.

304
00:19:52,064 --> 00:19:56,934
Nikolay: Right, because if you start from the
end we will have empty pages or something.

305
00:19:57,114 --> 00:19:57,984
I may be wrong.

306
00:19:57,984 --> 00:19:58,734
Let's not guess.

307
00:19:58,854 --> 00:20:04,723
Let's just provide some link and, and, and read about it later again, right?

308
00:20:05,053 --> 00:20:05,713
Yeah, yeah.

309
00:20:05,713 --> 00:20:11,838
Actually, you, you mentioned Lucas um,  fi, um,
YouTube channel, Pigeon generalized YouTube channel.

310
00:20:12,138 --> 00:20:15,034
I, I think it's underestimated because.

311
00:20:15,184 --> 00:20:18,274
I, I would warn our audience because it's too deep.

312
00:20:18,574 --> 00:20:20,704
It's like, you, you, yeah.

313
00:20:20,704 --> 00:20:29,536
You should go there if you already, not a new user of pogs, but once
you already know basics, it's super cool channel because it's very deep.

314
00:20:29,536 --> 00:20:30,586
It's very interesting.

315
00:20:30,586 --> 00:20:34,154
I, I learn a lot every time I open it, so it's good.

316
00:20:34,304 --> 00:20:34,514
Michael: Yeah.

317
00:20:34,514 --> 00:20:36,614
And short videos as well, like five minutes.

318
00:20:37,184 --> 00:20:37,514
Nikolay: Right.

319
00:20:37,544 --> 00:20:40,375
Reverting us to, to the very beginning.

320
00:20:40,405 --> 00:20:49,535
I, I should point to the fact that if you want to learn about
s you previously I always said, said you cannot find it in

321
00:20:49,535 --> 00:20:52,206
documentation because the convention didn't have anything about it.

322
00:20:52,385 --> 00:20:56,075
Now it has since POS 15 or 1415.

323
00:20:56,075 --> 00:20:56,345
Right.

324
00:20:56,795 --> 00:20:58,025
But it has only a small.

325
00:20:58,325 --> 00:20:59,575
It'll not be enough for you.

326
00:20:59,665 --> 00:21:07,105
Like, I mean, even if you just starting, if you want to
understand how updates are working, I encourage everyone to

327
00:21:07,105 --> 00:21:11,838
open source code again and find rhythm related to hot updates.

328
00:21:12,198 --> 00:21:16,758
Much more interesting information there,
but of course, maybe too deep as well.

329
00:21:16,935 --> 00:21:17,775
Michael: Yeah, very in depth.

330
00:21:17,775 --> 00:21:20,415
I was gonna say, but there is, it's quite beginner friendly.

331
00:21:20,415 --> 00:21:26,595
Like I, I looked at it earlier today and there's a glossary
even of, of kind of definitions even within the read.

332
00:21:27,195 --> 00:21:28,185
Yeah, it's great.

333
00:21:28,795 --> 00:21:31,612
So yes, I'll include links to both of those.

334
00:21:31,612 --> 00:21:36,380
There's also one final kind of place that
people could look into if they want, if you.

335
00:21:37,292 --> 00:21:39,032
Diagrams or you're quite a visual learner.

336
00:21:39,392 --> 00:21:49,167
There's a good part of the guide by, we've mentioned it before,
by Hu Nobu Suzuki that exactly the internals book, it has the

337
00:21:49,167 --> 00:21:53,627
online version of I can link to with diagrams of how this works.

338
00:21:53,767 --> 00:21:54,257
Nikolay: Yeah.

339
00:21:54,257 --> 00:22:00,164
And  Ros book also has this topic card also internal worth mentioning.

340
00:22:00,524 --> 00:22:01,394
Well, okay, good.

341
00:22:01,429 --> 00:22:01,693
And.

342
00:22:01,870 --> 00:22:04,270
Finally like related to hot updates.

343
00:22:04,270 --> 00:22:08,230
I have a feeling that a lot of things can be improved here.

344
00:22:08,320 --> 00:22:08,890
Definitely.

345
00:22:08,890 --> 00:22:13,150
For example, there was this discussion of the idea of partial hot updates.

346
00:22:13,150 --> 00:22:24,355
So let's have some indexes are updated in regular weight, some in partial, if
it's possible, but as I know, it was not like, Prototype was not get finished.

347
00:22:24,775 --> 00:22:28,075
Maybe in future we, we will see some improvements in this area.

348
00:22:28,075 --> 00:22:35,875
I have strong feeling that can be much, much better in
the future, but in general, definitely if you are thinking

349
00:22:35,875 --> 00:22:39,985
about this area, try to reach at least 13 or maybe 14.

350
00:22:40,195 --> 00:22:44,365
In terms of pogs major version, you will feel much better, right?

351
00:22:44,365 --> 00:22:48,385
Because in terms of visibility, in terms
of how between indexes, behave and so on.

352
00:22:49,045 --> 00:22:49,525
Okay.

353
00:22:49,647 --> 00:22:51,327
Anything else related to heart?

354
00:22:52,275 --> 00:22:53,955
Michael: I think it's quite important to recap.

355
00:22:53,955 --> 00:22:55,855
So the, the two big.

356
00:22:56,340 --> 00:23:05,090
Things needed to achieve a hot update are one enough space on
the page and two, no index, no index columns being updated.

357
00:23:05,270 --> 00:23:09,170
And that kind of leads developers to, two things you can do about this.

358
00:23:09,170 --> 00:23:16,580
One is make sure there's more space available on each page,
so that's where fill factor tuning can be looked into.

359
00:23:17,060 --> 00:23:21,500
And then second one is make sure no, so
we've covered the kind of make sure your.

360
00:23:22,511 --> 00:23:25,931
You don't have indexes on columns you want to update if you don't need them.

361
00:23:25,931 --> 00:23:30,221
So any, if you've got any unused indexes,
it's a real benefit for cleaning those up.

362
00:23:30,663 --> 00:23:37,563
Or if there's a trade off there to consider, if you've got a very right heavy
workload, maybe you're okay with some reads being a bit slower as a result.

363
00:23:37,563 --> 00:23:38,253
So like there's that.

364
00:23:38,813 --> 00:23:43,133
That natural trade off there, but the one we haven't
talked about much is that fill factor tuning.

365
00:23:43,133 --> 00:23:48,293
And it's something I didn't know until today when I was
reading up on it, is that if you reduce the fill factor,

366
00:23:48,383 --> 00:23:51,113
so, oh wait, if you have a partition table, it's quite nice.

367
00:23:51,113 --> 00:23:53,583
You can experiment on fill factor of new partitions.

368
00:23:53,652 --> 00:23:58,363
Not in production maybe, but The other point that I
learned today though, was that you can change the fill

369
00:23:58,363 --> 00:24:01,393
factor on an existing table without rewriting it all.

370
00:24:01,398 --> 00:24:07,243
It doesn't change your past data, but you can benefit
going forwards at least, which is something that's quite

371
00:24:07,243 --> 00:24:10,133
interesting as a as something you might actually be able to do.

372
00:24:10,193 --> 00:24:16,756
You don't have to do a full table rewrite or similar
to, to start getting the benefits going forwards.

373
00:24:16,924 --> 00:24:17,314
Nikolay: Right.

374
00:24:17,314 --> 00:24:19,958
So again, fuel factor is the threshold.

375
00:24:20,393 --> 00:24:29,265
When defining when new page should be allocated and used
when new two tops are coming during inserts or updates.

376
00:24:29,295 --> 00:24:29,565
Right?

377
00:24:29,985 --> 00:24:32,655
And for table, it's handed by default.

378
00:24:32,655 --> 00:24:38,765
For indexes, it's 90 by default, meaning
for in, for indexes, we are better prepared.

379
00:24:39,265 --> 00:24:43,493
New pages use, I mean, I mean, to, to avoid them actually, right?

380
00:24:43,623 --> 00:24:47,306
To, to make them less likely to happen.

381
00:24:47,636 --> 00:24:54,441
But the question is, should we decrease fill
factor for tables from hundred to 90 or 80?

382
00:24:54,441 --> 00:24:55,101
70?

383
00:24:55,106 --> 00:24:55,971
60 50?

384
00:24:56,031 --> 00:24:57,501
Maybe I saw 50.

385
00:24:57,501 --> 00:24:58,611
Some people use 50.

386
00:24:58,631 --> 00:24:59,141
Michael: Oh wow.

387
00:24:59,691 --> 00:25:00,021
Nikolay: Yeah.

388
00:25:00,021 --> 00:25:03,231
Answer is test always right.

389
00:25:03,771 --> 00:25:04,741
, there's no magic answer.

390
00:25:04,891 --> 00:25:11,249
And it'll depend on data and especially on
the workload and also on your activity of owa.

391
00:25:11,249 --> 00:25:15,869
If you have not tuned owa, you will not benefit a lot.

392
00:25:16,289 --> 00:25:18,765
So OWA will need to clean up very quickly.

393
00:25:18,915 --> 00:25:20,145
Michael: That's such a great point.

394
00:25:20,205 --> 00:25:23,955
And things that prevent cleanup are also problematic here.

395
00:25:23,955 --> 00:25:24,855
So it.

396
00:25:25,155 --> 00:25:29,477
For example, if you have long running transactions
that prevent things being marked as dead.

397
00:25:29,597 --> 00:25:34,667
So there, there's a whole interplay in
terms of system level things to consider.

398
00:25:35,023 --> 00:25:40,483
Um, So yeah, testing and testing I think on something
that's realistic as possible is quite important here.

399
00:25:40,483 --> 00:25:48,023
So your pages need to be filled with data that's roughly realistic
in order to determine whether there'll be a space available.

400
00:25:48,173 --> 00:25:58,184
Nikolay: Or if you have a, a long transaction on a physical standby with
hot feedback on, or if you have a logical standby, which is lagging a lot.

401
00:25:58,334 --> 00:26:03,164
You will, your auto vacuum will not do your proper work.

402
00:26:03,194 --> 00:26:04,526
And this is an issue.

403
00:26:04,526 --> 00:26:08,946
And in this case, fill factor won't save you actually fill factor.

404
00:26:08,946 --> 00:26:13,506
Even make, will make things worse because more pages fill factor.

405
00:26:13,506 --> 00:26:17,916
If you reduce it you will need more space to store the same data.

406
00:26:17,946 --> 00:26:18,166
Right.

407
00:26:18,316 --> 00:26:19,666
If you said 50.

408
00:26:20,096 --> 00:26:20,416
Michael: yes,

409
00:26:20,896 --> 00:26:21,616
Nikolay: Twice more.

410
00:26:22,186 --> 00:26:23,476
Well, not well.

411
00:26:23,536 --> 00:26:24,016
Yeah, yeah.

412
00:26:24,016 --> 00:26:24,556
Initially.

413
00:26:24,616 --> 00:26:25,276
Right, right, right.

414
00:26:25,906 --> 00:26:31,186
So experiments like I, I saw a blog post from Cyber Tech, which.

415
00:26:31,486 --> 00:26:41,148
For some synthetic benchmark workload showed that if you decrease fuel factor
from a hundred to 90 roughly, you, you have improved performance improvement

416
00:26:41,148 --> 00:26:46,088
for updates also, like roughly 10%, roughly, but further, It's not.

417
00:26:46,088 --> 00:26:55,020
So if you next 10% doesn't give, you, don't give you 10%
of performance improvement only like a couple of percent.

418
00:26:55,200 --> 00:27:01,819
So maybe like the conclusion is maybe
you should have 94 tables as well, right?

419
00:27:01,999 --> 00:27:02,059
It

420
00:27:02,684 --> 00:27:03,854
Michael: Oh, you mean by default?

421
00:27:04,124 --> 00:27:05,114
Well, yeah,

422
00:27:05,509 --> 00:27:07,099
Nikolay: by defaulting your application.

423
00:27:07,099 --> 00:27:10,029
But we cannot say it's, it's rule for everyone.

424
00:27:10,359 --> 00:27:11,359
It should be test.

425
00:27:11,989 --> 00:27:16,879
Michael: If you have a pen only, like if you have an
append only workload or if you have a pen only tables,

426
00:27:16,969 --> 00:27:19,369
there's no point reducing it as far as I can tell.

427
00:27:19,369 --> 00:27:22,729
So it's only, we're only worried about update heavy

428
00:27:23,134 --> 00:27:26,794
Nikolay: If append only means no, ensures, no, no updates, no deletes,

429
00:27:26,869 --> 00:27:28,279
Michael: like Yeah, exactly.

430
00:27:28,369 --> 00:27:28,909
Yes.

431
00:27:29,006 --> 00:27:29,246
Nikolay: Mm-hmm.

432
00:27:29,486 --> 00:27:33,266
, like log, log, like table, which definitely should be partitioned.

433
00:27:33,596 --> 00:27:36,959
And only one or two indexes a maximum should be used on it.

434
00:27:37,121 --> 00:27:39,970
. Okay, Because inserts also unify indexes.

435
00:27:40,120 --> 00:27:42,340
There are no hot, hot inserts, unfortunately.

436
00:27:42,340 --> 00:27:42,610
Right.

437
00:27:42,970 --> 00:27:44,290
Index insert.

438
00:27:44,590 --> 00:27:47,140
Always modify all indexes, Unfortu.

439
00:27:47,530 --> 00:27:55,779
So if you want very good performance for insert, you
need to have fewer indexes as few as possible, right?

440
00:27:55,909 --> 00:27:56,904
Michael: Yeah, yeah, yeah.

441
00:27:56,904 --> 00:27:57,144
I

442
00:27:57,309 --> 00:27:58,359
Nikolay: Delete is the same.

443
00:27:58,364 --> 00:28:00,129
Delete also mag all indexes.

444
00:28:00,159 --> 00:28:01,419
There is no hard delete.

445
00:28:01,659 --> 00:28:04,509
I'm just like raising silly questions.

446
00:28:04,509 --> 00:28:06,279
Do we have hundred answers?

447
00:28:06,279 --> 00:28:06,639
No.

448
00:28:06,939 --> 00:28:07,299
Right?

449
00:28:07,695 --> 00:28:08,985
So delete will modify.

450
00:28:09,430 --> 00:28:11,050
Both inserts and deletes.

451
00:28:11,050 --> 00:28:19,517
Deletes, they will write to wall a lot because they need to,
implicitly, they will modify all indexes you have on table.

452
00:28:19,517 --> 00:28:27,227
So hand only case, usually, like I see people prefer not
having indexes at all, but maybe it's not a good idea as well

453
00:28:27,227 --> 00:28:31,255
because like who will take care of duplicates for example.

454
00:28:31,675 --> 00:28:31,945
Right?

455
00:28:32,162 --> 00:28:35,452
But maybe it's not, It doesn't matter a lot, but.

456
00:28:35,752 --> 00:28:36,232
Sometimes

457
00:28:36,372 --> 00:28:37,687
Michael: key feels important.

458
00:28:38,197 --> 00:28:38,677
Anyway,

459
00:28:38,977 --> 00:28:39,427
Nikolay: It's,

460
00:28:39,442 --> 00:28:40,702
Michael: think we've covered that quite well.

461
00:28:40,702 --> 00:28:42,862
I think in terms, I'm looking back over the question.

462
00:28:43,062 --> 00:28:49,872
We've covered fill factor, hot updates, kind, workloads, performance,
when it, when it's worth customizing and the age old advice

463
00:28:49,932 --> 00:28:52,152
Nikolay: how to control digital user tables.

464
00:28:52,572 --> 00:28:53,112
Michael: Yeah, exactly.

465
00:28:53,112 --> 00:28:54,042
How to monitor, How to

466
00:28:54,222 --> 00:28:55,692
Nikolay: Don't forget about buffers.

467
00:28:55,722 --> 00:28:55,812
Right.

468
00:28:56,382 --> 00:28:56,682
Michael: Yep.

469
00:28:56,832 --> 00:28:57,657
Nikolay: Buffers buffers.

470
00:28:57,963 --> 00:28:58,633
YouTube.

471
00:28:58,783 --> 00:29:00,433
Viewers should see buffers.

472
00:29:00,463 --> 00:29:00,943
Buffers.

473
00:29:00,943 --> 00:29:01,453
Buffers

474
00:29:01,843 --> 00:29:02,143
Michael: Yeah.

475
00:29:02,209 --> 00:29:08,018
Nicola has a t-shirt on for anybody listening with
explain, analyze, and then buffers in bold, which is, yeah.

476
00:29:08,108 --> 00:29:08,588
Awesome.

477
00:29:08,679 --> 00:29:09,789
Well, thanks everybody.

478
00:29:09,856 --> 00:29:11,206
keep the suggestions coming.

479
00:29:11,298 --> 00:29:15,258
We wouldn't, I don't think we'd have done this topic so
early if, if we hadn't had it requested, so that's great.

480
00:29:15,264 --> 00:29:16,271
Thank you, Nicola.

481
00:29:16,581 --> 00:29:24,865
Nikolay: Definitely I would like, I would be happy to hear about some real
life stories production stories related to field factor and hot updates.

482
00:29:24,895 --> 00:29:31,023
It's, it's always interesting because we, we have some
materials, but most of them are talking about some synthetic

483
00:29:31,173 --> 00:29:34,623
like experimental environment, not about real production.

484
00:29:35,073 --> 00:29:41,403
And the production stories are always interesting to hear
because they tend to to be reproduced in other places.

485
00:29:41,703 --> 00:29:47,145
Michael: In fact,  Adian uh, shared that they changed
their fuel factor to 85%, which is super interesting.

486
00:29:47,390 --> 00:29:47,630
Nikolay: Interesting.

487
00:29:47,630 --> 00:29:48,030
Yeah.

488
00:29:48,074 --> 00:29:52,244
Michael: Again, it'd be great to hear what numbers
other people have settled on based on their own testing.

489
00:29:52,919 --> 00:29:53,069
Nikolay: Yep.

490
00:29:53,369 --> 00:29:53,519
Good.

491
00:29:53,819 --> 00:29:54,619
Michael: Wonderful.

492
00:29:54,879 --> 00:29:57,249
Well, thanks again everybody and see you next

493
00:29:57,404 --> 00:29:59,754
Nikolay: Oh, it was episode number 20, right?

494
00:30:00,249 --> 00:30:00,849
Michael: Yeah.

495
00:30:01,059 --> 00:30:02,289
Nikolay: Yeah, . Okay.

496
00:30:02,619 --> 00:30:04,959
Almost half a year without interruptions.

497
00:30:04,959 --> 00:30:05,709
Without breaks.

498
00:30:05,829 --> 00:30:06,159
Good.

499
00:30:06,729 --> 00:30:08,949
Thank you everyone for being with us.

500
00:30:09,249 --> 00:30:09,759
Michael: Absolutely.

501
00:30:09,774 --> 00:30:10,074
Nikolay: next.

502
00:30:10,164 --> 00:30:11,214
Next till next week.

503
00:30:11,514 --> 00:30:11,934
Michael: Bye.