1
0:0:0,06 --> 0:0:2,22
Michael: Hello and welcome to Postgres.FM,
a weekly show about

2
0:0:2,22 --> 0:0:3,06
all things PostgreSQL.

3
0:0:3,24 --> 0:0:4,6
I am Michael, founder of pgMustard.

4
0:0:4,6 --> 0:0:6,92
I'm joined as always by Nik, founder of PostgresAI.

5
0:0:7,08 --> 0:0:7,7799997
Hey, Nik.

6
0:0:8,4 --> 0:0:9,0
Nikolay: Hi, Michael.

7
0:0:9,0 --> 0:0:9,78
How are you?

8
0:0:10,08 --> 0:0:10,44
Michael: Good.

9
0:0:10,44 --> 0:0:15,06
And we're delighted to be joined by Bruce Momjian, who is VP of

10
0:0:15,06 --> 0:0:19,84
Postgres Evangelist at EDB and long-serving Postgres Core Team 

11
0:0:19,84 --> 0:0:23,16
member who recently gave a new talk about the missing features 

12
0:0:23,16 --> 0:0:23,86
in Postgres.

13
0:0:23,939999 --> 0:0:26,54
So welcome Bruce, it's an honour to have you here.

14
0:0:26,68 --> 0:0:27,779999
Bruce: Good to be with you.

15
0:0:28,619999 --> 0:0:29,06
Michael: Wonderful.

16
0:0:29,06 --> 0:0:33,62
Perhaps we could get started with why this topic, why is it important 

17
0:0:33,62 --> 0:0:35,14
or why is it important to you?

18
0:0:35,46 --> 0:0:38,239998
Bruce: So I'd love to say that I thought of this topic and I'm 

19
0:0:38,239998 --> 0:0:43,0
a genius, but no, actually Melanie Plageman came to me in Riga, 

20
0:0:43,34 --> 0:0:45,08
PG Europe in the fall.

21
0:0:45,56 --> 0:0:51,96
And she said, we'd really love you to do a new talk at the

22
0:0:51,96 --> 0:0:54,44
PGConf.dev Conference in Vancouver.

23
0:0:55,14 --> 0:0:56,08
I said, okay.

24
0:0:56,5 --> 0:0:59,739998
And she said, how about things that aren't in Postgres?

25
0:0:59,82 --> 0:1:1,82
I think, wow, that's a great idea.

26
0:1:2,32 --> 0:1:5,58
So then I guess I was showing these slides to Robert Haas once 

27
0:1:5,58 --> 0:1:6,34
I finished them.

28
0:1:6,34 --> 0:1:8,56
And he's like, wow, Melanie got a mention.

29
0:1:8,56 --> 0:1:13,16
I'm like, yeah, I said half the job is figuring out what topic 

30
0:1:13,2 --> 0:1:14,62
to do for a talk.

31
0:1:15,18 --> 0:1:17,02
So he said, I'll have to think about that now.

32
0:1:17,02 --> 0:1:19,76
I can get a mention just by thinking of a topic.

33
0:1:19,76 --> 0:1:21,4
I don't even have to do the talk.

34
0:1:22,92 --> 0:1:24,86
So yeah, I thought it was a cool idea.

35
0:1:24,86 --> 0:1:26,42
I never would have thought of it.

36
0:1:26,42 --> 0:1:29,82
There's a couple of my talks where you'll see in the introduction 

37
0:1:30,02 --> 0:1:36,5
of title concept from text, because again, I have about 60, 65 

38
0:1:36,56 --> 0:1:41,76
talks I think now and it's hard to, 67, it's hard to think of 

39
0:1:41,76 --> 0:1:43,4
like topics.

40
0:1:44,06 --> 0:1:49,84
I've been really lucky because I did 2 new ones in the fall of 

41
0:1:49,86 --> 0:1:50,72
last year.

42
0:1:51,18 --> 0:1:55,94
1 is this talk, another is the wonderful world of WAL, which 

43
0:1:55,94 --> 0:1:59,64
is about the write-ahead log, and then I have a third 1 which 

44
0:1:59,64 --> 0:2:0,8
is more of an AI talk.

45
0:2:0,8 --> 0:2:2,34
I already have 2 AI talks.

46
0:2:2,36 --> 0:2:5,56
This is another 1 called building an MCP server using Postgres.

47
0:2:6,18 --> 0:2:9,06
So I got 3 new talks queued up for 2026.

48
0:2:9,34 --> 0:2:11,64
So it's unusual for me.

49
0:2:11,78 --> 0:2:16,46
I write my talks and basically give them for maybe a year around 

50
0:2:16,5 --> 0:2:19,0
and then come up with a new 1 because I've been doing it for

51
0:2:19,0 --> 0:2:19,6
30 years.

52
0:2:19,6 --> 0:2:23,24
So on average, I'm making 2 talks
a year because I have 67, but

53
0:2:23,24 --> 0:2:24,28
this year I get 3.

54
0:2:24,28 --> 0:2:26,0
So I was like, wow, this is great.

55
0:2:26,0 --> 0:2:28,78
And Melanie, I have to give her
credit for the topic on this

56
0:2:28,78 --> 0:2:29,28
one.

57
0:2:30,3 --> 0:2:31,5
Michael: That's awesome to hear.

58
0:2:31,5 --> 0:2:33,98
Friend of the show Melanie, she's
been a guest before.

59
0:2:34,12 --> 0:2:35,14
That's good to hear.

60
0:2:35,14 --> 0:2:36,3
Nikolay: Yeah, I have a question.

61
0:2:36,36 --> 0:2:40,76
If my memory doesn't play a game
with me, 20 years ago I remember

62
0:2:40,76 --> 0:2:42,58
you maintained the to-do list.

63
0:2:43,08 --> 0:2:43,78
Bruce: That's right.

64
0:2:44,02 --> 0:2:45,56
Nikolay: Isn't this a to-do list?

65
0:2:46,22 --> 0:2:47,56
Bruce: Is this a to-do list?

66
0:2:47,56 --> 0:2:48,74
I would say no.

67
0:2:48,74 --> 0:2:54,22
The to-do list is basically driven
by what people have asked

68
0:2:54,22 --> 0:2:54,6
for.

69
0:2:54,6 --> 0:2:56,82
It could be a small thing, it could
be a big thing.

70
0:2:57,18 --> 0:3:1,9
This is more of the, I would say,
the big missing features.

71
0:3:2,18 --> 0:3:5,04
And a lot of them are ones that
we had no intention of doing.

72
0:3:5,68 --> 0:3:8,3
So it's sort of like, let's step
back.

73
0:3:9,64 --> 0:3:11,48
Let's look at what big things were
missing.

74
0:3:11,48 --> 0:3:16,22
Let's look at what big things we're
maybe working on.

75
0:3:16,22 --> 0:3:18,72
And then look at some of the big
things that we'll probably never

76
0:3:18,72 --> 0:3:19,22
do.

77
0:3:19,84 --> 0:3:20,78
Nikolay: So it's strategic.

78
0:3:22,28 --> 0:3:23,26
Bruce: Much more strategic.

79
0:3:23,6 --> 0:3:24,02
You're right.

80
0:3:24,02 --> 0:3:26,6
If I drill down to the to-do list,
we'll be here forever.

81
0:3:26,72 --> 0:3:31,4
But again, I think it is valuable
to step back and look at what

82
0:3:31,4 --> 0:3:32,72
is missing and what isn't.

83
0:3:32,72 --> 0:3:36,24
And I even learned something just
by writing the talk because

84
0:3:36,68 --> 0:3:40,2
stepping back far enough and looking
at the missing things that,

85
0:3:40,2 --> 0:3:42,7
so I think I have, what is that?

86
0:3:43,04 --> 0:3:44,02
12 items.

87
0:3:45,04 --> 0:3:49,14
And what's interesting is 11 of
the 12 items relate to performance.

88
0:3:50,2 --> 0:3:51,42
That surprised me.

89
0:3:52,2 --> 0:3:54,64
Michael: If you had to guess beforehand,
what would you thought

90
0:3:54,64 --> 0:3:55,88
more of them would be?

91
0:3:56,04 --> 0:4:1,66
Bruce: I thought it would be basically
missing operational features

92
0:4:2,12 --> 0:4:6,36
or missing security features.

93
0:4:6,5 --> 0:4:8,56
Well, that's TD is that 1.

94
0:4:8,56 --> 0:4:10,28
That's the 1 of the 12, right?

95
0:4:10,52 --> 0:4:15,76
Missing just integration, infrastructure,
some of these big missing

96
0:4:15,76 --> 0:4:16,26
things.

97
0:4:16,5 --> 0:4:20,1
But when I actually wrote them
down, I kind of like, wow.

98
0:4:20,74 --> 0:4:25,28
So 11 of the 12 are basically related
to scaling, either on a

99
0:4:25,28 --> 0:4:29,86
single host, 7 of them, or multiple
hosts, which is another area

100
0:4:29,86 --> 0:4:30,6
we need to work on.

101
0:4:30,6 --> 0:4:35,6
So you got really 2 kind of 11
of the 12 performance, 7 of them

102
0:4:35,6 --> 0:4:39,38
are for single host and 4 of them
are for multi-host.

103
0:4:40,52 --> 0:4:45,56
And again, I just never suspected
that it would be a performance

104
0:4:46,5 --> 0:4:48,62
need and not an operational need.

105
0:4:48,62 --> 0:4:51,34
If I had done this 10 years ago,
I know there would have been

106
0:4:51,34 --> 0:4:52,54
a whole bunch of things that...

107
0:4:52,54 --> 0:4:54,52
Nikolay: I can bring some ideas.

109
0:4:54,52 --> 0:4:57,74
Maybe I should do it in the end,
which are not performance related

110
0:4:57,74 --> 0:5:1,32
and definitely big missing pieces
And they're not included to

111
0:5:1,32 --> 0:5:1,72
your list.

112
0:5:1,72 --> 0:5:2,44
So maybe let's

113
0:5:2,44 --> 0:5:2,78
Bruce: see.

114
0:5:2,78 --> 0:5:3,4
I have...

115
0:5:3,4 --> 0:5:7,12
What's interesting, I've given
this twice, both in January, 1

116
0:5:7,12 --> 0:5:10,92
in Prague, 1 in Brussels, and nobody
really came up, in those

117
0:5:10,92 --> 0:5:15,26
groups, nobody really came up with
anything that was wrong.

118
0:5:16,02 --> 0:5:21,04
The only, I think, feeling I got
was some of the people were

119
0:5:21,04 --> 0:5:24,78
sort of like, well Oracle RAC has
some value in these small use

120
0:5:24,78 --> 0:5:25,18
cases.

121
0:5:25,18 --> 0:5:26,62
And I'm like, yeah.

122
0:5:26,72 --> 0:5:30,52
So there was a little bit of, you
don't have this Oracle thing

123
0:5:30,52 --> 0:5:34,5
or multi-master like, well maybe
there was some value.

124
0:5:34,74 --> 0:5:35,74
So I get it.

125
0:5:35,74 --> 0:5:38,86
Yeah, it's there, but I didn't
get anything that sort of shocked

126
0:5:38,86 --> 0:5:38,99
me.

127
0:5:38,99 --> 0:5:40,84
So I'll be interested to see if
you have anything that was sort

128
0:5:40,84 --> 0:5:41,2
of like,

129
0:5:41,2 --> 0:5:42,54
Nikolay: Sounds good, I will do
it.

130
0:5:42,54 --> 0:5:43,94
Bruce: Yeah, I want to hear this.

131
0:5:44,12 --> 0:5:45,64
Michael: Why don't we do it straight
away?

132
0:5:46,12 --> 0:5:46,62
Bruce: Sure.

133
0:5:46,78 --> 0:5:49,2
Nikolay: Right now, let's discuss
this.

134
0:5:49,2 --> 0:5:52,62
Okay, I'm looking at your list
and I don't see, for example,

135
0:5:53,86 --> 0:5:56,58
synchronous replication improvements
because right now synchronous

136
0:5:56,58 --> 0:5:58,94
replication is not all right.

137
0:5:59,24 --> 0:6:4,62
If you maybe watched Kukushkin's
talk last summer talking about

138
0:6:4,62 --> 0:6:10,16
caveats, and This is a big topic,
for example, when there are

139
0:6:10,16 --> 0:6:14,58
new people coming to Postgres ecosystem
from MySQL, for example,

140
0:6:14,58 --> 0:6:19,74
and 1 of big group is Multigres,
creators of Vitess.

141
0:6:23,0 --> 0:6:25,92
Some words from them, some words
from Kukushkin, and I see synchronous

142
0:6:25,92 --> 0:6:30,48
replication is not all right because
it's based on logs And if

143
0:6:30,48 --> 0:6:36,14
primary already written data, and
then it suddenly restarts,

144
0:6:36,46 --> 0:6:39,9
primary thinks data is written,
standby never accepted.

145
0:6:40,24 --> 0:6:41,08
This is 1 thing.

146
0:6:41,08 --> 0:6:45,0
Another thing is that if you have
logical replication involved,

147
0:6:46,38 --> 0:6:49,74
There is a nightmare to handle
that.

148
0:6:49,74 --> 0:6:54,52
Basically you lose, during failover
with issues, you lose logical

149
0:6:54,52 --> 0:6:55,02
replica.

150
0:6:55,26 --> 0:6:58,98
And looking at other systems, including
MySQL, there is a big

151
0:6:58,98 --> 0:7:0,98
potential to improve synchronous
replication.

152
0:7:1,78 --> 0:7:4,94
Basically some people won't say
names but just watch Kukushkin's

153
0:7:5,2 --> 0:7:5,7
talk.

154
0:7:5,9 --> 0:7:8,14
It's broken right now.

155
0:7:8,3 --> 0:7:11,92
Bruce: So I'm working on the PG 19
release notes.

156
0:7:11,92 --> 0:7:16,88
I do see improvements in both of
those areas in the release notes

157
0:7:16,88 --> 0:7:17,68
that I'm working on.

158
0:7:17,68 --> 0:7:20,02
They're not done yet, it won't
be done for about 2 weeks.

159
0:7:20,02 --> 0:7:22,64
But it'd be interesting if you
could come back and take a look

160
0:7:22,64 --> 0:7:24,78
and see what was done.

161
0:7:24,92 --> 0:7:28,32
So what I didn't, what happens
in that particular example, both

162
0:7:28,32 --> 0:7:31,64
examples, is you're now out of
the 30,000 foot and you're now

163
0:7:31,64 --> 0:7:32,2
getting down.

164
0:7:32,2 --> 0:7:34,18
And there's totally stuff to do.

165
0:7:34,18 --> 0:7:35,82
Nikolay: Sorry, I'm not getting
down.

166
0:7:36,02 --> 0:7:38,94
When people with enterprise and
big cluster experience come,

167
0:7:38,94 --> 0:7:40,56
they say, where is synchronous
replication?

168
0:7:40,56 --> 0:7:42,22
We cannot build proper clusters.

169
0:7:43,14 --> 0:7:46,42
And what I'm telling them, most
people still use asynchronous

170
0:7:46,42 --> 0:7:48,14
replication and they cannot believe.

171
0:7:48,46 --> 0:7:49,58
They cannot believe.

172
0:7:50,28 --> 0:7:50,78
Really?

173
0:7:52,66 --> 0:7:53,12
Bruce: Yes.

174
0:7:53,12 --> 0:7:53,28
How can

175
0:7:53,28 --> 0:7:53,36
Nikolay: they say?

176
0:7:53,36 --> 0:7:56,28
Other systems have proper synchronous
replication with proper

177
0:7:56,28 --> 0:7:57,32
consensus algorithm.

178
0:7:57,4 --> 0:7:58,58
This is another topic.

179
0:7:59,1 --> 0:8:2,96
There are many attempts to bring
consensus algorithm or something

180
0:8:2,96 --> 0:8:4,24
inside Postgres, right?

181
0:8:4,28 --> 0:8:6,16
That's similar to pooler question.

182
0:8:6,58 --> 0:8:10,2
It's always there are many ways,
and so this topic is not about

183
0:8:10,2 --> 0:8:11,78
performance, it's about HA.

184
0:8:12,66 --> 0:8:15,06
Bruce: No, what I'm saying, I'm
saying, Dan, what I'm saying

185
0:8:15,06 --> 0:8:18,74
is I'm trying to look at this from
a very high level to look

186
0:8:18,74 --> 0:8:20,1
at big holes.

187
0:8:20,74 --> 0:8:23,56
What you're actually looking at
is we have the feature but it

188
0:8:23,56 --> 0:8:25,46
doesn't work right or it needs
whatever.

189
0:8:25,76 --> 0:8:29,44
I get that but again if I go down
that level I'll never get done

190
0:8:29,44 --> 0:8:29,94
right.

191
0:8:30,74 --> 0:8:35,54
I'm trying to from the top look
for these big holes and actually

192
0:8:35,54 --> 0:8:38,94
to me not only look at the big
holes, but look at why are they

193
0:8:38,94 --> 0:8:40,22
holes, right?

194
0:8:40,32 --> 0:8:43,66
Because a lot of people who aren't
me or you, who are sitting

195
0:8:43,66 --> 0:8:48,1
watching everything that's happening
all 25 hours a day, right?

196
0:8:48,1 --> 0:8:49,86
They're like, what's the problem?

197
0:8:49,86 --> 0:8:50,78
What's the holdup?

198
0:8:50,98 --> 0:8:55,62
And then the other thing for me
is, by having that 10,000-foot

199
0:8:55,86 --> 0:9:1,8
view, I see a pattern clearer than
I could see when I was down

200
0:9:1,8 --> 0:9:2,28
in the...

201
0:9:2,28 --> 0:9:8,1
I'm not disputing that these things are not feature complete,

202
0:9:8,1 --> 0:9:8,98
I would say.

203
0:9:9,14 --> 0:9:12,74
It's not atypical for us, particularly with partitioning and

204
0:9:12,74 --> 0:9:17,96
logical replication and failover, to take years to get to a feature

205
0:9:17,96 --> 0:9:19,68
complete point.

206
0:9:19,74 --> 0:9:22,64
And I think we are in a feature complete 0.1 whole bunch of areas,

207
0:9:22,64 --> 0:9:25,96
but I tried to stay away from, okay, we're not feature complete

208
0:9:25,96 --> 0:9:27,68
here, we're not feature complete there.

209
0:9:27,7 --> 0:9:30,24
I'm like, what things are we not even like moving in?

210
0:9:30,24 --> 0:9:33,58
And there are a bunch of them that I would say we're just not

211
0:9:34,08 --> 0:9:35,4
engaged at all.

212
0:9:36,14 --> 0:9:37,66
Nikolay: Yeah, I get that.

213
0:9:37,66 --> 0:9:38,68
That's a good description.

214
0:9:38,68 --> 0:9:42,24
And when you're at this highest level, you see a lot of performance

215
0:9:42,26 --> 0:9:43,48
related stuff, right?

216
0:9:43,52 --> 0:9:44,2
Bruce: That's right.

217
0:9:44,2 --> 0:9:48,28
At that level, I see a lot of stuff that we're not even in the

218
0:9:48,28 --> 0:9:49,34
ballpark in.

219
0:9:49,34 --> 0:9:51,9
That we're just scratching the surface.

220
0:9:52,28 --> 0:9:55,56
I'm not saying all the things we've done so far that have been

221
0:9:55,56 --> 0:9:58,26
implemented or done perfectly or feature completely clearly are

222
0:9:58,26 --> 0:9:58,7
not.

223
0:9:58,7 --> 0:10:1,16
But at least I feel we're moving there.

224
0:10:1,3 --> 0:10:6,1
What I see up top is a whole class of things where we're not

225
0:10:6,1 --> 0:10:9,42
even engaging, and we may not need to engage.

226
0:10:9,44 --> 0:10:13,14
It's more of a thought experiment for people to come in and say,

227
0:10:13,14 --> 0:10:15,82
okay, what things are they not even in the ballpark on?

228
0:10:15,82 --> 0:10:18,72
And it's helpful to know, because if you look at the list it's

229
0:10:18,72 --> 0:10:20,04
not a terrible list.

230
0:10:20,32 --> 0:10:23,1
And if you think this is the biggest stuff we don't have, we

231
0:10:23,1 --> 0:10:24,44
must have a lot of stuff, right?

232
0:10:24,44 --> 0:10:27,98
Because I can't think of any other big things except these 12.

233
0:10:27,98 --> 0:10:30,82
There might be 1 or 2 that somebody will think of at some point.

234
0:10:30,92 --> 0:10:36,36
But I think for me it's a great exercise in seeing where we are

235
0:10:36,6 --> 0:10:37,74
at that big level.

236
0:10:37,74 --> 0:10:41,1
Where do we think we need to go 05:10 years out?

237
0:10:41,4 --> 0:10:45,12
That's usually the area, that's the distance I'm looking at,

238
0:10:45,3 --> 0:10:47,62
Not necessarily what comes in 19 or 20.

239
0:10:47,66 --> 0:10:50,16
I think some of the stuff you're looking for it's coming in those

240
0:10:50,16 --> 0:10:50,6
releases

241
0:10:50,6 --> 0:10:51,52
Nikolay: That's good.

242
0:10:51,56 --> 0:10:54,72
Bruce: But what stuff would do we really need to at least evaluate

243
0:10:55,02 --> 0:10:57,54
and say is this a direction we need to go in or not?

244
0:10:58,52 --> 0:11:1,32
Michael: Looking through the list it struck me that there's a

245
0:11:1,32 --> 0:11:5,72
bunch of things that you can currently do, but you need extensions

246
0:11:6,22 --> 0:11:7,88
or a separate tool.

247
0:11:7,88 --> 0:11:10,96
You can use Postgres and get those features, but you need an

248
0:11:10,96 --> 0:11:14,48
extension or another service like PgBouncer or something.

249
0:11:15,04 --> 0:11:18,54
But there's also a bunch that you literally can't do with Postgres,

250
0:11:18,84 --> 0:11:22,42
so you would stick with maybe a proprietary database or just

251
0:11:22,42 --> 0:11:25,92
a different solution completely, maybe a fork or something like

252
0:11:25,92 --> 0:11:26,42
that.

253
0:11:26,64 --> 0:11:31,16
So it felt like there's that kind of 2 categories there where

254
0:11:31,68 --> 0:11:34,32
there are things you can do with Postgres, but you need something

255
0:11:34,32 --> 0:11:35,22
outside of core.

256
0:11:35,22 --> 0:11:37,48
And there are things that you can't do with it.

257
0:11:37,96 --> 0:11:40,94
So is there 1 of those that's more interesting than the other

258
0:11:40,94 --> 0:11:42,68
in terms of bringing into core?

259
0:11:42,98 --> 0:11:43,98
Bruce: That's funny you mentioned that.

260
0:11:43,98 --> 0:11:45,58
I really struggled with that question.

261
0:11:45,66 --> 0:11:47,5
And the audience struggled with that question.

262
0:11:47,5 --> 0:11:50,46
I remember people saying, hey, what about this?

263
0:11:50,46 --> 0:11:50,98
What about that?

264
0:11:50,98 --> 0:11:55,26
And I think the version, these slides are regularly updated.

265
0:11:55,32 --> 0:11:58,5
Like every time I give a talk, I think it's something that, I'm

266
0:11:58,5 --> 0:12:1,64
just looking at 1 of the capitalizations I did wrong.

267
0:12:1,64 --> 0:12:3,74
Anyway, I'll fix that after this talk.

268
0:12:3,9 --> 0:12:6,5
But my point is that you're right.

269
0:12:6,5 --> 0:12:9,14
You have, there's like concentric circles.

270
0:12:9,14 --> 0:12:15,3
So if you're willing, if you're willing to go to a proprietary

271
0:12:15,36 --> 0:12:18,22
fork of Postgres, it's a big circle, right?

272
0:12:18,54 --> 0:12:22,86
And then if you're willing to use extension or external tools,

273
0:12:22,86 --> 0:12:27,74
let's say external tools and then proprietary extensions and

274
0:12:27,74 --> 0:12:31,88
then extensions And then the core, right?

275
0:12:31,92 --> 0:12:35,42
And in fact, maybe I should have a slide for this to illustrate

276
0:12:35,9 --> 0:12:37,7
what that span is.

277
0:12:38,12 --> 0:12:42,18
And in a lot of cases, we will say we have it.

278
0:12:42,5 --> 0:12:44,34
This is just great you asked that question.

279
0:12:44,34 --> 0:12:46,62
Nobody asked it quite the same way you did.

280
0:12:46,64 --> 0:12:51,0
So the question, some of the complaints we have are not that

281
0:12:51,0 --> 0:12:56,3
it's not available as either an extension or an external tool

282
0:12:56,4 --> 0:12:58,98
or a proprietary work of Postgres.

283
0:12:59,18 --> 0:13:1,82
The complaint we have is it's not in core Postgres.

284
0:13:3,34 --> 0:13:8,0
Or that there's an extension, but the extension, because it's

285
0:13:8,0 --> 0:13:12,88
an extension, doesn't work exactly the same way it could possibly

286
0:13:12,88 --> 0:13:14,12
work if it was in core.

287
0:13:14,28 --> 0:13:18,96
Or if it's a third-party tool, the third-party tool is not as

288
0:13:18,96 --> 0:13:23,36
effective or easy to use as it would be if it was integrated,

289
0:13:23,6 --> 0:13:26,02
which is the connection pooling example.

290
0:13:26,04 --> 0:13:27,6
Nikolay: Or there's a choice of them.

291
0:13:27,6 --> 0:13:29,1
Like, why a choice of them?

292
0:13:29,1 --> 0:13:30,36
Bruce: Right, there's a choice of them.

293
0:13:30,36 --> 0:13:33,26
Although the choice of them I kind of like, particularly in the

294
0:13:33,26 --> 0:13:36,98
pgpool, PgBouncer realm, there is actually a reason for that.

295
0:13:37,04 --> 0:13:42,04
But my point is that some of the missing stuff, even something

296
0:13:42,04 --> 0:13:43,2
like column there, right?

297
0:13:43,2 --> 0:13:47,92
You could say we have extensions to do that, but yeah, I guess,

298
0:13:47,92 --> 0:13:51,66
but I'm not really happy with that and I'm not necessarily wedded

299
0:13:51,66 --> 0:13:55,58
to Citus and it has limitations and because it's an extension,

300
0:13:55,64 --> 0:13:56,7
it might not be available.

301
0:13:56,94 --> 0:14:2,32
So you get all these sort of complaints
And it brings you to

302
0:14:2,32 --> 0:14:5,38
a bigger philosophical question
of, does everything need to be

303
0:14:5,38 --> 0:14:6,34
in core, right?

304
0:14:6,34 --> 0:14:7,58
And there's probably not.

305
0:14:7,66 --> 0:14:11,52
I think pgvector is a great example
of something external that

306
0:14:11,52 --> 0:14:14,6
can be developed on its own release
cycle.

307
0:14:14,6 --> 0:14:18,54
And it really doesn't hamper its
ability to be used effectively

308
0:14:18,6 --> 0:14:19,5
with the code.

309
0:14:19,54 --> 0:14:21,88
But when you're starting to talk
column there, you're starting

310
0:14:21,88 --> 0:14:25,58
to talk connection pooling, where
you have problems with authentication

311
0:14:25,84 --> 0:14:27,26
going through the connection pooler.

312
0:14:27,34 --> 0:14:29,76
We've tried to improve that, but
it's still kind of yucky.

313
0:14:29,96 --> 0:14:33,12
You know, logical replication at
DDL, you know, another one's

314
0:14:33,12 --> 0:14:37,28
available as an external proprietary
product but not in the code.

315
0:14:37,58 --> 0:14:41,0
TDE is a classic that's available
in a bunch of external stuff.

316
0:14:41,0 --> 0:14:42,54
It's not available in the code.

317
0:14:42,54 --> 0:14:48,34
So it's sometimes when people complain
or it's missing, what

318
0:14:48,34 --> 0:14:52,0
I tried to focus on was basically
in this talk, is it available

319
0:14:52,16 --> 0:14:56,14
either in Postgres or as an extension
that has no downsides?

320
0:14:57,44 --> 0:14:59,0
That's my litmus test.

321
0:15:0,26 --> 0:15:5,04
Anything that has limitations and
is an extension, limitations

322
0:15:5,22 --> 0:15:9,52
as a third party tool, or obviously
as a proprietary fork, those

323
0:15:9,52 --> 0:15:13,96
are going to be effectively not
available in the community.

324
0:15:14,18 --> 0:15:17,2
And we have to decide, is that
a good place to be, a bad place

325
0:15:17,2 --> 0:15:17,84
to be?

326
0:15:17,84 --> 0:15:18,9
Maybe that's fine.

327
0:15:19,54 --> 0:15:23,08
Michael: I really like the concentric
circles thought and I want

328
0:15:23,08 --> 0:15:27,5
just on the kind of the last 2,
is there also a difference in

329
0:15:27,5 --> 0:15:31,4
your head between contrib extensions
and core?

330
0:15:31,42 --> 0:15:36,8
Like, for example, pg_stat_statements,
a lot of the pg_stat_* views

331
0:15:36,82 --> 0:15:40,84
are on by default when people install
Postgres, but pg_stat_statements

332
0:15:40,84 --> 0:15:41,32
isn't.

333
0:15:41,32 --> 0:15:43,86
So like that's an interesting,
I don't know if that's too semantic

334
0:15:43,86 --> 0:15:46,92
a discussion, but that feels interesting
to me as is that missing

335
0:15:46,92 --> 0:15:48,1
or not really missing?

336
0:15:48,16 --> 0:15:48,64
Bruce: You're right.

337
0:15:48,64 --> 0:15:50,42
That's another concentric circle.

338
0:15:50,44 --> 0:15:50,64
Nikolay: Yeah.

339
0:15:50,64 --> 0:15:51,14
Bruce: Right.

340
0:15:51,34 --> 0:15:55,24
So yeah, see like pgvector, maybe
that would be nice to be in

341
0:15:55,24 --> 0:15:57,38
contrib instead of being external.

342
0:15:57,66 --> 0:16:1,62
But then if we pulled into contrib
and its release cycle becomes

343
0:16:1,62 --> 0:16:3,06
tied to the major releases.

344
0:16:3,48 --> 0:16:5,56
Maybe that's not a good idea, I
don't know.

345
0:16:5,66 --> 0:16:8,2
And when you start to talk about
the cloud vendors will support

346
0:16:8,2 --> 0:16:11,66
some contrib extensions and not
others, and you get people who

347
0:16:11,66 --> 0:16:14,86
pull me aside, hey, why don't you
move this into core, get out

348
0:16:14,86 --> 0:16:16,98
of contrib, And I'm like, what's
the matter with contrib?

349
0:16:17,22 --> 0:16:20,34
And you're like, well, my cloud
vendor doesn't support it.

350
0:16:20,74 --> 0:16:24,4
I'm like, okay, so you're complaining
to me that your cloud vendor

351
0:16:24,4 --> 0:16:26,1
doesn't support something we're already shipping.

352
0:16:26,32 --> 0:16:28,6
Is that, do you think that's gonna be like fruitful?

353
0:16:29,28 --> 0:16:31,12
And they're like, well, why isn't it?

354
0:16:31,78 --> 0:16:34,78
But there's a variety of reasons that stuff is in contrib.

355
0:16:34,82 --> 0:16:36,6
It may be an edge use case.

356
0:16:36,6 --> 0:16:42,74
It may be like pg_stat_statements, creates its own tables and

357
0:16:42,9 --> 0:16:47,0
has its own overhead that we're not sure everyone would want.

358
0:16:47,16 --> 0:16:49,42
So that kind of makes sense out there.

359
0:16:49,82 --> 0:16:57,94
Some of them like an oddball 1 like cube, an edgy case, pg_trgm.

360
0:17:0,24 --> 0:17:1,08
I don't know.

361
0:17:1,08 --> 0:17:2,22
It's hard to say.

362
0:17:2,26 --> 0:17:3,62
Is it better as an extension?

363
0:17:3,64 --> 0:17:5,46
Is it going to be better in core?

364
0:17:5,66 --> 0:17:7,9
You can make a case either way, I would say.

365
0:17:8,44 --> 0:17:8,94
Michael: Nice.

366
0:17:9,48 --> 0:17:12,5
From the list, are there any that are like particularly your

367
0:17:12,5 --> 0:17:16,78
favorites or ones that you would love to see progress on or do

368
0:17:16,78 --> 0:17:18,84
you not like to pick favourites?

369
0:17:19,84 --> 0:17:24,94
Bruce: Yeah, the 2 that I've really felt called to champion are

370
0:17:25,08 --> 0:17:28,56
cluster file encryption, which is called TDE in the industry

371
0:17:28,9 --> 0:17:29,64
and sharding.

372
0:17:29,7 --> 0:17:35,78
Those are the 2 that I have the biggest impact in terms of adoption

373
0:17:36,16 --> 0:17:38,76
and expanding Postgres workloads for me.

374
0:17:39,12 --> 0:17:41,5
The other ones are interesting, but they're more...

375
0:17:42,66 --> 0:17:46,36
I don't see them as expanding the Postgres adoption universe.

376
0:17:46,5 --> 0:17:49,32
That's, I guess, the phrase I would use.

377
0:17:49,46 --> 0:17:51,86
Yeah, The other ones are good, but they're more operational.

378
0:17:51,92 --> 0:17:53,3
It would be nice to have this.

379
0:17:54,72 --> 0:17:57,48
I would be more optimized if I had this other thing.

380
0:17:58,14 --> 0:18:3,58
For me, the sharding and the TDE really take Postgres to a new

381
0:18:3,58 --> 0:18:4,08
level.

382
0:18:4,6 --> 0:18:7,62
I would say in both of those, I've pretty much either failed

383
0:18:7,86 --> 0:18:12,04
or executed poorly in terms of because I've been working, at

384
0:18:12,04 --> 0:18:14,88
least championing those for at least 5 years now.

385
0:18:15,26 --> 0:18:19,54
I don't feel I've made a lot of, I guess I've been waiting for

386
0:18:19,54 --> 0:18:23,04
a groundswell of support because both of them were very hard

387
0:18:23,04 --> 0:18:24,52
to implement as an individual.

388
0:18:25,64 --> 0:18:28,68
I got close with TDE but got stuck on something.

389
0:18:29,34 --> 0:18:32,34
And sharding, I think, it really requires a team.

390
0:18:33,26 --> 0:18:37,36
I don't think I've seen the kind of desire behind that.

391
0:18:37,36 --> 0:18:41,32
I think part of the reason for sharding is that machines keep

392
0:18:41,32 --> 0:18:47,74
getting so big that effectively you're better off just buying

393
0:18:47,74 --> 0:18:51,06
a bigger machine or renting a bigger machine than going down

394
0:18:51,06 --> 0:18:51,88
the sharding route.

395
0:18:51,88 --> 0:18:57,9
Because the sharding route really is very workload specific and

396
0:18:58,42 --> 0:19:3,42
there's an increasing dislike of workload specific things in

397
0:19:3,42 --> 0:19:4,06
an enterprise.

398
0:19:4,54 --> 0:19:9,18
There's a much more push toward just generic compute, generic

399
0:19:9,72 --> 0:19:15,9
solutions, generic, and the sort of, oh, I got... with special

400
0:19:16,22 --> 0:19:21,06
chips on the drive, which filter stuff, and oh, I got this special,

401
0:19:21,06 --> 0:19:22,86
whatever, special hardware does something.

402
0:19:23,08 --> 0:19:27,08
It just, the enterprise focus is a lot less on hardware now,

403
0:19:27,28 --> 0:19:29,88
a lot less on infrastructure, a lot more on solutions.

404
0:19:30,34 --> 0:19:33,84
Maybe That might be good, I'm not saying good or bad, but I think

405
0:19:33,84 --> 0:19:34,78
sharding is...

406
0:19:34,92 --> 0:19:40,66
As people needed sharding, systems have increased in size, really

407
0:19:40,68 --> 0:19:42,62
pretty much in lockstep to what they needed.

408
0:19:42,62 --> 0:19:46,1
So the number of people who actually need it is limited, and

409
0:19:46,1 --> 0:19:48,52
the number of people who are willing to put in the effort to

410
0:19:48,52 --> 0:19:50,58
architect it is also limited.

411
0:19:51,34 --> 0:19:55,18
Nikolay: Yeah, I agree with you that hardware available right

412
0:19:55,18 --> 0:19:56,6
now is so huge.

413
0:19:57,4 --> 0:20:1,8
Recently, I saw a cluster which had default vacuum settings,

414
0:20:2,96 --> 0:20:3,46
self-managed.

415
0:20:4,22 --> 0:20:9,18
So usually like RDS or others, they tune, half-tune it.

416
0:20:9,52 --> 0:20:13,08
But it was the default of the vacuum settings, 3 workers only,

417
0:20:13,08 --> 0:20:16,82
nothing tuned, and it was 20 terabytes and somehow surviving.

418
0:20:16,86 --> 0:20:18,4
It was insane, absolutely.

419
0:20:19,16 --> 0:20:21,68
So how is it possible?

420
0:20:23,64 --> 0:20:27,16
But at the same time, RDS and CloudSQL and others, they have

421
0:20:27,16 --> 0:20:27,94
hard limit.

422
0:20:28,48 --> 0:20:29,82
I know they work on it.

423
0:20:29,82 --> 0:20:32,34
And Aurora has more 64 terabytes.

424
0:20:32,48 --> 0:20:37,16
And these days to collect that amount of data is not rare already.

425
0:20:37,2 --> 0:20:41,76
Even 1 person surrounded by AIs can collect a lot of data.

426
0:20:42,04 --> 0:20:42,84
This is 1 thing.

427
0:20:42,84 --> 0:20:44,76
And another thing is that there are hard problems.

428
0:20:44,76 --> 0:20:48,34
For example, lightweight lock manager problem, which was solved

429
0:20:48,34 --> 0:20:49,74
in Postgres 18.

430
0:20:50,34 --> 0:20:53,68
There are other problems as well when you grow to some heights

431
0:20:53,68 --> 0:20:59,02
you encounter with Some people would not like to spend time firefighting,

432
0:20:59,44 --> 0:21:1,52
so Sharding is really needed.

433
0:21:1,72 --> 0:21:3,42
But I guess, what's your take?

434
0:21:3,42 --> 0:21:7,2
For me, it's really hard to find consensus how to achieve it.

435
0:21:7,48 --> 0:21:11,32
Also, sharding topic usually triggers topics like internal connection

436
0:21:11,32 --> 0:21:13,72
pooler, for example, because they are adjacent.

437
0:21:15,06 --> 0:21:18,82
If you talk about routing and so on, you think about also auto

438
0:21:18,82 --> 0:21:21,1
failover, you think about connection pooling.

439
0:21:21,68 --> 0:21:24,44
These topics usually come together in my head.

440
0:21:24,48 --> 0:21:26,54
But there are other ways to do sharding.

441
0:21:26,76 --> 0:21:27,74
There are many ways.

442
0:21:27,74 --> 0:21:28,88
What's your take?

443
0:21:29,16 --> 0:21:31,94
Will it ever be implemented in core or no?

444
0:21:33,28 --> 0:21:35,74
Bruce: I've always been the opinion, I do have a sharding talk

445
0:21:35,74 --> 0:21:39,06
on my website, but I've always been the opinion that sharding

446
0:21:39,06 --> 0:21:43,04
has to be really developed organically within Postgres, and it

447
0:21:43,04 --> 0:21:47,5
really has to be built on effectively partitioning foreign data

448
0:21:47,5 --> 0:21:48,76
wrappers and parallelism.

449
0:21:49,64 --> 0:21:52,98
I don't think we're going to have the appetite to create a whole

450
0:21:53,68 --> 0:21:56,62
new architecture for sharding.

451
0:21:56,66 --> 0:21:59,54
And as every year that goes by,
it becomes clearer that, yeah,

452
0:21:59,54 --> 0:22:0,6
we just can't.

453
0:22:0,94 --> 0:22:5,58
So I think we can get closer than
we are now with little impact.

454
0:22:6,1 --> 0:22:9,48
I think if we, for read-only sharding,
I think we want to do

455
0:22:9,48 --> 0:22:12,62
read-write sharding, then we have
to have a global LockManager,

456
0:22:12,62 --> 0:22:14,18
global snapshot manager.

457
0:22:14,38 --> 0:22:15,9
It becomes much more complicated.

458
0:22:17,06 --> 0:22:21,6
Yeah, I keep going there and people
who think they need it effectively

459
0:22:21,72 --> 0:22:25,64
don't end up needing it or end
up getting bigger hardware or

460
0:22:25,64 --> 0:22:28,38
they re-architect what they're
doing and then they don't need

461
0:22:28,38 --> 0:22:29,0
it anymore.

462
0:22:29,24 --> 0:22:32,86
So I think that's 1 of the reasons
it hasn't moved forward.

463
0:22:32,86 --> 0:22:36,76
But I do think, I think for the
right only sort of a data warehouse

464
0:22:36,76 --> 0:22:40,68
kind of sharding, I think within
2 years we could have a pretty

465
0:22:40,68 --> 0:22:42,66
good solution in the industry.

466
0:22:42,88 --> 0:22:47,02
But I just haven't seen a lot of,
There's 1 guy in Fujitsu who

467
0:22:47,02 --> 0:22:49,7
was working on it, but that was
about it.

468
0:22:50,74 --> 0:22:53,52
And frankly, it's been pretty dormant
for the past 2 years, and

469
0:22:53,52 --> 0:22:55,58
I haven't had time to work on it
much either.

470
0:22:55,9 --> 0:22:56,26
Nikolay: Yeah.

471
0:22:56,26 --> 0:23:1,16
If you stay on a single primary
cluster, there is a big limitation,

472
0:23:1,72 --> 0:23:5,94
well known for those who achieved
some heights, where you have

473
0:23:6,04 --> 0:23:8,32
200, 300 maybe bytes per second.

474
0:23:8,32 --> 0:23:11,74
We talked about this a lot with
folks who develop sharding systems.

475
0:23:13,66 --> 0:23:17,86
You hit the limits of single threaded
process of logical, physical,

476
0:23:17,86 --> 0:23:21,1
logical, logical, WAL receiver,
basically.

477
0:23:21,1 --> 0:23:24,1399
Not WAL receiver, a replay process,
right?

478
0:23:24,1399 --> 0:23:29,24
In it, it shows up as a neat and
top, which basically replays

479
0:23:29,24 --> 0:23:32,66
the changes from the primary and
it's a single threaded process.

480
0:23:33,26 --> 0:23:37,76
And I know also somebody from Japan
working on it since 2013,

481
0:23:38,1 --> 0:23:39,76
I think, on and off.

482
0:23:39,76 --> 0:23:42,08
I saw some conference talks and
so on.

483
0:23:42,6 --> 0:23:44,44
This problem is not included to
your list.

484
0:23:44,44 --> 0:23:46,88
It's not a small problem if you
achieve heights.

485
0:23:46,88 --> 0:23:48,58
What do you think about that problem?

486
0:23:49,74 --> 0:23:53,14
Bruce: Yeah, so as I remember,
that was the 1 where they're trying

487
0:23:53,14 --> 0:23:59,08
to create a dependency graph from
the WAL, and therefore identify

488
0:23:59,32 --> 0:24:3,96
which parts of the WAL are parallelizable,
and send those off

489
0:24:3,96 --> 0:24:4,62
to workers.

490
0:24:4,84 --> 0:24:7,38
CPUs do that now with CPU instructions.

491
0:24:7,54 --> 0:24:11,94
They figure out which parts can
be run on different cores or

492
0:24:11,94 --> 0:24:13,12
in parallel cores.

493
0:24:13,14 --> 0:24:13,4
Yeah, you're right.

494
0:24:13,4 --> 0:24:14,02
But we don't

495
0:24:14,02 --> 0:24:15,84
Nikolay: have threading, so, yeah.

496
0:24:16,1 --> 0:24:17,02
Bruce: I don't know.

497
0:24:17,28 --> 0:24:19,12
I don't know if threading is...

498
0:24:19,3 --> 0:24:20,79
Threading's on the list that I
have.

499
0:24:20,79 --> 0:24:21,22
I don't

500
0:24:21,22 --> 0:24:21,84
Nikolay: know if

501
0:24:21,9 --> 0:24:25,44
Bruce: threading I'm not sure threading is a requirement for

502
0:24:25,58 --> 0:24:29,54
that, because we do have parallelism without threading now.

503
0:24:29,54 --> 0:24:30,9
It seems to work fine.

504
0:24:31,06 --> 0:24:33,8
Because you load them into a shared memory queue and the process

505
0:24:33,8 --> 0:24:35,04
just pulls out of there.

506
0:24:35,08 --> 0:24:36,14
I don't know.

507
0:24:36,14 --> 0:24:36,9
I think you're right.

508
0:24:36,9 --> 0:24:41,66
I think that would be getting a paralyzable replay of logical

509
0:24:41,68 --> 0:24:42,18
replication.

510
0:24:43,84 --> 0:24:47,46
Physical doesn't seem to matter too much because it's so fast

511
0:24:47,52 --> 0:24:48,02
that...

512
0:24:48,12 --> 0:24:49,68
Nikolay: Hold on 1 second.

513
0:24:49,68 --> 0:24:52,26
Logical I think already, if there is work maybe in Postgres 18

514
0:24:52,26 --> 0:24:54,94
or something, I remember something for logical to parallelize

515
0:24:54,96 --> 0:24:55,28
it.

516
0:24:55,28 --> 0:24:56,64
Bruce: To parallelize it, okay.

517
0:24:56,76 --> 0:24:57,66
Nikolay: I think so.

518
0:24:57,66 --> 0:25:0,66
Technically you can parallelize it with multiple publication

519
0:25:0,66 --> 0:25:4,92
subscription pairs, multiple slots, but it will be having problems

520
0:25:4,92 --> 0:25:5,94
with foreign keys.

521
0:25:6,14 --> 0:25:9,56
It will be eventually consistent in terms of referential integrity.

522
0:25:10,84 --> 0:25:13,88
I'm talking purely about physical replication.

523
0:25:14,2 --> 0:25:16,04
Bruce: Physical replication, okay.

524
0:25:18,32 --> 0:25:22,36
Nikolay: And this is limit some companies hit, and it's super

525
0:25:22,36 --> 0:25:24,38
painful because there is no escape from it.

526
0:25:24,38 --> 0:25:27,48
You need to do either sharding or vertical split.

527
0:25:28,32 --> 0:25:31,22
You need to achieve some, like a lot of writes basically.

528
0:25:32,22 --> 0:25:36,1
Bruce: I think the reason we don't hear about it a lot in the

529
0:25:36,1 --> 0:25:39,44
community level, and this might be different from the level that

530
0:25:39,44 --> 0:25:44,36
you're working at, is we're definitely a general purpose database.

531
0:25:45,56 --> 0:25:49,46
And not that we can't go to the heights and we keep pushing the

532
0:25:49,46 --> 0:25:58,18
ceiling up higher and higher, but there is a limit to how much

533
0:25:58,18 --> 0:26:1,62
complexity we're willing to add, and potentially unreliability

534
0:26:3,22 --> 0:26:6,22
to Postgres to get up to the super heights.

535
0:26:6,42 --> 0:26:9,72
It's possible that's why I know you specialize in that area.

536
0:26:9,96 --> 0:26:12,84
That's possible we don't hear a lot about it.

537
0:26:13,66 --> 0:26:18,04
That didn't get on my list as something that I've heard a lot

538
0:26:18,04 --> 0:26:18,54
about.

539
0:26:18,68 --> 0:26:21,88
I don't see many email threads really addressing that.

540
0:26:21,88 --> 0:26:22,7
Nikolay: This is so.

541
0:26:23,08 --> 0:26:26,32
But I know, for example, Instacart hit it years ago, during

542
0:26:26,32 --> 0:26:28,76
COVID actually, and there are others who hit it.

543
0:26:28,8 --> 0:26:31,5
Sometimes people hit it and don't notice because, okay, some

544
0:26:31,5 --> 0:26:34,64
replica are lagging a little bit and so on.

545
0:26:34,64 --> 0:26:38,0
Yeah, but I understand it's not a super common problem.

546
0:26:38,0 --> 0:26:38,5
Yeah.

547
0:26:38,8 --> 0:26:43,22
Bruce: We had a, EDB had a customer who hit it on specialized

548
0:26:43,36 --> 0:26:46,62
storage hardware and we identified that it was the storage hardware

549
0:26:47,24 --> 0:26:48,54
that was causing it.

550
0:26:48,54 --> 0:26:51,74
So that was, yeah, that was kind of a get new hardware, get new

551
0:26:51,74 --> 0:26:53,5
storage hardware, your problem will go away.

552
0:26:53,5 --> 0:26:55,96
So that was, I think the answer to that 1.

553
0:26:56,28 --> 0:26:56,68
Nikolay: Okay,

554
0:26:56,68 --> 0:26:57,18
Bruce: yeah.

555
0:26:58,38 --> 0:27:1,46
That's the only case I've heard of replay complaints and lagging

556
0:27:2,28 --> 0:27:3,34
in my work.

557
0:27:3,34 --> 0:27:3,84
Yeah.

558
0:27:3,84 --> 0:27:6,4
Nikolay: But what's your take on threading topic?

559
0:27:6,42 --> 0:27:10,6
Because there was a big impulse a couple of years ago from Heikki

560
0:27:10,6 --> 0:27:11,78
originally, I think.

561
0:27:13,48 --> 0:27:13,98
Bruce: Yeah.

562
0:27:13,98 --> 0:27:17,72
I have a blog entry about it, which kind of from 2018, which

563
0:27:17,72 --> 0:27:23,76
references, I think it might even references Heikki's, Heikki's,

564
0:27:23,8 --> 0:27:25,7
let me see if it references Heikki's email.

565
0:27:26,64 --> 0:27:27,76
No, it doesn't.

566
0:27:28,62 --> 0:27:31,0
I think we're still on the fence on that 1.

567
0:27:31,0 --> 0:27:35,14
We keep chewing away at some of the small things like getting

568
0:27:35,14 --> 0:27:37,04
rid of global variables, I think was 1.

569
0:27:37,04 --> 0:27:39,9
And I think the other thing that's going to bail us out here

570
0:27:40,0 --> 0:27:46,3
is that we now have, it appears as though we never have compiler

571
0:27:46,64 --> 0:27:47,86
support for this.

572
0:27:48,76 --> 0:27:53,8
So instead of having to re-architect all our code, it looks like

573
0:27:53,8 --> 0:27:56,76
there's some, if we can get rid of the global variables, which

574
0:27:56,76 --> 0:28:3,58
is pretty simple, it looks like the compiler will sort of read

575
0:28:3,58 --> 0:28:7,36
local a lot of stuff for us to make it easier without us having

576
0:28:7,36 --> 0:28:10,76
to like really re-architect a lot of the code.

577
0:28:10,76 --> 0:28:12,54
So I'm hoping in that area.

578
0:28:12,84 --> 0:28:17,44
But my understanding is that the threading prototypes don't seem

579
0:28:17,44 --> 0:28:18,76
to gain a whole lot.

580
0:28:18,84 --> 0:28:23,82
They don't, at least from what we found, because it reduces task

581
0:28:23,82 --> 0:28:24,64
switching time.

582
0:28:24,64 --> 0:28:31,2
But again, for example, if you're doing a GUI, you really need

583
0:28:31,2 --> 0:28:33,48
threading because you're updating a whole bunch of stuff on the

584
0:28:33,48 --> 0:28:36,34
screen at once and you have 1 of all these background jobs to

585
0:28:36,34 --> 0:28:39,84
be running in your address space and so forth to update a whole

586
0:28:39,84 --> 0:28:42,94
bunch of very light processes and light statuses.

587
0:28:43,34 --> 0:28:46,12
But when you start to talk about a database, there really isn't

588
0:28:46,12 --> 0:28:47,64
a whole lot of light out there.

589
0:28:47,64 --> 0:28:50,24
Yeah, I guess we could do index lookups with threads.

590
0:28:50,6 --> 0:28:51,6
That would be cool.

591
0:28:52,12 --> 0:28:53,54
Nikolay: Visibility bits maybe?

592
0:28:53,72 --> 0:28:54,52
Bruce: Visibility, yeah.

593
0:28:54,52 --> 0:28:57,5
But the problem is there's a lot of the stuff is really heavy.

594
0:28:57,5 --> 0:29:2,32
For example, when you're trying to paralyze a query, those are

595
0:29:2,32 --> 0:29:3,12
not light, right?

596
0:29:3,12 --> 0:29:6,22
So you might as well just get a process and run it and have its

597
0:29:6,22 --> 0:29:7,32
own everything.

598
0:29:7,36 --> 0:29:10,22
So I think that's what has slowed it down.

599
0:29:10,52 --> 0:29:12,24
Another problem obviously is resiliency.

600
0:29:12,9 --> 0:29:17,14
Right now if a session crashes or runs out of memory or whatever,

601
0:29:17,16 --> 0:29:20,28
we keep running just fine, unless
it's in a critical section.

602
0:29:21,04 --> 0:29:25,94
Whereas with threading, it would
make us less resilient to sessions

603
0:29:25,94 --> 0:29:28,86
that misbehave, and we have to
balance those 2 off.

604
0:29:28,86 --> 0:29:31,72
We know the value of threading,
and then the less resiliency

605
0:29:31,78 --> 0:29:32,08
of it.

606
0:29:32,08 --> 0:29:33,5
I don't really know the answer.

607
0:29:33,52 --> 0:29:38,24
I've been surprised we haven't
done more in this area, but it

608
0:29:38,24 --> 0:29:42,54
appears to not be a huge problem
if I'm not hearing about it

609
0:29:42,54 --> 0:29:43,26
every couple months.

610
0:29:43,26 --> 0:29:43,78
I certainly

611
0:29:43,78 --> 0:29:43,82
Nikolay: am not.

612
0:29:43,82 --> 0:29:48,68
Let's talk about huge problems
related to 4 byte transaction

613
0:29:48,74 --> 0:29:54,44
IDs, which is included to your
list, 8 byte transaction IDs.

614
0:29:54,64 --> 0:29:56,68
And some folks have them for years
already.

615
0:29:56,68 --> 0:29:59,38
And also there is, I saw some preparation
work and so on.

616
0:29:59,38 --> 0:30:4,16
I'm very curious What are your
forecasts and take on this?

617
0:30:4,16 --> 0:30:8,96
But also, which is not included
to your list, is undo log and

618
0:30:8,96 --> 0:30:12,56
redesigning MVCC and these topics
which were quite popular a

619
0:30:12,56 --> 0:30:13,54
few years ago.

620
0:30:13,7 --> 0:30:16,36
There are some efforts, but now
like silence.

621
0:30:16,56 --> 0:30:18,2
Maybe I don't see it, but...

622
0:30:18,2 --> 0:30:20,72
Bruce: No, I think you're right
on the MVCC.

623
0:30:20,82 --> 0:30:24,18
I mean, we had the Zheap effort
by Robert Haas.

624
0:30:24,64 --> 0:30:26,4
That was probably 10 years ago.

625
0:30:26,82 --> 0:30:31,1
And he really poured into it, but
I think the job just got too

626
0:30:31,12 --> 0:30:31,62
large.

627
0:30:32,22 --> 0:30:37,24
And I think you also, once you
start to look at what undo requires,

628
0:30:37,88 --> 0:30:39,1
things get very complicated.

629
0:30:39,52 --> 0:30:43,68
So I'm not saying it's not doable,
but you end up with a lot

630
0:30:43,68 --> 0:30:47,68
of complicated challenges And Oracle
suffered from those challenges

631
0:30:47,68 --> 0:30:52,1
for decades until they finally
figured out how to deal with them.

632
0:30:52,44 --> 0:30:56,26
So what would we gain by having
an undo?

633
0:30:57,74 --> 0:31:0,46
Certainly it would make updates
easier.

634
0:31:0,92 --> 0:31:7,36
I'm not sure it would help with
something like deletes or aborted

635
0:31:7,4 --> 0:31:7,9
inserts.

636
0:31:8,86 --> 0:31:9,84
Nikolay: I think it's deletes.

637
0:31:9,86 --> 0:31:11,28
Let's just update and see.

638
0:31:11,28 --> 0:31:13,48
Bruce: Yeah, just update and be
done with it, exactly.

639
0:31:13,86 --> 0:31:16,56
Michael: It's all the bloat and
vacuum issues, I think, that

640
0:31:16,56 --> 0:31:19,46
people are most excited about getting
rid of.

641
0:31:19,54 --> 0:31:22,54
Bruce: Yeah, I mean, but at the
same time they keep improving

642
0:31:22,54 --> 0:31:24,76
vacuum every release I saw it in
the thing.

643
0:31:24,76 --> 0:31:27,18
You have today's vacuum, you have
the vacuum from 2 years ago,

644
0:31:27,18 --> 0:31:28,82
you have the vacuum 5 years ago.

645
0:31:28,82 --> 0:31:30,22
Which 1 are we complaining about?

646
0:31:30,22 --> 0:31:33,04
Because in fact it has gotten better.

647
0:31:33,34 --> 0:31:35,28
I'd love to just get rid of the
whole thing.

648
0:31:35,28 --> 0:31:39,16
I don't think we Even with an undo,
I'm not sure I'm ever going

649
0:31:39,16 --> 0:31:40,64
to be able to get rid of vacuum.

650
0:31:41,74 --> 0:31:44,04
Nikolay: Because it has different
jobs as well.

651
0:31:44,08 --> 0:31:46,62
Bruce: Right, because of the problem with aborted inserts and

652
0:31:46,62 --> 0:31:47,82
deletes and so forth.

653
0:31:48,08 --> 0:31:51,3
So what is that going to look like?

654
0:31:52,08 --> 0:31:52,84
I don't know.

655
0:31:52,84 --> 0:31:53,46
I don't know.

656
0:31:53,46 --> 0:31:58,18
I'm surprised at how long we've been able to stay with what we

657
0:31:58,18 --> 0:32:0,46
got, what, 30, 40 years ago.

658
0:32:1,2 --> 0:32:4,24
Much more simpler architectural setup.

659
0:32:4,66 --> 0:32:8,5
It's funny, I had a discussion with somebody and I was in Geneva

660
0:32:8,64 --> 0:32:13,52
for the CERN PGDay and somebody came to me and they were talking

661
0:32:13,52 --> 0:32:15,36
about page compression.

662
0:32:15,46 --> 0:32:17,56
Do we do page compression?

663
0:32:17,68 --> 0:32:22,04
And I have a blog entry about it and I showed it to him and he

664
0:32:22,04 --> 0:32:25,18
didn't really, I hoped he would read it from my phone, but he

665
0:32:25,18 --> 0:32:26,26
didn't want to.

666
0:32:27,1 --> 0:32:29,28
I said, why do you want page compression?

667
0:32:29,28 --> 0:32:31,3
He said, it requires less disk space.

668
0:32:31,78 --> 0:32:33,4
I said, this is not the 1990s.

669
0:32:34,2 --> 0:32:35,94
What actual reason?

670
0:32:35,94 --> 0:32:39,28
I said, I understand why it was needed 30 years ago, but I'm

671
0:32:39,28 --> 0:32:43,48
not sure that, because once we compress a page, then obviously

672
0:32:43,86 --> 0:32:48,46
if we do an update and then the new data can't be compressed

673
0:32:48,48 --> 0:32:51,42
as well, then we got a bigger page, you've got to put that somewhere,

674
0:32:51,66 --> 0:32:54,32
and where do we put it, and then how do we deal with the index

675
0:32:54,32 --> 0:32:58,78
entries, and like what are we really gaining once we're all done

676
0:32:58,78 --> 0:33:1,84
all that computation, all that moving the data around, What have

677
0:33:1,84 --> 0:33:3,48
we really gained with the compression?

678
0:33:3,9 --> 0:33:6,28
Now we can use storage compression, right?

679
0:33:6,28 --> 0:33:8,16
You can, there is storage compression.

680
0:33:8,16 --> 0:33:10,12
I think Postgres will run on that.

681
0:33:10,12 --> 0:33:12,98
But again, that pushes the problem down to the storage layer.

682
0:33:12,98 --> 0:33:15,14
I don't know what performance would be like, because frankly,

683
0:33:15,14 --> 0:33:17,98
databases don't like compression of that size.

684
0:33:17,98 --> 0:33:22,72
But I guess my point is that he was asking for a very specific

685
0:33:22,72 --> 0:33:26,24
thing, but when I asked him why he wanted it, he really couldn't

686
0:33:27,24 --> 0:33:32,72
articulate except, okay, it uses less disk space, but I wasn't

687
0:33:32,72 --> 0:33:34,2
sure what the goal was there.

688
0:33:34,2 --> 0:33:34,54
So for

689
0:33:34,54 --> 0:33:38,42
Nikolay: me, compression related to, let's compress data, it's

690
0:33:38,42 --> 0:33:41,82
more related to Column storage and how there you can compress.

691
0:33:41,82 --> 0:33:42,72
Bruce: I told him that.

692
0:33:42,72 --> 0:33:46,54
I said, if you're telling me Column storage, that's a completely

693
0:33:46,56 --> 0:33:47,38
different setup.

694
0:33:47,9 --> 0:33:49,64
And we do have solutions for that.

695
0:33:49,84 --> 0:33:52,8
But again, you're assuming a lot of duplicate data in the same

696
0:33:52,8 --> 0:33:54,4
Column, duplicate data.

697
0:33:54,4 --> 0:33:59,04
And that's a different case than compressing an 8K page on disk.

698
0:34:0,14 --> 0:34:2,5
And it's like, no, no, no, I don't want that.

699
0:34:2,5 --> 0:34:4,18
I want the 1 about the page.

700
0:34:4,74 --> 0:34:6,58
I don't see anybody working on that.

701
0:34:6,58 --> 0:34:7,4667
And I'm like, okay.

702
0:34:7,4667 --> 0:34:9,22
Nikolay: We don't serve this in
this restaurant.

703
0:34:9,52 --> 0:34:10,94
Bruce: Yeah, we don't serve that.

704
0:34:10,94 --> 0:34:14,48
In terms of 64-bit transaction
IDs, I think it's a little hard

705
0:34:14,48 --> 0:34:17,62
to understand what's going on when
you look at the email threads,

706
0:34:17,64 --> 0:34:24,02
because I think the impetus is
that Postgres Pro has a version

707
0:34:24,02 --> 0:34:27,1
of Postgres that does 64-bit transaction
IDs.

708
0:34:27,74 --> 0:34:31,8
And therefore, you don't have to
freeze and the whole thing there.

709
0:34:31,8 --> 0:34:37,08
But as we talked about earlier,
Nikolay, I think that the Postgres

710
0:34:37,24 --> 0:34:44,7
Pro customers are really at the
very high end throughput requirements

711
0:34:45,26 --> 0:34:46,72
And they're willing to pay for
that.

712
0:34:46,72 --> 0:34:50,52
They're willing to have a more
complex system that does that.

713
0:34:51,18 --> 0:34:56,4
But when we're now in the more
generic workload case, the Russians

714
0:34:56,4 --> 0:34:59,62
have been very willing to give
us the patches to do it.

715
0:35:0,06 --> 0:35:3,78
But there's a resistance in terms
of exactly how to do it.

716
0:35:4,0 --> 0:35:8,44
In a way, it's going to benefit
the high-end users, no question.

717
0:35:9,34 --> 0:35:12,98
And 1 of the things I've learned
in the past year is that 1 of

718
0:35:12,98 --> 0:35:16,44
the reasons that Oracle looks that
way, and a lot of the proprietary

719
0:35:16,56 --> 0:35:19,6
forks look the way they do, is
because they're really selling

720
0:35:19,6 --> 0:35:22,86
to that top 5% of market volume.

721
0:35:23,5 --> 0:35:27,5
Whereas when Postgres is working
on its code, we're trying to

722
0:35:27,5 --> 0:35:29,88
hit that 50% mark, right?

723
0:35:30,36 --> 0:35:33,12
And the reason Oracle is so complicated,
the reason a lot of

724
0:35:33,12 --> 0:35:36,1
these databases become complicated
or add these complicated features

725
0:35:36,22 --> 0:35:38,8
is because they're really selling,
they don't really care.

726
0:35:38,8 --> 0:35:41,88
I don't say they don't care, but
they're only really focused

727
0:35:41,88 --> 0:35:45,76
on their top 50 customers and everyone
else is just along for

728
0:35:45,76 --> 0:35:46,42
the ride.

729
0:35:46,56 --> 0:35:49,44
And therefore you get these very
complicated systems with a lot

730
0:35:49,44 --> 0:35:53,86
of weird options, which were added
only because 1 of the 50 wanted

731
0:35:53,86 --> 0:35:54,36
it.

732
0:35:54,84 --> 0:35:58,64
You have this very high-end group
who's very demanding, who's

733
0:35:58,64 --> 0:36:2,7
calling the shots, and they're
dragging the Database into this

734
0:36:2,7 --> 0:36:3,76
high-end volume.

735
0:36:4,02 --> 0:36:7,2
But every time they're dragging
it up there, they're potentially

736
0:36:7,42 --> 0:36:13,34
making the generic workload either
slower or harder or less efficient.

737
0:36:13,38 --> 0:36:16,7
And they effectively don't care
because the money is in that

738
0:36:16,7 --> 0:36:17,2
high-end.

739
0:36:17,22 --> 0:36:18,46
Nikolay: Let me comment, please.

740
0:36:18,74 --> 0:36:22,1
I cannot keep silence here because
we help startups.

741
0:36:22,36 --> 0:36:26,72
Our primary customer is startup
which grew some terabytes and

742
0:36:26,72 --> 0:36:28,94
they saw something and they hit
problems.

743
0:36:29,25 --> 0:36:34,1
And last couple of years we experienced
a lot of AI startups,

744
0:36:34,82 --> 0:36:36,24
and this is crazy, absolutely.

745
0:36:36,74 --> 0:36:40,9
They reach heights much more quicker
with much less effort and

746
0:36:40,9 --> 0:36:41,4
resources.

747
0:36:42,26 --> 0:36:45,68
And they start hitting problems,
for example, very quickly when

748
0:36:45,72 --> 0:36:50,22
freezing is a big problem and we
need to do manual freezing to

749
0:36:50,22 --> 0:36:51,43
skip indexes, for example.

750
0:36:51,43 --> 0:36:53,5
This is very common right now for
us.

751
0:36:54,34 --> 0:37:1,32
And second thing is that we're already losing some of them, because

752
0:37:1,34 --> 0:37:5,46
when we start saying you need like partitioning is like you need

753
0:37:5,46 --> 0:37:8,48
to allocate some resources for this like experiments and so on.

754
0:37:8,48 --> 0:37:8,98
Yeah.

755
0:37:9,0 --> 0:37:11,96
They switch to different database system or quickly consider

756
0:37:11,96 --> 0:37:14,48
it where like less headache.

757
0:37:14,96 --> 0:37:16,62
But this concerns me a lot.

758
0:37:16,76 --> 0:37:18,46
Michael: Isn't that kind of Bruce's point?

759
0:37:18,48 --> 0:37:22,54
That if they're an Oracle customer and they grew to being a significant

760
0:37:22,64 --> 0:37:25,84
portion of revenue worth saving, that's different.

761
0:37:25,84 --> 0:37:28,44
If it's the Postgres, like the Postgres project doesn't have

762
0:37:28,44 --> 0:37:31,4
to listen to the huge customers because they're paying.

763
0:37:31,4 --> 0:37:34,84
Maybe Oracle incentivized to listen to those customers more than

764
0:37:34,84 --> 0:37:37,54
the average customer or more than all the little guys.

765
0:37:37,8 --> 0:37:41,28
So Postgres is a different landscape where we can listen to people

766
0:37:41,28 --> 0:37:44,86
more like across the board, not just the guys hitting extreme

767
0:37:44,86 --> 0:37:45,36
scale.

768
0:37:45,8 --> 0:37:48,76
Nikolay: My point is that the smaller teams have bigger databases

769
0:37:49,22 --> 0:37:50,88
with increasing speed.

770
0:37:51,58 --> 0:37:57,92
They hit problems which were hit only by big users before more

771
0:37:57,92 --> 0:37:58,82
and more often.

772
0:37:58,84 --> 0:38:3,46
These guys are more ready to switch faster because they are surrounded

773
0:38:3,46 --> 0:38:6,0
by AI themselves, because they are AI startups.

774
0:38:6,54 --> 0:38:7,94
Bruce: What do they switch to?

775
0:38:8,5 --> 0:38:13,02
Nikolay: Sometimes proprietary databases, AWS or GCP, what they

776
0:38:13,02 --> 0:38:13,52
offer.

777
0:38:13,94 --> 0:38:18,86
Sometimes MongoDB also, like on the table sometimes, go into

778
0:38:18,86 --> 0:38:19,7
different problems.

779
0:38:19,74 --> 0:38:23,98
But I'm just concerned that I see this different thing that I

780
0:38:23,98 --> 0:38:26,02
didn't see 20 previous years.

781
0:38:26,28 --> 0:38:30,02
Bruce: So if we look at what's going on with the 64-bit IDs as

782
0:38:30,02 --> 0:38:33,18
a microcosm, So they're willing to give us all the patches to

783
0:38:33,18 --> 0:38:34,62
make this happen, right?

784
0:38:35,28 --> 0:38:39,14
But what's happened is that we've only incrementally implemented

785
0:38:39,16 --> 0:38:39,4
those.

786
0:38:39,4 --> 0:38:40,86
We have some of them in PG 19.

787
0:38:41,52 --> 0:38:44,94
For example, the MultiXact sizes are now 64-bit.

788
0:38:45,28 --> 0:38:48,24
The members or the groups, I can't remember, I think it's the

789
0:38:48,24 --> 0:38:48,74
groups.

790
0:38:49,3 --> 0:38:53,14
Nikolay: And sorry, we had a couple of good examples last year

791
0:38:53,14 --> 0:38:56,86
and some of them came to our podcast where this was exhausted,

792
0:38:56,98 --> 0:38:57,68
this space.

793
0:38:57,8 --> 0:38:58,48
Bruce: The multi-tac.

794
0:38:59,34 --> 0:38:59,82
Nikolay: Yeah, yeah.

795
0:38:59,82 --> 0:39:3,94
And also AI startups which have a lot of data very fast.

796
0:39:4,16 --> 0:39:8,94
Bruce: So there's 1, 2 ways we can go at it.

797
0:39:8,94 --> 0:39:15,36
We could just go whole hog, 64-bit everything, increase the header

798
0:39:15,36 --> 0:39:20,32
size by 33% by making them 64-bit and be done with it.

799
0:39:20,32 --> 0:39:23,1
But we're worried about the impact on an average user.

800
0:39:23,1 --> 0:39:29,24
So what we've been doing is slowly 64-bit, enabling the server,

801
0:39:29,54 --> 0:39:31,92
MultiXact is 1, and there's some other areas.

802
0:39:31,92 --> 0:39:33,78
I think C-log we're looking at doing.

803
0:39:33,78 --> 0:39:35,26
I don't know if we got to that.

804
0:39:35,86 --> 0:39:37,26
So we're trying to do it.

805
0:39:37,26 --> 0:39:42,54
We're trying to 64-bit the areas that are costless to us, basically,

806
0:39:42,54 --> 0:39:45,96
where we can increase it without really any problems.

807
0:39:46,3 --> 0:39:50,38
And then I think once we get to that, now we have 64-bit pretty

808
0:39:50,38 --> 0:39:52,12
much everywhere that we can easily do it.

809
0:39:52,12 --> 0:39:55,12
Then we have to look at, okay, now at the places they're going

810
0:39:55,12 --> 0:39:57,48
to cost us, what do we do?

811
0:39:57,74 --> 0:40:2,66
I think one of the great ideas that I saw was to get basically

812
0:40:3,08 --> 0:40:9,88
an epoch LSN on the page header so that you could basically say

813
0:40:9,88 --> 0:40:14,44
that for mass loads of data, you're doing a mass load of data

814
0:40:14,44 --> 0:40:17,4
which a lot of these companies do, effectively all the tuples

815
0:40:17,4 --> 0:40:18,54
are in the same transaction.

816
0:40:18,64 --> 0:40:21,54
They're clearly all in the same epoch, right?

817
0:40:21,82 --> 0:40:25,88
So you just put one epoch and then all your heap tuples look just

818
0:40:25,88 --> 0:40:27,26
the same as they did before.

819
0:40:27,84 --> 0:40:28,34
Right.

820
0:40:28,38 --> 0:40:31,46
I think that's where we're going is the concept of having an

821
0:40:31,46 --> 0:40:35,28
epoch on the page and then basically allow those not to necessarily

822
0:40:35,46 --> 0:40:36,98
have to be frozen at all.

823
0:40:37,28 --> 0:40:42,1
I think if we start to bring the epoch down to individual rows

824
0:40:42,7 --> 0:40:45,48
and we increase the header size, which people have already complained

825
0:40:45,48 --> 0:40:49,38
is too big, then we potentially could get complaints.

826
0:40:49,66 --> 0:40:53,0
I think that, if I had to take a guess, I think that's where

827
0:40:53,0 --> 0:40:56,66
we're going, but we're going at it very incrementally, again,

828
0:40:56,68 --> 0:41:2,78
hitting all the places that store LSNs outside of the heap and

829
0:41:2,78 --> 0:41:5,1
index pages, because we know those are going to be complicated,

830
0:41:5,32 --> 0:41:7,06
and get the infrastructure done.

831
0:41:7,3 --> 0:41:10,08
I know that we have some patches in 19 for that.

832
0:41:10,08 --> 0:41:11,4
I've seen them coming through.

833
0:41:11,48 --> 0:41:12,36
They got committed.

834
0:41:12,74 --> 0:41:16,72
Then we have to decide, Okay, now that all this stuff is in place,

835
0:41:16,72 --> 0:41:18,26
let's put up a proof of concept.

836
0:41:18,48 --> 0:41:20,9
Let's see what a per-page EPOCH looks like.

837
0:41:21,04 --> 0:41:22,6
Let's do some loading of data.

838
0:41:22,6 --> 0:41:25,76
Let's see how far this gets us down the road.

839
0:41:25,76 --> 0:41:29,24
We don't have a whole lot of empty space on the pages, so it'll

840
0:41:29,24 --> 0:41:31,36
be a little tricky to figure out where to put it, but I think

841
0:41:31,36 --> 0:41:33,28
we can find a place to put it.

842
0:41:33,28 --> 0:41:37,2
But I think that's where we basically are is trying to figure

843
0:41:37,2 --> 0:41:42,38
out once we get all the bottom stuff done, what do we do for

844
0:41:42,38 --> 0:41:44,7
the main stuff, the heap and the index pages.

845
0:41:45,3 --> 0:41:45,62
Michael: Nice.

846
0:41:45,62 --> 0:41:46,92
I think that makes a ton of sense.

847
0:41:46,92 --> 0:41:50,22
And I really like the incremental approach that Postgres takes.

848
0:41:50,22 --> 0:41:53,26
I've only been in and around the project for the past 10 years

849
0:41:53,26 --> 0:41:53,98
or so.

850
0:41:54,52 --> 0:41:55,46
Bruce: It's a lot!

851
0:41:55,6 --> 0:41:58,84
Michael: I know, it's a lot compared to some, but it pales in

852
0:41:58,84 --> 0:42:0,58
comparison to 30 years.

853
0:42:0,68 --> 0:42:3,88
And it's been really nice seeing the incremental improvements

854
0:42:3,92 --> 0:42:6,22
but also that they stack up.

855
0:42:6,22 --> 0:42:9,96
It really has come a long way in 10 years and things that looked

856
0:42:9,96 --> 0:42:13,66
incremental kind of 9 years ago, 8 years ago, 7 years ago, they've

857
0:42:13,66 --> 0:42:16,68
really added up and you mentioned vacuum but it's it's not just

858
0:42:16,68 --> 0:42:19,4
vacuum right All of the changes that have been that have helped

859
0:42:19,4 --> 0:42:20,56
with index bloat.

860
0:42:20,74 --> 0:42:23,0
They kind of attacking from the other direction.

861
0:42:23,36 --> 0:42:23,8
So it's

862
0:42:23,8 --> 0:42:24,66
Bruce: that's very true.

863
0:42:24,66 --> 0:42:26,34
Yeah, replication.

864
0:42:26,38 --> 0:42:28,46
Yeah, yeah, partitioning, yeah.

865
0:42:29,02 --> 0:42:31,8
Michael: Yeah, I like that approach and I feel like it might

866
0:42:31,8 --> 0:42:35,16
sound like it's going to be slow, but time will fly in a few

867
0:42:35,16 --> 0:42:35,46
years.

868
0:42:35,46 --> 0:42:38,36
I can imagine us having made significant progress on some of

869
0:42:38,36 --> 0:42:38,76
these things.

870
0:42:38,76 --> 0:42:40,14
So yeah, that's really cool.

871
0:42:41,04 --> 0:42:42,6
Bruce, I'm conscious of time.

872
0:42:42,88 --> 0:42:45,32
Is there anything we didn't talk about that you wanted to make

873
0:42:45,32 --> 0:42:48,48
sure we did mention or any last shout outs or pointers you wanted

874
0:42:48,48 --> 0:42:49,34
to give people?

875
0:42:49,82 --> 0:42:53,52
Bruce: No, I was just nice that we got a chance to at least talk

876
0:42:53,52 --> 0:42:58,04
about, to me, the categories of what's missing was a big takeaway

877
0:42:58,04 --> 0:43:1,08
for me to understand, like, why are we here?

878
0:43:1,08 --> 0:43:4,94
Why don't we have some of the stuff that we're missing and some

879
0:43:4,94 --> 0:43:6,28
of the stuff we may never have?

880
0:43:6,28 --> 0:43:7,6
Optimizer hints.

881
0:43:7,6 --> 0:43:10,6
Although that seems like a portion of that is coming in Postgres

882
0:43:10,6 --> 0:43:10,92
19.

883
0:43:10,92 --> 0:43:13,78
It's not called Optimizer Hints, it's called advise, but

884
0:43:13,84 --> 0:43:16,04
it could be used in a similar way.

885
0:43:16,5 --> 0:43:19,44
So again, I think that we don't have a roadmap.

886
0:43:19,44 --> 0:43:20,14
That's the problem.

887
0:43:20,14 --> 0:43:22,76
We don't, because we're too dynamic to have a roadmap.

888
0:43:23,1 --> 0:43:26,88
So it's almost a surprise by me to see what gets in every release.

889
0:43:26,88 --> 0:43:30,64
And, I think everybody's pleasantly surprised by what's in 19

890
0:43:30,92 --> 0:43:35,28
and obviously what we're going to be starting for Postgres 20

891
0:43:35,28 --> 0:43:35,9
in July.

892
0:43:36,76 --> 0:43:37,48
Michael: Yeah, absolutely.

893
0:43:37,48 --> 0:43:39,38
And I think I saw the work from Robert.

894
0:43:39,38 --> 0:43:41,34
I think it was in contrib modules, right?

895
0:43:41,94 --> 0:43:43,08
At least the first version.

896
0:43:43,08 --> 0:43:44,56
Bruce: pg_plan_advice, yeah.

897
0:43:44,56 --> 0:43:45,06
Michael: Yes.

898
0:43:45,94 --> 0:43:46,92
Looks very cool.

899
0:43:47,22 --> 0:43:47,86
All right.

900
0:43:47,86 --> 0:43:49,28
Well, thank you so much for joining us.

901
0:43:49,28 --> 0:43:52,64
I'm sorry Nik had to drop off
It's absolute pleasure having

902
0:43:52,64 --> 0:43:53,14
you

903
0:43:53,44 --> 0:43:53,8
Bruce: Great.

904
0:43:53,8 --> 0:43:55,36
Thanks for nice talking to you.

905
0:43:55,46 --> 0:43:55,96
Michael: Likewise.