1
00:00:00,060 --> 00:00:02,980
Michael: Hello and welcome to
PostgresFM, a weekly show about

2
00:00:02,980 --> 00:00:04,120
all things PostgreSQL.

3
00:00:04,120 --> 00:00:05,720
I am Michael, founder of pgMustard.

4
00:00:05,860 --> 00:00:08,540
This is my co-host Nikolay, founder
of Postgres.ai.

5
00:00:08,720 --> 00:00:10,820
Hey Nikolay, what are we talking
about today?

6
00:00:11,540 --> 00:00:13,880
Nikolay: Let's continue the discussion
about UUIDs.

7
00:00:15,060 --> 00:00:20,340
I chose this topic because I recently
implemented something myself

8
00:00:20,740 --> 00:00:24,180
and also because some of my customers
raised this again.

9
00:00:25,760 --> 00:00:29,440
So I think many people still have
questions.

10
00:00:30,060 --> 00:00:34,020
I had, was it a solo recording
about UUID?

11
00:00:34,200 --> 00:00:36,080
Okay, it was a solo episode.

12
00:00:36,900 --> 00:00:42,340
But obviously I have fresh things
to discuss and you also have

13
00:00:42,340 --> 00:00:43,380
something to discuss.

14
00:00:43,380 --> 00:00:48,440
So let's return to this topic and
be more practical this time

15
00:00:48,440 --> 00:00:53,180
and think about really, really
large datasets and how to work

16
00:00:53,180 --> 00:00:54,260
with them more efficiently.

17
00:00:54,720 --> 00:00:55,680
It means partitioning.

18
00:00:56,460 --> 00:00:57,900
UUID plus partitioning.

19
00:00:58,820 --> 00:01:00,260
Michael: So that's what you looked
into.

20
00:01:01,940 --> 00:01:02,440
Nikolay: Yep.

21
00:01:03,780 --> 00:01:05,500
Well, let's talk about only partitioning.

22
00:01:05,600 --> 00:01:08,600
We are not sharding experts, so
let's do only partitioning.

23
00:01:09,960 --> 00:01:12,900
Michael: But I would say, actually,
I think it's important to

24
00:01:12,900 --> 00:01:18,340
mention that one of the big benefits
of UUIDs in general is that

25
00:01:18,340 --> 00:01:23,040
they do suit a setup
where you're generating IDs in

26
00:01:23,040 --> 00:01:24,040
multiple places.

27
00:01:24,060 --> 00:01:27,780
So it is a natural fit in terms
of topic.

28
00:01:27,780 --> 00:01:30,560
Nikolay: So basically you ditch
sequences.

29
00:01:31,920 --> 00:01:34,800
Michael: I think this lets you
avoid some of the drawbacks that

30
00:01:35,000 --> 00:01:39,720
we're seeing in UUIDs, some
of the new implementations, some

31
00:01:39,720 --> 00:01:43,260
of the new standards mean that
we don't have the drawbacks that

32
00:01:43,260 --> 00:01:49,000
typically come up in discussions
of bigserial versus UUID or

33
00:01:49,000 --> 00:01:52,320
ULID, as they're often called if
they're sortable.

34
00:01:53,100 --> 00:01:55,240
So yeah, you were looking into
this.

35
00:01:55,240 --> 00:01:58,140
Nikolay: Never, ever integer for
primary keys, never.

36
00:01:58,140 --> 00:01:58,640
Michael: Yeah.

37
00:01:59,540 --> 00:02:01,220
Hopefully, people know that now.

38
00:02:02,320 --> 00:02:04,600
You were looking into it for partitioning
recently then.

39
00:02:04,600 --> 00:02:06,220
So what was the use case?

40
00:02:07,700 --> 00:02:11,780
Nikolay: So the use case was our
chatbot, a new AI chatbot we

41
00:02:11,780 --> 00:02:15,940
developed, and we needed to have
a storage for all communication

42
00:02:16,080 --> 00:02:19,860
that happens in a public channel,
because we have a paid version,

43
00:02:19,860 --> 00:02:23,300
and a free version, and the free
version obviously is recorded

44
00:02:23,300 --> 00:02:24,120
and shareable.

45
00:02:24,520 --> 00:02:29,980
So basically, we created a chats table,
and I originally used BigInt

46
00:02:30,480 --> 00:02:35,920
primary keys, and then I was ashamed
by my team members, saying

47
00:02:35,920 --> 00:02:41,060
that come on, we need to use Timescale
there first of all and

48
00:02:41,060 --> 00:02:47,160
also, like numbers in URL is not
looking great. We need some better

49
00:02:47,160 --> 00:02:49,220
ideas to have there.

50
00:02:49,640 --> 00:02:53,340
Michael: Is this because they would
be guessable, or is it a security

51
00:02:53,440 --> 00:02:55,120
thing or something else?

52
00:02:55,580 --> 00:02:56,360
Nikolay: Good question.

53
00:02:56,820 --> 00:02:58,960
I have a long history about this.

54
00:02:59,380 --> 00:03:03,280
So I remember my first social networks,
we tried to hide the

55
00:03:03,280 --> 00:03:06,240
actual number of posts and users
we have.

56
00:03:07,120 --> 00:03:11,140
But then a competitor was released
and they didn't care about it.

57
00:03:11,140 --> 00:03:12,480
They didn't care about it.

58
00:03:12,070 --> 00:03:15,360
And then I also started not
to care about it.

59
00:03:15,360 --> 00:03:19,440
It's okay to have some numbers
in the URL and share actual numbers

60
00:03:19,440 --> 00:03:20,240
with everyone.

61
00:03:20,580 --> 00:03:25,860
But here I just think maybe it
will be maybe shared data set

62
00:03:25,860 --> 00:03:26,070
in the future.

63
00:03:27,540 --> 00:03:29,280
So I don't want to have sequences.

64
00:03:30,060 --> 00:03:32,480
And also maybe I don't want to
share actual numbers.

65
00:03:32,480 --> 00:03:33,160
I don't know.

66
00:03:33,160 --> 00:03:35,940
Like it's one of the things where
you're not fully sure.

67
00:03:36,060 --> 00:03:40,020
So I'm kind of okay to share numbers,
right?

68
00:03:40,120 --> 00:03:44,820
So for example, we generate 5,200
messages per day with this

69
00:03:44,820 --> 00:03:45,320
bot.

70
00:03:45,480 --> 00:03:49,720
People are using it. We have 72 people
joined the program, many

71
00:03:49,720 --> 00:03:53,240
more waiting in the waitlist.

72
00:03:54,640 --> 00:03:59,860
It's not a lot at all, but obviously
it will explode soon, I

73
00:03:59,860 --> 00:04:00,060
think so.

74
00:04:01,680 --> 00:04:06,260
And maybe I don't want competitors
to see actual numbers.

75
00:04:07,780 --> 00:04:08,560
I don't know.

76
00:04:08,560 --> 00:04:09,720
50-50 here.

77
00:04:09,720 --> 00:04:11,780
Because I had experience in both
cases.

78
00:04:12,040 --> 00:04:15,320
I had experience sharing numbers,
I had experience hiding numbers,

79
00:04:15,620 --> 00:04:20,720
and we used two prime numbers to
have not actual random, but kind

80
00:04:20,720 --> 00:04:21,050
of random.

81
00:04:21,050 --> 00:04:27,060
If you take a sequence and multiply
it by a very large integer and

82
00:04:27,060 --> 00:04:32,120
then take a modulus with another
integer, and these two integers

83
00:04:32,120 --> 00:04:33,050
should be mutually prime.

84
00:04:34,280 --> 00:04:38,000
In this case, you have like a rotational,
without collisions, you

85
00:04:38,000 --> 00:04:42,080
have like kind of random numbers
but they are not actually random.

86
00:04:43,140 --> 00:04:46,580
And nobody knows your actual ID
number.

87
00:04:47,300 --> 00:04:51,820
So this is how you can hide the
growth rates and actual number

88
00:04:51,820 --> 00:04:57,560
of your, like, business metrics
and so on, like how many posts

89
00:04:57,560 --> 00:05:00,480
or comments or anything you have
in your social network, for

90
00:05:00,480 --> 00:05:00,980
example.

91
00:05:01,240 --> 00:05:03,320
Again, it's not my main reason.

92
00:05:03,320 --> 00:05:07,360
I just think, okay, we need to
use a UUID probably, and then the

93
00:05:07,360 --> 00:05:10,060
team said we also need to partition
it.

94
00:05:10,420 --> 00:05:12,280
We already used TimescaleDB.

95
00:05:13,260 --> 00:05:14,560
Let's just use it.

96
00:05:14,700 --> 00:05:19,820
And then I think, okay, obviously,
I want my indexes to behave well.

97
00:05:20,020 --> 00:05:25,340
If I use a regular UUID version 4,
for example, supported by Postgres,

98
00:05:25,440 --> 00:05:29,480
currently, I know we will have
performance issues because numbers

99
00:05:30,060 --> 00:05:33,660
are going to very different random
places on the B-tree index.

100
00:05:34,120 --> 00:05:39,480
Not purely random, but any insert
disturbs some arbitrary part

101
00:05:39,480 --> 00:05:40,280
of index.

102
00:05:40,760 --> 00:05:44,240
And this is not good for performance,
especially also if you

103
00:05:44,240 --> 00:05:51,900
want to show last 25 items, for
example, ordered by time, creation

104
00:05:51,980 --> 00:05:52,480
time.

105
00:05:52,960 --> 00:05:56,540
Yeah, you will deal with a lot
more buffers than you should deal

106
00:05:56,540 --> 00:05:57,040
with.

107
00:05:57,260 --> 00:06:00,260
Michael: And a lot more for write-ahead
logging, right?

108
00:06:00,900 --> 00:06:04,120
Because you've got full-page images
for different parts of the

109
00:06:04,120 --> 00:06:08,720
index each time instead of well
B-trees are very optimized for

110
00:06:08,720 --> 00:06:11,820
adding things to the to the end
to the right side continuously

111
00:06:12,040 --> 00:06:16,820
like timestamps and UUIDs don't
fit that pattern at all but that

112
00:06:16,820 --> 00:06:18,820
takes us on to I guess some of
the

113
00:06:19,360 --> 00:06:23,940
Nikolay: newer right so I I And
since we also had a session on

114
00:06:23,940 --> 00:06:28,200
PostgresFM YouTube channel with
Kirk and Andrei, I think it's

115
00:06:28,200 --> 00:06:34,320
Kirk who brought this topic, let's
implement ULID in Postgres.

116
00:06:34,980 --> 00:06:38,680
And using Andrei's hands, we usually
use Andrei's hands because

117
00:06:38,680 --> 00:06:41,600
he's a true hacker in those sessions.

118
00:06:42,120 --> 00:06:47,080
And he implemented quite quickly
some patch, sent it to hackers,

119
00:06:47,080 --> 00:06:53,900
and quite soon we realized there
is already ongoing proposal

120
00:06:54,400 --> 00:07:01,860
of RFC change to support version
7, version 8 of UUID, and Andrei

121
00:07:01,860 --> 00:07:08,860
renamed the patch from ULID to UUID version
7, and then it was blocked

122
00:07:08,860 --> 00:07:13,860
by the idea, let's wait until the
standard is finalized and fully

123
00:07:13,860 --> 00:07:14,360
approved.

124
00:07:16,500 --> 00:07:18,480
You read this thread.

125
00:07:18,480 --> 00:07:22,860
There was also an idea, let's just
avoid naming version 7 completely.

126
00:07:23,440 --> 00:07:27,040
Michael: Well, I thought that was
a really neat solution.

127
00:07:27,040 --> 00:07:31,920
So, ultimately, what we want in,
well, what you want, what I

128
00:07:31,920 --> 00:07:36,180
would like, I think what lots of
people would like, is a function

129
00:07:36,180 --> 00:07:41,180
within Postgres to generate things
that can be put in a column

130
00:07:41,280 --> 00:07:47,260
of type UUID, but that when we
generate new ones based on a timestamp,

131
00:07:48,580 --> 00:07:52,700
they are sorted at the increase
over time.

132
00:07:53,000 --> 00:07:56,260
So that comes with loads of benefits
for updating B-trees and

133
00:07:56,260 --> 00:07:58,520
some of the other benefits that
you mentioned already.

134
00:07:58,940 --> 00:08:04,140
Now there are some new definitions
of new specifications for

135
00:08:04,140 --> 00:08:08,160
UUIDs coming down the pipe that
would suit those perfectly.

136
00:08:08,760 --> 00:08:10,260
But somebody made a really good
point.

137
00:08:10,260 --> 00:08:12,280
I think it was Andres from Microsoft.

138
00:08:13,120 --> 00:08:16,560
One of the first replies that Andres
got was we could just name

139
00:08:16,560 --> 00:08:20,640
the function along the lines of
what it does rather than UUID

140
00:08:20,740 --> 00:08:21,240
v7.

141
00:08:21,980 --> 00:08:24,440
I can't remember the exact name,
but they came up with something

142
00:08:24,440 --> 00:08:25,780
very sensible, I'm sure.

143
00:08:26,140 --> 00:08:29,160
And, yeah, that seemed really smart
to me.

144
00:08:29,160 --> 00:08:33,840
Because then even if the spec changes
for v7, it doesn't matter.

145
00:08:33,840 --> 00:08:36,380
We can implement those later or
have different functions for

146
00:08:36,380 --> 00:08:40,520
those, but this one could still be
useful to users earlier, before

147
00:08:40,520 --> 00:08:43,180
the spec finalizes, before everything
else gets agreed.

148
00:08:43,180 --> 00:08:46,860
So I like that as a suggestion,
but I think it either got lost

149
00:08:46,860 --> 00:08:49,520
or people are very happy waiting
another year.

150
00:08:49,540 --> 00:08:53,980
Nikolay: Well, it's just because
it looks like RFC makes good

151
00:08:53,980 --> 00:08:55,920
progress and it should be finalized.

152
00:08:55,920 --> 00:08:56,920
It's still not finalized.

153
00:08:56,920 --> 00:08:58,780
I've checked it a couple of days
ago.

154
00:08:58,780 --> 00:08:59,740
It's not finalized.

155
00:09:00,820 --> 00:09:02,980
But let's explain once again what
we want.

156
00:09:02,980 --> 00:09:07,800
We want to take good things from
both worlds.

157
00:09:08,800 --> 00:09:10,460
Growing integers, right?

158
00:09:11,280 --> 00:09:16,980
Which if you order them actually
completely, if you have, for

159
00:09:16,980 --> 00:09:21,100
example, regular approach, you
have ID and created at.

160
00:09:21,380 --> 00:09:26,340
Created at is 16 bytes and ID is
8 bytes.

161
00:09:26,400 --> 00:09:30,140
If you use 4 bytes, you will still
use 8 bytes because of alignment

162
00:09:30,140 --> 00:09:31,420
padding, we know it.

163
00:09:31,800 --> 00:09:36,780
So, and if you just insert and
created at is just now, and you

164
00:09:36,780 --> 00:09:42,480
insert just 1 row per transaction,
or you have clock timestamp

165
00:09:42,500 --> 00:09:45,780
instead of now, in this case you
can insert many rows.

166
00:09:45,820 --> 00:09:52,500
So, the order of IDs and created
at values will be the same.

167
00:09:53,680 --> 00:09:58,080
You can order by ID desc limit
25 or you can order by created

168
00:09:58,080 --> 00:10:00,720
at desc limit 25, same.

169
00:10:01,340 --> 00:10:02,360
And this is good.

170
00:10:02,680 --> 00:10:05,520
This is good benefit, and it's
also good for performance because

171
00:10:05,820 --> 00:10:14,120
of locality of data, rows created
inside the same second go probably

172
00:10:14,120 --> 00:10:16,020
in 1 or 2 pages only.

173
00:10:16,020 --> 00:10:21,440
Like, very, it's packed, not distributed
sparsely, right?

174
00:10:22,360 --> 00:10:26,760
And UUID is good because it doesn't
require any sequences.

175
00:10:27,380 --> 00:10:30,520
You can generate it using thousands
of nodes.

176
00:10:30,820 --> 00:10:34,220
This is a name, UUID, universally
unique ID.

177
00:10:35,640 --> 00:10:39,520
But they go into different places
of B-tree and you cannot order

178
00:10:39,520 --> 00:10:40,080
by them.

179
00:10:40,080 --> 00:10:41,240
You cannot order by them.

180
00:10:41,240 --> 00:10:42,380
This is the main problem.

181
00:10:42,880 --> 00:10:44,780
Michael: In like version 4 of the
specification.

182
00:10:45,040 --> 00:10:47,120
Nikolay: Yeah, yeah, I'm talking
about current situation.

183
00:10:47,980 --> 00:10:50,860
And also with RFC, it's still not
finalized, so it's also the

184
00:10:50,860 --> 00:10:53,500
current situation if you don't
use drafts, right?

185
00:10:54,020 --> 00:10:55,180
And this is a problem.

186
00:10:55,440 --> 00:10:58,300
And we have good blog posts from
Shopify.

187
00:10:58,380 --> 00:11:00,200
It's MySQL, but it's very good.

188
00:11:00,200 --> 00:11:04,080
They explain the economical benefits
from switching to different

189
00:11:04,080 --> 00:11:06,440
type of UUID, newer type of UUID.

190
00:11:06,820 --> 00:11:13,580
And also, Husayn Nasir, sorry, I
pronounced his name wrong, on

191
00:11:13,580 --> 00:11:18,460
YouTube explaining very well for
backend engineers, This Shopify

192
00:11:18,520 --> 00:11:23,000
case and why a better version of
UUID is really better, he explains

193
00:11:23,000 --> 00:11:23,940
it very well.

194
00:11:24,120 --> 00:11:25,080
Very, very well.

195
00:11:25,080 --> 00:11:30,060
Like, his explanation is, anyone
can understand this now, right?

196
00:11:30,060 --> 00:11:32,120
Maybe better than I just explained,
right?

197
00:11:32,120 --> 00:11:33,400
Michael: Let's link it up.

198
00:11:34,280 --> 00:11:37,720
I actually pulled out a quote from
that Shopify blog post.

199
00:11:37,720 --> 00:11:42,720
It said, in 1 high throughput system,
they saw a 50% decrease

200
00:11:42,720 --> 00:11:47,540
in insert statement duration by
switching from UUID v4 to ULID.

201
00:11:49,020 --> 00:11:50,320
Nikolay: And you can order by them.

202
00:11:50,320 --> 00:11:55,060
If you want, like, 25 letters posts,
you need to order by, right?

203
00:11:57,340 --> 00:11:59,640
Michael: Even if you don't need
those, even if you don't need

204
00:11:59,640 --> 00:12:03,000
that, there could be significant
benefits because of

205
00:12:03,000 --> 00:12:03,480
Nikolay: the B-trees.

206
00:12:03,480 --> 00:12:05,780
Let's slowly move to partitioning.

207
00:12:06,340 --> 00:12:11,140
If you don't have an orderable
ID and you rely on created at

208
00:12:11,180 --> 00:12:14,100
column, or you can have an index on
created at, you can order by

209
00:12:14,100 --> 00:12:15,700
and use an index scan on that.

210
00:12:15,700 --> 00:12:16,825
Okay, It's okay.

211
00:12:16,825 --> 00:12:21,780
Not perfect situation because you
have many more buffer numbers

212
00:12:22,400 --> 00:12:23,300
in the plan, right?

213
00:12:23,300 --> 00:12:25,440
Because you fetch from random places.

214
00:12:25,960 --> 00:12:30,980
But if you want to partition such
table, in URL you have UUID,

215
00:12:31,080 --> 00:12:31,580
right?

216
00:12:32,120 --> 00:12:35,460
How to find which partition to
deal with?

217
00:12:36,420 --> 00:12:38,080
Michael: You could do like hash
partitioning.

218
00:12:38,240 --> 00:12:39,960
Nikolay: Well, yeah, yeah, yeah,
yeah.

219
00:12:39,960 --> 00:12:42,240
But I want to partition by time,
I forgot to say.

220
00:12:42,240 --> 00:12:45,400
I want to partition by time because
older data is used less often.

221
00:12:47,000 --> 00:12:50,320
Michael: And maybe like you want
to phase it out over time.

222
00:12:50,380 --> 00:12:54,100
Maybe you don't even, like, you
might want to eventually drop...

223
00:12:54,140 --> 00:12:54,640
Exactly.

224
00:12:55,020 --> 00:12:56,880
Nikolay: Or move to a different tablespace.

225
00:12:56,880 --> 00:12:59,120
It's not super popular solution,
of course.

226
00:12:59,440 --> 00:13:03,780
Also, There are some ideas to be
able to move it to object storage,

227
00:13:04,020 --> 00:13:06,680
which is only supported in Timescale
Cloud.

228
00:13:07,660 --> 00:13:12,160
I think one day it should be possible
with regular vanilla PostgreSQL.

229
00:13:12,180 --> 00:13:15,000
Michael: I think I read about another
provider doing it recently

230
00:13:15,040 --> 00:13:15,540
too.

231
00:13:16,020 --> 00:13:16,520
Nikolay: Interesting.

232
00:13:16,880 --> 00:13:17,540
Send me a link.

233
00:13:17,540 --> 00:13:20,000
I'm very interested in learning
about it.

234
00:13:20,280 --> 00:13:23,600
I think moving older partitions,
time-based partitions, like

235
00:13:23,600 --> 00:13:27,420
our old data, like 2 years old,
moving to object storage, eventually

236
00:13:27,660 --> 00:13:31,220
moving towards like bottomless
Postgres, right, it would be great.

237
00:13:31,640 --> 00:13:34,020
But back to our question.

238
00:13:34,300 --> 00:13:38,000
In URL you have blah blah blah,
in our case postgresql.org slash

239
00:13:38,000 --> 00:13:38,740
some UUID.

240
00:13:39,400 --> 00:13:41,740
And we need to understand which
partition to deal with.

241
00:13:41,740 --> 00:13:43,020
In our case it's TimescaleDB.

242
00:13:43,780 --> 00:13:46,240
How to tell TimescaleDB which
partition it is?

243
00:13:46,240 --> 00:13:50,200
If it's a regular UUID, I think it's
a nightmare.

244
00:13:50,800 --> 00:13:56,260
Michael: Well, I saw the update
and the reply you got from, was

245
00:13:56,260 --> 00:13:58,480
it James Sewell at Timescale?

246
00:13:58,480 --> 00:14:00,900
Nikolay: A couple of guys replied
from Timescale company.

247
00:14:02,860 --> 00:14:07,280
I really appreciate it, but it
was about already this sortable

248
00:14:07,360 --> 00:14:09,560
ULID version 7.

249
00:14:09,760 --> 00:14:11,180
This is already solved.

250
00:14:11,540 --> 00:14:14,120
This recipe I have in this Postgres
Marathon.

251
00:14:14,800 --> 00:14:15,580
It's recorded.

252
00:14:16,900 --> 00:14:18,540
I implemented it my own way.

253
00:14:18,540 --> 00:14:19,620
It was not efficient.

254
00:14:19,640 --> 00:14:23,240
And they just provided very good
advice how to do it much more

255
00:14:23,240 --> 00:14:23,740
elegantly.

256
00:14:25,240 --> 00:14:27,660
And I have it covered in my small
article.

257
00:14:28,440 --> 00:14:32,940
So ULID, it's similar to UUID version
7 or 8.

258
00:14:32,940 --> 00:14:38,040
ULID, it's actually it should be
UUID because it's universally

259
00:14:39,120 --> 00:14:43,280
unique, lexicographically orderable
or something.

260
00:14:43,480 --> 00:14:43,980
Sortable.

261
00:14:44,440 --> 00:14:46,600
Michael: Yeah, probably like UUID.

262
00:14:47,980 --> 00:14:51,560
But I like, I admire the people
that came up with ULID.

263
00:14:51,680 --> 00:14:53,260
It's nice and simple and unique.

264
00:14:53,260 --> 00:14:56,900
Nikolay: So I guess we are about
to abandon this abbreviation

265
00:14:57,180 --> 00:15:01,400
at all and just use UUID version
7 or 8 because of the standard.

266
00:15:02,280 --> 00:15:05,140
Michael: Or maybe in the future we'll just say UUID

267
00:15:05,140 --> 00:15:08,320
because I don't, there might
not be that many benefits to

268
00:15:08,320 --> 00:15:09,280
using the random ones.

269
00:15:09,280 --> 00:15:12,660
Yeah, sure, you get extra uniqueness,
like you get more bits

270
00:15:12,660 --> 00:15:16,620
assigned to the uniqueness part,
but we already have so many,

271
00:15:17,040 --> 00:15:22,200
like I was looking up version
7 and I think the spec currently

272
00:15:22,200 --> 00:15:24,120
says 48 bits for the timestamp.

273
00:15:24,620 --> 00:15:26,580
So if you're thinking in terms
of bits

274
00:15:26,580 --> 00:15:27,260
Nikolay: and bytes.

275
00:15:27,260 --> 00:15:27,840
16 bytes.

276
00:15:28,680 --> 00:15:29,140
Michael: Yeah.

277
00:15:29,140 --> 00:15:30,840
So 16 bytes or 128 bits.

278
00:15:31,240 --> 00:15:35,360
48 of those are reserved for the
timestamp component.

279
00:15:35,900 --> 00:15:40,460
And there's a little bit about,
to specify the version, 74 bits

280
00:15:40,480 --> 00:15:42,020
for the randomness.

281
00:15:42,540 --> 00:15:47,580
Now naturally in a UUID v4, you
get all 128 bits for randomness.

282
00:15:47,600 --> 00:15:50,820
So there is less randomness possible
in these ones.

283
00:15:51,760 --> 00:15:53,580
Nikolay: There may be collisions,
right?

284
00:15:54,000 --> 00:15:56,980
Michael: Or just a slightly higher
percentage chance of it, right?

285
00:15:56,980 --> 00:16:01,560
Like, it's not, we're still talking
about a, you know, especially

286
00:16:01,560 --> 00:16:05,140
if you're talking about in your
case, the chance of collisions

287
00:16:05,140 --> 00:16:08,940
is just basically 0, which is why
these are useful.

288
00:16:09,240 --> 00:16:11,960
Nikolay: Yeah, even if we will
have like 100 messages per second,

289
00:16:11,960 --> 00:16:13,000
we will be fine.

290
00:16:13,040 --> 00:16:13,760
Michael: Yeah, exactly.

291
00:16:13,860 --> 00:16:16,700
Nikolay: And we can generate them
on any server basically.

292
00:16:16,800 --> 00:16:23,080
Well, how clocks are set will be
a question of course.

293
00:16:24,940 --> 00:16:26,780
Michael: This needs to be UTC,
right?

294
00:16:27,100 --> 00:16:30,800
Nikolay: So this version 7 or ULID
or version 8, version 7 and

295
00:16:30,800 --> 00:16:36,320
8 they distinguish in precision
only, right?

296
00:16:36,580 --> 00:16:37,500
Michael: That was my understanding.

297
00:16:37,500 --> 00:16:40,640
Nikolay: The idea is let's take
timestamp, let's generate regular

298
00:16:40,640 --> 00:16:45,600
UUID, or something like that, and
then produce a new UUID, the

299
00:16:45,600 --> 00:16:50,160
prefix of which will be, not prefix,
first bytes of which will

300
00:16:50,160 --> 00:16:53,960
be corresponding to timestamp,
so they will be sortable.

301
00:16:54,280 --> 00:16:58,440
If you can order by and with very,
very high confidence, you

302
00:16:58,440 --> 00:17:01,360
can say, this corresponds to created
timestamp.

303
00:17:02,460 --> 00:17:05,400
And actually, the interesting thing...

304
00:17:06,100 --> 00:17:10,880
So, Postgres is on pause, right?

305
00:17:10,880 --> 00:17:12,940
Waiting for RFC to be finalized.

306
00:17:13,320 --> 00:17:18,840
I mean, I actually started to doubt
that Postgres 17 will get it.

307
00:17:19,140 --> 00:17:23,040
This idea to change name and just
provide something not depending

308
00:17:23,040 --> 00:17:27,180
on the RFC is good, but I guess
consensus was not, maybe not

309
00:17:27,180 --> 00:17:32,520
consensus, but what I saw from
these Postgres scale hackers mailing

310
00:17:32,520 --> 00:17:36,440
list, people decided to wait until
the RFC.

311
00:17:36,940 --> 00:17:40,080
And it's not fully clear, I checked
the status, it's still like

312
00:17:40,080 --> 00:17:43,680
waiting for some reviews, but it's
not fully clear when it will

313
00:17:43,680 --> 00:17:44,280
be finalized.

314
00:17:44,280 --> 00:17:46,940
So I guess, probably 17 will not
have it.

315
00:17:47,720 --> 00:17:51,220
Michael: My experience with these
things is it depends if a couple

316
00:17:51,220 --> 00:17:54,520
of people get excited about this
if somebody ends up needing

317
00:17:54,520 --> 00:17:59,880
it, somebody commits it, if somebody
puts a patch together that

318
00:18:00,040 --> 00:18:03,080
can generally be agreed on and
somebody else is willing to review

319
00:18:03,080 --> 00:18:04,440
it, it could get in.

320
00:18:04,440 --> 00:18:10,540
But it depends on a couple of people
at least having the energy

321
00:18:10,840 --> 00:18:11,340
and

322
00:18:11,840 --> 00:18:12,340
Nikolay: focus.

323
00:18:12,380 --> 00:18:15,860
Well I just remember Peter Eisentrout's
comment that we should

324
00:18:15,860 --> 00:18:19,440
probably wait for the standard and
Postgres is very close to some

325
00:18:19,440 --> 00:18:21,900
standards, you know, like including
SQL standards.

326
00:18:21,900 --> 00:18:24,940
So it's a very, very important
comment.

327
00:18:25,200 --> 00:18:27,760
My impression is that we're just
waiting for the RFC.

328
00:18:28,260 --> 00:18:32,520
But good news is that you can generate
it in any application

329
00:18:32,680 --> 00:18:33,580
code yourself.

330
00:18:34,740 --> 00:18:39,600
There are many, many, many ULID
or this ULID version 7, if RFC

331
00:18:39,600 --> 00:18:42,400
is not yet finalized, they're called
already version 7, you can

332
00:18:42,400 --> 00:18:42,880
do it.

333
00:18:42,880 --> 00:18:45,140
And you can find on GitHub a lot
of stuff.

334
00:18:45,360 --> 00:18:49,700
But you can also generate it using
PL/pgSQL.

335
00:18:49,700 --> 00:18:50,200
Yeah.

336
00:18:51,100 --> 00:18:55,620
And additionally, like, and Daniel
Verite, I hope I pronounce it

337
00:18:55,840 --> 00:19:01,080
right, sorry guys if I pronounce names
wrong, showed some proposal how

338
00:19:01,080 --> 00:19:06,720
to generate it using PL/pgSQL,
and then I asked, I'm just looking

339
00:19:06,720 --> 00:19:10,520
at the code, I'm saying, we don't
need actually PL/pgSQL here,

340
00:19:10,520 --> 00:19:14,600
we can use it with regular SQL
function, and he answers, yes,

341
00:19:14,700 --> 00:19:15,760
just scroll down.

342
00:19:16,640 --> 00:19:21,340
The same GitHub discussion, it
was GitHub gist, I see a SQL function,

343
00:19:21,340 --> 00:19:24,720
it's quite simple and I just started
to use it.

344
00:19:25,360 --> 00:19:30,240
So I used that function, and I
just generate UUID with 7 using

345
00:19:30,240 --> 00:19:31,920
small SQL function, that's it.

346
00:19:31,920 --> 00:19:35,860
But then the very big question,
do we really need a created that

347
00:19:35,860 --> 00:19:36,360
timestamp?

348
00:19:37,580 --> 00:19:41,320
If we have a timestamp already present
in this UUID value which

349
00:19:41,320 --> 00:19:45,420
is 18 bytes, and we don't need
a very good precision, maybe we

350
00:19:45,420 --> 00:19:47,180
can extract the timestamp back.

351
00:19:48,060 --> 00:19:49,340
And the answer is yes.

352
00:19:49,900 --> 00:19:55,620
I saw Andrei's comment in PostgreSQL
hackers saying that the authors

353
00:19:55,760 --> 00:20:02,080
of the RFC proposal don't encourage
it, extraction of UUID version

354
00:20:02,080 --> 00:20:04,340
7 values, extraction of timestamp.

355
00:20:05,380 --> 00:20:07,680
But for partitioning, we actually
need it.

356
00:20:07,680 --> 00:20:12,600
And also, we need it if we want
to just drop our created at column,

357
00:20:12,600 --> 00:20:13,320
because why?

358
00:20:13,320 --> 00:20:15,120
We have created that right here.

359
00:20:15,480 --> 00:20:18,280
Michael: I'm not sure we do need
it for partitioning.

360
00:20:18,480 --> 00:20:23,460
Like, if maybe in Timescale, the
way you define a hypertable,

361
00:20:23,460 --> 00:20:29,520
but for example, if I set up a
range partition using a ULID,

362
00:20:30,140 --> 00:20:34,300
And I could tell it, if I maybe,
maybe, yeah, I could, or maybe

363
00:20:34,300 --> 00:20:38,580
not even just prefix, maybe exact
UUID between these 2 UUIDs.

364
00:20:39,280 --> 00:20:40,640
Nikolay: Yeah, yeah, yeah, yeah,
I agree.

365
00:20:40,640 --> 00:20:41,120
I agree.

366
00:20:41,120 --> 00:20:46,080
Yes, I'm in the context of my particular
recipe with TimescaleDB.

367
00:20:46,360 --> 00:20:46,860
Yeah.

368
00:20:46,960 --> 00:20:49,780
And TimescaleDB, of course, would
like to have timestamps.

369
00:20:50,560 --> 00:20:54,520
And I created some recipe which
looked quite weird, but it worked.

370
00:20:54,520 --> 00:20:59,280
But then a couple of Timescale
guys provided good advice on how

371
00:20:59,280 --> 00:21:04,820
to use it much better, just saying
time partitioning func option

372
00:21:04,820 --> 00:21:09,100
when you create hypertable, TimescaleDB
hypertable, and that's

373
00:21:09,100 --> 00:21:09,600
it.

374
00:21:10,080 --> 00:21:15,140
This func function is our function
which converts UUID version

375
00:21:15,140 --> 00:21:16,240
7 to timestamps.

376
00:21:17,040 --> 00:21:20,040
I strongly believe this is a very
helpful function to have.

377
00:21:20,080 --> 00:21:21,680
So we need to have it.

378
00:21:21,700 --> 00:21:25,900
Even if RFC authors think we don't
need it, I think we need it.

379
00:21:25,920 --> 00:21:28,700
I hear you, we can have ranges,
yes.

380
00:21:29,340 --> 00:21:33,620
But having this function is super
helpful because I can reconstruct

381
00:21:34,540 --> 00:21:37,320
timestamps even with some not perfect
precision.

382
00:21:37,700 --> 00:21:40,460
Timestamps are 16 bytes.

383
00:21:40,960 --> 00:21:42,900
UUID version 7 is 16 bytes.

384
00:21:42,900 --> 00:21:46,820
We know not everything on those
16 bytes is related to a timestamp.

385
00:21:47,040 --> 00:21:49,500
You said like how many bytes, I
don't remember.

386
00:21:50,420 --> 00:21:52,320
Michael: More than half is random,
yeah.

387
00:21:52,660 --> 00:21:54,720
Nikolay: Yes, so we lose precision.

388
00:21:55,080 --> 00:21:57,100
But I don't need the milliseconds.

389
00:21:58,660 --> 00:21:59,240
Michael: Oh, that's

390
00:21:59,240 --> 00:22:00,120
Nikolay: a good point.

391
00:22:02,020 --> 00:22:03,020
Michael: That's a good point.

392
00:22:03,820 --> 00:22:05,540
I'm looking at the V7 spec.

393
00:22:05,540 --> 00:22:08,040
I suspect V8 with more precision.

394
00:22:09,060 --> 00:22:13,260
Nikolay: If you need better precision,
use V8 and have better

395
00:22:13,260 --> 00:22:18,560
precision paying some extra storage
costs and memory, of course,

396
00:22:18,560 --> 00:22:19,540
Michael: and buffers and

397
00:22:19,540 --> 00:22:20,240
Nikolay: so on.

398
00:22:20,820 --> 00:22:23,860
Michael: I don't think it's still
UUID format, right?

399
00:22:23,860 --> 00:22:27,600
So I don't think you do pay those
extra precision.

400
00:22:27,620 --> 00:22:29,240
But I think you get less randomness.

401
00:22:29,960 --> 00:22:30,900
Nikolay: Ah, less randomness.

402
00:22:31,880 --> 00:22:32,380
Well,

403
00:22:32,780 --> 00:22:34,820
Michael: Fewer bits for the random jump.

404
00:22:34,820 --> 00:22:36,420
Nikolay: Well, I need to look at it then.

405
00:22:36,420 --> 00:22:40,320
Yes, I only played with v7 and I decided to use it, but maybe

406
00:22:40,320 --> 00:22:43,240
I should look at v8 and understand the trade-offs.

407
00:22:43,740 --> 00:22:44,200
Yeah,

408
00:22:44,200 --> 00:22:46,180
Michael: Well, good to understand at least.

409
00:22:46,860 --> 00:22:47,780
Nikolay: Yeah, yeah, interesting.

410
00:22:48,540 --> 00:22:52,860
So anyway, you can take UUID v7 right now, generate it on application

411
00:22:53,080 --> 00:22:57,660
on using this SQL function, quite simple, and then join this

412
00:22:57,660 --> 00:23:00,720
with Timescale and have partitioning provided by TimescaleDB,

413
00:23:00,720 --> 00:23:03,100
which is like fully automatic, very good.

414
00:23:03,320 --> 00:23:04,620
And that's great, right?

415
00:23:04,820 --> 00:23:07,940
And it's sortable and it's quite efficient.

416
00:23:08,040 --> 00:23:13,040
Yes, it's 16 bytes versus 8, but for timestamps, we also had

417
00:23:13,040 --> 00:23:13,820
16 bytes.

418
00:23:14,380 --> 00:23:17,720
And I decided not to create created_at at all.

419
00:23:18,180 --> 00:23:20,060
Michael: Yeah, you said do we still need created_at?

420
00:23:20,060 --> 00:23:23,360
And then you answered yes, but I thought you meant the other

421
00:23:23,360 --> 00:23:23,740
way around.

422
00:23:23,740 --> 00:23:27,340
So you mean now we don't need created_at at all?

423
00:23:27,340 --> 00:23:29,160
Nikolay: Yeah, we can extract it and that's it.

424
00:23:29,160 --> 00:23:34,200
So I don't need super precision in my case, it's just some messages.

425
00:23:35,140 --> 00:23:37,260
And I'm fine with even second precision.

426
00:23:38,260 --> 00:23:41,960
So yeah, it's a good question which precision I get extracting

427
00:23:42,100 --> 00:23:44,280
from this UUID version set up.

428
00:23:45,140 --> 00:23:49,000
Michael: Well, there's an argument you could, if you don't care

429
00:23:49,000 --> 00:23:52,800
about the spec, you could invent your own version that only goes

430
00:23:52,800 --> 00:23:55,080
down to a second precision and gets even more random.

431
00:23:55,080 --> 00:23:57,280
You could play with that a little bit.

432
00:23:57,380 --> 00:24:00,220
Nikolay: And adjust the function and feed it to Timescale and

433
00:24:00,220 --> 00:24:00,860
so on.

434
00:24:01,200 --> 00:24:04,740
This recipe is already showing all internals, right?

435
00:24:04,900 --> 00:24:06,180
It's pure SQL.

436
00:24:06,760 --> 00:24:08,560
So we don't need to...

437
00:24:08,560 --> 00:24:13,680
We can use it in any place, in any Postgres flavor, RDS, Aurora,

438
00:24:13,840 --> 00:24:15,120
anywhere right now.

439
00:24:15,240 --> 00:24:16,220
So that's great.

440
00:24:17,020 --> 00:24:19,740
Of course, in RDS you don't have TimescaleDB, right?

441
00:24:19,900 --> 00:24:24,400
But you have a new sharding of Aurora, just released like a month

442
00:24:24,400 --> 00:24:25,300
ago or when.

443
00:24:25,680 --> 00:24:29,620
So yeah, I don't know, like I'm excited to see us shifting from

444
00:24:29,620 --> 00:24:32,780
regular numbers to these IDs, actually.

445
00:24:33,800 --> 00:24:37,120
Yeah, it's good to see some numbers.

446
00:24:37,120 --> 00:24:41,440
Let's provide some links to blog posts for those folks who want

447
00:24:41,440 --> 00:24:42,740
to explore overhead.

448
00:24:43,580 --> 00:24:47,780
Michael: Well, I think there's an interesting post by Brandur

449
00:24:48,040 --> 00:24:49,020
that I'll link to.

450
00:24:49,120 --> 00:24:50,760
Nikolay: Exactly, this is one of them.

451
00:24:50,760 --> 00:24:51,140
Michael: Yeah.

452
00:24:51,140 --> 00:24:51,640
Nice.

453
00:24:52,340 --> 00:24:57,040
But they also mentioned a couple
of downsides of ULIDs, which

454
00:24:57,040 --> 00:24:59,540
I hadn't considered that I think's
worth mentioning.

455
00:24:59,540 --> 00:25:01,820
And well, let's talk about them
quickly.

456
00:25:02,440 --> 00:25:06,500
1 is that, well, there's naturally
less randomness because we're

457
00:25:06,500 --> 00:25:07,860
taking up some of it with timestamps.

458
00:25:07,860 --> 00:25:09,100
We've talked about that 1 already.

459
00:25:09,100 --> 00:25:13,880
But the second 1 is you could end
up producing a lopsided index.

460
00:25:14,340 --> 00:25:21,060
So if you have deleted data, we
won't reuse that space, like

461
00:25:21,060 --> 00:25:22,100
sequential IDs.

462
00:25:22,640 --> 00:25:26,880
But in other UUID implementations,
because they're more random,

463
00:25:26,880 --> 00:25:28,360
you would reuse that space.

464
00:25:28,380 --> 00:25:32,640
So we've got kind of a different
bloat problem, except if we

465
00:25:32,640 --> 00:25:33,340
partition, right?

466
00:25:33,340 --> 00:25:37,200
If we're partitioning and if we've
got index maintenance, if

467
00:25:37,200 --> 00:25:40,080
we're dropping our partitions over
time, we'll naturally get

468
00:25:40,080 --> 00:25:40,760
rid of that bloat.

469
00:25:40,760 --> 00:25:43,720
But I thought it was a really good
point that there are some

470
00:25:43,980 --> 00:25:48,340
downsides to this always increasing
idea as well.

471
00:25:48,800 --> 00:25:51,900
But they mentioned at the end of
the blog post that they expected

472
00:25:51,900 --> 00:25:56,480
to go into the investigation very
much on the side of using big

473
00:25:56,480 --> 00:25:58,300
serial or big ints everywhere.

474
00:25:58,400 --> 00:26:02,480
And actually they ended up thinking
these ULIDs or they called

475
00:26:02,480 --> 00:26:06,280
them performance aware or intelligent
UUIDs, which I thought

476
00:26:06,280 --> 00:26:07,820
was quite a nice phrase.

477
00:26:09,480 --> 00:26:11,540
Was their favored approach now?

478
00:26:12,040 --> 00:26:13,140
Nikolay: I'm very sorry.

479
00:26:13,140 --> 00:26:17,800
You said in the end of his blog
posts, and I see pictures of

480
00:26:17,800 --> 00:26:18,340
the walk.

481
00:26:18,340 --> 00:26:20,420
I don't know how it's related at
all.

482
00:26:21,420 --> 00:26:22,100
No, no,

483
00:26:22,420 --> 00:26:23,560
Michael: A bit above that.

484
00:26:23,560 --> 00:26:24,480
Just at the end

485
00:26:24,480 --> 00:26:25,280
Nikolay: of the title.

486
00:26:25,680 --> 00:26:26,680
I've got distracted.

487
00:26:26,840 --> 00:26:31,100
First of all, very good title of
the blog post, identity crisis.

488
00:26:32,900 --> 00:26:34,060
So ID crisis.

489
00:26:34,540 --> 00:26:39,100
But yeah, these pictures of the San
Francisco Bay Area and the walk down

490
00:26:39,100 --> 00:26:39,740
to Pacifica.

491
00:26:40,200 --> 00:26:46,940
I know Brando already heard 1 of
our episodes and highlighted

492
00:26:47,080 --> 00:26:51,860
my phrase on Twitter when I was
saying, very good blog post,

493
00:26:52,340 --> 00:26:53,540
but completely wrong.

494
00:26:54,140 --> 00:26:55,120
Something like this.

495
00:26:55,320 --> 00:26:56,880
Not completely, right, okay.

496
00:26:56,980 --> 00:26:58,300
And I was right, actually.

497
00:26:58,380 --> 00:27:00,200
I like speaking from experience.

498
00:27:00,660 --> 00:27:05,180
And now I must say, you should
not put very good pictures of

499
00:27:05,640 --> 00:27:06,880
101 highway.

500
00:27:07,840 --> 00:27:09,220
Because it's super distracting.

501
00:27:10,080 --> 00:27:11,760
Michael: Yeah, I actually remember
that quote.

502
00:27:11,760 --> 00:27:14,640
I think it was I think the words
you used were something along

503
00:27:14,640 --> 00:27:17,700
the lines of great post 1 conclusion.

504
00:27:18,340 --> 00:27:19,360
Nikolay: Ah, yes, exactly.

505
00:27:19,360 --> 00:27:19,660
Right.

506
00:27:19,660 --> 00:27:19,980
Right.

507
00:27:19,980 --> 00:27:20,980
And then

508
00:27:20,980 --> 00:27:23,220
Michael: that must be I think there
must be a second one as well,

509
00:27:23,220 --> 00:27:25,920
because I think at Crunchy Data,
credit to them, they listened

510
00:27:25,920 --> 00:27:30,860
to the one where we were talking
about random page cost and how

511
00:27:30,860 --> 00:27:34,180
it was still 4, but they did some
benchmarking and reduced it,

512
00:27:34,180 --> 00:27:38,440
I think, to 1.1, which is a much
more SSD-friendly number.

513
00:27:38,900 --> 00:27:39,800
So maybe they listened

514
00:27:39,800 --> 00:27:40,440
Nikolay: to 2.0.

515
00:27:40,440 --> 00:27:44,620
And also, yeah, I also learned
from this session, which was full

516
00:27:44,620 --> 00:27:51,040
of, I also learned that 1.0 probably
is not better than 1.1,

517
00:27:51,280 --> 00:27:53,260
which is interesting because they
had numbers.

518
00:27:53,920 --> 00:27:58,200
Yeah, they had some numbers proving
that 1.1 is better than 1.0.

519
00:27:58,520 --> 00:27:59,960
So this is super interesting.

520
00:27:59,960 --> 00:28:03,260
Next time I touch this area, I
will pay attention to their numbers.

521
00:28:03,520 --> 00:28:04,020
Michael: Nice.

522
00:28:04,340 --> 00:28:05,320
Nikolay: Yeah, that's good.

523
00:28:06,000 --> 00:28:08,420
Michael: Were there any other posts
or things that you've seen

524
00:28:08,420 --> 00:28:10,820
that you wanted to draw people's
attention to?

525
00:28:11,280 --> 00:28:11,980
Nikolay: I don't know.

526
00:28:11,980 --> 00:28:13,680
There are many materials about
it.

527
00:28:13,680 --> 00:28:16,120
I don't remember particular ones.

528
00:28:16,320 --> 00:28:19,200
Let's just put something in the show
notes.

529
00:28:19,200 --> 00:28:25,780
It's an interesting topic, I think
very important for maybe everyone.

530
00:28:26,400 --> 00:28:26,900
Michael: Yeah.

531
00:28:28,260 --> 00:28:31,320
There's an outdated one by Christophe
Pettis that I thought was

532
00:28:31,320 --> 00:28:34,940
quite good, but then a more recent
one by them as well, suggesting

533
00:28:34,940 --> 00:28:39,740
that we think about it in two steps,
the age-old UUID versus serial

534
00:28:39,860 --> 00:28:40,360
question.

535
00:28:40,600 --> 00:28:44,440
They suggested thinking, firstly,
should our keys be random or

536
00:28:44,440 --> 00:28:45,460
should they be sequential?

537
00:28:45,720 --> 00:28:48,340
That's a very good point.

538
00:28:48,340 --> 00:28:49,640
And do you want them to be guessable?

539
00:28:49,640 --> 00:28:51,080
Do you want to be able to infer?

540
00:28:51,220 --> 00:28:54,000
Actually, we didn't talk about this,
but if you're using ULIDs

541
00:28:54,480 --> 00:28:57,660
and someone can tell their ULIDs,
they also get a little bit

542
00:28:57,660 --> 00:28:58,560
of extra information.

543
00:28:58,620 --> 00:29:00,980
They can tell when this ID was
created.

544
00:29:01,080 --> 00:29:03,020
Is that a problem for you or not?

545
00:29:03,340 --> 00:29:04,960
These are the questions you need
to ask yourself.

546
00:29:04,960 --> 00:29:05,740
So that's number one.

547
00:29:05,740 --> 00:29:07,460
Should they be random or should
they be sequential?

548
00:29:07,580 --> 00:29:09,360
Nikolay: Do you want to hide it,
right?

549
00:29:10,040 --> 00:29:10,540
Michael: Yeah.

550
00:29:10,640 --> 00:29:15,220
And then the second one is, should
they be 64 bits or should they be

551
00:29:15,220 --> 00:29:16,400
be larger than that?

552
00:29:16,400 --> 00:29:20,140
And that's a second separate question.

553
00:29:20,740 --> 00:29:21,660
But I think Christoph might be

554
00:29:21,660 --> 00:29:21,780
Nikolay: really good.

555
00:29:21,780 --> 00:29:24,680
I want them to be shorter, not
larger.

556
00:29:26,060 --> 00:29:26,560
Michael: Okay.

557
00:29:27,160 --> 00:29:28,640
But sorry, bits.

558
00:29:28,780 --> 00:29:29,780
Did I say bytes?

559
00:29:31,020 --> 00:29:32,340
I don't remember.

560
00:29:32,460 --> 00:29:36,160
Anyway, the point is how much randomness,
like how much entropy

561
00:29:36,160 --> 00:29:36,940
do you need?

562
00:29:37,080 --> 00:29:39,780
Nikolay: And that's collision risks,
basically.

563
00:29:39,780 --> 00:29:40,200
Michael: Yeah.

564
00:29:40,200 --> 00:29:42,800
But these 2 questions, they're
separate questions.

565
00:29:42,800 --> 00:29:45,560
And Michael made a really good
point that often they get conflated

566
00:29:45,720 --> 00:29:49,540
in these arguments between like
people that argue for UUIDs are

567
00:29:49,540 --> 00:29:52,780
arguing along 1 of these questions
and people that argue for

568
00:29:52,840 --> 00:29:56,040
bigints are just completely ignoring
that question and going

569
00:29:56,040 --> 00:29:57,540
down a completely different question.

570
00:29:57,880 --> 00:30:01,260
So I think asking both of those
questions is important before

571
00:30:01,260 --> 00:30:02,220
picking your ID.

572
00:30:02,500 --> 00:30:09,560
Nikolay: We could handle collisions
maybe in certain conflict

573
00:30:10,260 --> 00:30:15,840
and just adjusting last bits slightly
on conflict.

574
00:30:16,500 --> 00:30:18,080
Yeah, well, I'm not sure.

575
00:30:18,080 --> 00:30:20,040
I'm just thinking like out of cloud.

576
00:30:21,660 --> 00:30:23,800
Michael: But yeah, in terms of
likelihood of collisions, I actually

577
00:30:23,800 --> 00:30:27,780
haven't done the math to how many,
like how unlikely they are.

578
00:30:27,780 --> 00:30:31,160
But I think they're pretty minuscule
likelihood, at least in

579
00:30:31,160 --> 00:30:31,640
most of the deployments.

580
00:30:31,640 --> 00:30:35,200
Nikolay: It depends on the volumes
you're trying to ingest in

581
00:30:35,200 --> 00:30:36,040
your database.

582
00:30:36,980 --> 00:30:38,040
Michael: Yeah, of course.

583
00:30:38,400 --> 00:30:38,900
Nikolay: Yeah.

584
00:30:39,320 --> 00:30:43,780
But anyway, I think next time we
build a new system we need to

585
00:30:43,780 --> 00:30:47,440
think should we use these like
int8 surrogate keys at all

586
00:30:47,440 --> 00:30:52,220
Or maybe it's time for UUID version
7, version 8 with partitioning?

587
00:30:53,800 --> 00:30:56,600
Actually, my response to my team
was, partitioning?

588
00:30:57,080 --> 00:31:01,200
Come on, we know how to handle
a billion rows easily in 1 physical

589
00:31:01,200 --> 00:31:01,560
table.

590
00:31:01,560 --> 00:31:03,420
Let's just grow it until a billion.

591
00:31:03,420 --> 00:31:05,440
And I said, no, no, no, let's do
it.

592
00:31:05,640 --> 00:31:06,840
Like, it's so easy.

593
00:31:06,900 --> 00:31:09,840
Like, let's just do it normally,
in a normal way.

594
00:31:10,320 --> 00:31:11,260
Okay, okay.

595
00:31:11,280 --> 00:31:12,220
We have partitioning.

596
00:31:12,380 --> 00:31:14,840
We have UUID version 7.

597
00:31:15,140 --> 00:31:16,660
A modern approach, you know.

598
00:31:17,600 --> 00:31:19,360
Michael: Well, last question.

599
00:31:19,360 --> 00:31:20,500
Last question from me.

600
00:31:20,500 --> 00:31:22,660
What chunks interval did you go
for?

601
00:31:23,860 --> 00:31:26,260
Nikolay: Oh, that's a good question,
I actually need to check.

602
00:31:26,260 --> 00:31:30,120
Because I remember for development,
I think I used for experimentation.

603
00:31:30,240 --> 00:31:32,940
I used 1-hour chunks, very small,
like tiny.

604
00:31:33,340 --> 00:31:36,300
Michael: So, even smaller in your
test setup?

605
00:31:36,820 --> 00:31:39,640
Nikolay: Well, maybe a minute, just
for testing, but eventually,

606
00:31:39,680 --> 00:31:43,420
I think it will be, I don't know,
days or weeks, but it will

607
00:31:43,420 --> 00:31:44,380
be smaller chunks.

608
00:31:44,380 --> 00:31:46,220
TimescaleDB is good with small
chunks.

609
00:31:46,480 --> 00:31:47,740
A lot of small chunks.

610
00:31:48,440 --> 00:31:49,100
Michael: Oh really?

611
00:31:49,740 --> 00:31:50,520
Nikolay: Not months.

612
00:31:50,840 --> 00:31:52,380
Well, maybe months actually.

613
00:31:52,760 --> 00:31:53,760
It depends, actually.

614
00:31:53,760 --> 00:31:54,220
It depends.

615
00:31:54,220 --> 00:31:55,080
It's a good question.

616
00:31:55,080 --> 00:31:56,540
I need to reconsider it.

617
00:31:56,680 --> 00:31:59,900
This I need to finalize, but we
can adjust it over time as well,

618
00:31:59,900 --> 00:32:00,480
I think.

619
00:32:00,480 --> 00:32:02,800
Michael: Yeah, I wouldn't be surprised
if you start off much

620
00:32:02,800 --> 00:32:06,040
longer and then go smaller and
smaller.

621
00:32:06,140 --> 00:32:06,840
Nikolay: Makes sense.

622
00:32:07,660 --> 00:32:08,360
It depends.

623
00:32:08,380 --> 00:32:12,100
If you want to test it in advance
how it works with a big number

624
00:32:12,100 --> 00:32:16,700
of partitions, chunks of hypertable,
you probably want to go

625
00:32:16,700 --> 00:32:18,580
smaller and find problems earlier.

626
00:32:20,200 --> 00:32:22,940
It's a problem when you want to
go deeper, right?

627
00:32:23,100 --> 00:32:26,480
But if you build a normal system,
of course, it makes sense I

628
00:32:26,480 --> 00:32:29,860
think the rule of thumb is, it's
not about Timescale.

629
00:32:29,860 --> 00:32:35,140
Rule of thumb is, except Timescale,
partitioning for all tables

630
00:32:35,140 --> 00:32:39,620
which exceed 100 gigabytes, maybe
actually 10 gigabytes.

631
00:32:40,760 --> 00:32:44,760
This rule of thumb, so to speak,
was raised by several folks.

632
00:32:44,760 --> 00:32:46,420
I listen to them very well.

633
00:32:47,960 --> 00:32:53,240
Their words matter to me in a lot
of sense.

634
00:32:53,500 --> 00:32:56,060
So why 100 gigabytes?

635
00:32:56,280 --> 00:32:57,040
Why 100 gigabytes?

636
00:32:57,040 --> 00:32:57,440
Why?

637
00:32:57,440 --> 00:33:01,080
It's just like, okay, it's some
empirical rule based on, for

638
00:33:01,080 --> 00:33:04,240
example, Alexander Kukushkin said,
based on Zalando experience,

639
00:33:04,440 --> 00:33:05,960
those 100 gigabytes.

640
00:33:06,340 --> 00:33:10,760
But then I realized I can build
some theoretical basis and find

641
00:33:10,760 --> 00:33:12,280
a better threshold.

642
00:33:12,880 --> 00:33:17,980
So, theoretical basis is how many
transaction IDs consuming transactions

643
00:33:17,980 --> 00:33:19,660
per second you have.

644
00:33:20,380 --> 00:33:21,960
For example, 100, right?

645
00:33:21,960 --> 00:33:24,300
So how fast your move consuming
exceeds.

646
00:33:26,600 --> 00:33:32,220
And then how long does it take
for Autovacuum to process a single

647
00:33:32,220 --> 00:33:32,660
table?

648
00:33:32,660 --> 00:33:36,760
And also how long does it take
to create an index on the largest

649
00:33:36,760 --> 00:33:37,260
tables.

650
00:33:37,720 --> 00:33:42,660
Because when you create an index,
you hold a snapshot horizon.

651
00:33:43,180 --> 00:33:47,980
If it takes many hours, you have,
for example, 5 hours to build

652
00:33:47,980 --> 00:33:50,740
an index on a 5 terabyte table.

653
00:33:50,740 --> 00:33:53,800
For example, it's very arbitrary
numbers, right?

654
00:33:53,800 --> 00:33:57,160
During which, autovacuum cannot
delete the tuples from any table

655
00:33:57,160 --> 00:33:59,640
in your cluster, in your database.

656
00:34:00,040 --> 00:34:05,240
And it means that if you have a
lot of write growth during this,

657
00:34:05,320 --> 00:34:06,900
you insert a lot of...

658
00:34:08,460 --> 00:34:11,920
So you can start comparing these
numbers and understand, this

659
00:34:11,920 --> 00:34:12,420
is...

660
00:34:12,540 --> 00:34:19,540
Okay, I need not to go more than
like 100,000 writes to be spent

661
00:34:19,540 --> 00:34:20,660
during index creation.

662
00:34:21,040 --> 00:34:24,440
And if you split your physical
table into smaller physical tables,

663
00:34:24,440 --> 00:34:27,940
partitions, or chunks in the sense
of Timescale, index creation

664
00:34:27,940 --> 00:34:28,760
becomes faster.

665
00:34:30,060 --> 00:34:33,760
And autovacuum is blocked for a
smaller period of time.

666
00:34:33,940 --> 00:34:37,800
Michael: But a couple of other
things, like a couple of things

667
00:34:37,800 --> 00:34:44,280
I've heard being used for this
are size of cache, like how many

668
00:34:44,280 --> 00:34:47,360
chunks do you want to be in memory?

669
00:34:47,780 --> 00:34:53,940
Like if your memory is smaller
than your most recent chunk size,

670
00:34:54,720 --> 00:34:56,700
that might not be optimal for performance.

671
00:34:57,180 --> 00:34:59,520
You might be better off with smaller
chunks.

672
00:35:00,560 --> 00:35:03,660
Some of the more recent ones are
more likely to be in memory.

673
00:35:04,280 --> 00:35:07,520
And then and then another one.

674
00:35:07,900 --> 00:35:11,020
Nikolay: It's but it's very rough,
rough reasoning.

675
00:35:11,380 --> 00:35:11,720
I would

676
00:35:11,720 --> 00:35:13,940
Michael: say yes, but it's the
other direction, right?

677
00:35:13,940 --> 00:35:17,120
It's like, one of them is encouraging
you to do it earlier and

678
00:35:17,120 --> 00:35:17,620
earlier.

679
00:35:18,400 --> 00:35:19,940
That one is in the same direction,
isn't it?

680
00:35:19,940 --> 00:35:21,400
It's like earlier is better.

681
00:35:21,420 --> 00:35:22,780
But you can go too far.

682
00:35:22,780 --> 00:35:23,980
You could have too many.

683
00:35:24,020 --> 00:35:26,980
Let's say you want to regularly
query the last day, but that's

684
00:35:26,980 --> 00:35:30,920
going to involve querying 24 partitions.

685
00:35:31,440 --> 00:35:34,640
We've already talked in the past
about problems when you query

686
00:35:34,640 --> 00:35:37,320
too many relations at the same
time.

687
00:35:37,360 --> 00:35:41,120
Nikolay: Yes, planning time and
execution time and lock manager

688
00:35:41,120 --> 00:35:41,580
contention.

689
00:35:41,580 --> 00:35:45,060
A lot of stuff happens if you have
a lot of partitions, and especially

690
00:35:45,060 --> 00:35:46,980
if each partition has a lot of
indexes.

691
00:35:47,260 --> 00:35:50,380
So yeah, many, many things to,
there's a trade-off obviously

692
00:35:50,380 --> 00:35:53,500
here, but Timescale is very good
with a large number of partitions.

693
00:35:53,602 --> 00:35:54,820
Michael: Okay, good.

694
00:35:55,840 --> 00:35:59,560
Nikolay: Yeah, so we can go with
daily partitions, it's okay.

695
00:36:00,040 --> 00:36:05,200
And we have cases, my colleagues
have observed some cases where

696
00:36:05,200 --> 00:36:09,400
we have dozens of terabytes of
data with daily partitions, very

697
00:36:09,400 --> 00:36:12,340
small, maybe not even daily, I
don't remember details, maybe

698
00:36:12,340 --> 00:36:17,420
some like 8 hours partitions, ingesting
like a lot per second.

699
00:36:18,400 --> 00:36:19,840
Michael: Did they change the...

700
00:36:20,140 --> 00:36:24,000
Remember when we talked about the
lock manager issues with not...

701
00:36:24,000 --> 00:36:27,140
When you don't have partition pruning
and lots of indexes on

702
00:36:27,140 --> 00:36:27,840
each partition?

703
00:36:28,780 --> 00:36:29,540
Did Timescale...

704
00:36:29,860 --> 00:36:32,960
Like presumably that hard limit
is still there in Timescale as

705
00:36:32,960 --> 00:36:33,400
well.

706
00:36:33,400 --> 00:36:33,900
16.

707
00:36:34,280 --> 00:36:34,780
Yeah.

708
00:36:34,820 --> 00:36:38,040
So you say it's really good with
lots of partitions, but if you're

709
00:36:38,040 --> 00:36:41,920
querying too many of them at the
same time, like if we're doing

710
00:36:41,920 --> 00:36:47,740
monthly reports against 31 or however
many days, we're going

711
00:36:47,740 --> 00:36:49,500
to bump into those limits, right?

712
00:36:49,640 --> 00:36:52,620
Nikolay: Definitely but a monthly
report is just a single query.

713
00:36:52,800 --> 00:36:55,940
The problem is when you have thousands
of such queries per second,

714
00:36:56,340 --> 00:37:00,660
then they start competing and the
log manager, lightweight log

715
00:37:00,660 --> 00:37:01,960
contention is happening.

716
00:37:01,960 --> 00:37:04,140
If it's some not frequent...

717
00:37:04,860 --> 00:37:09,180
So, second question to check is
how many QPS you have.

718
00:37:09,380 --> 00:37:10,140
Michael: Yeah, yeah.

719
00:37:10,520 --> 00:37:12,880
So if it's like dashboards or something.

720
00:37:12,900 --> 00:37:13,860
If it's monthly reports,

721
00:37:13,860 --> 00:37:14,780
Nikolay: you're probably fine.

722
00:37:15,040 --> 00:37:19,620
Nobody says we cannot have more
than 15 indexes on a table, achieving

723
00:37:19,640 --> 00:37:24,060
16, this fast path locking threshold.

724
00:37:24,640 --> 00:37:26,980
It's okay to exceed it.

725
00:37:27,040 --> 00:37:31,420
The only problem when you need
to exceed it a thousand times

726
00:37:31,420 --> 00:37:32,040
per second.

727
00:37:32,040 --> 00:37:32,540
Yeah.

728
00:37:34,600 --> 00:37:35,100
Michael: Cool.

729
00:37:35,320 --> 00:37:35,820
Right.

730
00:37:35,920 --> 00:37:37,520
Any last words on this one?

731
00:37:38,260 --> 00:37:39,880
Nikolay: No, I think it was good.

732
00:37:41,380 --> 00:37:44,280
A lot of things to understand and
play with.

733
00:37:44,280 --> 00:37:48,600
And I think I encourage folks to
look at these new versions of

734
00:37:48,600 --> 00:37:50,100
UUID if not yet.

735
00:37:52,800 --> 00:37:56,920
Does it matter where we generate
it, on application or on database?

736
00:37:58,260 --> 00:38:00,600
Michael: I think the only risk
is time zones.

737
00:38:01,320 --> 00:38:02,720
Nikolay: Clocks can be off.

738
00:38:02,900 --> 00:38:03,420
Michael: Yeah, true.

739
00:38:03,420 --> 00:38:05,860
If they're off by seconds, then
it will be going to different

740
00:38:05,860 --> 00:38:06,820
database pages.

741
00:38:07,280 --> 00:38:08,160
Yeah, it won't be

742
00:38:08,160 --> 00:38:08,200
Nikolay: as bad.

743
00:38:08,200 --> 00:38:10,460
And also, order will be broken.

744
00:38:11,360 --> 00:38:13,300
Michael: Yeah, so I guess it matters
a bit.

745
00:38:13,820 --> 00:38:17,440
Nikolay: Yeah, I prefer generated
on Postgres database side.

746
00:38:17,440 --> 00:38:23,160
But interesting question, pros
and cons of database versus application

747
00:38:23,160 --> 00:38:23,660
side.

748
00:38:24,180 --> 00:38:28,220
Anyway, I think Postgres eventually
will have UUID version 7.

749
00:38:28,520 --> 00:38:32,480
Not sure for 17, but it should
be for 18 at least.

750
00:38:32,660 --> 00:38:34,780
Only a couple of years of waiting.

751
00:38:34,840 --> 00:38:40,080
But we don't need to wait, and
we know now how to use both this

752
00:38:40,080 --> 00:38:45,040
new type of UUID and partitioning,
TimescaleDB or any other.

753
00:38:45,740 --> 00:38:46,840
Yeah, good?

754
00:38:47,740 --> 00:38:48,480
Michael: Nice one.

755
00:38:48,720 --> 00:38:51,080
Well, thank you, Nikolay, and catch
you next week.

756
00:38:51,083 --> 00:38:51,981
Nikolay: Thank you.

757
00:38:52,175 --> 00:38:52,425
Bye.