1
00:00:00,060 --> 00:00:02,800
Michael: Hello and welcome to
PostgresFM, a weekly show about

2
00:00:02,800 --> 00:00:03,740
all things PostgreSQL.

3
00:00:04,000 --> 00:00:05,800
I am Michael, founder of pgMustard.

4
00:00:05,800 --> 00:00:09,000
And today I am joined by Markus
Winand, creator of the popular

5
00:00:09,000 --> 00:00:12,860
sites, 'Use the index, Luke,' Modern
SQL, author of SQL Performance

6
00:00:12,860 --> 00:00:13,360
Explained.

7
00:00:13,780 --> 00:00:15,260
Performance trainer and consultant.

8
00:00:15,300 --> 00:00:19,040
And according to your LinkedIn,
SQL Renaissance ambassador.

9
00:00:19,620 --> 00:00:21,400
It's a pleasure to have you here
today, Markus.

10
00:00:21,400 --> 00:00:22,580
Thank you for joining us.

11
00:00:22,580 --> 00:00:23,480
Markus: Yes, thank you.

12
00:00:23,480 --> 00:00:23,980
Michael: Awesome.

13
00:00:24,060 --> 00:00:28,240
Well, I suggested the topic this
week, but it is very much your

14
00:00:28,240 --> 00:00:28,740
topic.

15
00:00:28,820 --> 00:00:30,320
It's modern SQL.

16
00:00:30,560 --> 00:00:34,760
And I'd love for you to give us an
intro into what you mean by modern

17
00:00:34,760 --> 00:00:39,640
SQL and also how you came to be
such an ambassador for it.

18
00:00:39,800 --> 00:00:43,780
Markus: So, it basically goes back
when I realized that many,

19
00:00:43,780 --> 00:00:49,280
many people think that the old
SQL, so what is more technically

20
00:00:49,400 --> 00:00:53,500
described as SQL-92, is pretty
much the only SQL.

21
00:00:54,340 --> 00:00:57,780
Like we have joins, yeah, we have
them for decades and people

22
00:00:57,780 --> 00:01:01,220
think that's it because that's
what they learned like decades

23
00:01:01,240 --> 00:01:02,040
ago and so on.

24
00:01:02,040 --> 00:01:05,060
And they didn't realize that the
story didn't stop there.

25
00:01:05,060 --> 00:01:06,020
It went on.

26
00:01:06,420 --> 00:01:10,520
And we got what I refer to as more
modern features, which are

27
00:01:10,520 --> 00:01:15,580
not rooted in the relational idea,
like window functions or recursive

28
00:01:15,700 --> 00:01:19,400
queries, or more modernly like
JSON support, document support

29
00:01:19,400 --> 00:01:19,760
at all.

30
00:01:19,760 --> 00:01:21,900
We have XML for quite a while in
SQL.

31
00:01:22,200 --> 00:01:27,540
This is what I kindly refer to
as, yeah, well, modern SQL, as

32
00:01:27,540 --> 00:01:31,860
opposed to the grandparents' SQL,
everybody has learned like decades

33
00:01:31,860 --> 00:01:32,360
ago.

34
00:01:32,540 --> 00:01:35,380
Michael: So do you see this more
as an issue with people that

35
00:01:35,380 --> 00:01:39,900
have been around for a few decades
and got stuck in their ways?

36
00:01:39,900 --> 00:01:43,040
Or do you see it also with new
folks coming along and getting

37
00:01:43,040 --> 00:01:46,920
taught by those people or the lessons
they're learning only include

38
00:01:46,920 --> 00:01:48,900
the most basic, rudimentary SQL?

39
00:01:50,820 --> 00:01:53,600
Markus: I would draw the line a
little bit differently there.

40
00:01:53,880 --> 00:02:01,160
I think the problem is more related
to ORM users and SQL users.

41
00:02:01,860 --> 00:02:05,580
So you see SQL users, there are
also 2 kinds of users, those

42
00:02:05,600 --> 00:02:08,320
who just learned it once and never
got an update.

43
00:02:08,320 --> 00:02:09,520
That definitely exists.

44
00:02:09,940 --> 00:02:15,520
But I think the most problematic
area is actually in the ORM

45
00:02:15,520 --> 00:02:18,960
world or rooted in the ORM world,
where the people just really

46
00:02:18,960 --> 00:02:22,260
think because the ORM tool is doing
the best because well, they

47
00:02:22,260 --> 00:02:26,880
are specialized in using SQL, all
sorts. The ORM tool.

48
00:02:26,940 --> 00:02:29,100
So this is probably the best you
can do.

49
00:02:29,340 --> 00:02:32,280
But that's unfortunately totally
wrong.

50
00:02:32,860 --> 00:02:39,060
So the whole idea that SQL is just
about relational things is

51
00:02:39,060 --> 00:02:39,840
just outdated.

52
00:02:40,120 --> 00:02:46,960
That was right in '92 but it was
already wrong in '99 and somehow

53
00:02:47,540 --> 00:02:50,100
all the people managed to miss
that message.

54
00:02:50,380 --> 00:02:52,280
They didn't get the memo, so to
say.

55
00:02:53,180 --> 00:02:55,240
Michael: Yeah, that's such an interesting
point.

56
00:02:55,240 --> 00:02:57,660
I hadn't put 2 and 2 together with...

57
00:02:57,880 --> 00:03:00,820
I've listened to your talks before
on this around the change

58
00:03:00,820 --> 00:03:05,020
from being relational only to being
able to handle other data

59
00:03:05,020 --> 00:03:06,520
types or other data models.

60
00:03:06,900 --> 00:03:10,900
But the R in ORM, is that what
you mean by the relational part

61
00:03:10,900 --> 00:03:11,600
of it?

62
00:03:11,740 --> 00:03:15,320
Markus: Yeah, also the whole idea
that the database is just for

63
00:03:15,320 --> 00:03:15,820
persistency.

64
00:03:17,800 --> 00:03:22,340
Look at Java, JPA, the Java Persistence
API.

65
00:03:22,660 --> 00:03:26,020
Just the idea that the database
just stores data.

66
00:03:26,780 --> 00:03:28,520
That's a wrong idea for SQL.

67
00:03:28,820 --> 00:03:32,080
It might be a right idea for some
other systems.

68
00:03:32,860 --> 00:03:38,040
But for SQL, the fundamental idea
of SQL is that we have a separation

69
00:03:38,300 --> 00:03:42,880
of the persistent data layout and
the transient data layout that

70
00:03:42,880 --> 00:03:45,800
we need for 1 query right now.

71
00:03:46,860 --> 00:03:51,060
Because 1 other thing that I like
to put emphasis on is that

72
00:03:51,060 --> 00:03:52,980
SQL is not a query language.

73
00:03:54,340 --> 00:03:57,420
SQL is more like a transformation
language.

74
00:03:58,500 --> 00:04:02,620
If you look at query languages,
Go to any web page, open the

75
00:04:02,620 --> 00:04:05,040
CSS and look at the selectors of
CSS.

76
00:04:05,500 --> 00:04:06,920
This is a query language.

77
00:04:07,360 --> 00:04:11,180
A query language can pick out 1
small piece of something larger,

78
00:04:11,600 --> 00:04:12,680
and that's it.

79
00:04:13,180 --> 00:04:14,600
But now look at SQL.

80
00:04:15,060 --> 00:04:18,840
With SQL, we are most of the time
focusing on transforming data,

81
00:04:19,340 --> 00:04:23,160
transforming the persistent data
we have stored on the disk to

82
00:04:23,160 --> 00:04:27,560
last forever, more or less, to
answer a question we have at hand

83
00:04:27,560 --> 00:04:28,340
right now.

84
00:04:28,480 --> 00:04:28,820
Michael: Yeah.

85
00:04:28,820 --> 00:04:30,600
Markus: That might be a different
question tomorrow.

86
00:04:30,600 --> 00:04:32,580
It's most certainly a different
question.

87
00:04:32,620 --> 00:04:36,600
And the whole idea of SQL is that
we normalize, yeah, there is

88
00:04:36,600 --> 00:04:39,880
this bad word, normalization, we
normalize the data persistently

89
00:04:40,120 --> 00:04:46,500
on the disk so that the shape of
the data does not change often.

90
00:04:47,400 --> 00:04:50,020
Yeah, the shape not, the data itself,
yeah, the contents of the

91
00:04:50,020 --> 00:04:51,100
table, but not the shape.

92
00:04:51,100 --> 00:04:51,920
Not so often.

93
00:04:52,340 --> 00:04:56,140
Now compare how static is the on-disk
schema, if you have probably

94
00:04:56,140 --> 00:04:59,640
normalized it, compared to the
questions you try to answer with

95
00:04:59,640 --> 00:05:00,400
those data.

96
00:05:00,620 --> 00:05:03,960
You get every day new questions
from marketing or from reporting

97
00:05:03,960 --> 00:05:04,780
or from whatever.

98
00:05:05,660 --> 00:05:08,800
And the idea of SQL is to have
a flexible transformation between

99
00:05:08,800 --> 00:05:12,540
the rather static schema you have
on disk, because once you have

100
00:05:12,540 --> 00:05:15,640
collected terabytes of data on
disk, then it's hard to change

101
00:05:16,580 --> 00:05:18,180
The model you have there.

102
00:05:18,180 --> 00:05:23,040
No matter how you do it, it will
be hard, compared to the dynamic

103
00:05:23,080 --> 00:05:24,700
we have on the question side.

104
00:05:25,580 --> 00:05:29,720
SQL is the glue between these two
layers.

105
00:05:30,060 --> 00:05:32,800
Michael: One thing you mentioned
there is normalization being a

106
00:05:32,800 --> 00:05:33,580
bad word.

107
00:05:33,900 --> 00:05:35,860
I'm curious what you meant by that.

108
00:05:36,140 --> 00:05:39,120
Markus: Yeah, with normalization
we have all of us have learned

109
00:05:39,120 --> 00:05:39,940
these levels.

110
00:05:40,360 --> 00:05:42,940
First Normal Form, second normal
form, third normal.

111
00:05:43,260 --> 00:05:46,480
Okay, of course, that's totally
correct and useful.

112
00:05:46,840 --> 00:05:50,720
But to another extent, for the
layman people, just causing more

113
00:05:50,720 --> 00:05:52,320
confusion than it is helpful.

114
00:05:53,100 --> 00:05:57,100
What I like to explain normalization
is just, okay, don't focus

115
00:05:57,100 --> 00:05:59,680
on the third normal or on any of
those.

116
00:06:00,180 --> 00:06:04,660
Just try to figure out in which
tables would the data feel comfortable.

117
00:06:06,180 --> 00:06:09,600
And naturally there are some things
like if you have a one-to-many

118
00:06:09,880 --> 00:06:13,300
relationship, then it needs to
be two tables in SQL.

119
00:06:14,180 --> 00:06:16,400
Just keep it on that level.

120
00:06:16,400 --> 00:06:20,580
Don't go into the scientific definitions
of all of these six normalization

121
00:06:20,680 --> 00:06:21,880
levels and some extras.

122
00:06:22,240 --> 00:06:26,000
Try to keep it simple and you will
still get the gain from the

123
00:06:26,000 --> 00:06:26,500
normalization.

124
00:06:27,080 --> 00:06:30,040
So that you have a rather static
schema on disk.

125
00:06:31,120 --> 00:06:34,300
Michael: I really like in what
tables would the data feel comfortable.

126
00:06:34,660 --> 00:06:35,900
I'm going to steal that.

127
00:06:35,900 --> 00:06:37,320
I think that should be on T-shirts.

128
00:06:37,540 --> 00:06:39,220
If you make them, I'll buy them.

129
00:06:40,360 --> 00:06:40,840
Awesome.

130
00:06:40,840 --> 00:06:44,380
Right back to the modern SQL topic
a little bit more.

131
00:06:44,380 --> 00:06:47,980
You mentioned that a lot of people
are stuck a few decades ago

132
00:06:47,980 --> 00:06:49,680
with very basic SQL.

133
00:06:50,500 --> 00:06:53,560
The counter argument might be,
well, they're solving their problems.

134
00:06:53,560 --> 00:06:55,060
They can do a lot of things.

135
00:06:55,620 --> 00:06:59,680
What are the big downsides or limitations
for those folks who

136
00:06:59,700 --> 00:07:02,580
haven't explored some of the newer
features that have been added.

137
00:07:04,540 --> 00:07:05,640
Markus: I'll give you an example.

138
00:07:06,780 --> 00:07:11,400
With SQL '92, it was quite common
to do self-joins.

139
00:07:12,560 --> 00:07:16,320
It was required to do self-joins
because some problems could

140
00:07:16,320 --> 00:07:18,460
only be solved by using a self-join.

141
00:07:19,900 --> 00:07:22,180
But self-joins are troublesome
in two ways.

142
00:07:22,360 --> 00:07:23,940
First of all, they are complex.

143
00:07:24,620 --> 00:07:27,900
And second of all, they are slow,
because they are touching every

144
00:07:27,900 --> 00:07:29,020
data item twice.

145
00:07:30,300 --> 00:07:30,480
Michael: And

146
00:07:30,480 --> 00:07:32,980
Markus: now we have pretty much
for every situation, we have

147
00:07:32,980 --> 00:07:35,680
some other syntax, most importantly
window functions.

148
00:07:35,680 --> 00:07:38,600
Window functions are one of the big
self-join killers.

149
00:07:39,480 --> 00:07:43,480
And now they become, actually the
syntax is more concise.

150
00:07:43,780 --> 00:07:47,280
It's more clear actually when you
look at it, rather than scratching

151
00:07:47,420 --> 00:07:50,820
yourself this way, you just scratch
yourself this way.

152
00:07:51,060 --> 00:07:54,380
So it's actually better to read
the query.

153
00:07:54,380 --> 00:07:56,940
It's actually also better to write
the query once you know the

154
00:07:56,940 --> 00:07:57,440
syntax.

155
00:07:57,800 --> 00:07:59,280
And it will perform better.

156
00:08:00,040 --> 00:08:02,520
And you will have less maintenance
nightmares because you don't

157
00:08:02,520 --> 00:08:04,120
have the redundancy in the code.

158
00:08:04,120 --> 00:08:07,540
Like with a self join, if you add
a condition to one of the tables,

159
00:08:07,540 --> 00:08:10,320
you might need to add it to the
other one as well or to the join

160
00:08:10,320 --> 00:08:10,820
condition.

161
00:08:11,040 --> 00:08:14,440
And this is easy to forget, then
you get wrong results.

162
00:08:15,180 --> 00:08:16,220
Michael: Yeah, I love those.

163
00:08:16,220 --> 00:08:18,560
So clarity and performance.

164
00:08:18,600 --> 00:08:21,900
So clarity not just for yourself
writing things, but also for

165
00:08:21,900 --> 00:08:24,740
people coming back to it, reading
it yourself in future.

166
00:08:24,860 --> 00:08:29,100
And performance is the man after
my own heart on that front.

167
00:08:29,200 --> 00:08:33,220
I think a lot of people assume
that, or looking at new features,

168
00:08:33,640 --> 00:08:37,600
they assume they'll get new functionality
and forget that those

169
00:08:37,600 --> 00:08:39,480
other two things are the bigger part.

170
00:08:39,480 --> 00:08:44,340
And because of the way SQL works,
we likely could have done it

171
00:08:44,340 --> 00:08:44,840
before.

172
00:08:45,060 --> 00:08:46,380
That's a really good point.

173
00:08:46,400 --> 00:08:50,900
Markus: But it's actually true
because SQL 92 was from one perspective,

174
00:08:50,900 --> 00:08:52,160
it was quite complete.

175
00:08:52,640 --> 00:08:53,000
Michael: So it

176
00:08:53,000 --> 00:08:55,460
Markus: was completing the relational
idea.

177
00:08:55,900 --> 00:08:58,440
And of course, you can solve almost
everything.

178
00:08:59,000 --> 00:09:00,040
Asterisk recursion.

179
00:09:02,380 --> 00:09:04,800
Almost everything with the old
SQL.

180
00:09:05,020 --> 00:09:08,480
And that also makes it difficult
to learn new stuff because,

181
00:09:08,480 --> 00:09:10,160
well, I can solve it already.

182
00:09:12,180 --> 00:09:14,960
So why do I need to learn something
new that worked for me like

183
00:09:14,960 --> 00:09:15,300
decades?

184
00:09:15,300 --> 00:09:16,160
It's still working.

185
00:09:16,160 --> 00:09:17,580
So what is it about?

186
00:09:17,980 --> 00:09:21,380
And that's, of course, a little
bit more difficult to explain

187
00:09:21,380 --> 00:09:21,880
people.

188
00:09:22,360 --> 00:09:24,960
Michael: Well, I love how simply
you did it.

189
00:09:24,960 --> 00:09:30,580
So clarity of what's happening
when you look at the SQL and performance.

190
00:09:31,020 --> 00:09:35,580
What can they what choices
does the planner have when executing

191
00:09:35,860 --> 00:09:36,360
that?

192
00:09:36,500 --> 00:09:37,620
Yeah, really cool.

193
00:09:37,660 --> 00:09:40,060
And also, I guess, edge cases as
well.

194
00:09:40,760 --> 00:09:43,140
These features can be designed
to be more...

195
00:09:43,140 --> 00:09:45,860
Well, we've done whole episodes
on handling nulls.

196
00:09:46,460 --> 00:09:50,780
And they can have thought about
those more than, or if you have

197
00:09:50,780 --> 00:09:53,800
to roll your own versions of these
features, you have to handle

198
00:09:54,000 --> 00:09:58,040
nulls yourself, which can be painful,
to say the least.

199
00:09:58,940 --> 00:10:02,720
Markus: So one of my favorite features
in this regard is the is

200
00:10:02,720 --> 00:10:05,040
not distinct from operator.

201
00:10:05,900 --> 00:10:08,940
It's the null-safe operator so
that if you have nulls on

202
00:10:08,940 --> 00:10:10,620
both sides then you still get true.

203
00:10:10,840 --> 00:10:13,520
Yeah, you could say it's a modern
SQL feature.

204
00:10:13,520 --> 00:10:16,420
According to my definition, it
was added after SQL '92.

205
00:10:17,320 --> 00:10:20,320
And if you look at on my website,
Modern SQL, you can check out

206
00:10:20,320 --> 00:10:22,220
which systems support that operator.

207
00:10:23,000 --> 00:10:25,320
And yeah, unfortunately, not so
many.

208
00:10:25,580 --> 00:10:26,080
Michael: Interesting.

209
00:10:26,540 --> 00:10:28,880
Markus: Although I consider it
something quite essential.

210
00:10:29,920 --> 00:10:34,280
But still, also the vendors didn't
get the memo that there is

211
00:10:34,280 --> 00:10:36,340
something new they need to catch
up with.

212
00:10:36,540 --> 00:10:38,600
Michael: There I ask, I actually
haven't used it.

213
00:10:38,600 --> 00:10:40,760
Is it in PostgreSQL, do you remember?

214
00:10:41,040 --> 00:10:41,940
Markus: It is there.

215
00:10:42,040 --> 00:10:42,540
Yes.

216
00:10:42,700 --> 00:10:45,480
It does work syntactically, you
get the right results.

217
00:10:45,480 --> 00:10:45,980
Michael: Okay.

218
00:10:46,340 --> 00:10:49,660
Markus: But there are some gotchas
when it comes to optimization.

219
00:10:50,860 --> 00:10:51,880
Michael: Interesting performance.

220
00:10:51,940 --> 00:10:52,780
There we go.

221
00:10:53,560 --> 00:10:56,640
Markus: So I have had just like
a month ago, I've had a case

222
00:10:56,640 --> 00:11:01,220
where I would like using it, but
it didn't work out for performance

223
00:11:01,440 --> 00:11:01,940
reasons.

224
00:11:02,720 --> 00:11:04,020
Michael: That's so good to know.

225
00:11:04,020 --> 00:11:05,520
That's very, very interesting.

226
00:11:06,660 --> 00:11:09,260
I love those charts on Modern SQL.

227
00:11:09,720 --> 00:11:13,740
When you look up one of these
newer features, and you have the

228
00:11:13,740 --> 00:11:21,780
charts of one row per database,
per DBMS, and then a timeline of

229
00:11:22,300 --> 00:11:25,580
whether it supported it, at which
point it supported it, whether

230
00:11:25,580 --> 00:11:26,540
there are any caveats.

231
00:11:27,040 --> 00:11:29,660
And every time I look at that,
I think, wow, this is impressive.

232
00:11:29,660 --> 00:11:33,020
But also, this must be so much
work to maintain.

233
00:11:33,580 --> 00:11:35,900
I was gonna ask, how do you do
it?

234
00:11:36,100 --> 00:11:38,860
Markus: Yeah, I'm happy you asked
that because this is one of the

235
00:11:38,860 --> 00:11:39,780
untold stories.

236
00:11:40,920 --> 00:11:45,340
So I'm just a one-man show, and doing
the job site as a side project,

237
00:11:45,340 --> 00:11:45,780
actually.

238
00:11:45,780 --> 00:11:47,640
I have kind of a day job as well.

239
00:11:47,640 --> 00:11:48,140
Yeah.

240
00:11:48,260 --> 00:11:53,160
So the way I make it nowadays is
that everything is covered by

241
00:11:53,160 --> 00:11:56,420
tests, by tests that I can run
automatically.

242
00:11:57,940 --> 00:12:01,400
I test very deeply, like I always
check for the null case.

243
00:12:01,400 --> 00:12:02,820
I always check for collations.

244
00:12:02,900 --> 00:12:06,560
I always check for this and that
and a lot of, down to the SQL

245
00:12:06,560 --> 00:12:07,060
state.

246
00:12:19,140 --> 00:12:20,780
And yeah, well, then I get the
result.

247
00:12:21,040 --> 00:12:25,200
And if everything is going fine,
then updating the website for

248
00:12:25,200 --> 00:12:28,680
a new version takes me like maybe
an hour of work.

249
00:12:29,640 --> 00:12:31,120
It really depends on the system.

250
00:12:31,120 --> 00:12:36,640
Like I'm also featuring Google
BigQuery, which is a cloud-only

251
00:12:36,900 --> 00:12:37,400
system.

252
00:12:37,540 --> 00:12:40,520
You can imagine that I have higher
latencies compared to the

253
00:12:40,520 --> 00:12:41,700
VM on my box.

254
00:12:42,120 --> 00:12:45,560
So for that, it takes a little
bit longer to run the tests.

255
00:12:45,940 --> 00:12:50,360
But after that, it's basically
just saying make and everything

256
00:12:50,400 --> 00:12:53,040
is rendered again and updated again.

257
00:12:53,680 --> 00:12:56,200
And yeah, well, that kind of works.

258
00:12:56,200 --> 00:12:59,560
And now I'm working on getting
actually more features in there.

259
00:13:00,600 --> 00:13:01,360
Michael: Oh, nice.

260
00:13:01,380 --> 00:13:03,480
You mean, so what's coming up?

261
00:13:03,660 --> 00:13:06,540
Markus: Yeah, so I try to catch
up, like, of course, I like to

262
00:13:06,540 --> 00:13:08,480
focus on the modern SQL features.

263
00:13:08,480 --> 00:13:08,980
Michael: Yeah.

264
00:13:09,100 --> 00:13:12,500
Markus: And I also like to focus
on features that are already

265
00:13:12,500 --> 00:13:16,080
implemented in some systems, because
this is more valuable, of

266
00:13:16,080 --> 00:13:16,580
course.

267
00:13:16,920 --> 00:13:20,640
And I also have always an eye on
the new features added by some

268
00:13:20,640 --> 00:13:21,140
systems.

269
00:13:21,140 --> 00:13:25,760
Like if we talk about Postgres,
I think it was even in 15 that

270
00:13:25,760 --> 00:13:30,020
we got the functionality of the
unique constraints where we can

271
00:13:30,020 --> 00:13:32,460
choose how it should treat the
null values.

272
00:13:33,080 --> 00:13:36,220
Michael: So this is whether you
can determine that you are only

273
00:13:36,220 --> 00:13:38,160
allowed to have a single null value.

274
00:13:38,400 --> 00:13:38,900
Exactly.

275
00:13:39,520 --> 00:13:39,780
So

276
00:13:39,780 --> 00:13:41,460
Markus: it's the nulls distinct.

277
00:13:41,940 --> 00:13:43,540
Let me just check up my own.

278
00:13:44,240 --> 00:13:45,360
Michael: Oh, I like it.

279
00:13:46,160 --> 00:13:47,080
Markus: Nulls distinct.

280
00:13:47,160 --> 00:13:48,980
So yeah, it was introduced with
15.

281
00:13:49,280 --> 00:13:54,100
So now you can choose for a unique
constraint if it has a nullable

282
00:13:54,140 --> 00:13:58,340
column, whether one null should rule
out other nulls or not.

283
00:13:58,780 --> 00:14:02,240
And that was now added to the new
standard of SQL we have in

284
00:14:02,240 --> 00:14:07,040
the year 2023, so last year, we
got a new standard and it's in

285
00:14:07,040 --> 00:14:07,540
there.

286
00:14:07,800 --> 00:14:11,440
But even a year before it was put
into the standard, it arrived

287
00:14:11,440 --> 00:14:12,100
in Postgres.

288
00:14:12,740 --> 00:14:13,480
Michael: Oh, nice.

289
00:14:13,500 --> 00:14:16,440
Markus: So, this is kind of the
features I try to catch up

290
00:14:16,500 --> 00:14:20,940
with the new stuff introduced in
the standard as well as in newer

291
00:14:20,940 --> 00:14:22,320
releases of the systems.

292
00:14:23,200 --> 00:14:23,700
Michael: Awesome.

293
00:14:23,800 --> 00:14:27,280
I watched a talk of yours, I think
it was about 5 years ago now,

294
00:14:27,280 --> 00:14:27,600
though.

295
00:14:27,600 --> 00:14:31,560
So I've only heard your opinions
of like the previous SQL standards.

296
00:14:31,560 --> 00:14:32,940
not the latest one.

297
00:14:32,980 --> 00:14:35,040
What were your main thoughts on 
2023?

298
00:14:36,420 --> 00:14:39,090
Markus: So 2023, it took quite 
a long while.

299
00:14:40,160 --> 00:14:41,700
The one before was 2016.

300
00:14:42,500 --> 00:14:44,320
So that's longer than intended.

301
00:14:44,340 --> 00:14:46,770
So that there's a target of like 
4 to 5 years, something like 

302
00:14:46,770 --> 00:14:46,920
that.

303
00:14:46,920 --> 00:14:47,440
Oh, really?

304
00:14:47,440 --> 00:14:47,940
Cool.

305
00:14:48,060 --> 00:14:53,240
The big thing in the latest release 
of the standard is the query 

306
00:14:53,240 --> 00:14:57,340
language for querying graphs in 
a more native syntax.

307
00:14:58,180 --> 00:15:03,080
So if you know the Cypher query 
language, then you get the idea.

308
00:15:03,540 --> 00:15:08,360
It's basically about drawing with 
SQL a path through a graph 

309
00:15:08,360 --> 00:15:09,880
that you want to match on.

310
00:15:10,580 --> 00:15:15,660
And there is now a standard of 
its own next to SQL, which is 

311
00:15:15,660 --> 00:15:17,700
basically independent of SQL.

312
00:15:17,700 --> 00:15:21,980
So it's a new language, a new international 
standard that defines 

313
00:15:23,000 --> 00:15:27,240
the query language, generally the 
language to work with such 

314
00:15:27,240 --> 00:15:30,280
databases, like how to insert something 
into a graph and how 

315
00:15:30,280 --> 00:15:31,340
to query the graph.

316
00:15:31,720 --> 00:15:36,980
So this is kind of independent 
of SQL, except that the same people 

317
00:15:37,800 --> 00:15:40,740
who do the SQL standard also do 
this new standard.

318
00:15:41,520 --> 00:15:45,660
And obviously, we would like to 
have that capability also in 

319
00:15:45,660 --> 00:15:46,160
SQL.

320
00:15:47,100 --> 00:15:51,280
So additionally, besides that entirely 
new standard, there is 

321
00:15:51,280 --> 00:15:55,840
now a new part of the SQL standard 
because the SQL standard is 

322
00:15:55,840 --> 00:15:59,440
actually made up of several parts 
and now there's a new part 

323
00:16:00,140 --> 00:16:05,100
so that we can use the query functionality 
part of the new standard 

324
00:16:05,340 --> 00:16:07,580
inside SQL in our queries.

325
00:16:08,260 --> 00:16:11,160
Basically in the FROM clause, we 
have match operations and we 

326
00:16:11,160 --> 00:16:15,360
can use that SQL-art kind of querying 
directly in SQL.

327
00:16:16,720 --> 00:16:20,300
And that's most certainly the killer 
feature of 2023.

328
00:16:21,660 --> 00:16:25,340
But of course, it may take a while 
until vendors catch up.

329
00:16:25,640 --> 00:16:28,820
There are some commercial implementations 
are available, but

330
00:16:28,820 --> 00:16:30,520
they predate the standard.

331
00:16:31,400 --> 00:16:34,900
Therefore, they don't implement 
the standard down to the last 

332
00:16:34,900 --> 00:16:35,400
sentence.

333
00:16:35,940 --> 00:16:40,460
So I'm curious how this evolves 
over time and if the standard 

334
00:16:40,460 --> 00:16:44,280
can still have the power to unify 
these languages.

335
00:16:45,040 --> 00:16:47,300
That's something that we have to 
watch out for.

336
00:16:48,160 --> 00:16:52,360
But besides that one mega topic, 
and that's probably the reason 

337
00:16:52,360 --> 00:16:53,600
why it took so long.

338
00:16:53,940 --> 00:16:56,880
Besides that, we have got many 
small things like the null handling 

339
00:16:56,880 --> 00:16:57,900
that I was mentioning.

340
00:16:58,260 --> 00:17:03,460
One really nice thing is that we 
can now put into literal numeric values.

341
00:17:03,820 --> 00:17:09,100
So if you write a number like 1000, 
you can now put in the underscore 

342
00:17:09,280 --> 00:17:12,040
as a separator for let's say thousands 
of digits.

343
00:17:12,040 --> 00:17:14,760
So that you can easily spot when 
it's a million or a billion 

344
00:17:14,760 --> 00:17:17,540
what it is, so that you can make 
it more readable.

345
00:17:18,180 --> 00:17:21,380
You see, it doesn't change anything 
what you can do with the

346
00:17:21,380 --> 00:17:21,880
language.

347
00:17:21,960 --> 00:17:24,220
It just makes it nicer and more
maintainable.

348
00:17:25,580 --> 00:17:30,140
And there are also, in respect
to JSON, there are also some additions,

349
00:17:30,140 --> 00:17:32,060
actually quite fundamental additions.

350
00:17:32,540 --> 00:17:35,780
Now, the standard has an actual
JSON type.

351
00:17:36,880 --> 00:17:38,640
It didn't have one before.

352
00:17:39,520 --> 00:17:43,000
The 2016 standard introduced a
lot of JSON functionality.

353
00:17:43,380 --> 00:17:45,880
Most importantly, the JSON query
language.

354
00:17:46,680 --> 00:17:49,700
That's a query language, the JSON
parse language, but it didn't

355
00:17:49,700 --> 00:17:51,440
introduce a type for JSON.

356
00:17:51,780 --> 00:17:56,940
And now with 2023, they have added
the type for JSON, along with

357
00:17:56,940 --> 00:17:59,280
some other simplifications and
extensions.

358
00:18:00,040 --> 00:18:04,320
So there are quite a few small
enhancements that just make everyday's

359
00:18:04,480 --> 00:18:05,340
life easier.

360
00:18:05,980 --> 00:18:10,300
But the big thing is I think that
the property graph language.

361
00:18:11,440 --> 00:18:12,440
Michael: Yeah, makes sense.

362
00:18:12,500 --> 00:18:16,160
I imagine that's gonna be a heck
of a lot of work for folks to

363
00:18:16,160 --> 00:18:16,660
implement.

364
00:18:17,040 --> 00:18:17,740
But well,

365
00:18:17,740 --> 00:18:19,780
Markus: If it's gonna be implemented
at all.

366
00:18:19,840 --> 00:18:20,140
True.

367
00:18:20,140 --> 00:18:24,280
Because the SQL standard, it has
the core functionality which

368
00:18:24,280 --> 00:18:28,500
every product is supposed to implement
and by testing it, I know

369
00:18:28,500 --> 00:18:30,180
it's not working out.

370
00:18:30,520 --> 00:18:33,280
And on top of that, it has optional
features.

371
00:18:33,900 --> 00:18:36,940
And all the advanced features are
basically optional features.

372
00:18:37,740 --> 00:18:42,040
So vendors can choose to either
implement it or not, but if

373
00:18:42,040 --> 00:18:44,740
they implement it, then they should
follow the standard.

374
00:18:45,720 --> 00:18:46,260
Michael: Got it.

375
00:18:46,260 --> 00:18:46,760
Okay.

376
00:18:47,380 --> 00:18:50,200
And I think I read somewhere, but
maybe this is misguided or

377
00:18:50,200 --> 00:18:54,820
outdated, that vendors can claim
that they support SQL if they

378
00:18:54,820 --> 00:18:56,300
support SQL 92.

379
00:18:57,740 --> 00:19:02,380
But how does it work for the non-optional
features for the core?

380
00:19:03,520 --> 00:19:04,020
Yeah.

381
00:19:04,480 --> 00:19:04,980
Markus: True.

382
00:19:05,740 --> 00:19:07,480
The short answer is not at all.

383
00:19:08,480 --> 00:19:08,980
Okay.

384
00:19:09,520 --> 00:19:09,583
Yeah.

385
00:19:09,583 --> 00:19:12,100
And that's part of the story behind
modern SQL and the charts

386
00:19:12,100 --> 00:19:12,780
you mentioned.

387
00:19:13,940 --> 00:19:15,820
I think a standard is nice.

388
00:19:15,820 --> 00:19:16,080
Okay.

389
00:19:16,080 --> 00:19:20,040
But if nobody cares about it, then
it doesn't have any power

390
00:19:20,080 --> 00:19:21,540
to unify the language.

391
00:19:22,020 --> 00:19:27,080
But as soon as you see the differences
and the mistakes the products

392
00:19:27,180 --> 00:19:31,500
have, then there is a motivation,
then there's an outside pressure

393
00:19:31,720 --> 00:19:34,420
to actually comply with the standard.

394
00:19:35,020 --> 00:19:39,280
And I can see it, I've actually
noticed it today, that when I

395
00:19:39,280 --> 00:19:42,940
publish something on my website,
the vendors are looking there.

396
00:19:43,180 --> 00:19:43,680
Michael: Nice.

397
00:19:43,680 --> 00:19:46,560
Markus: And they are questioning
when I say, okay, it doesn't

398
00:19:46,560 --> 00:19:49,540
work in that product, then they
ask, hey, I thought this is supposed

399
00:19:49,540 --> 00:19:50,580
to work, why not?

400
00:19:50,580 --> 00:19:54,840
And then I can show them the test
case I have and that it doesn't

401
00:19:54,840 --> 00:19:55,340
work.

402
00:19:56,200 --> 00:20:01,940
And some of them fix those issues
quite quickly, like in 2 weeks.

403
00:20:02,780 --> 00:20:03,280
Michael: Brilliant.

404
00:20:04,860 --> 00:20:06,180
Markus: So sometimes it just happens.

405
00:20:06,180 --> 00:20:10,640
I publish a new article mentioning
10 systems and 1 of them is

406
00:20:10,640 --> 00:20:12,380
behaving differently, maybe wrong.

407
00:20:12,440 --> 00:20:15,240
And just coincidentally, 2 weeks
later, there's a new release

408
00:20:15,240 --> 00:20:16,980
by that system that behaves correctly.

409
00:20:17,160 --> 00:20:19,440
So that happened, yeah, that happens.

410
00:20:19,440 --> 00:20:22,360
Michael: Would you like to give
any kudos or praise to the folks

411
00:20:22,360 --> 00:20:24,640
that are managing to turn those
around so quickly?

412
00:20:24,720 --> 00:20:26,780
Markus: In that case, it was the
H2 database.

413
00:20:27,340 --> 00:20:28,860
The H2 database, yeah.

414
00:20:28,860 --> 00:20:31,560
On the other hand, of course, there
are systems that have longer

415
00:20:31,560 --> 00:20:34,840
cycles and then it takes longer
or maybe happens never.

416
00:20:34,900 --> 00:20:38,760
But on the other hand, if I look
at Oracle, if I may talk about

417
00:20:38,760 --> 00:20:39,340
Oracle briefly.

418
00:20:39,340 --> 00:20:40,460
Michael: Yeah, yeah, please do.

419
00:20:40,640 --> 00:20:44,600
Markus: The latest release, the
23 release, finally fills the

420
00:20:44,600 --> 00:20:46,380
gaps that are there for decades.

421
00:20:46,400 --> 00:20:48,940
Like they finally introduced a Boolean
type.

422
00:20:49,300 --> 00:20:52,200
They have ignored it for so many
decades and have just said,

423
00:20:52,200 --> 00:20:54,180
okay, make it a char 1, why?

424
00:20:54,440 --> 00:20:56,400
No, and that's fine.

425
00:20:56,640 --> 00:20:59,940
And now ultimately, there's movement
coming in there.

426
00:21:00,480 --> 00:21:01,720
Even from Oracle.

427
00:21:02,180 --> 00:21:03,080
Even from Oracle.

428
00:21:03,340 --> 00:21:06,820
The whole industry has actually
much more movement than like

429
00:21:06,820 --> 00:21:08,280
10 years before.

430
00:21:08,600 --> 00:21:12,320
Like that MySQL is getting Window
function support and recursion

431
00:21:12,360 --> 00:21:16,220
support and all of that was something
I didn't expect actually.

432
00:21:16,820 --> 00:21:22,360
When I was starting with modern
SQL in 2015, it was rather static.

433
00:21:23,200 --> 00:21:25,940
There were a few products doing
the modern stuff and there were

434
00:21:25,940 --> 00:21:27,780
a few products not doing the modern
stuff.

435
00:21:27,780 --> 00:21:29,280
These were the two things.

436
00:21:30,060 --> 00:21:35,200
And I did not expect the non-modern
stuff to catch up.

437
00:21:35,200 --> 00:21:37,860
It was more like, yeah, well, they
will stick there forever.

438
00:21:38,080 --> 00:21:43,240
But then ultimately, they started
to get a lot of dynamics into

439
00:21:43,480 --> 00:21:45,300
the language support that they
offer.

440
00:21:46,260 --> 00:21:48,180
Michael: Nice, and we all benefit,
right?

441
00:21:48,420 --> 00:21:49,900
And thanks for bringing up Oracle.

442
00:21:50,020 --> 00:21:51,180
I think they're a great case.

443
00:21:51,180 --> 00:21:54,000
I think they often are one of the
first to support some of these

444
00:21:54,000 --> 00:21:57,800
newer things, even if they have
had in the past some of these

445
00:21:57,800 --> 00:21:58,300
gaps.

446
00:21:58,540 --> 00:22:01,920
It definitely gets brought up quite
a lot when people choose,

447
00:22:01,920 --> 00:22:05,640
like when people mention why they
chose PostgreSQL, standard

448
00:22:05,640 --> 00:22:10,220
compliance and things working as
expected, which aren't exactly

449
00:22:10,320 --> 00:22:13,860
the same thing, but quite close
to the same thing, gets brought

450
00:22:13,860 --> 00:22:17,860
up quite often as one of the reasons
they trust Postgres over other,

451
00:22:17,880 --> 00:22:21,680
at the time, open-source databases,
but even yeah, commercial

452
00:22:21,740 --> 00:22:22,240
databases.

453
00:22:23,220 --> 00:22:26,820
So it is important and we all benefit
ultimately if everybody

454
00:22:26,920 --> 00:22:29,660
can work together on these things.

455
00:22:30,300 --> 00:22:31,900
If it works as expected.

456
00:22:32,660 --> 00:22:33,400
That's cool.

457
00:22:33,480 --> 00:22:36,180
Other than your site, of course,
being very important in the

458
00:22:36,180 --> 00:22:39,900
shift, what else do you think might
have helped with this change?

459
00:22:41,140 --> 00:22:46,360
Markus: I think the MySQL move
is actually pretty important because

460
00:22:46,560 --> 00:22:51,780
now the most commonly used database
actually in the world, besides

461
00:22:51,820 --> 00:22:54,780
SQLite of course, but even SQLite
has moved.

462
00:22:55,600 --> 00:22:59,440
Those systems supporting these
features means these are not exotic

463
00:22:59,540 --> 00:23:01,260
features of a few vendors.

464
00:23:02,080 --> 00:23:05,520
I think as soon as this is there
in the MySQL documentation,

465
00:23:06,180 --> 00:23:11,160
then it becomes more official being
SQL rather than just being

466
00:23:11,160 --> 00:23:13,200
Postgres or Oracle or whatever.

467
00:23:13,860 --> 00:23:18,480
I think the fact that MySQL 8
got all of some of these modern

468
00:23:18,480 --> 00:23:22,120
features makes quite a big difference
how people look at these

469
00:23:22,120 --> 00:23:22,620
features.

470
00:23:23,440 --> 00:23:25,540
Michael: Yeah, that was a huge,
huge release.

471
00:23:25,920 --> 00:23:29,640
I remember MySQL, it was one of the
8 releases that included and

472
00:23:29,640 --> 00:23:33,200
this is not related to SQL, well
not related to SQL standard

473
00:23:33,200 --> 00:23:36,180
at all, but they got explain, analyze as well, which was a

474
00:23:36,180 --> 00:23:38,500
big step forward in the things
I look at.

475
00:23:38,500 --> 00:23:40,780
So yeah, huge, huge release.

476
00:23:41,400 --> 00:23:45,060
And we're going to get hate mail
for this, but perhaps Oracle

477
00:23:45,060 --> 00:23:47,940
deserves some credit for that now
being the stewards of the MySQL

478
00:23:48,900 --> 00:23:50,140
codebase and project.

479
00:23:50,280 --> 00:23:53,740
Markus: Yeah, I was not expecting
that when MySQL ended up at

480
00:23:53,740 --> 00:23:54,220
Oracle.

481
00:23:54,220 --> 00:23:58,980
I was rather expecting it like
being just come over to the real

482
00:23:58,980 --> 00:24:01,920
database like this kind of thing.

483
00:24:02,060 --> 00:24:03,200
But that didn't happen.

484
00:24:03,200 --> 00:24:06,880
And if we now look at MariaDB,
which is basically the continuation

485
00:24:07,240 --> 00:24:12,180
of the old team and at MySQL, then
we can see, well, they diverge

486
00:24:12,280 --> 00:24:13,600
quite a lot.

487
00:24:14,340 --> 00:24:19,140
And while MySQL was rather focusing
on getting the basics right.

488
00:24:19,940 --> 00:24:22,940
MariaDB is more focusing on getting
features in.

489
00:24:24,860 --> 00:24:28,580
And on the long run, I think the
basics are worth it.

490
00:24:28,580 --> 00:24:31,320
And I think we can see that with
Postgres because in Postgres

491
00:24:31,320 --> 00:24:34,640
the basics were right from the
beginning, I would say, at least

492
00:24:34,640 --> 00:24:37,980
most of them, and that pays off
on the long run.

493
00:24:38,480 --> 00:24:42,760
So I'm also happy to see MariaDB
being still there.

494
00:24:43,260 --> 00:24:47,220
It's another competitor, especially
against MySQL in the open

495
00:24:47,220 --> 00:24:47,980
source world.

496
00:24:48,340 --> 00:24:52,780
But on the long run, let's see
how long they can survive.

497
00:24:54,100 --> 00:24:56,680
Michael: Yeah, and as users, we
benefit from the competition,

498
00:24:56,820 --> 00:24:57,080
right?

499
00:24:57,080 --> 00:25:01,660
Like, obviously as community members,
I'm definitely very biased,

500
00:25:02,020 --> 00:25:04,800
but I do like, I want there to
be competitors.

501
00:25:04,900 --> 00:25:08,220
It would be, I don't think Postgres
as a project would benefit

502
00:25:08,220 --> 00:25:11,540
from being the only relational
database standing in 20 years

503
00:25:11,540 --> 00:25:12,040
time.

504
00:25:12,560 --> 00:25:15,480
Historically, I don't think that's
been the place where most

505
00:25:15,480 --> 00:25:18,940
innovation happens or the most
like customer benefit happens.

506
00:25:18,960 --> 00:25:23,000
So, yeah, I love to see it and
appreciate your work on this as

507
00:25:23,000 --> 00:25:23,260
well.

508
00:25:23,260 --> 00:25:23,760
So I did.

509
00:25:23,760 --> 00:25:24,960
I didn't want to minimize that.

510
00:25:24,960 --> 00:25:27,880
I do think you have had an impact,
but it's cool to see other

511
00:25:27,880 --> 00:25:28,940
reasons as well.

512
00:25:28,940 --> 00:25:30,700
Markus: Of course, there are many
reasons for that.

513
00:25:30,700 --> 00:25:30,860
Yeah.

514
00:25:30,860 --> 00:25:33,460
You also see the noSQL movement,
remember?

515
00:25:33,740 --> 00:25:34,240
Yeah.

516
00:25:34,780 --> 00:25:38,700
In the meanwhile, most of the noSQL
vendors that are still there

517
00:25:39,520 --> 00:25:44,680
have some QL, whether they claim
it is SQL or it is just some

518
00:25:44,680 --> 00:25:46,000
similar QL.

519
00:25:46,440 --> 00:25:50,540
Then we have new vendors that start
up with new SQL dialects,

520
00:25:50,540 --> 00:25:54,280
you know, DuckDB and so on, that
also bring fresh ideas into

521
00:25:54,280 --> 00:25:55,120
the SQL dialect.

522
00:25:55,120 --> 00:25:57,880
I think some of the ideas they
are bringing up into their own

523
00:25:57,880 --> 00:26:02,720
dialect are actually worth thinking
about bringing into the standard

524
00:26:02,720 --> 00:26:05,460
and maybe other products like in
either way.

525
00:26:06,660 --> 00:26:09,160
So, as I said, there's a lot of
dynamics at the moment.

526
00:26:09,640 --> 00:26:11,580
Michael: Let's go back to what
you would like to see.

527
00:26:11,580 --> 00:26:15,000
Let's say if somebody's like, well,
you know what, I do mostly

528
00:26:15,060 --> 00:26:18,060
just use the SQL I learned when
I first learned.

529
00:26:18,120 --> 00:26:21,380
Yeah, okay, maybe I do a bit of
JSON stuff here and there.

530
00:26:21,600 --> 00:26:24,720
I've learned, I've had to use a
lateral join or window function

531
00:26:24,720 --> 00:26:27,820
once, but you know, that's about
it.

532
00:26:28,520 --> 00:26:29,720
What should they be doing?

533
00:26:29,720 --> 00:26:31,980
How could they go about learning
more?

534
00:26:32,600 --> 00:26:36,100
Markus: Well, this is pretty much
the same for the last, not

535
00:26:36,100 --> 00:26:37,280
20, but 10 years.

536
00:26:37,680 --> 00:26:42,600
The most undervalued feature of
modern SQL is and always was

537
00:26:42,660 --> 00:26:43,680
window functions.

538
00:26:45,300 --> 00:26:51,340
If you don't master window functions
yet, then well, stop listening

539
00:26:51,340 --> 00:26:54,480
now, we will wait for you and learn
window functions, yeah?

540
00:26:54,640 --> 00:26:55,820
And then continue listening.

541
00:26:56,100 --> 00:26:59,940
So this is really the single most
important and powerful feature.

542
00:27:00,660 --> 00:27:04,000
That's definitely my recommendation
if you are not into that

543
00:27:04,000 --> 00:27:04,660
yet.

544
00:27:04,740 --> 00:27:11,020
Other than that, let me give you
1 problem I see more often nowadays,

545
00:27:11,980 --> 00:27:15,780
to change the mind about how to
use new features, because you

546
00:27:15,780 --> 00:27:17,960
we're mentioning JSON as well.

547
00:27:18,140 --> 00:27:18,640
Yeah.

548
00:27:19,020 --> 00:27:24,600
I see JSON used in a way that
I'm not sure it's the best way.

549
00:27:24,840 --> 00:27:25,740
Michael: Okay, great.

550
00:27:26,140 --> 00:27:26,880
Markus: Quite often.

551
00:27:27,780 --> 00:27:32,980
And the way I'm explaining it is
that data types like JSON or

552
00:27:32,980 --> 00:27:38,860
JSONB, or even like Boolean, yeah,
they have several use cases,

553
00:27:38,860 --> 00:27:40,420
actually 3 use cases.

554
00:27:41,340 --> 00:27:44,000
A data type can be used in the
persistent schema.

555
00:27:45,060 --> 00:27:48,220
It's in a great table, you can see
a data type.

556
00:27:48,760 --> 00:27:53,660
But also important, a data type
can be used during the transformation.

557
00:27:54,760 --> 00:27:56,460
And I'll give you an example there.

558
00:27:56,460 --> 00:28:00,980
Like the Boolean type, we use most
of the time just during the

559
00:28:00,980 --> 00:28:01,480
transformation.

560
00:28:02,220 --> 00:28:05,140
Because in the WHERE clause, in
the ON clause, in the WHEN clause

561
00:28:05,140 --> 00:28:08,560
of case, and so on, and so on,
and so on, we have Boolean expressions

562
00:28:08,560 --> 00:28:12,380
and we're using Boolean values
just during the transformation.

563
00:28:13,080 --> 00:28:15,260
It's not coming from the base table
from the disk.

564
00:28:15,260 --> 00:28:19,120
It's not going up to the client
through the wire, just for the

565
00:28:19,120 --> 00:28:19,600
transformation.

566
00:28:19,600 --> 00:28:22,100
That's the second use case for
data types.

567
00:28:22,440 --> 00:28:24,840
And the third use case is the API
layer.

568
00:28:24,840 --> 00:28:27,560
So meaning how do we communicate
with the client?

569
00:28:28,500 --> 00:28:31,200
And of course we think about a
tabular representation.

570
00:28:31,320 --> 00:28:33,980
Yeah, of course, of course, of
course, and maybe a normalized

571
00:28:34,160 --> 00:28:34,540
1.

572
00:28:34,540 --> 00:28:39,560
But here, I say, no, the normalization
is fine for the disk,

573
00:28:39,880 --> 00:28:42,940
but it's not so much fine for the
API layer.

574
00:28:43,860 --> 00:28:45,640
And here I love JSON.

575
00:28:46,100 --> 00:28:46,600
Yeah.

576
00:28:46,720 --> 00:28:50,360
Because if you think of a tag system,
you have some entities

577
00:28:50,360 --> 00:28:52,280
and you can add tags on them.

578
00:28:52,660 --> 00:28:56,500
Then obviously the normalized way
to do it is to have 2 tables.

579
00:28:57,360 --> 00:29:00,560
I still think that's a good approach
in many, many cases.

580
00:29:01,120 --> 00:29:05,400
But on the API layer, if you join
them, you will get multiples

581
00:29:05,440 --> 00:29:07,720
of the entity data, which is meaningless.

582
00:29:07,720 --> 00:29:08,940
That doesn't make sense.

583
00:29:09,120 --> 00:29:13,520
But here comes basically the JSON
array_agg() and you aggregate

584
00:29:13,620 --> 00:29:18,500
the tags into a JSON field that
you then give to the client API.

585
00:29:19,280 --> 00:29:22,860
And then you just keep the 1 row
of the entity and have the list,

586
00:29:23,360 --> 00:29:26,920
a list, yeah, of text there.

587
00:29:26,980 --> 00:29:30,040
And here you see, this is where
I personally think that that's

588
00:29:30,040 --> 00:29:31,940
the most important use case for
JSON.

589
00:29:32,240 --> 00:29:34,300
It's not about storing JSON on
disk.

590
00:29:34,300 --> 00:29:37,760
Yeah, that can make sense sometimes
as well, no question.

591
00:29:38,040 --> 00:29:38,540
Yeah.

592
00:29:38,560 --> 00:29:43,020
But the main use case I see for
JSON is on the API side.

593
00:29:43,480 --> 00:29:43,860
Michael: Yeah.

594
00:29:43,860 --> 00:29:46,360
Or for what it was originally intended,
right?

595
00:29:47,380 --> 00:29:48,200
Markus: Yeah, maybe.

596
00:29:48,420 --> 00:29:51,060
But what I see the people when
they say, yeah, we're using JSON,

597
00:29:51,060 --> 00:29:54,120
what they mean is they store it
on disk.

598
00:29:54,280 --> 00:29:54,820
Michael: Yes, yes.

599
00:29:54,820 --> 00:29:55,540
Good point.

600
00:29:55,960 --> 00:29:56,460
Awesome.

601
00:29:56,580 --> 00:29:58,840
Well, is there anything you'd point
people at?

602
00:29:58,840 --> 00:30:00,960
Have you given a talk on anything
like that before?

603
00:30:00,960 --> 00:30:02,940
Any good materials for people?

604
00:30:02,960 --> 00:30:05,740
Markus: Not so recently, you know,
there was this corona

605
00:30:05,740 --> 00:30:06,240
break.

606
00:30:07,160 --> 00:30:08,540
So no conferences.

607
00:30:09,010 --> 00:30:12,620
I'm still not talking a lot in
public like there was another

608
00:30:12,620 --> 00:30:16,120
podcast recorded a German one like
a few months back, but that's

609
00:30:16,120 --> 00:30:16,620
it.

610
00:30:16,640 --> 00:30:21,040
So follow my blog, modern-sql.com
and also use the indexloop.com.

611
00:30:22,060 --> 00:30:24,160
This is where I publish my updates.

612
00:30:24,800 --> 00:30:29,920
And what you can expect is basically
these matrices, who can

613
00:30:29,920 --> 00:30:33,580
do what, use cases for the features
that I test.

614
00:30:33,580 --> 00:30:35,100
That's of course also important.

615
00:30:35,140 --> 00:30:37,860
What is the benefit of knowing,
okay, that feature does work

616
00:30:37,860 --> 00:30:40,580
in that system, but I don't know
what to use that feature for.

617
00:30:40,920 --> 00:30:44,340
Yeah, well, that's what you can
expect on a more or less regular

618
00:30:44,340 --> 00:30:47,620
basis like once a month or twice
a month, something like that.

619
00:30:48,060 --> 00:30:50,880
Michael: Nice, and I think I just,
might have even been this

620
00:30:50,880 --> 00:30:54,960
morning, was reading your half
year update as well, which was

621
00:30:54,960 --> 00:30:55,460
cool.

622
00:30:55,680 --> 00:30:57,680
So yeah, highly recommend those
resources.

623
00:30:57,840 --> 00:31:01,010
We've shouted them out a few times
on the podcast in previous

624
00:31:01,010 --> 00:31:05,140
episodes, and your book as well,
SQL Performance Explained was

625
00:31:05,140 --> 00:31:05,640
fantastic.

626
00:31:05,660 --> 00:31:06,920
So thank you for that.

627
00:31:07,040 --> 00:31:10,260
Actually on that note, when's Modern
SQL the book coming?

628
00:31:10,680 --> 00:31:14,800
Markus: Yeah, well, I started in
2015 with the intention to make

629
00:31:14,800 --> 00:31:15,640
it a book.

630
00:31:15,740 --> 00:31:16,420
Michael: Oh really?

631
00:31:16,500 --> 00:31:18,460
Markus: Yes, that was the original
intention.

632
00:31:18,940 --> 00:31:21,860
So when I did the other book, the
SQL Performance Explained,

633
00:31:22,080 --> 00:31:25,680
it also started as a blog and then
was later on put into a book.

634
00:31:25,680 --> 00:31:28,340
And the blog is still available
for free on use-the-index-loop.com.

635
00:31:29,540 --> 00:31:33,540
And the idea was to do the same
thing with Modern SQL, but then

636
00:31:33,540 --> 00:31:36,040
happened what I've already told,
something which I thought was

637
00:31:36,040 --> 00:31:38,200
static became very dynamic.

638
00:31:38,560 --> 00:31:39,060
Michael: Yeah.

639
00:31:39,340 --> 00:31:43,260
Markus: And now, yeah, it's hard
to catch up, but there is progress

640
00:31:43,340 --> 00:31:45,580
and there will be a book.

641
00:31:45,900 --> 00:31:46,300
Nice.

642
00:31:46,300 --> 00:31:50,140
I'm still working on it, I'm using
the draft as material in my training.

643
00:31:50,740 --> 00:31:55,040
So it is getting better but there
is no definite timeline yet

644
00:31:55,040 --> 00:31:55,540
for it.

645
00:31:55,540 --> 00:31:58,640
Michael: As should always be the
way with things in the future.

646
00:31:58,660 --> 00:32:00,060
Well, good luck with that.

647
00:32:00,060 --> 00:32:01,920
Thank you so much for coming on.

648
00:32:01,920 --> 00:32:03,500
It's been a pleasure having you.

649
00:32:03,580 --> 00:32:04,540
Best of luck.

650
00:32:04,700 --> 00:32:06,880
Markus: Yeah, thank you and thanks
for having me.

651
00:32:07,200 --> 00:32:07,860
Bye-bye.