1
00:00:00,000 --> 00:00:03,920
So I think it's a lot easier to
develop when your queries are

2
00:00:03,950 --> 00:00:05,059
co located with your components.

3
00:00:05,381 --> 00:00:08,611
So every component is responsible
for getting the data it needs.

4
00:00:08,892 --> 00:00:11,492
And this makes your app like
super composable, right?

5
00:00:11,492 --> 00:00:16,172
Like you can add and remove components
to your app or your component tree.

6
00:00:16,442 --> 00:00:19,202
And because they're responsible
for getting their data, like, you

7
00:00:19,202 --> 00:00:21,672
don't have to do any additional
wiring or prop drilling.

8
00:00:21,985 --> 00:00:24,135
Welcome to the localfirst.fm podcast.

9
00:00:24,465 --> 00:00:27,375
I'm your host, Johannes Schickling,
and I'm a web developer, a

10
00:00:27,375 --> 00:00:30,325
startup founder, and love the
craft of software engineering.

11
00:00:30,865 --> 00:00:34,855
For the past few years, I've been on a
journey to build a modern, high quality

12
00:00:34,855 --> 00:00:36,875
music app using web technologies.

13
00:00:37,055 --> 00:00:41,235
And in doing so, I've been falling down
the rabbit hole of local-first software.

14
00:00:41,755 --> 00:00:44,755
This podcast is your invitation
to join me in that journey.

15
00:00:44,868 --> 00:00:48,538
In this episode, I'm speaking to Matt
Wanlaw, a prolific local-first tool

16
00:00:48,538 --> 00:00:52,758
builder who's behind projects such
as Vlcn, CR-SQLite, and Materialite.

17
00:00:53,098 --> 00:00:56,948
Most recently, Matt also joined
Rocicorp to work on their new product.

18
00:00:57,378 --> 00:01:02,278
In this conversation, we go deep on
his projects covering CRDTs, SQLite,

19
00:01:02,328 --> 00:01:04,028
and incremental view maintenance.

20
00:01:04,262 --> 00:01:08,532
Before getting started, also a
big thank you to Rocicorp and

21
00:01:08,532 --> 00:01:10,052
Expo for supporting this podcast.

22
00:01:10,472 --> 00:01:12,212
And now my interview with Matt.

23
00:01:13,188 --> 00:01:14,868
Hey Matt, welcome to the show.

24
00:01:14,878 --> 00:01:15,588
How are you doing?

25
00:01:16,338 --> 00:01:16,748
Good.

26
00:01:16,808 --> 00:01:17,098
Yeah.

27
00:01:17,098 --> 00:01:17,648
Thanks for having me.

28
00:01:17,932 --> 00:01:20,102
I'm super excited to have you on the show.

29
00:01:20,322 --> 00:01:23,572
You and I have been collaborating
on a few projects now over the

30
00:01:23,572 --> 00:01:25,002
course of the last few years.

31
00:01:25,312 --> 00:01:29,072
Most notably, you've been
working on CR-SQLite and also

32
00:01:29,072 --> 00:01:30,852
Materialite most recently.

33
00:01:31,332 --> 00:01:35,182
But for those of you who don't
know Matt would you briefly

34
00:01:35,182 --> 00:01:36,332
mind introducing yourself?

35
00:01:36,747 --> 00:01:41,037
Yeah um, Matt Wonlaw, and I guess
I've been involved in the local-first

36
00:01:41,057 --> 00:01:42,927
community for about two years now.

37
00:01:43,257 --> 00:01:46,247
It originally started with me trying
to resurrect an ancient project

38
00:01:46,247 --> 00:01:50,817
of mine and trying to add, like,
multiplayer and sync between devices.

39
00:01:51,047 --> 00:01:54,663
And I saw, like, that's actually a pretty
hard problem, and there weren't, like,

40
00:01:54,663 --> 00:01:59,293
many off the shelf solutions, so then I
just, like, dove into this , and that's

41
00:01:59,293 --> 00:02:02,143
where all these projects like CR-SQLite,
Materialite and other things came from.

42
00:02:02,330 --> 00:02:06,780
Yeah, and I think this is also how
you and I have met quite a while ago.

43
00:02:07,050 --> 00:02:10,270
And before that, you've been
working at Facebook for many years.

44
00:02:10,300 --> 00:02:13,560
And even before that, I think you
had your first engineering role

45
00:02:13,777 --> 00:02:18,107
working on some software that by
definition had to be very local.

46
00:02:18,747 --> 00:02:20,247
There was no local-first yet.

47
00:02:20,537 --> 00:02:25,027
Would you mind telling us a little bit
more about what those first apps were that

48
00:02:25,027 --> 00:02:26,427
you've been working on professionally?

49
00:02:26,900 --> 00:02:31,510
Yeah, so like my first job out of
college was at a defense contractor

50
00:02:31,580 --> 00:02:34,960
called Lockheed Martin, but essentially
what we were doing was we were building

51
00:02:35,187 --> 00:02:37,327
software for submarines, so like U.

52
00:02:37,327 --> 00:02:37,457
S.

53
00:02:37,457 --> 00:02:41,497
Navy submarines, and yeah, obviously a
submarine has no internet connection,

54
00:02:41,497 --> 00:02:45,657
everything has to be self contained, but
even beyond that, so the software I built

55
00:02:45,677 --> 00:02:50,327
was for, Like these thick clients for the
workstations that sonar operators used.

56
00:02:50,927 --> 00:02:54,377
And, you know, there, there are servers
on the sub, you know, servers that

57
00:02:54,377 --> 00:02:58,307
provide storage and some that provide
the signal processing and whatnot.

58
00:02:58,907 --> 00:03:02,907
But like an extra requirement was, you
know, a sub is in a hostile environment.

59
00:03:03,007 --> 00:03:03,997
Things can break.

60
00:03:04,117 --> 00:03:07,687
So the work, workstations need to be able
to work even if all the servers were down.

61
00:03:08,027 --> 00:03:10,567
So, you know, they need to
hold the last 24 hours of data.

62
00:03:10,947 --> 00:03:12,667
So the sonar operators can, you know.

63
00:03:12,977 --> 00:03:16,027
If every other system's down, they
can at least work with the last

64
00:03:16,027 --> 00:03:17,357
24 hours of data that they had.

65
00:03:18,177 --> 00:03:23,120
So that, you know, I guess was my
first brief introduction to local

66
00:03:23,120 --> 00:03:26,250
apps that eventually do need to do
some sort of syncing once connections

67
00:03:26,250 --> 00:03:28,960
are re established and, you know,
reconcile the changes that were made.

68
00:03:29,360 --> 00:03:30,490
That's fascinating.

69
00:03:30,510 --> 00:03:35,970
Have you learned some patterns back then
that you still see out in the wild today?

70
00:03:36,020 --> 00:03:39,380
Or do you think the approach that
you've taken back then is entirely

71
00:03:39,380 --> 00:03:40,740
different compared to today?

72
00:03:41,115 --> 00:03:44,355
Uh, Back then the data
was very well segmented.

73
00:03:44,395 --> 00:03:46,355
So there was like never really conflicts.

74
00:03:46,885 --> 00:03:48,965
But event sourcing was the main pattern.

75
00:03:49,515 --> 00:03:52,525
So just have an event log of
everything that has occurred.

76
00:03:52,935 --> 00:03:54,820
And when connectivity is
reestablished, do you know.

77
00:03:55,040 --> 00:03:57,930
Play that event log forward
from, you know, the time you lost

78
00:03:57,930 --> 00:03:59,170
connection to the current time.

79
00:03:59,597 --> 00:04:00,037
Yeah.

80
00:04:00,077 --> 00:04:04,247
I mean, event sourcing, I feel
like is still a super active topic.

81
00:04:04,257 --> 00:04:09,287
I'm actually looking into it much more
closely myself for the use case of

82
00:04:09,307 --> 00:04:14,277
building Overtone, and I'm also exploring
to embrace it a bit more for like syncing

83
00:04:14,277 --> 00:04:16,147
and migration strategy for LiveStore.

84
00:04:16,737 --> 00:04:20,587
But I also think that there might
be a renaissance for event sourcing.

85
00:04:20,597 --> 00:04:26,187
We had it for a while with Redux in a
much more self contained space, but I

86
00:04:26,207 --> 00:04:30,917
think there's a, that's an interesting
topic for research and production as well.

87
00:04:31,317 --> 00:04:34,087
Curious to hear your thoughts
on event sourcing and how it

88
00:04:34,087 --> 00:04:35,417
could fit into local-first.

89
00:04:36,431 --> 00:04:37,451
I mean, I really like it.

90
00:04:37,848 --> 00:04:41,488
, you have a source of truth,
like of all the facts, all

91
00:04:41,488 --> 00:04:42,498
the things that ever happened.

92
00:04:43,068 --> 00:04:46,885
Whereas, you know, a regular
app or state or database that

93
00:04:46,885 --> 00:04:48,025
doesn't do event sourcing.

94
00:04:48,715 --> 00:04:51,435
You're losing all the
things that happen, right?

95
00:04:51,435 --> 00:04:54,855
And you just have this snapshot of
what things are right at this instant.

96
00:04:55,355 --> 00:04:56,875
So yeah there's no way to rewind.

97
00:04:57,305 --> 00:05:01,395
There's no way to, merge other people's
changes if they did come in the past.

98
00:05:01,975 --> 00:05:05,841
, so I guess event sourcing, you know,
probably fell out of favor in some

99
00:05:05,841 --> 00:05:08,668
ways because, you know, storage
requirements, like are you really

100
00:05:08,668 --> 00:05:09,748
going to store all the event logs?

101
00:05:10,088 --> 00:05:12,718
And then also, you know,
processing those event logs.

102
00:05:12,758 --> 00:05:16,398
You do eventually need to process
them into some you know, snapshot of

103
00:05:16,398 --> 00:05:17,858
state so your app can work with it.

104
00:05:18,428 --> 00:05:20,888
So I think, you know, people seeing
that, oh, I have to process the events

105
00:05:20,888 --> 00:05:23,718
and turn it into like these tables
anyway, like why even do the event logs?

106
00:05:23,738 --> 00:05:24,968
Why not just mutate in place?

107
00:05:25,395 --> 00:05:30,426
Yeah, I think in case anyone of the
listeners is interested in local-first

108
00:05:30,426 --> 00:05:32,354
and event sourcing, please get in touch.

109
00:05:32,354 --> 00:05:37,768
I think this is a very interesting
topic to research a little bit  After

110
00:05:37,798 --> 00:05:43,568
building those apps for like submarines
which sounds super fascinating, you've

111
00:05:43,568 --> 00:05:49,288
at some point also decided to move
on and join Facebook nowadays, Meta.

112
00:05:49,738 --> 00:05:52,568
I'm curious to hear more what
you've been working on there and

113
00:05:52,568 --> 00:05:57,688
whether there was any intersection
with your local-first topics now.

114
00:05:58,072 --> 00:05:58,322
Yeah.

115
00:05:58,322 --> 00:05:59,261
So Meta,

116
00:05:59,484 --> 00:06:04,384
, I guess, like, it shaped the way I
approached the local-first problems after

117
00:06:04,384 --> 00:06:07,914
I left, but the problems there were very
, , not related to local-first at all.

118
00:06:08,084 --> 00:06:11,344
So I guess, yeah, what I worked on,
it was like three different teams, but

119
00:06:11,344 --> 00:06:12,694
they all kind of shared the same theme.

120
00:06:12,940 --> 00:06:16,130
And a lot of what I was doing, we had a
system called Download Your Information.

121
00:06:16,130 --> 00:06:19,962
So this is like any Facebook user
can go click a button and get

122
00:06:19,962 --> 00:06:21,392
an archive of all their data.

123
00:06:21,922 --> 00:06:25,442
So we'd have to like crawl
your entire Facebook graph

124
00:06:25,682 --> 00:06:27,537
and find Everything you own.

125
00:06:27,597 --> 00:06:29,037
So we call it the ownership graph.

126
00:06:29,293 --> 00:06:33,273
And then another thing one of my team's
built was the deletion framework.

127
00:06:33,463 --> 00:06:35,313
So this is like you delete your account.

128
00:06:35,523 --> 00:06:38,873
So go delete every post you've ever
uploaded, every like you've ever

129
00:06:38,873 --> 00:06:43,043
made, every comment, every like ad
you've ever run every message you've

130
00:06:43,043 --> 00:06:46,013
ever sent across all the products,
Instagram, Messenger, Facebook.

131
00:06:46,053 --> 00:06:46,943
What's I guess not WhatsApp.

132
00:06:47,205 --> 00:06:50,225
And then, yeah, so that, that was
like traversing this deletion graph.

133
00:06:50,308 --> 00:06:54,248
And then a third product, like,
variation on all these themes was

134
00:06:54,258 --> 00:06:55,958
like investigations and human review.

135
00:06:56,648 --> 00:06:59,168
So, like, obviously there's
lots of abuse on Facebook.

136
00:06:59,705 --> 00:07:03,050
And we had to have a way for, like,
every product that's ever created, like,

137
00:07:03,640 --> 00:07:08,180
Can we make their content reviewable
by like a human review team at Meta?

138
00:07:08,430 --> 00:07:12,217
So this is like, yeah, trying to fan out
from the content and traverse the graph

139
00:07:12,227 --> 00:07:13,767
of things associated with it for review.

140
00:07:14,317 --> 00:07:18,617
So I guess like, all this like
graph walking was very, goes

141
00:07:18,627 --> 00:07:22,471
very hand in hand with query
languages and also schematization.

142
00:07:23,151 --> 00:07:27,068
So, I guess I mentioned like
three different areas of like

143
00:07:27,108 --> 00:07:28,288
quote unquote compliance, right?

144
00:07:28,338 --> 00:07:32,018
Deletion, download your
information, and content review.

145
00:07:32,658 --> 00:07:35,478
Well, if you're a product team
at Meta, you don't want to

146
00:07:35,488 --> 00:07:37,668
be slowed down with having to
integrate into all these systems.

147
00:07:37,668 --> 00:07:38,588
You don't have to learn about them.

148
00:07:38,588 --> 00:07:39,638
You just want to like ship your product.

149
00:07:40,263 --> 00:07:43,923
So like our key focus was how
can we make it so developers can

150
00:07:43,923 --> 00:07:45,673
just make their product and not
have to understand these things.

151
00:07:46,303 --> 00:07:48,923
So we kind of, we built it
into the schema language.

152
00:07:49,153 --> 00:07:53,553
So as you're defining your schema for
your product, you can say like for a

153
00:07:53,553 --> 00:07:57,083
given edge, like this is an ownership
edge, or this is like a deletion edge.

154
00:07:57,483 --> 00:08:01,383
And then just like by declaratively
specifying all this stuff all these

155
00:08:01,383 --> 00:08:02,723
other systems could just work.

156
00:08:03,333 --> 00:08:03,643
And.

157
00:08:03,978 --> 00:08:06,658
Yeah, eventually when I pivoted to
local-first problem, I was like, wow, this

158
00:08:06,658 --> 00:08:12,098
like developer experience of everything
being declarative and the schema layer

159
00:08:12,098 --> 00:08:16,038
down, taking care of it for you maybe
we can do this for local-first too.

160
00:08:16,258 --> 00:08:17,438
I couldn't agree more.

161
00:08:17,448 --> 00:08:21,438
Did you eventually land on something
where you got those declarative

162
00:08:21,448 --> 00:08:24,408
benefits for some local-first problems?

163
00:08:24,696 --> 00:08:28,986
Yeah, so, the project I ended up
working on was called CR-SQLite,

164
00:08:29,496 --> 00:08:32,576
so I was, like, taking these
ideas and adding them to SQLite.

165
00:08:32,806 --> 00:08:36,836
So, yeah, SQLite's already, you know,
a relational database, and it's already

166
00:08:36,836 --> 00:08:40,486
pretty declarative in terms of you
can define your schema for your table,

167
00:08:40,486 --> 00:08:43,806
and you can specify an index, and you
can specify foreign key relations, and

168
00:08:43,806 --> 00:08:45,116
the database manages all that for you.

169
00:08:45,796 --> 00:08:49,536
But it didn't have any primitives
for collaboration, right?

170
00:08:49,566 --> 00:08:54,451
So, like What if I give you a copy of
my database and I have a copy of my

171
00:08:54,451 --> 00:08:57,661
own database and we both go offline
and we both make a bunch of changes?

172
00:08:58,071 --> 00:09:00,961
What if we want to like sync our
changes together or merge our DBs?

173
00:09:01,561 --> 00:09:04,791
So what I started adding to
SQLite was this concept of CRDTs.

174
00:09:05,481 --> 00:09:09,971
So you could say for a given table,
what type of CRDT should we modeled as?

175
00:09:10,531 --> 00:09:15,231
Like if it's a grow only set, or if
it's like an add or move set, like

176
00:09:15,291 --> 00:09:17,101
technical terms in CRDT literature.

177
00:09:17,706 --> 00:09:21,076
And then for specific columns
in the table, you could also say

178
00:09:21,076 --> 00:09:24,256
like, what sort of CRDTs should
be used to merge those columns.

179
00:09:24,946 --> 00:09:30,583
So yeah, the idea was any developer who
has an app backed by SQLite, they can

180
00:09:30,583 --> 00:09:33,083
just go in and make some schema changes.

181
00:09:33,553 --> 00:09:36,328
And then that app can become
collaborative by allowing merging

182
00:09:36,328 --> 00:09:38,213
databases with other collaborators.

183
00:09:39,027 --> 00:09:39,397
Got it.

184
00:09:39,427 --> 00:09:44,867
And I think the parallel there is that
many things about SQL can be also seen

185
00:09:44,907 --> 00:09:49,427
as rather declarative and the schema
modeling, et cetera, those is where

186
00:09:49,687 --> 00:09:53,487
I think what we gain with local-first
software is the the collaborative

187
00:09:53,487 --> 00:09:56,017
nature of the apps working together.

188
00:09:56,557 --> 00:09:57,947
And I think this is where you.

189
00:09:58,307 --> 00:10:02,657
Brought some of the nice experiences
that you've seen at Meta, where there's

190
00:10:02,667 --> 00:10:06,937
such a strong engineering culture for
having a great developer experience,

191
00:10:07,337 --> 00:10:11,717
bringing that to the new topic that
you were interested in, local-first.

192
00:10:12,440 --> 00:10:14,970
Meta had a really interesting
philosophy when I got there.

193
00:10:15,640 --> 00:10:18,430
I think like one of my first questions
I asked was like, oh, how do we like

194
00:10:18,490 --> 00:10:21,840
make sure people use our stuff that
our teams, our security infrastructure?

195
00:10:22,340 --> 00:10:25,237
And they're building things to make
sure Meta's code was more secure.

196
00:10:25,277 --> 00:10:25,987
That was my first team.

197
00:10:26,170 --> 00:10:29,803
And like, security at Lockheed
was always Everybody hated it.

198
00:10:29,803 --> 00:10:31,743
It was like, nobody wants to do it.

199
00:10:31,743 --> 00:10:33,753
And there's always this
like mandated thing.

200
00:10:33,983 --> 00:10:35,253
So I got to Facebook.

201
00:10:35,253 --> 00:10:37,853
I'm like, how are we, I'm on
security for, how are we going to

202
00:10:37,853 --> 00:10:38,973
make sure people use our stuff?

203
00:10:39,493 --> 00:10:40,493
And it was interesting.

204
00:10:40,503 --> 00:10:45,213
Like the manager at the time was just
like, yeah, like we can't do anything.

205
00:10:45,303 --> 00:10:46,490
Facebook's bottom's up.

206
00:10:46,490 --> 00:10:47,120
We can't do anything.

207
00:10:47,120 --> 00:10:48,260
Make sure anybody uses our stuff.

208
00:10:48,610 --> 00:10:53,870
The only way we can do it is make the
safe way, the default, and easiest way.

209
00:10:53,880 --> 00:10:55,930
Like, people are going to
choose the secure way because

210
00:10:55,930 --> 00:10:56,840
we've made it the easiest.

211
00:10:57,320 --> 00:10:59,570
And like, there's no other way
we can make them choose it.

212
00:10:59,980 --> 00:11:03,620
So yeah, there's a huge, always a huge
focus on everything we did from DevX.

213
00:11:03,640 --> 00:11:07,310
So, that like, developers
would just pick that solution.

214
00:11:07,390 --> 00:11:09,370
Because it was easiest, not necessarily
because it was secure, just because

215
00:11:09,370 --> 00:11:10,360
it was the easiest thing around.

216
00:11:10,460 --> 00:11:11,800
And then it was also secure.

217
00:11:12,048 --> 00:11:13,668
Which is like the nice added benefit.

218
00:11:14,085 --> 00:11:19,115
So after having worked for many years
at Facebook, you've later then started

219
00:11:19,115 --> 00:11:24,325
working on CR-SQLite, but you didn't
arrive at CR-SQLite immediately.

220
00:11:24,415 --> 00:11:28,555
I think you were for a little
bit also still working on

221
00:11:28,565 --> 00:11:32,145
bringing back an app you've been
working on before called Strut.

222
00:11:32,455 --> 00:11:36,815
So I'd be curious to hear a little
bit more about that and particularly

223
00:11:36,825 --> 00:11:41,205
given that you worked on it quite a
bit before you joined Facebook and then

224
00:11:41,295 --> 00:11:43,495
you started working on it again after.

225
00:11:43,845 --> 00:11:47,165
So tell me more about the app and
how that led you to local-first.

226
00:11:47,462 --> 00:11:47,792
Yeah.

227
00:11:47,792 --> 00:11:51,572
So it was my first, I guess,
web app I was building.

228
00:11:51,802 --> 00:11:54,282
So yeah, at Lockheed, I was doing
all these like Java rich clients,

229
00:11:54,812 --> 00:11:56,192
and I wanted to get into web dev.

230
00:11:56,392 --> 00:11:58,912
So I started building this
thing called Strut.io.

231
00:11:58,932 --> 00:12:00,112
It's a presentation editor.

232
00:12:00,682 --> 00:12:03,732
And I knew people's devices,
you know, they have storage and

233
00:12:03,732 --> 00:12:06,772
compute, and the browser had,
you know, some storage back then.

234
00:12:07,352 --> 00:12:09,302
And I didn't want to, like,
run servers for people.

235
00:12:09,732 --> 00:12:13,512
I just wanted them to be able to
work and edit presentations if

236
00:12:13,512 --> 00:12:17,332
they're online or offline and not
have to sign up for an account.

237
00:12:17,372 --> 00:12:19,402
Like you go to the site and you
can immediately start doing stuff.

238
00:12:20,032 --> 00:12:25,132
So yeah, like I guess without knowing it,
I was building a presentation editor that

239
00:12:25,132 --> 00:12:27,382
was, you know, local-first in some ways.

240
00:12:28,002 --> 00:12:33,098
And at that time, it was like 2011, 2012,
there was, you know, I guess the prequel

241
00:12:33,098 --> 00:12:37,328
to the local-first movement, there's
like remoteStorage.io and ownCloud.

242
00:12:37,798 --> 00:12:40,358
And I started getting involved with
those, but then, yeah, eventually

243
00:12:40,358 --> 00:12:43,818
I got the Facebook offer and I
just put Strut on the shelf to be

244
00:12:43,818 --> 00:12:45,228
resurrected sometime in the future.

245
00:12:45,515 --> 00:12:49,265
And then, yeah, when I was leaving
Meta or Facebook, yeah, I didn't

246
00:12:49,265 --> 00:12:50,455
know exactly what I was going to do.

247
00:12:50,605 --> 00:12:54,675
So I figured, oh, I can just resurrect
Strut.io real quick and you know,

248
00:12:54,695 --> 00:12:55,915
it'll bring in some side income.

249
00:12:56,427 --> 00:12:59,547
In the early days before I abandoned
it, it had like 10, 000 monthly actives.

250
00:12:59,737 --> 00:13:02,997
So I was like, Oh maybe a decade
later we can get back to that again.

251
00:13:03,597 --> 00:13:06,757
But yeah, as I resurrected it, kind
of the landscape had changed a bit.

252
00:13:06,837 --> 00:13:12,240
Like people had a lot more devices and
also multiplayer was an expectation where

253
00:13:12,240 --> 00:13:13,840
like, You know, Google Slides, right?

254
00:13:13,840 --> 00:13:15,310
You can have multiple
people on the same deck.

255
00:13:16,000 --> 00:13:19,180
So I was, you know, looking into how I
was going to solve that problem for Strut.

256
00:13:19,800 --> 00:13:23,943
And, you know, rather than I guess I've
been building frameworks for so long at

257
00:13:23,943 --> 00:13:29,603
Meta, like, rather than making this a one
off thing in Strut to solve the problem.

258
00:13:29,848 --> 00:13:31,768
I was like, Oh, like, can I
solve it at a lower level?

259
00:13:32,218 --> 00:13:36,928
And that's when I went down the rabbit
hole of CRDTs investigating SQLite,

260
00:13:36,968 --> 00:13:39,268
adding stuff to SQLite and so on.

261
00:13:40,195 --> 00:13:45,585
So the data architecture you had for
the first version of Strut around, like,

262
00:13:45,975 --> 00:13:50,785
yeah, before Facebook, you mentioned
like around 2012 to when you've left

263
00:13:50,785 --> 00:13:53,065
Facebook and what you've arrived at.

264
00:13:53,350 --> 00:13:54,120
Very different.

265
00:13:54,460 --> 00:13:59,240
So I'm curious, like how you went
from like whether you took rather

266
00:13:59,240 --> 00:14:05,040
some smaller steps or some much bigger
steps altogether to rethink the data

267
00:14:05,040 --> 00:14:06,840
architecture and how you went about that.

268
00:14:07,460 --> 00:14:11,730
Yeah, I mean, I guess beyond just
collaboration, I wanted to support

269
00:14:12,340 --> 00:14:14,560
Yeah, arbitrarily large presentations.

270
00:14:15,230 --> 00:14:19,620
So right, the original version of Strut,
it would load everything off disk and

271
00:14:19,620 --> 00:14:23,160
into memory, and then when you're closing
the tab, or periodically, it would save.

272
00:14:23,855 --> 00:14:27,575
A big dump of memory to disk and yeah,
I think that works fine, probably

273
00:14:27,575 --> 00:14:29,665
for the vast majority of cases.

274
00:14:29,905 --> 00:14:33,895
But I wanted to be able to handle
presentations with hundreds of slides

275
00:14:33,895 --> 00:14:38,505
and tons of binary content or like images
and videos and all sorts of things.

276
00:14:39,065 --> 00:14:42,865
So like at first I was like, okay, how do
I like lazily load this stuff off of disk?

277
00:14:43,110 --> 00:14:46,770
And, you know, as you're starting
to implement that in your own data

278
00:14:46,770 --> 00:14:49,740
model, you're like, Oh, well, this
is kind of like a database, which,

279
00:14:50,240 --> 00:14:53,620
you know, knows how to page in
and out things in and out for me.

280
00:14:54,170 --> 00:14:58,357
So I think, you know, that was also an
indication that, okay, like, something

281
00:14:58,417 --> 00:14:59,807
more like a database would be good.

282
00:15:00,297 --> 00:15:02,327
And then I don't know how I
found that SQLite was better.

283
00:15:04,167 --> 00:15:08,147
I guess I saw James Long's
Absurd-SQL and Actual Budget.

284
00:15:08,657 --> 00:15:12,687
That's when I first realized that, yeah,
like, I mean, I'd always known Wasm was

285
00:15:12,687 --> 00:15:16,467
a thing but that like SQLite was being
compiled to Wasm and could run in the

286
00:15:16,467 --> 00:15:21,407
browser and it was actually faster than
IndexedDB when you were, I guess there's

287
00:15:21,407 --> 00:15:25,990
some caveats there, but SQLite would store
to IndexedDB, but it would store pages.

288
00:15:26,140 --> 00:15:27,580
So it would store four kilobyte chunks.

289
00:15:28,110 --> 00:15:29,450
IndexedDB, when you're like.

290
00:15:29,772 --> 00:15:32,432
Reading things in and out
individually, like single objects

291
00:15:32,432 --> 00:15:33,952
by key value is like super slow.

292
00:15:34,382 --> 00:15:38,525
And the fact that like, okay, you're
using SQLite and reading in chunks

293
00:15:38,525 --> 00:15:42,425
rather than individual objects, like
it would be faster than IndexedDB.

294
00:15:42,595 --> 00:15:45,725
Like if you change your IndexedDB
access to be paged, okay, it

295
00:15:45,725 --> 00:15:46,745
could be faster in SQLite.

296
00:15:48,020 --> 00:15:51,560
SQLite, you can still do point queries and
be faster than a point query in IndexedDB.

297
00:15:52,020 --> 00:15:55,060
So that was something else that turned me
on, like, oh, maybe I should use SQLite

298
00:15:55,060 --> 00:15:58,040
in the browser rather than IndexedDB,
so I don't have to implement all this

299
00:15:58,170 --> 00:16:00,040
crazy stuff to make IndexedDB go fast.

300
00:16:00,740 --> 00:16:04,100
And then, you know, there's all the
problem of, like, collaboration still

301
00:16:04,540 --> 00:16:09,500
and my justification for picking SQLite
then was, you know, not only is SQLite

302
00:16:09,500 --> 00:16:13,910
running the browser now, if I ever want
to take this app and make it native

303
00:16:14,020 --> 00:16:18,660
well, SQLite, Everything I build for
collaboration in the SQLite, like it'll

304
00:16:18,660 --> 00:16:24,120
work on native, it'll work on desktop
it'll work in the cloud if you want

305
00:16:24,120 --> 00:16:28,510
some like cloud peer or something rather
than like a JavaScript only solution.

306
00:16:29,350 --> 00:16:34,820
So you made the decision, SQLite is the
way forward, SQLite on top of IndexDB or

307
00:16:34,830 --> 00:16:39,490
on top of like some persistence mechanism,
but you now also need collaboration.

308
00:16:39,510 --> 00:16:42,660
You looked a little bit into
CRDTs, but you still decided

309
00:16:42,970 --> 00:16:46,860
instead of going CRDTs first in
terms of the developer experience.

310
00:16:46,860 --> 00:16:52,530
You still want to give a developer
yourself SQLite as the primary way to

311
00:16:52,530 --> 00:16:57,450
think about the developer experience,
but you started to use CRDTs or you

312
00:16:57,450 --> 00:17:02,490
wanted to use CRDTs as a implementation
detail to bring into SQLite.

313
00:17:02,500 --> 00:17:03,670
So how did you go about that?

314
00:17:03,690 --> 00:17:08,490
How did you solve that puzzle of making
SQLite collaborative out of the box?

315
00:17:09,285 --> 00:17:12,965
Yeah, so SQLite has a bunch of
mechanisms for extending it.

316
00:17:13,242 --> 00:17:15,832
You can make new virtual tables.

317
00:17:16,208 --> 00:17:18,918
yeah, a virtual table
is essentially a table.

318
00:17:18,928 --> 00:17:24,483
It presents itself as a table in SQLite,
except that The implementation of that

319
00:17:24,483 --> 00:17:28,473
table is any arbitrary code you want to
write, so C code, Rust code, whatever.

320
00:17:28,760 --> 00:17:32,930
You can make functions, new functions,
like a mat, you know, like min or max

321
00:17:32,930 --> 00:17:34,220
or whatever function you want to define.

322
00:17:34,690 --> 00:17:37,960
You can make new virtual file
systems, so it's pretty extensible

323
00:17:38,120 --> 00:17:39,440
from the plugin side of things.

324
00:17:39,807 --> 00:17:43,990
And yeah, one of the teams I had worked
with at Meta, they built this thing called

325
00:17:44,020 --> 00:17:48,347
osquery way back, I think, You know,
some other company may have seen osquery

326
00:17:48,367 --> 00:17:52,820
now, but it basically lets you query all
details about your OS through SQLite,

327
00:17:53,090 --> 00:17:55,490
and they did that through virtual tables.

328
00:17:55,683 --> 00:17:57,723
So yeah, for adding CRDTs to SQLite,

329
00:17:57,990 --> 00:18:01,200
it was kind of a combination of
a bunch of special functions.

330
00:18:01,805 --> 00:18:06,185
That like when you create a
table so, plus special functions

331
00:18:06,185 --> 00:18:07,145
and special virtual tables.

332
00:18:07,645 --> 00:18:11,055
So you can like, create a table, and
then you can call a function that

333
00:18:11,055 --> 00:18:16,295
converts that table to a CRR, or
Conflict-Free Replicated Relation.

334
00:18:16,735 --> 00:18:19,575
So just a fancy name for
a table that's a CRDT.

335
00:18:20,215 --> 00:18:22,875
Or you could use this like virtual
table syntax where you're like, I want

336
00:18:22,875 --> 00:18:30,222
to create this table as a CRR and then
you list all the columns and in the

337
00:18:31,012 --> 00:18:34,632
columns, so when you're creating a
virtual table, like you can customize that

338
00:18:34,632 --> 00:18:36,452
syntax all you want in your extension.

339
00:18:37,022 --> 00:18:40,062
So I could like have a
slot for custom data types.

340
00:18:40,712 --> 00:18:42,402
So you can say this is a string.

341
00:18:42,402 --> 00:18:46,082
And then like another addition to
the data type is like the CRDT type.

342
00:18:46,702 --> 00:18:50,832
So it's not only text, it's like a
perryText, or it's like, not just an

343
00:18:50,842 --> 00:18:53,062
int, it's an int that's a counter.

344
00:18:53,512 --> 00:18:56,132
Or it's not just a date, it's
a date that slash right wins.

345
00:18:56,812 --> 00:19:00,052
Yeah, and then and then I guess the
last thing, right, like, it's great

346
00:19:00,062 --> 00:19:06,042
to have a database that's, you know, a
CRDT but how do you merge efficiently?

347
00:19:06,632 --> 00:19:11,942
. There had been some earlier work, actually
by a guy named Iver and we actually worked

348
00:19:11,942 --> 00:19:14,882
together pretty early on, on CR-SQLite.

349
00:19:15,222 --> 00:19:20,572
He did his master's thesis on essentially
adding CRDTs to SQLite but some of the,

350
00:19:20,572 --> 00:19:24,042
like, Downsides of like that original
approach was there's, there was no

351
00:19:24,042 --> 00:19:25,512
way to incrementally merge databases.

352
00:19:25,562 --> 00:19:29,312
You had to literally send the
entire copy of the SQLiteDB to

353
00:19:29,312 --> 00:19:32,032
somebody else, and it would like
scan every row and merge them all.

354
00:19:32,622 --> 00:19:34,972
So yeah, one of the key problems
we had to solve was like how do we

355
00:19:34,972 --> 00:19:41,267
incrementally sync And that was done
essentially like implementing a global

356
00:19:41,277 --> 00:19:45,557
version that gets tagged on every row
and a virtual table that knows how

357
00:19:45,557 --> 00:19:50,927
to find all the rows past the given
version and just send you those to sync.

358
00:19:51,290 --> 00:19:58,150
Was there any prior art to combining
a relational database and CRDTs or

359
00:19:58,150 --> 00:20:00,280
another mechanism to synchronize?

360
00:20:00,923 --> 00:20:05,523
Yeah, so there's Iver's work, and
then, I guess, James Long's work on

361
00:20:05,603 --> 00:20:08,968
well, Absurd-SQL, but I guess Yeah, I
don't know if he ever gave a name to

362
00:20:08,968 --> 00:20:10,728
the layer above it that did the CRDTs.

363
00:20:10,948 --> 00:20:15,518
But yeah, James work, if I recall
correctly, was all in JavaScript.

364
00:20:15,798 --> 00:20:17,668
So it wasn't built into SQLite itself.

365
00:20:17,688 --> 00:20:20,088
You couldn't just run it
anywhere SQLite could run.

366
00:20:20,308 --> 00:20:21,898
It had to be in a JavaScript environment.

367
00:20:22,508 --> 00:20:23,858
And he took a different approach.

368
00:20:24,108 --> 00:20:29,058
I think he used hybrid
logical clocks and Merkle DAG.

369
00:20:29,058 --> 00:20:31,798
So yeah, I guess we started
getting some trade offs of like,

370
00:20:32,118 --> 00:20:33,658
Strut was the first use case.

371
00:20:34,028 --> 00:20:37,328
Which actually informed a lot of design
decisions for CR-SQLite and why I

372
00:20:37,328 --> 00:20:41,658
didn't want to use something like James
Long's approach or, I don't know other

373
00:20:41,808 --> 00:20:42,968
solutions that were available at the time.

374
00:20:43,818 --> 00:20:44,078
Yeah.

375
00:20:44,078 --> 00:20:48,138
I'm curious to hear more about
your your judgment of the trade

376
00:20:48,138 --> 00:20:51,868
offs and what made it a good versus
not so good fit for your use case.

377
00:20:52,442 --> 00:20:52,582
Yeah.

378
00:20:52,582 --> 00:20:55,612
So I think James's approach yeah, I hope
I'm remembering everything correctly.

379
00:20:55,632 --> 00:20:58,772
This was like a couple of years ago,
but when I looked into it, every

380
00:20:58,932 --> 00:21:01,042
single message for all time was stored.

381
00:21:01,602 --> 00:21:03,442
So this is, you know, kind of
like event sourcing, right?

382
00:21:03,942 --> 00:21:08,962
But like, one of my key requirements,
like, I was using Strut to drive

383
00:21:08,962 --> 00:21:10,302
requirements for CR-SQLite.

384
00:21:10,922 --> 00:21:14,372
And I wanted the ability for people
to be offline as long as they wanted.

385
00:21:14,782 --> 00:21:18,172
For it to be able to run on
constrained, low memory devices.

386
00:21:18,937 --> 00:21:22,307
And so those two in combination with
the fact that it's like a presentation

387
00:21:22,307 --> 00:21:26,247
editor where you can like, you can drag
and drop components, you can spin them,

388
00:21:26,247 --> 00:21:31,667
rotate them type, like I didn't want
every single event for every keystroke

389
00:21:31,667 --> 00:21:34,877
and every drag and all these things
to be logged for all time, right?

390
00:21:35,427 --> 00:21:37,317
Especially if you're offline indefinitely.

391
00:21:37,637 --> 00:21:41,527
So if you're offline for like, I
don't know, a couple days editing

392
00:21:41,527 --> 00:21:43,877
a really big presentation, you
come back offline, I want it to

393
00:21:43,887 --> 00:21:45,107
be able to sync almost instantly.

394
00:21:45,797 --> 00:21:50,527
So the approach I chose was the set of
CRDTs that allow you to collapse history.

395
00:21:51,147 --> 00:21:56,617
So, yeah, if you've been off The
database size never grows more than

396
00:21:56,617 --> 00:21:58,467
a constant factor of the base data.

397
00:21:59,057 --> 00:22:01,737
So no matter how much you're editing,
you're always going to be within a

398
00:22:01,737 --> 00:22:04,067
constant factor of the regular DB size.

399
00:22:04,687 --> 00:22:07,127
I don't know, I think that constant
factor was like three or four times,

400
00:22:07,677 --> 00:22:08,757
I don't recall the exact number.

401
00:22:09,277 --> 00:22:12,237
Whereas the other systems, if you're
the longer offline, the more and

402
00:22:12,237 --> 00:22:15,067
more the data grew which was, you
know, not a trade off I liked.

403
00:22:15,537 --> 00:22:18,490
Yeah, I guess the downside to my
approach is there's a few less

404
00:22:18,510 --> 00:22:21,980
transactional guarantees and you
can't, you know, rewind history.

405
00:22:22,550 --> 00:22:24,720
But if somebody really wanted
rewinding history, I was like, Oh,

406
00:22:24,750 --> 00:22:26,250
they could do it in user space.

407
00:22:26,350 --> 00:22:27,680
Like, that's not the
problem I'm trying to solve.

408
00:22:28,127 --> 00:22:29,187
That makes perfect sense.

409
00:22:29,217 --> 00:22:34,697
And I think I really like, you focusing
on the use case that you're most familiar

410
00:22:34,697 --> 00:22:38,537
with Strut and the requirements around
that,  Since we're dealing with like

411
00:22:38,537 --> 00:22:43,293
distributed systems and programming
is hard enough, that you can't really

412
00:22:43,423 --> 00:22:46,043
focus on all use cases at the same time.

413
00:22:47,043 --> 00:22:50,913
And this way you design for
the use case that you know of.

414
00:22:51,183 --> 00:22:55,203
And I think this is where you have like
a canvas app or something like that

415
00:22:55,213 --> 00:22:57,653
where you move things around constantly.

416
00:22:58,033 --> 00:23:02,683
If you work for multiple days, you
don't really care exactly about

417
00:23:02,693 --> 00:23:06,273
the change you've made over like
five seconds, three days ago.

418
00:23:06,493 --> 00:23:10,703
You just care about, okay, this is
the state it is in right now and that

419
00:23:10,723 --> 00:23:13,173
other users converge to the same thing.

420
00:23:13,643 --> 00:23:19,378
Whereas I think a different set of
requirements might be for a chat app

421
00:23:19,728 --> 00:23:24,618
where you might not send around, like,
which sort of keystrokes someone has

422
00:23:25,198 --> 00:23:27,088
taken to arrive at the final message.

423
00:23:27,088 --> 00:23:30,158
You send the final message, but
then those final messages, you

424
00:23:30,158 --> 00:23:32,868
might actually care about, okay,
this came at this point in time.

425
00:23:32,868 --> 00:23:35,828
One message does rarely
override another one.

426
00:23:35,958 --> 00:23:39,238
So this is, I think, where
that might be a better fit.

427
00:23:39,418 --> 00:23:44,378
And that makes also sense that in James
use case with Actual Budget that's all

428
00:23:44,378 --> 00:23:46,203
about, like, historic transactions.

429
00:23:46,223 --> 00:23:49,393
You want exactly those
like records and time.

430
00:23:49,633 --> 00:23:52,643
There's probably not so many,
and even if there are many,

431
00:23:52,793 --> 00:23:53,973
they're all kind of worth it.

432
00:23:54,513 --> 00:23:55,633
They need to be accurate.

433
00:23:56,263 --> 00:23:59,643
So I think that makes perfect sense, and
I think there will be like many different

434
00:23:59,643 --> 00:24:01,613
solutions for many different use cases.

435
00:24:02,133 --> 00:24:06,128
And I think there's even, you know, an
approach how you could apply a bit of like

436
00:24:06,128 --> 00:24:08,728
that compaction, even to event sourcing.

437
00:24:08,778 --> 00:24:12,527
I think there's like a concept
called subsumption, I think.

438
00:24:12,947 --> 00:24:16,707
And that allows you to also like
flatten some of those historic events.

439
00:24:16,997 --> 00:24:20,347
So that's super interesting
topic of research, I think.

440
00:24:20,813 --> 00:24:23,813
. Yeah, I know there for certain
CRDT algorithms and certain

441
00:24:23,823 --> 00:24:25,473
event sourcing things, like.

442
00:24:26,068 --> 00:24:29,858
Yeah, there are ways to compact
history, but it did seem like, still

443
00:24:29,858 --> 00:24:31,978
a pretty evolving area of research.

444
00:24:32,508 --> 00:24:35,238
I think now it's a lot more
understood especially in the,

445
00:24:35,238 --> 00:24:37,738
like, list based CRDT space.

446
00:24:37,878 --> 00:24:41,318
Sure, like, I don't know, I guess,
Yjs kind of pioneered a lot of

447
00:24:41,318 --> 00:24:44,508
that work of, like, run length
encoding, everything, and then stuff.

448
00:24:45,188 --> 00:24:47,908
Yeah, I guess what I want to
quickly mention, so there's, when

449
00:24:47,908 --> 00:24:50,952
I was first getting started, there
was a community called Braid.org

450
00:24:51,452 --> 00:24:53,622
which I got involved with early on.

451
00:24:54,132 --> 00:24:56,932
And yeah, I don't know, that was
an amazing community in terms

452
00:24:56,932 --> 00:24:59,182
of like learning and being open.

453
00:24:59,322 --> 00:25:03,232
So I just want to call them out, like
they helped me a ton to understand

454
00:25:03,312 --> 00:25:07,132
all these CRDT algorithms and when
history can be compacted, when it can't.

455
00:25:07,827 --> 00:25:11,847
What kind of weird circumstances you
get into if you know you're emerging

456
00:25:11,847 --> 00:25:14,517
changes peer-to-peer versus, you
know, client server and whatnot.

457
00:25:14,987 --> 00:25:15,437
Yeah.

458
00:25:15,527 --> 00:25:19,657
There this entire community, I'm
not yet a part of, but given that

459
00:25:19,657 --> 00:25:24,517
this year I'm also looking much more
into implementing, syncing this.

460
00:25:24,577 --> 00:25:28,347
I feel like there's a lot I have
to learn about and I would love

461
00:25:28,352 --> 00:25:31,627
to also have some folks from that
community here on the podcast.

462
00:25:32,297 --> 00:25:33,892
Yeah they're trying to add.

463
00:25:34,087 --> 00:25:40,947
Syncing to the HTTP protocol, so
essentially adding a new verb rather

464
00:25:40,947 --> 00:25:46,070
than get, put, and post to allow you to
merge documents or resources together.

465
00:25:46,602 --> 00:25:47,972
That sounds fascinating.

466
00:25:48,002 --> 00:25:49,672
How far along is that effort?

467
00:25:49,672 --> 00:25:52,595
They are part of IETF now.

468
00:25:52,955 --> 00:25:56,385
I don't remember if they're a research,
there's like two separate types of groups.

469
00:25:56,385 --> 00:25:59,448
There's like research groups and
maybe there's standards groups.

470
00:25:59,448 --> 00:26:00,528
I think those are the two.

471
00:26:01,108 --> 00:26:03,858
Yeah, I don't know if they're
still in the research phase or if

472
00:26:03,858 --> 00:26:04,953
they're on the standards track yet.

473
00:26:05,263 --> 00:26:06,113
Very interesting.

474
00:26:06,163 --> 00:26:07,623
I have to look more into that.

475
00:26:08,153 --> 00:26:14,823
So, besides choosing the right CRDTs
to fit into SQLite through virtual

476
00:26:14,823 --> 00:26:21,063
tables I think this, like, just picking
SQLite to run in the browser quite a

477
00:26:21,063 --> 00:26:25,363
few years ago, that's probably you've
must have been running into a bunch of

478
00:26:25,503 --> 00:26:27,663
challenges and issues along the way.

479
00:26:27,683 --> 00:26:31,823
So I'm curious to hear first getting
sort of like a broad overview of

480
00:26:31,823 --> 00:26:34,923
the different challenges you were
running into and then going into them.

481
00:26:35,310 --> 00:26:39,620
Yeah, I think I wasted months of my
life, like fighting SQLite issues.

482
00:26:39,980 --> 00:26:44,230
Cause yeah, it was really early where
the official SQLite WASM project

483
00:26:44,350 --> 00:26:45,500
had kind of like just started.

484
00:26:46,040 --> 00:26:50,587
And I guess some of the first
problems I ran into was.

485
00:26:51,058 --> 00:26:54,938
I guess compiling an
extension into the WASM build.

486
00:26:55,478 --> 00:26:57,818
So yeah, it's one thing to
write a SQLite extension.

487
00:26:57,958 --> 00:27:02,528
It's another thing to get
it to run correctly in WASM.

488
00:27:03,288 --> 00:27:06,708
It wasn't so hard when the extension
was originally written in C, but like,

489
00:27:06,858 --> 00:27:10,578
I don't know, like trying to code review
contributions in C and make sure there's

490
00:27:10,578 --> 00:27:13,068
no like memory issues was a nightmare.

491
00:27:13,178 --> 00:27:16,952
So I eventually like bit the bullet,
started learning Rust and re implemented

492
00:27:16,982 --> 00:27:21,282
the extension in Rust, which I think
was, yeah, it became like a third

493
00:27:21,392 --> 00:27:24,952
or two thirds less code and just way
faster to write after doing that.

494
00:27:24,962 --> 00:27:25,882
I think it was worth the time.

495
00:27:26,477 --> 00:27:29,677
But that introduced this whole new
complication of like getting the

496
00:27:29,677 --> 00:27:33,937
Rust compiler toolchain to compile an
extension that can be linked to the

497
00:27:33,967 --> 00:27:39,237
SQLite compiler toolchain that then goes
through Emscripten to build a Wasm bundle.

498
00:27:39,697 --> 00:27:41,807
And let me see what else was there.

499
00:27:42,447 --> 00:27:45,317
Transactions was an interesting thing.

500
00:27:45,647 --> 00:27:49,577
So the inner, the originally
the interfaces to SQLite

501
00:27:49,617 --> 00:27:51,137
in the browser were async.

502
00:27:51,647 --> 00:27:55,492
And if you ever had Well, I guess
there's two builds of SQLite.

503
00:27:55,522 --> 00:27:57,362
The official SQLite build
was always synchronous.

504
00:27:57,462 --> 00:28:02,212
There's this unofficial build, which I
think is better, called wa-sqlite, which

505
00:28:02,212 --> 00:28:03,432
all the interfaces were asynchronous.

506
00:28:03,942 --> 00:28:07,732
But it had these interesting caveats,
like if you await two calls to

507
00:28:07,772 --> 00:28:09,372
SQLite at once, it would deadlock.

508
00:28:09,578 --> 00:28:13,738
So you had to make sure like there's
only one Call to SQLite ever at once.

509
00:28:14,298 --> 00:28:16,148
So, you know, working
around that challenge.

510
00:28:16,265 --> 00:28:20,915
Transaction interleaving was another,
so like if it's asynchronous and you

511
00:28:20,915 --> 00:28:24,165
start a transaction and you have all
the statements in the transaction, okay.

512
00:28:24,175 --> 00:28:26,995
You, you began the transaction
and then you like do a read,

513
00:28:26,995 --> 00:28:27,825
you're waiting the read.

514
00:28:28,165 --> 00:28:30,895
Well, while you're waiting in the read,
you delegate control of the event loop,

515
00:28:31,255 --> 00:28:35,795
so somebody else could come in and,
you know, kick off a new task, right?

516
00:28:36,235 --> 00:28:38,605
A new statement to do
a write or something.

517
00:28:39,105 --> 00:28:41,625
Well, now  that's somebody else's
transaction that's trying to run while

518
00:28:41,625 --> 00:28:44,585
your other transaction is running, so
you get this, like, weird interleaving.

519
00:28:44,615 --> 00:28:48,195
So, yeah trying to lock out transactions
in the JavaScript layer that was

520
00:28:48,195 --> 00:28:49,375
another annoying thing to deal with.

521
00:28:49,450 --> 00:28:55,598
And then, yeah, this is not related
to WASM, but just getting SQLite

522
00:28:56,288 --> 00:28:59,098
with the extension built for all
the platforms I wanted to target.

523
00:28:59,331 --> 00:29:04,261
So, like, eventually Expo wants to
include it in their Expo SDK, so figuring

524
00:29:04,261 --> 00:29:09,051
out how to do Android and iOS builds in
addition to, you know, the WASM build.

525
00:29:09,886 --> 00:29:11,796
And Mac and Linux builds.

526
00:29:12,325 --> 00:29:18,045
Those sound like quite the buffet of
different challenges that some sounds

527
00:29:18,325 --> 00:29:23,815
familiar to me as well as I think some
you've been pioneering a tad earlier.

528
00:29:23,855 --> 00:29:26,975
And by now, some of those
are maybe a bit more.

529
00:29:27,150 --> 00:29:31,240
A well trodden path, but I
think there's probably even more

530
00:29:31,260 --> 00:29:32,900
that we don't understand yet.

531
00:29:33,670 --> 00:29:36,630
One more, I know one more interesting
one I think more interesting than

532
00:29:36,630 --> 00:29:41,680
the other ones is since SQLite
in the browser, at least the one

533
00:29:41,680 --> 00:29:43,430
I'm using, is IndexedDB backed.

534
00:29:43,860 --> 00:29:48,200
IndexedDB has a very high cost
for opening a transaction which

535
00:29:48,200 --> 00:29:49,290
I was unaware of at the time.

536
00:29:49,850 --> 00:29:51,620
So yeah, there's a lot of work.

537
00:29:51,862 --> 00:29:55,642
And anytime you open a SQLite transaction,
it would open an IndexedDB transaction.

538
00:29:56,339 --> 00:30:00,669
So there's a lot of work making sure
that, like, for reads Yeah, normally

539
00:30:00,669 --> 00:30:03,109
you just issue reads, you wouldn't
necessarily batch them into a transaction.

540
00:30:03,119 --> 00:30:05,479
Like normally, you know, you'd batch
your writes into a transaction.

541
00:30:06,139 --> 00:30:09,609
But yeah, to make SQLite in the browser
go fast if you're doing a whole bunch

542
00:30:09,609 --> 00:30:14,439
of reads, like making sure a transaction
was open automatically all the reads

543
00:30:14,439 --> 00:30:16,839
would happen and then the transaction
would close once all the reads are done.

544
00:30:16,959 --> 00:30:20,989
So it's like way of batching that was like
a 10x speed up in the browser for SQLite.

545
00:30:21,408 --> 00:30:26,808
So once you've solved all of those
performance challenges and also like

546
00:30:26,858 --> 00:30:30,638
correctness, syncing challenges,
et cetera, then it's also needs

547
00:30:30,638 --> 00:30:34,738
to be competitive somehow with
the typical developer experience

548
00:30:34,758 --> 00:30:39,518
you have in a React app or in a
different framework you're using.

549
00:30:40,238 --> 00:30:41,638
How did you go about designing that?

550
00:30:42,132 --> 00:30:47,415
Yeah, so, yeah, I've developed a set of
hooks, so, I don't know, it's, I guess

551
00:30:47,415 --> 00:30:51,349
like the React integration was probably
the least amount of, I don't know,

552
00:30:51,359 --> 00:30:52,349
I'd say it's the least amount of work.

553
00:30:52,389 --> 00:30:55,079
Maybe not once I like think about
all the strict code problems and

554
00:30:55,079 --> 00:30:58,669
stuff but yeah, essentially the
DevEx was like a use query hook.

555
00:30:59,159 --> 00:31:03,079
So I think it's a lot easier to
develop when your queries are

556
00:31:03,109 --> 00:31:04,219
co located with your components.

557
00:31:04,540 --> 00:31:07,770
So every component is responsible
for getting the data it needs.

558
00:31:08,051 --> 00:31:10,651
And this makes your app like
super composable, right?

559
00:31:10,651 --> 00:31:15,331
Like you can add and remove components
to your app or your component tree.

560
00:31:15,601 --> 00:31:18,361
And because they're responsible
for getting their data, like, you

561
00:31:18,361 --> 00:31:20,831
don't have to do any additional
wiring or prop drilling.

562
00:31:20,831 --> 00:31:26,501
And yeah, they're not going to break
other components because suddenly,

563
00:31:26,881 --> 00:31:28,261
Some data dependencies are missing.

564
00:31:28,931 --> 00:31:29,771
So I really like that, yeah.

565
00:31:29,831 --> 00:31:33,831
Co located queries, so every component
you could use this hook called UseQuery.

566
00:31:34,491 --> 00:31:39,051
And in that UseQuery, you just
literally write your SQL that fetches

567
00:31:39,051 --> 00:31:40,671
the data required by that component.

568
00:31:41,221 --> 00:31:44,301
And that SQL could be as complex
or as simple as you wanted.

569
00:31:44,891 --> 00:31:49,181
So, you know, like, select from issue
where id equals issue id or something

570
00:31:49,181 --> 00:31:52,731
simple or, you know, something
complex like select star from issue,

571
00:31:52,741 --> 00:31:57,721
order by modify, join labels join
owner, and yeah, all these things.

572
00:31:58,251 --> 00:32:03,661
So you've managed to tame most challenges
with SQLite in the browser, making sure

573
00:32:03,661 --> 00:32:08,021
it works in Wasm, making sure that your
own extension is working, et cetera.

574
00:32:08,441 --> 00:32:14,451
And SQLite can be super, super fast
for most workloads, but once your

575
00:32:14,461 --> 00:32:19,211
cardinality grows, et cetera, your
tables get really big then queries can

576
00:32:19,221 --> 00:32:21,716
also start Taking a little bit longer.

577
00:32:22,136 --> 00:32:26,516
And that is because SQLite always kind
of, besides a little bit of caching,

578
00:32:26,756 --> 00:32:31,066
always starts with your results for
your queries from scratch, kind of.

579
00:32:31,456 --> 00:32:35,166
And you know, I've been chatting about
this for quite a while back then.

580
00:32:35,466 --> 00:32:37,856
This is how I know about
all of this context.

581
00:32:37,886 --> 00:32:39,636
And that has led you to look into.

582
00:32:40,006 --> 00:32:43,436
incrementally maintaining the
query results and the views.

583
00:32:43,756 --> 00:32:46,756
So I'm very curious to hear more
what you've learned back then

584
00:32:46,766 --> 00:32:48,896
and which path this led you down.

585
00:32:49,486 --> 00:32:49,936
Yeah.

586
00:32:50,196 --> 00:32:50,336
Yeah.

587
00:32:50,336 --> 00:32:53,286
So I guess, you know, I was talking
earlier about each component

588
00:32:53,336 --> 00:32:56,326
fetches its own data, has this
use query hook, or just, you know,

589
00:32:56,326 --> 00:32:57,926
specifies a SQL query it's running.

590
00:32:58,456 --> 00:33:02,096
So when you're developing like a rich
client side application like this,

591
00:33:02,646 --> 00:33:04,796
ideally your database is reactive, right?

592
00:33:04,796 --> 00:33:06,596
So any, anytime some data changes.

593
00:33:07,146 --> 00:33:09,066
It's like you, you work
on Overtone, right?

594
00:33:09,076 --> 00:33:09,756
The music app.

595
00:33:10,316 --> 00:33:16,386
If somebody presses play, then, you
know, all the components that care about

596
00:33:16,386 --> 00:33:18,166
play state need to immediately update.

597
00:33:18,906 --> 00:33:22,591
And yeah, as you're saying in SQLite, if
you're doing this somebody changes some

598
00:33:22,591 --> 00:33:27,588
state, well, you know, SQLite doesn't have
any, you know, facilities for reactivity.

599
00:33:27,648 --> 00:33:29,368
It has a few but they're not very good.

600
00:33:29,868 --> 00:33:32,388
Like they don't give you many fine
grained notions about what changed

601
00:33:32,388 --> 00:33:33,618
or like what queries it impacts.

602
00:33:34,168 --> 00:33:37,678
So what you're left doing is rerunning.

603
00:33:38,098 --> 00:33:41,598
Essentially from scratch the queries
that could have been affected.

604
00:33:42,268 --> 00:33:46,018
So, like, the SQLite change notifications,
they'll tell you the row ID that

605
00:33:46,018 --> 00:33:48,818
changed, but not the contents, and
they'll tell you, like, the table.

606
00:33:49,228 --> 00:33:52,528
So it's usually not enough to figure
out exactly which queries to invalidate.

607
00:33:52,879 --> 00:33:55,619
And yeah, so you rerun tons
of queries throughout the app.

608
00:33:56,209 --> 00:34:00,079
And those are all rerunning from
scratch, so like, if you have, like,

609
00:34:00,079 --> 00:34:03,779
for your track list, you're showing a
few hundred tracks, and that track list

610
00:34:03,779 --> 00:34:05,589
has to do a number of joins, right?

611
00:34:05,589 --> 00:34:08,919
Has to, for a track, has to join the
album to get the album title, has to

612
00:34:08,919 --> 00:34:11,949
join the artist to get the artist names.

613
00:34:12,569 --> 00:34:13,689
Yeah, I don't know if
there's any other joins.

614
00:34:13,719 --> 00:34:18,119
Maybe there could be joins about,
like, like status, or favorite status,

615
00:34:18,289 --> 00:34:19,539
or something about the track, right?

616
00:34:20,209 --> 00:34:25,049
And, yeah, rerunning that join to
select, you know, 200 some items.

617
00:34:25,509 --> 00:34:28,789
And yeah, maybe they just want to
sort, and sorting it doing that from

618
00:34:28,789 --> 00:34:32,909
scratch every time somebody mutates
something severely limits I guess how

619
00:34:32,909 --> 00:34:34,149
fast you can interact with the app.

620
00:34:34,749 --> 00:34:38,499
And yeah, I'd worked, tried a
bunch with like, okay, you know,

621
00:34:38,499 --> 00:34:41,539
maybe a purely in memory SQLite,
and that gets you pretty fast.

622
00:34:42,099 --> 00:34:45,119
But yeah, somebody was saying,
oh, like, The goal for a reactive

623
00:34:45,119 --> 00:34:46,419
database is it's memory fast.

624
00:34:47,099 --> 00:34:50,669
And then when I started, like, I
don't know, that idea, a lot just

625
00:34:50,669 --> 00:34:54,049
helped my brain and I started like
actually benchmarking, you know,

626
00:34:54,729 --> 00:34:57,399
if I'm just observing a value
in JavaScript, how fast is that?

627
00:34:57,809 --> 00:35:02,699
And if I'm like, Observing a
query in SQLite and updating the

628
00:35:02,699 --> 00:35:03,909
row and re running the query.

629
00:35:03,909 --> 00:35:04,679
Like, what is that?

630
00:35:04,689 --> 00:35:08,109
And I have a observable notebook
somewhere that like compares all this

631
00:35:08,109 --> 00:35:09,589
and like the difference was massive.

632
00:35:09,989 --> 00:35:13,079
So like just seeing how far I was
from memory fast, like started

633
00:35:13,089 --> 00:35:15,589
making me a bit sad on SQLite.

634
00:35:15,819 --> 00:35:20,919
And also like benchmarking the original
strut and the new one, which use

635
00:35:20,919 --> 00:35:22,569
SQLite and these use query hooks.

636
00:35:23,099 --> 00:35:27,239
Yeah, like the original one I could
throttle my Chrome, you know, but in

637
00:35:27,249 --> 00:35:31,769
the dev tools you can like downgrade
your CPU right to like 6x slower or

638
00:35:31,769 --> 00:35:35,449
something and I could go all the way down
to the slowest possible and everything

639
00:35:35,449 --> 00:35:40,839
was buttery smooth, but in the new
one, like if I downgraded it too much,

640
00:35:40,849 --> 00:35:42,289
like you could see some visible lag.

641
00:35:42,349 --> 00:35:43,069
SQLite

642
00:35:45,129 --> 00:35:49,399
is fundamentally built in this
request response style, right?

643
00:35:50,389 --> 00:35:55,079
This is the era of the LAMP stack
where, you know, databases request

644
00:35:55,079 --> 00:35:56,129
response made sense, right?

645
00:35:56,189 --> 00:35:59,839
A user would go to the website
it would do a query against the

646
00:35:59,839 --> 00:36:01,099
DB and it would render, right?

647
00:36:01,099 --> 00:36:03,179
There was no, like, rich interactivity.

648
00:36:03,179 --> 00:36:04,549
It was always a full
page refresh every time.

649
00:36:04,639 --> 00:36:05,919
So request response made sense.

650
00:36:06,449 --> 00:36:09,169
Yeah, but for these rich apps, request
response no longer makes sense.

651
00:36:09,169 --> 00:36:13,859
And trying to fit a request response
sort of DB designed around that,

652
00:36:13,859 --> 00:36:15,789
it into this reactive scenario.

653
00:36:15,819 --> 00:36:19,239
And I think maybe one day
somebody can get there.

654
00:36:19,289 --> 00:36:23,309
But I think right now it didn't seem
to make much sense to me and the juice

655
00:36:23,309 --> 00:36:24,539
didn't seem to be worth the squeeze.

656
00:36:25,149 --> 00:36:29,829
So I started, I guess my love affair with
SQLite has slowly been coming to a close.

657
00:36:30,249 --> 00:36:32,799
And I started, you know,
investigating other projects.

658
00:36:33,369 --> 00:36:36,249
Yeah maybe one, one day
port it if there's time or.

659
00:36:36,723 --> 00:36:39,383
porting some of these projects
back to SQLite and contributing

660
00:36:39,616 --> 00:36:42,456
incremental data flow, differential
data flow to them or something.

661
00:36:43,146 --> 00:36:45,026
But yeah, I guess I should describe
what these projects are, right?

662
00:36:45,719 --> 00:36:51,499
, So I like the idea of queries, being able
to like, Issue a complex declarative query

663
00:36:52,229 --> 00:36:56,669
against some set of relational data to
get back the data you need for your view.

664
00:36:56,853 --> 00:37:00,403
So I wanted to like, how can
I , incrementally maintain an

665
00:37:00,413 --> 00:37:01,963
arbitrary query against some data?

666
00:37:02,213 --> 00:37:06,268
So like that track list example
you query the, Tracks and their

667
00:37:06,278 --> 00:37:09,378
artists, and their albums, and the
play state, and all these things.

668
00:37:10,058 --> 00:37:13,358
How can we, when somebody does
it right, rather than re running

669
00:37:13,358 --> 00:37:17,738
the query to get the track list
it knows exactly which queries.

670
00:37:17,938 --> 00:37:22,298
Should be invalidated by that write,
and rather than re running them,

671
00:37:22,298 --> 00:37:26,784
knows exactly how to patch up the
rows that are impacted by the write.

672
00:37:27,193 --> 00:37:31,680
So, you know, at first this problem
seems absurdly complex and hard.

673
00:37:32,100 --> 00:37:34,730
And you're like, Oh, yeah, I
want to like invalidate and

674
00:37:34,860 --> 00:37:36,870
patch up any arbitrary SQL query.

675
00:37:37,171 --> 00:37:39,171
But yeah, I started
reading some of the papers.

676
00:37:39,171 --> 00:37:40,641
There's like a paper called DBSP.

677
00:37:41,111 --> 00:37:43,071
It's a differential data flow paper.

678
00:37:43,761 --> 00:37:47,101
I think Materialize is based on that.

679
00:37:47,781 --> 00:37:49,321
And then, yeah, started implementing it.

680
00:37:49,361 --> 00:37:52,738
And then I realized like, I don't know,
this, once you've read it and started

681
00:37:52,738 --> 00:37:56,738
implementing it and you realize like,
this is very similar to other stream

682
00:37:56,738 --> 00:38:02,205
processing stuff , like, and other query,
you know, builders I've done,, right,

683
00:38:02,245 --> 00:38:06,686
where rather than, you know, you create
a series of operations and rather than

684
00:38:06,686 --> 00:38:11,696
them asking a database for the data,
they're like taking streaming data in.

685
00:38:12,123 --> 00:38:17,210
So yeah, like some of the work I'd done
at Meta was on real time abuse systems

686
00:38:17,570 --> 00:38:19,730
and that was all streaming systems.

687
00:38:20,320 --> 00:38:22,840
So I was like, oh, like this
problem isn't that mysterious.

688
00:38:22,840 --> 00:38:24,780
Like I've done stuff like this before.

689
00:38:25,040 --> 00:38:26,370
It seems tractable.

690
00:38:26,566 --> 00:38:27,356
So I, you know.

691
00:38:27,801 --> 00:38:32,811
went full on, decided like, I think we,
I think I can implement this yeah, based

692
00:38:32,811 --> 00:38:36,431
on prior experience, based on having
done query languages before so yeah,

693
00:38:36,481 --> 00:38:43,391
MaterialLite was born which is bringing
differential data flow to JavaScript, so

694
00:38:43,391 --> 00:38:48,861
you can compose, filter, map, reduce, and
a join operator to create these pretty

695
00:38:48,871 --> 00:38:52,761
rich queries and have them reactively
updated anytime there's a write.

696
00:38:53,115 --> 00:38:53,435
Right.

697
00:38:53,455 --> 00:38:57,775
And so for those of the listeners who
have not yet built their own database

698
00:38:58,165 --> 00:39:04,105
and maybe are familiar with like SQL
where in SQL, you have like select from

699
00:39:04,105 --> 00:39:08,515
where, and all of those and JavaScript
or other programming languages who you

700
00:39:08,515 --> 00:39:10,905
might've like, Map, filter, et cetera.

701
00:39:11,175 --> 00:39:13,145
They might have different
names, but they're sort of

702
00:39:13,145 --> 00:39:14,825
like conceptually very similar.

703
00:39:15,345 --> 00:39:20,175
And this is where you're basically just
now like trying to recreate the same

704
00:39:20,175 --> 00:39:25,635
semantics that we have from SQL, where you
can say select star from this where so,

705
00:39:25,735 --> 00:39:31,830
and you can basically have like an array
where it now say .Map .Filter, et cetera,

706
00:39:32,280 --> 00:39:38,420
and where you flip the trade offs from
before you owned in SQLite, you might get

707
00:39:38,420 --> 00:39:40,650
this query just once and then it's done.

708
00:39:40,690 --> 00:39:45,630
You no longer are interested in like
subsequent changes and then like, smaller

709
00:39:45,650 --> 00:39:49,950
updates, but in JavaScript where our
app stays warm, we click the button.

710
00:39:49,950 --> 00:39:51,950
Something's changed slightly.

711
00:39:52,190 --> 00:39:54,940
You want to change the trade
offs quite significantly.

712
00:39:55,270 --> 00:39:57,030
And that has led you to Materialite.

713
00:39:57,448 --> 00:40:02,518
Yeah, so like just to build off on like
how SQL concepts map to like filter,

714
00:40:02,528 --> 00:40:08,018
map, and reduce, or like map represents
select or function application, like

715
00:40:08,138 --> 00:40:12,178
you're selecting some columns, well
that's mapping the original object to some

716
00:40:12,198 --> 00:40:14,158
other object with a subset of the fields.

717
00:40:14,838 --> 00:40:19,188
Filter is the same as where Reduce
is what you use for any aggregate

718
00:40:19,188 --> 00:40:20,988
function, like group by, sum, count.

719
00:40:21,608 --> 00:40:24,078
Yeah, and I guess one other
operator would be concat.

720
00:40:24,218 --> 00:40:28,548
So if you want to do or, you just,
you fork your stream that you filtered

721
00:40:28,548 --> 00:40:32,808
or mapped over, and then after you've
filtered it, you concat the streams back

722
00:40:32,808 --> 00:40:34,328
together, and then you run a distinct.

723
00:40:34,638 --> 00:40:38,798
So you can do an or and then join,
joins a special thing where essentially.

724
00:40:39,343 --> 00:40:43,013
You kind of maintain what you've
seen from one stream and then as you

725
00:40:43,023 --> 00:40:46,653
see results from another stream, you
link them together based on some key.

726
00:40:46,943 --> 00:40:51,407
At first, it like, Seemed like a large
barrier, but then I don't know, once you

727
00:40:51,407 --> 00:40:53,787
get into it and you realize, Oh, like
I've done all this before and filter,

728
00:40:53,797 --> 00:40:58,216
map, reduce, like these have direct
analogs in SQL, like it's not so bad.

729
00:40:58,713 --> 00:41:04,193
So we've so far like touched
the most common usage of SQL.

730
00:41:04,598 --> 00:41:08,818
But SQL has, or SQLite, Postgres,
et cetera, but like sticking with

731
00:41:08,818 --> 00:41:12,818
SQLite for a moment there's quite
a bit more than like the common

732
00:41:12,828 --> 00:41:15,148
like select from where, et cetera.

733
00:41:15,148 --> 00:41:18,948
There's like various kinds of
aggregations, various kinds of ways

734
00:41:18,948 --> 00:41:22,898
to combine queries, sub queries,
joins, various kinds of joins.

735
00:41:23,298 --> 00:41:27,418
Is the system that you've came up
with MaterialLite, is that on feature

736
00:41:27,418 --> 00:41:30,668
parity on what SQLite is able to do?

737
00:41:31,118 --> 00:41:35,158
Or are there some parts where you
say, okay, that's out of scope because

738
00:41:35,158 --> 00:41:40,318
there's years worth, maybe decades
worth of optimization have gone into

739
00:41:40,318 --> 00:41:43,918
SQLite into making that fast and
that's out of scope for Materialite.

740
00:41:44,358 --> 00:41:44,638
Yeah.

741
00:41:44,658 --> 00:41:47,768
So there's no window functions
and there's no recursive queries.

742
00:41:48,343 --> 00:41:52,593
And base Materialite is
strictly for streaming.

743
00:41:52,823 --> 00:41:57,896
So, like, if you're processing a
stream of writes or events, like

744
00:41:57,936 --> 00:42:01,570
Materialite is a great fit because
it's gonna, you know, patch up your

745
00:42:01,570 --> 00:42:03,530
query results as those events come in.

746
00:42:04,060 --> 00:42:09,576
But say you have a table of a
million items and you want to

747
00:42:09,596 --> 00:42:11,936
query it from scratch, right?

748
00:42:11,966 --> 00:42:15,056
Like you didn't see the events,
you couldn't maintain any queries.

749
00:42:15,136 --> 00:42:16,996
You're just like, I want to
know what's in this table.

750
00:42:17,476 --> 00:42:21,196
So, so you're going to have to run that
query from scratch, even in Materialite.

751
00:42:21,246 --> 00:42:23,766
But for Materialite, what that means
running a query from scratch is like

752
00:42:24,016 --> 00:42:27,966
setting up the Dataflow pipeline and
then feeding literally every row,

753
00:42:27,966 --> 00:42:29,546
all million rows into that pipeline.

754
00:42:30,096 --> 00:42:34,246
Which, yeah, since everything's
implemented with, like, filter and map,

755
00:42:34,306 --> 00:42:38,056
where, like, it's opaque, like, when
you do a filter, you provide a lambda.

756
00:42:38,056 --> 00:42:38,596
It's opaque.

757
00:42:38,656 --> 00:42:42,536
That's what fields you're filtering on
which makes this from scratch case hard

758
00:42:42,536 --> 00:42:45,891
because, like, Oh, if I don't know what
fields you're filtering on, then I don't

759
00:42:45,891 --> 00:42:49,171
know if I can apply some sort of index
that you might have against this data.

760
00:42:49,771 --> 00:42:54,575
So, yeah, I guess after working
on Material 8 and getting it to

761
00:42:54,575 --> 00:42:56,945
work pretty well, like, okay,
that was the next problem.

762
00:42:57,415 --> 00:43:01,231
And it's like, we need an actual
query language because in a

763
00:43:01,231 --> 00:43:02,521
query language is declarative.

764
00:43:02,571 --> 00:43:05,081
You express, like, This is the
field name I'm filtering on.

765
00:43:05,081 --> 00:43:07,371
This is the operator being
used and this is the value.

766
00:43:07,781 --> 00:43:12,691
So the engine knows exactly where the
fields are ordering by the tables you're

767
00:43:12,761 --> 00:43:14,421
using the fields you're filtering on.

768
00:43:15,011 --> 00:43:19,281
And once you have that knowledge, you
can also make that like first query

769
00:43:19,291 --> 00:43:23,781
case fast, where I guess you're doing
the traditional database thing of like

770
00:43:24,091 --> 00:43:25,531
figuring out what indices to pick.

771
00:43:26,151 --> 00:43:29,411
I don't think we'll ever, Be as
fast as SQLite for that case.

772
00:43:29,851 --> 00:43:32,828
But I think, you know, For an
app like, for a rich client

773
00:43:32,838 --> 00:43:35,145
like, Overtone or Strut, right?

774
00:43:35,735 --> 00:43:37,375
, Most of your queries are
subscriptions, they're not this

775
00:43:37,375 --> 00:43:38,815
like from scratch Sort of thing.

776
00:43:38,935 --> 00:43:42,796
Like, You're gonna subscribe, you're
gonna like, You're gonna like, Set up

777
00:43:42,796 --> 00:43:45,896
a query that subscribes to the slide
list, set up a query that subscribes

778
00:43:45,896 --> 00:43:47,656
to the component list on a slide.

779
00:43:48,206 --> 00:43:51,776
And then as writes happen, you just
want to update those rather than,

780
00:43:52,016 --> 00:43:55,646
yeah, always, like, rather than
having to have first class support

781
00:43:55,656 --> 00:43:57,486
for running a query from scratch.

782
00:43:58,226 --> 00:44:00,896
So, yeah, so that latter thing
needs to be just fast enough, but

783
00:44:00,896 --> 00:44:03,996
it doesn't need to be, like, Yeah,
the fastest thing in the world.

784
00:44:04,446 --> 00:44:08,756
I totally agree that there's a very
interesting, different kinds of trade

785
00:44:08,756 --> 00:44:14,046
off that should be explored and should be
built that makes so much more sense for a

786
00:44:14,046 --> 00:44:20,016
live active application where everything
stays up to date and then changes

787
00:44:20,016 --> 00:44:25,981
quite minimally as opposed to request
response, which is how SQLite works today.

788
00:44:26,258 --> 00:44:31,178
That being said,  you can still get quite
a bit of mileage out of SQLite today.

789
00:44:31,368 --> 00:44:36,438
If you are a little bit more mindful
about like how you work with it.

790
00:44:36,748 --> 00:44:42,228
So some patterns that I found to Makes
SQLite work for me with Overtone.

791
00:44:42,568 --> 00:44:48,428
Is that what makes SQLite the slowest
in a browser context is shoveling data

792
00:44:48,458 --> 00:44:54,548
from within the SQLite database into
JavaScript land and back and forth as

793
00:44:54,548 --> 00:44:59,198
this is where you need to cross the
memory boundaries from memory staying

794
00:44:59,198 --> 00:45:04,218
within Wasm to how it's being like
marshaled into JavaScript objects.

795
00:45:05,043 --> 00:45:11,433
And what I found as a pattern that helps
quite a lot is just keeping that small.

796
00:45:11,453 --> 00:45:15,893
So if you need like a, and this is where
I think you can apply quite a few of the

797
00:45:15,903 --> 00:45:20,503
tricks that you've now mentioned and you
embrace with Materialite, you can actually

798
00:45:20,513 --> 00:45:23,353
bring back and layer on top of SQLite.

799
00:45:23,453 --> 00:45:26,803
Probably still not quite
as fast as Materialite.

800
00:45:27,098 --> 00:45:31,268
But probably fast enough to make a use
case like Overtone work within that

801
00:45:31,268 --> 00:45:36,138
paradigm as well, that when the list
changes, instead of like, getting a

802
00:45:36,138 --> 00:45:40,378
new list of 5, 000 tracks for that
playlist, every time those 5, 000

803
00:45:40,428 --> 00:45:45,648
tracks, you just get a change in terms
of like, Oh, this new track was added.

804
00:45:46,088 --> 00:45:52,108
And getting that SQLite, I think can
also be done with a few tricks namely

805
00:45:52,108 --> 00:45:58,498
also like a  temporary table where you
can save the previous result of your

806
00:45:58,538 --> 00:46:01,558
query before and after the change.

807
00:46:01,588 --> 00:46:04,858
And then you query that in a diffing way.

808
00:46:05,278 --> 00:46:08,748
And then you still need to
sort of like work with that

809
00:46:08,808 --> 00:46:10,878
diff information in JavaScript.

810
00:46:11,198 --> 00:46:14,468
But this is how you can work
around that performance challenge.

811
00:46:14,788 --> 00:46:16,898
But your system with Materialite.

812
00:46:17,163 --> 00:46:20,303
All of that is absorbed from
you, which is very attractive.

813
00:46:20,363 --> 00:46:24,233
Yeah, and I think right, so Materialite,
right now, it can be backed by

814
00:46:24,563 --> 00:46:28,453
an in memory collection, or it's
just some little interface that's

815
00:46:28,453 --> 00:46:29,633
called a source that you implement.

816
00:46:30,143 --> 00:46:33,893
So I think, Yeah, SQLite could be
that source, and then that would

817
00:46:33,893 --> 00:46:35,603
solve the like, first query problem.

818
00:46:36,163 --> 00:46:39,913
So I just compile the Materialite
queries to SQL queries.

819
00:46:40,136 --> 00:46:43,876
and then the incremental updates are
maintained by Materialite going forward.

820
00:46:44,083 --> 00:46:46,833
Yeah, obviously you have to like,
listen to the transaction failures

821
00:46:46,833 --> 00:46:48,163
and somehow roll those back.

822
00:46:48,603 --> 00:46:51,903
But yeah, I think it's doable to
like, Marry those two together and

823
00:46:51,913 --> 00:46:53,803
have kind of the best of both worlds.

824
00:46:54,373 --> 00:46:56,793
And I know there's like
a project called GRDB.

825
00:46:56,793 --> 00:47:02,313
I think it's like a iOS SQLite wrapper
which adds a bunch of reactivity.

826
00:47:02,673 --> 00:47:04,143
it's not like completely general.

827
00:47:04,203 --> 00:47:06,983
It doesn't support like completely
arbitrary SQL queries, at

828
00:47:06,983 --> 00:47:07,993
least in an efficient way.

829
00:47:08,033 --> 00:47:11,083
But I think it gets you close enough
with the queries it does support.

830
00:47:11,463 --> 00:47:15,100
One more thing about like, moving
from the request response style to

831
00:47:15,100 --> 00:47:17,030
like, queries being subscriptions.

832
00:47:17,610 --> 00:47:19,910
It gives you these really
cool opportunities to optimize

833
00:47:19,910 --> 00:47:21,630
queries at a global level.

834
00:47:22,230 --> 00:47:24,960
So like, if all your queries
are subscriptions, you can see

835
00:47:24,960 --> 00:47:26,150
what subscriptions are open.

836
00:47:26,740 --> 00:47:29,960
And then you can start seeing
which queries are either

837
00:47:29,960 --> 00:47:34,610
identical and deduplicate or
which queries share operations.

838
00:47:34,860 --> 00:47:37,270
So like, maybe you have a whole bunch
of queries that all do the same join.

839
00:47:37,830 --> 00:47:39,320
While in the streaming system.

840
00:47:39,685 --> 00:47:44,425
You can optimize that so that join happens
once and then the results of the join are

841
00:47:44,425 --> 00:47:45,995
fanned out to all the queries that use it.

842
00:47:46,375 --> 00:47:48,755
So yeah, you can do some really neat
tricks at a global level once you

843
00:47:49,065 --> 00:47:50,645
model your queries or subscriptions.

844
00:47:50,814 --> 00:47:52,404
Yeah, that's super interesting.

845
00:47:52,434 --> 00:47:58,444
And a few of those things I'm exploring
for LiveStore myself where they're

846
00:47:58,444 --> 00:48:00,984
basically the problem of reactivity.

847
00:48:01,239 --> 00:48:05,619
And efficiency, trying to keep the
performance as good as possible,

848
00:48:05,649 --> 00:48:10,419
but also squeezing that into a frame
budget since LiveStore and Overton

849
00:48:10,429 --> 00:48:14,109
all run within the main thread,
which I'm not sure that's probably

850
00:48:14,109 --> 00:48:15,949
also the way how MaterialLite works.

851
00:48:16,519 --> 00:48:20,739
There is Interesting to also see it
through a lens of like a different

852
00:48:21,059 --> 00:48:27,299
paradigm shift evolution in the JavaScript
front end world which is about signals.

853
00:48:27,649 --> 00:48:32,859
And signals is also about like, I think
a more lighter weight reactivity system

854
00:48:32,869 --> 00:48:35,129
that is all about being composable.

855
00:48:35,159 --> 00:48:39,389
And and I think those worlds are
now finding, like, a way together

856
00:48:39,539 --> 00:48:44,849
within live store with the prior work
with Jeffrey and Nicholas on Riffle.

857
00:48:45,129 --> 00:48:50,379
This was like some early implementations
based on a paper called Adapt On.

858
00:48:50,859 --> 00:48:55,559
And from my understanding all of like
the research that has gone into adapt on.

859
00:48:56,049 --> 00:48:59,999
And sort of like in parallel,
what's been evolving as signals.

860
00:49:00,289 --> 00:49:01,759
Those are actually the same thing.

861
00:49:02,139 --> 00:49:07,099
And I think now we can do some really
interesting work on combining the

862
00:49:07,109 --> 00:49:13,284
power that like more substantial state
management primitives, Like a database

863
00:49:13,364 --> 00:49:17,354
gives us or something like material
light gives us and bring that together

864
00:49:17,684 --> 00:49:23,544
with something like a signal system that
also kind of out of the box address the

865
00:49:23,554 --> 00:49:27,754
thing you've just mentioned, which is
reusing parts of the computational graph.

866
00:49:28,234 --> 00:49:29,864
The only thing signals is missing.

867
00:49:29,864 --> 00:49:31,224
I'm surprised nobody's done this yet.

868
00:49:31,514 --> 00:49:34,334
Is Incremental computation
against collections, right?

869
00:49:34,334 --> 00:49:37,554
So you think of like, I have an array,
and I map the array, then I filter

870
00:49:37,554 --> 00:49:39,714
the array, then I reduce the array.

871
00:49:39,764 --> 00:49:42,134
That's like creating a new
copy of the array each time.

872
00:49:42,434 --> 00:49:45,484
And if you like, add one element
to the array, you like, rerun all

873
00:49:45,484 --> 00:49:47,174
those against the full array again.

874
00:49:47,214 --> 00:49:48,924
So it's kind of like the query
from scratch case, right?

875
00:49:49,181 --> 00:49:52,276
Whereas Materialite, right, You
add an element to the array, well,

876
00:49:52,286 --> 00:49:55,016
it's only going to run the filter
on that one element and the map

877
00:49:55,016 --> 00:49:58,226
on that one element rather than
running against the full array.

878
00:49:58,286 --> 00:50:01,066
So there's some interesting benchmarks
maybe I can share with you after of

879
00:50:01,066 --> 00:50:05,506
like, yeah, incrementally maintaining
a map filter reduce pipeline

880
00:50:05,526 --> 00:50:07,426
against an array of a million items.

881
00:50:07,606 --> 00:50:10,366
Yeah, so it'd be really cool if like
the signals implementers would even

882
00:50:10,386 --> 00:50:12,276
implement this sort of stuff eventually.

883
00:50:12,392 --> 00:50:17,772
So even though your love affair with
SQLite has for now come to an end.

884
00:50:18,132 --> 00:50:22,962
I think the problems and goals
you're chasing after are still very

885
00:50:22,992 --> 00:50:27,442
similar to what I think is feasible
with SQLite, just maybe with like

886
00:50:27,442 --> 00:50:29,012
a different performance footprint.

887
00:50:29,372 --> 00:50:32,162
I think with SQLite you
also get the benefit.

888
00:50:32,552 --> 00:50:35,982
of just that being like a very
familiar and trusted thing that's

889
00:50:35,992 --> 00:50:40,292
like been proven and battle
tested for literally decades now.

890
00:50:40,632 --> 00:50:44,032
And even though you said like you
were running into some performance

891
00:50:44,052 --> 00:50:49,362
issues SQLite is still so ridiculously
fast that it's very feasible to do

892
00:50:49,362 --> 00:50:52,972
write those applications in the,
in a browser context, et cetera.

893
00:50:52,972 --> 00:50:57,592
And I think even if you now on
a modern CPU straddle your your

894
00:50:57,592 --> 00:50:59,972
CPU, it's still manageable.

895
00:51:00,652 --> 00:51:04,762
But I think there's multiple
paths to arrive at the same goal.

896
00:51:05,202 --> 00:51:11,282
And so as my understanding is that the
project Materialite opened a couple of

897
00:51:11,282 --> 00:51:16,972
really interesting follow up conversations
for you which is now opening a new

898
00:51:16,972 --> 00:51:18,942
chapter for your professional life.

899
00:51:19,212 --> 00:51:20,702
So do you want to tell
us a bit more about that?

900
00:51:21,367 --> 00:51:25,167
Yeah, I guess, yeah, Materialite
is my least known project,

901
00:51:25,207 --> 00:51:26,867
least complete project.

902
00:51:27,307 --> 00:51:32,787
But yeah, it did pick up some interest
by Rocicorp specifically Aaron, who I

903
00:51:32,787 --> 00:51:35,157
met at the local-first conference in St.

904
00:51:35,157 --> 00:51:35,467
Louis.

905
00:51:36,147 --> 00:51:39,317
And yeah, we're talking about, you know,
the model of subscript, those queries as

906
00:51:39,317 --> 00:51:42,967
subscriptions and like having a reactive
app and I think he's the one that said

907
00:51:42,967 --> 00:51:44,707
like, oh, things need to be memory fast.

908
00:51:45,207 --> 00:51:49,757
And yeah, so, so I started working on
MaterialLite and we did some explorations

909
00:51:49,767 --> 00:51:54,267
of like, if you powered one of their
demo apps with MaterialLite, like  they

910
00:51:54,267 --> 00:51:56,837
have a linear clone that's a demo
app and they want to see like, Oh,

911
00:51:56,837 --> 00:51:58,557
how many issues can we scale this to?

912
00:51:58,567 --> 00:52:01,387
If we like use this
differential data flow approach.

913
00:52:02,097 --> 00:52:02,347
Yeah.

914
00:52:02,347 --> 00:52:04,617
And they really liked kind
of the initial results.

915
00:52:05,197 --> 00:52:07,670
And yeah, we worked together, I guess.

916
00:52:08,089 --> 00:52:10,439
quite a bit over the last
four slash six months.

917
00:52:10,499 --> 00:52:12,699
There was like a two month
break for me on paternity leave.

918
00:52:13,249 --> 00:52:18,669
And yeah, so I'll be starting a job with
Rocicorp here pretty soon as a partner.

919
00:52:18,709 --> 00:52:19,799
Yeah, I'm super excited about that.

920
00:52:20,569 --> 00:52:21,459
That's amazing.

921
00:52:21,509 --> 00:52:23,349
I mean, we had Aaron on the show.

922
00:52:23,379 --> 00:52:27,639
I think that the second show for
localfirst.fm, and there were so many

923
00:52:28,009 --> 00:52:33,539
deep insights there by Aaron, and I'm a
big fan of the products that's already

924
00:52:33,539 --> 00:52:39,879
been built there, Replicache, Reflect,
and sounds like your work with Materialite

925
00:52:40,149 --> 00:52:44,039
is also going to be incorporated
into making the products even better.

926
00:52:44,479 --> 00:52:49,959
So I'm curious to hear more of your
thoughts on you know, what do you feel

927
00:52:49,959 --> 00:52:54,919
like where's local-first going with the
products you're working on with Rosicorp?

928
00:52:55,330 --> 00:52:58,990
Local-first provides a superior
DevX because you're no longer

929
00:52:59,000 --> 00:53:02,940
worried about APIs of, like, how
do I get data from the server?

930
00:53:03,370 --> 00:53:07,870
Like, a sync engine, or your database is
solving, you have a sync engine solving

931
00:53:07,870 --> 00:53:11,300
that for you, right, where you're just
coding against local data rather than

932
00:53:11,310 --> 00:53:14,280
setting up all these weird REST APIs
that you have to call to get the data.

933
00:53:14,620 --> 00:53:19,207
And yeah, I think that's the key value
of local-first and the main thing

934
00:53:19,207 --> 00:53:20,897
that Rocicorp is pushing forward.

935
00:53:21,927 --> 00:53:24,367
It's like making that sync
engine experience as best, as

936
00:53:24,697 --> 00:53:25,657
good as it can possibly be.

937
00:53:25,950 --> 00:53:29,970
So you're now joining Rocicorp where
you're building Replicache and Reflect,

938
00:53:30,000 --> 00:53:34,674
but the local-first space has grown
quite significantly to a point that

939
00:53:34,674 --> 00:53:36,860
can be almost a bit disorienting.

940
00:53:36,920 --> 00:53:41,380
For newcomers who are trying to first
figure out what does it mean to build

941
00:53:41,380 --> 00:53:45,630
the app local-first and then also
choose the right tool for the job.

942
00:53:45,910 --> 00:53:49,880
Can you provide a little bit of guidance
which how someone should think about

943
00:53:49,880 --> 00:53:51,770
use cases and picking the right tool?

944
00:53:52,200 --> 00:53:52,590
Yeah.

945
00:53:52,770 --> 00:53:56,970
So I guess maybe when you're somebody,
when the first question somebody

946
00:53:56,970 --> 00:54:00,720
asks, should ask themselves is,  "Do
they want servers involved at all?"

947
00:54:01,110 --> 00:54:01,920
Is the first question.

948
00:54:02,620 --> 00:54:06,470
So, like, if you want all sync to
go through a central authoritative

949
00:54:06,470 --> 00:54:10,430
server, and you want that server to
be able to override decisions made

950
00:54:10,430 --> 00:54:15,650
by the client apply data integrity
rules revert changes made by a client

951
00:54:16,210 --> 00:54:18,080
Replicash is a very good choice for that.

952
00:54:18,730 --> 00:54:22,887
If you want something, Where, you
know, you don't want to preclude the

953
00:54:22,887 --> 00:54:27,737
ability for peer, for nodes to sync
peer to peer, or maybe the data they're

954
00:54:27,737 --> 00:54:29,177
syncing is end to end encrypted.

955
00:54:29,687 --> 00:54:33,087
I think something like
CR-SQLite actually shines there.

956
00:54:33,137 --> 00:54:37,387
CR-SQLite lets you merge in any
topology you want, whether it be

957
00:54:37,417 --> 00:54:39,047
peer to peer or hub and spoke.

958
00:54:39,477 --> 00:54:44,307
So we've seen a couple of companies that
are, have like end to end encryption,

959
00:54:44,397 --> 00:54:48,617
where they use CR-SQLite to sync,
or they have literally a pub sub

960
00:54:48,617 --> 00:54:51,977
channel, where all peers just publish
their messages, and any peer who

961
00:54:51,977 --> 00:54:53,717
receives the messages integrates it.

962
00:54:54,167 --> 00:54:57,640
In those use cases and then I guess
if you're really looking for, like,

963
00:54:57,814 --> 00:55:01,834
if your data model is like document
based, I guess going back to syncing,

964
00:55:01,874 --> 00:55:05,584
so not just CR-SQLite for peer to
peer, but also Yjs and AutoMerge

965
00:55:05,594 --> 00:55:06,804
is capable of peer to peer sync.

966
00:55:07,384 --> 00:55:09,304
They're also, of course, if you
can sync peer to peer, you can

967
00:55:09,304 --> 00:55:10,394
sync through central server too.

968
00:55:11,134 --> 00:55:16,204
If you're looking for, you know, you
just want to support a collaborative text

969
00:55:16,204 --> 00:55:22,414
field collaborative text doc and you don't
need, you know, ways to query over your

970
00:55:22,414 --> 00:55:26,754
documents, like you're fine just storing
this doc in a blob in your Whatever your

971
00:55:26,754 --> 00:55:30,984
storage is you know, Yjs is a very good
choice because it's primarily targeting

972
00:55:30,984 --> 00:55:33,924
it at text editing, collaborative
text but they don't have, you know,

973
00:55:33,924 --> 00:55:38,624
facilities like built in for persistence
or querying and these sorts of things.

974
00:55:38,867 --> 00:55:42,667
So I haven't kept super up to date with
AutoMerge, but from what I've seen if

975
00:55:42,667 --> 00:55:47,187
you want like, the ability to fork and
look at different portions of history.

976
00:55:47,277 --> 00:55:50,647
So like, say people are editing a
document somebody wants to fork their

977
00:55:50,647 --> 00:55:55,177
document off, make a bunch of changes
and merge the fork back in I mean, I see

978
00:55:55,177 --> 00:55:57,657
AutoMerge talks about this concept a lot.

979
00:55:57,997 --> 00:56:02,647
So I think if it's not there already,
AutoMerge would be your best bet for some.

980
00:56:03,267 --> 00:56:04,297
for that sort of behavior.

981
00:56:04,877 --> 00:56:05,937
Yeah, then there's ElectricSQL.

982
00:56:06,837 --> 00:56:11,347
So that is like a, you know,
Postgres centric solution.

983
00:56:11,737 --> 00:56:17,294
So if your database is Postgres and you're
okay with, like, merging based on CRDT

984
00:56:17,294 --> 00:56:18,564
rules You know, that's a good choice.

985
00:56:18,564 --> 00:56:21,664
So yeah, so I say like your
back end is Postgres and you're

986
00:56:21,664 --> 00:56:23,094
okay like using CRDT rules.

987
00:56:23,104 --> 00:56:24,444
Like yeah, what does that mean?

988
00:56:24,764 --> 00:56:27,534
So a really interesting thing about
Replicache is it can work with any

989
00:56:27,534 --> 00:56:30,507
back end, pretty much any back end.

990
00:56:30,507 --> 00:56:34,514
So Postgres, MySQL, all sorts of things.

991
00:56:34,784 --> 00:56:39,634
And the way this works is that it uses
sort of a rebase model where , the

992
00:56:39,634 --> 00:56:41,474
developer, defines their mutations.

993
00:56:41,729 --> 00:56:47,119
And those mutations get run in a specific
order on the server and clients when

994
00:56:47,119 --> 00:56:49,949
they receive updates from the server,
any mutations they have outstanding.

995
00:56:50,324 --> 00:56:52,384
They apply the server updates
and any mutations that are

996
00:56:52,384 --> 00:56:53,974
outstanding, they rebase on top.

997
00:56:54,524 --> 00:56:59,724
And this is like super flexible in that
you can write any, like, if you want to

998
00:56:59,724 --> 00:57:03,034
change merge and conflict behavior, well,
you just write your mutations however

999
00:57:03,034 --> 00:57:04,654
you like to get the behavior you want.

1000
00:57:05,044 --> 00:57:09,474
Whereas something like CR-SQLite
or electricsql to get the merge

1001
00:57:09,474 --> 00:57:12,124
behavior you want, you have
to pick specific CRDT types.

1002
00:57:12,314 --> 00:57:15,064
And it, it seems a little bit
less flexible since you can't

1003
00:57:15,064 --> 00:57:16,274
write like a custom mutation.

1004
00:57:16,724 --> 00:57:19,854
But ElectricSQL does seem to have a
pretty good end to end story of like,

1005
00:57:19,874 --> 00:57:22,964
you set up your Postgres DB, that
schema gets replicated down into a

1006
00:57:22,964 --> 00:57:27,114
local SQLite DB, and as you make schema
changes upstream, they flow downstream.

1007
00:57:27,664 --> 00:57:32,270
So in terms of like, Having a
good story between two specific

1008
00:57:32,270 --> 00:57:34,220
databases it seems pretty good.

1009
00:57:34,647 --> 00:57:36,577
And then I guess the
last thing is PowerSync.

1010
00:57:37,017 --> 00:57:43,249
So PowerSync, uh, works on a different
model that's not rebase or CRDTs.

1011
00:57:43,419 --> 00:57:47,879
So with PowerSync, a client will
accumulate changes, and if a

1012
00:57:47,879 --> 00:57:50,529
client has local changes, it won't
take any changes from the server.

1013
00:57:50,929 --> 00:57:53,009
So in the rebase model, if a
client has changes, it'll take

1014
00:57:53,009 --> 00:57:55,369
changes from the server and then
replay its local changes on top.

1015
00:57:56,059 --> 00:57:59,329
Uh, but in the PowerSync model, the client
has local changes, it won't take any

1016
00:57:59,329 --> 00:58:02,449
server changes until those local changes
have been incorporated by the server,

1017
00:58:02,469 --> 00:58:03,799
and then it can take server changes.

1018
00:58:04,479 --> 00:58:07,869
Um, So yeah, to me, I've never built an
app with PowerSync, but this seems like

1019
00:58:07,869 --> 00:58:12,149
it'd be a lot higher latency, uh, for
the syncing, because if you have any

1020
00:58:12,199 --> 00:58:14,809
outstanding local changes, you simply
cannot take a change from the server.

1021
00:58:15,142 --> 00:58:18,412
Whereas rebase model, if you have
outstanding changes, it's fine.

1022
00:58:18,452 --> 00:58:21,082
You get the server changes, you
just replay your local changes

1023
00:58:21,082 --> 00:58:22,082
on top of the server changes.

1024
00:58:22,285 --> 00:58:26,475
That's a really great overview,
and there's many other tools that

1025
00:58:26,485 --> 00:58:31,035
more more tools than we have time
to exhaustively can cover here.

1026
00:58:31,325 --> 00:58:34,925
But I think that's a really great
orientation, particularly with Asking

1027
00:58:34,925 --> 00:58:38,735
yourself the first question, like where
should the authority originate from?

1028
00:58:38,975 --> 00:58:40,535
Do you want to have a server?

1029
00:58:40,665 --> 00:58:43,495
That's, I guess, more traditional
how web apps work today.

1030
00:58:43,495 --> 00:58:48,765
And that can simplify things a lot because
in that server you can enforce if there's

1031
00:58:48,765 --> 00:58:54,295
like a merge conflict or if there's
like maybe a user has set some data in

1032
00:58:54,295 --> 00:58:59,535
a way That might be not compliant with
the app that you want to build or for

1033
00:58:59,555 --> 00:59:02,075
other simplification reasons, et cetera.

1034
00:59:02,105 --> 00:59:04,965
So I think picking that,
where does the authority live?

1035
00:59:04,985 --> 00:59:06,055
Is there an authority?

1036
00:59:06,375 --> 00:59:09,085
And , which role should a server play?

1037
00:59:09,325 --> 00:59:11,215
I think that's a really important one.

1038
00:59:11,555 --> 00:59:15,005
And then also the two kinds of
like ways to go about syncing or

1039
00:59:15,215 --> 00:59:20,965
at least two ways going more about
certain spacing, everything on CRDTs.

1040
00:59:21,375 --> 00:59:27,115
Or going with a event rebase
model, our Replicash is working.

1041
00:59:27,485 --> 00:59:32,615
I'm curious how someone who has used
neither and might have not really

1042
00:59:32,645 --> 00:59:37,215
a lot of intuition for what are the
implications of choosing one or the other.

1043
00:59:37,465 --> 00:59:42,015
Can you provide a little bit of an
intuition which kind of app use cases

1044
00:59:42,015 --> 00:59:43,916
are a good fit for one or the other?

1045
00:59:44,372 --> 00:59:44,772
Yeah.

1046
00:59:44,842 --> 00:59:47,052
So with CRDTs, you don't.

1047
00:59:47,492 --> 00:59:53,342
Have code on the right path, like
the CRDT algorithm is going to

1048
00:59:53,342 --> 00:59:55,882
determine, I guess I should say you
don't have code on the merge path.

1049
00:59:55,982 --> 00:59:58,402
The CRDT algorithm is going
to determine how things merge.

1050
00:59:58,452 --> 01:00:02,252
And you're just picking a set of CRDT
algorithms where something like Replicash,

1051
01:00:02,302 --> 01:00:03,802
where you write custom mutators.

1052
01:00:04,077 --> 01:00:08,317
You can write a set of mutators that run
on the client, and you can write mutators

1053
01:00:08,377 --> 01:00:11,487
with the same name, same args, but do
something totally different on the server.

1054
01:00:11,947 --> 01:00:16,737
Which, one it's, since you're writing code
in the mutation, it's giving you control

1055
01:00:16,747 --> 01:00:19,927
over, kind of, how things merge and sync.

1056
01:00:20,367 --> 01:00:24,317
And two, since The code doesn't have
to be identical on every node because

1057
01:00:24,417 --> 01:00:26,287
the server is the authoritative answer.

1058
01:00:26,584 --> 01:00:30,564
You could do something in the server
mutation like check permissions or,

1059
01:00:30,634 --> 01:00:33,907
you know, set last modified times
or something from the server's

1060
01:00:33,917 --> 01:00:35,457
time rather than the client's time.

1061
01:00:35,632 --> 01:00:40,916
Yeah, but I guess the one downside to that
is  a CRDT, like, you can sync messages in

1062
01:00:40,936 --> 01:00:46,326
any order, like, yeah, if you have message
A, B, and C, if you sync them B, C, A,

1063
01:00:46,326 --> 01:00:50,506
or C, B, A, like, you always get the same
state, which means that you don't have to

1064
01:00:50,506 --> 01:00:52,236
have a single server being the authority.

1065
01:00:52,236 --> 01:00:56,466
You can have every peer be on equal
footing and essentially have no server

1066
01:00:56,816 --> 01:01:00,529
whereas like this, you know, Replicache
model, you do have to have, Someone

1067
01:01:00,529 --> 01:01:04,629
that is that authority that's going to
provide the total ordering of messages.

1068
01:01:04,979 --> 01:01:09,579
And to me, it sounds like at least
the latter, the rebasing model with

1069
01:01:09,589 --> 01:01:15,109
that authority living in a server, for
example, that is a bit easier to get

1070
01:01:15,129 --> 01:01:20,939
into less of a radical shift of like
thinking, changing the way, how you

1071
01:01:20,949 --> 01:01:25,729
think about data modeling, et cetera,
and it's probably the easier one to

1072
01:01:25,729 --> 01:01:30,624
get into and might be a better catch
all solution for many app use cases.

1073
01:01:30,934 --> 01:01:33,684
Like if you look at yeah, I think
it's the easier, it's like, it's more

1074
01:01:33,684 --> 01:01:38,324
familiar and if you have an existing app
that you're trying to add multiplayer

1075
01:01:38,324 --> 01:01:41,824
to Like, yeah this can work on your
existing data model without much

1076
01:01:41,824 --> 01:01:44,114
changes or with little to no changes.

1077
01:01:44,144 --> 01:01:47,774
Whereas, oh, you want to switch to CRDTs,
like you might have to make a new data

1078
01:01:47,774 --> 01:01:50,904
model specifically for this collaboration
thing and migrate your stuff over.

1079
01:01:51,304 --> 01:01:51,654
Right.

1080
01:01:51,734 --> 01:01:55,384
And I mean, the advice I would
give on top of that is that.

1081
01:01:55,404 --> 01:02:00,064
There are now like so many interesting
and easy to use options out there,

1082
01:02:00,064 --> 01:02:04,354
whether it's Replicache, whether it's
AutoMerge or others just maybe try

1083
01:02:04,374 --> 01:02:09,184
building the same mini version of your
app in different technology stacks,

1084
01:02:09,194 --> 01:02:14,274
see what works best for you, and you'll
develop that intuition just by yourself.

1085
01:02:14,634 --> 01:02:16,554
And I think that's a
really fun thing to do.

1086
01:02:16,979 --> 01:02:17,509
Cool.

1087
01:02:17,779 --> 01:02:21,209
Matt, thank you so much for sharing
about like all those different

1088
01:02:21,209 --> 01:02:22,849
projects you've been involved with.

1089
01:02:22,899 --> 01:02:26,779
I'm really excited to see what
you'll be building at Rosicorp.

1090
01:02:27,089 --> 01:02:29,559
Is there anything else you want
to share with the audience now?

1091
01:02:30,129 --> 01:02:31,029
Yeah, just stay tuned.

1092
01:02:31,029 --> 01:02:31,819
We're building the next.

1093
01:02:32,104 --> 01:02:34,484
Kind of iteration of
Replicache and Reflect.

1094
01:02:34,664 --> 01:02:39,701
And , it was exciting enough to make
me like drop everything I was doing

1095
01:02:39,701 --> 01:02:44,921
and join the team and yeah, go back
to, I yeah, work, working for somebody,

1096
01:02:44,921 --> 01:02:49,041
I guess, rather than just, you know,
following whatever research interests,

1097
01:02:49,091 --> 01:02:50,391
wherever they took me from day to day.

1098
01:02:50,891 --> 01:02:53,461
So yeah, it's, I think what they're
building is super compelling and

1099
01:02:53,471 --> 01:02:57,301
You'll hear more, you know, over
the summer and coming months.

1100
01:02:57,944 --> 01:02:58,964
That's incredible.

1101
01:02:59,024 --> 01:03:03,104
I'm really excited to hear more about
what you all have been working on.

1102
01:03:03,614 --> 01:03:06,314
And yeah, thank you so much
for coming on the show.

1103
01:03:06,814 --> 01:03:06,974
Yeah.

1104
01:03:06,974 --> 01:03:07,614
Thanks for having me.

1105
01:03:08,690 --> 01:03:11,230
Thank you for listening to
the localfirst.fm podcast.

1106
01:03:11,470 --> 01:03:15,200
If you've enjoyed this episode and haven't
done so already, please subscribe and

1107
01:03:15,200 --> 01:03:16,960
leave a review wherever you're listening.

1108
01:03:17,330 --> 01:03:18,760
Please also tell your friends about it.

1109
01:03:18,770 --> 01:03:22,340
If you think they could be interested
in local-first, if you have feedback,

1110
01:03:22,380 --> 01:03:26,120
questions or ideas for the podcast,
please get in touch via hello at

1111
01:03:26,120 --> 01:03:31,900
localfirst.fm or use the feedback form on
our website, special thanks to Expo and

1112
01:03:31,900 --> 01:03:33,950
Crab Nebula for supporting this podcast.

1113
01:03:34,350 --> 01:03:35,110
See you next time.