Postgres FM

Nikolay and Michael discuss sharding Postgres — what it means, why and when it's needed, and the available options right now. 
Here are some links to some things they mentioned:


What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!


Postgres FM is brought to you by:
With special thanks to:

Creators & Guests

Michael Christofides
Founder of pgMustard
Nikolay Samokhvalov
Founder of Postgres AI

What is Postgres FM?

A weekly podcast about all things PostgreSQL

[00:00:00] Nikolay: Hello, hello, this is Postgres FM episode number 58 and my name is Nikola. Together with me Michael. Hi Michael.

[00:00:09] Michael: Hello, Nikolai.

[00:00:11] Nikolay: And we are going to talk about sharding today. Sharding sharding. Two big experts of sharding are here, are going to discuss sharding.

[00:00:20] Michael: Yeah, definitely not an expert, here, but this did actually, this came up last week as a topic. somebody suggested partitioning and sharding as a topic for a monthly.

blogging event, that I'll link it up in the,

in the show notes. But yeah. we've done an episode on petitioning.

thought it was a really good one. I really enjoyed that and learn stuff, and I think we got some good feedback on it. this though is shard and I think maybe that is a good place to start. Like what, the difference is. I, I tried defining it. You gave me some feedback. How do you define them?

[00:00:53] Nikolay: Well partitioning is, uh, when we split tables, but we remain in on a single node. I mean, we, we, we might have, only standby nodes, but if we count only primary nodes, we have a RedR access. It's only a single node, but once we do similar split. Involving multiple primary nodes, we should call it, uh, Charing instead of partitioning.

And the name, I guess, came from some game in the past.

[00:01:27] Michael: Oh really?

[00:01:28] Nikolay: yeah, it's from, it's, it's some, some, some game online game and they called shards some. I should not lie, I was not prepared to explain etymology here, but my part of my brain says it's some from some game from gaming and they call chart some parts of the world of gaming world.

It's basically if, if we move this concept to databases, we have database charting.

[00:01:56] Michael: I really like your definition. It's the one I, tried to go within my blog post. So partitioning is at the table level, splitting what's logically one table into multiple tables, and the sharding is the same, but at the database level. So it's splitting what is lo can logically be thought of as one database, but is actually behind the scenes.

Multiple databases. Um, so that makes sense to me, except what, and I think it's worth warning listeners, lots of blog posts you read out there, definitions even on Wikipedia and other, like other places that are normally like quite, quite accurate. They do often use the word partitioning in places that I think would more accurately be described as sharding.

So it is confusing out there.

[00:02:43] Nikolay: The, the C AP theorem, uh, it's official terminology, network partitioning. Right. So, and network partitioning it's, I mean, is in negative connotation. It's uh, like in our case we should call it network shouting, but it's like, doesn't make sense. Also, there's some confusion around vertical versus horizontal and, uh, well, I don't understand why confusion, but I see people try to say vertical sharding.

Which actually just vertical split of like maybe functional split of databases to multiple databases. For example, you see weak connection between, two groups of tables. We could say clustering here as well. And actually we can apply some, machine learning approaches like basic Cains to try to automatically, detect, Good options for vertical split of databases into sets of tables with weak connections.

Weak means like almost no foreign keys between them. And also dynamic, relation when we have two tables involved in, in a query. It's also a connection though. It's not static as form key, it's dynamic. So I use it. I, I use, I use machine learning to try to help people. Uh, understand how to split databases vertically.

Vertically means, okay, users are here, products are there, right? And connection between exists, but we don't do it often. Well, maybe it's not a good, example, but I, I think it's, it's clear when we have different data in different parts, different primary nodes, but, horizontal means we. Basically take, cut our table and, split it using like horizontal line, right?

[00:04:32] Michael: I think the partitioning cases. Actually simpler to explain. So horizontal partitioning is at the row level. So we are taking some rows from, the table and putting them in a, on a different node, or sorry, in a different table. and vertical would be column based. So we're splitting the table based on the columns into two different tables.

Now that's, it's the same but for sharding, right? Like we don't have the same column. It is, it is kind of still columns and rows, right? That we're just taking. Columns as a whole table and putting them somewhere else. And in the horizontal case, uh, we're taking row level data. Um, so we have the, the same tables across different nodes, but different rows in each one.

[00:05:11] Nikolay: vertical column level. Okay. It means like, uh, if, if you studied, uh, database theory, it should be called like, uh, uh, projection when you have only li limited set of, columns and. The other columns go to different table. You probably have one-to-one. Connection between them relation between them relationship, sorry.

I agree. And, if you do it, at server level, it's just functional split, we should call it functional split. And this is a normal approach to scale your system. If you prefer microservices or just, mid-size or big size services approach when you split, , some functionalities in one database and another functionality, another database.

They have some connection. Of course, maybe they have some common part and you need to take care of replication, but majority of data is very different in nature versus, okay, role level means like we have same sets of. Tables, but different data because some part of data goes to one node, some different part of data goes to another node.

So how do we do that? Let's talk.

[00:06:27] Michael: Well, do you want to do how, or, I was thinking maybe we could start with why. Like what? Why even

[00:06:32] Nikolay: is simple? Why is simple? Well,

[00:06:36] Michael: I think there are a few reasons, to be honest.

[00:06:39] Nikolay: there's only one big reason and we know it. Performance.

[00:06:44] Michael: Well scale, right? Like when you say performance, it's, um,

[00:06:48] Nikolay: related.

[00:06:49] Michael: we're talking about that. I like it. I think it's the main reason you'd would ever want to share, which is hitting a max. Like, let's say you are super keen to stay on Amazon, R D Ss Postgres, and there's a max size that you can provision and at on some.

Basis, you are ma, you're getting close to, to maxing that out. Maybe your CPU's going up, maybe your ingest rate, there's some level in which you're, you're scared, or we can't continue to grow this vertically, we can't scale up anymore. Is that, is that what you are

[00:07:22] Nikolay: You just used the word vertically in different sense. I mean, vertical scaling when we just add the resources more. C p U, more mammary be better discs. Right.

[00:07:35] Michael: That's what I meant. What I meant is if we get to a point where we can't do that anymore, we need to think about, sharding.

[00:07:42] Nikolay: No, not necessarily. Like again, like you just used, again, I, I will repeat. You used the word vertical scaling where vertical word is in different sense compared to what we just discussed when we talked about columns, you used the word vertical in the sense of hardware resources. So I mean, we. If we cannot add more C P U and Ram

[00:08:07] Michael: Mm-hmm.

[00:08:07] Nikolay: vertical scaling, we probably still can split our system vertically in terms of tables and columns and, uh, have different parts of our schema and different, primary nodes.

[00:08:18] Michael: Yeah, I agree. But I thought we were counting that as a, as a type of sharding.

[00:08:23] Nikolay: Uh, I don't like to call it charging at all, but, but I see people do it vertical sharing. Okay. For me it's like just functional split. Or like go in microservices architecture or something. And it has some of problems, uh, which this approach has are similar to normal ard. For me, normal ARD is horizontal ard.

[00:08:48] Michael: Yeah. Okay.

[00:08:48] Nikolay: When we don't say, when we don't specify vertical or horizontal, we mean usually we mean horizontal arding, same s scheme everywhere, but different data. Right?

[00:08:59] Michael: I completely agree. Same with partitioning. If you don't hear somebody say horizontal vertical, they mean horizontal. Mm-hmm.

[00:09:06] Nikolay: partitioning also, like, it's not common. Some people use it again, it's just a, it's an attempt to have some unified, , terminology or everything but Okay. So, , We, cannot live with one primary node. We are saturated.

We tried everything off, like offloading, , read only queries to standby nodes. We reduced frequency probably with some caching, with some optimization, and we still see that we are growing and soon one node is not enough. It's very painful situation. Very scary. Especially for c t O and so on , to suddenly to hit the, the ceiling.

And of course in this case, usually people choose one of two directions as a main one. Like again, vertical split or sharding right away. But sometimes they, they need to mix. If it, it's a really large project, you start with like, you bet on one approach, but still need to apply the different one as well.

[00:10:09] Michael: Yeah.

[00:10:10] Nikolay: Right? , So if you like, microservices, and microservices is a bigger than technological, , topic. It's not just technological microservices is organizational topic. You need to change management and how , teams are organized. How they choose technology, they probably choose, not cus some people, some teams might choose, not cus but something else and so on, so it's a bigger than just technical discussion, but if you choose microservices, probably you don't need charting, right? It's, they, they, microservices approach either solves it this problem of saturation and the inability to scale or it just postpones it so much that, that you have like five years or so, right?

[00:10:56] Michael: M maybe like the thing, the thing I don't understand about microservices is like, what if you've got one thing that's very difficult to split logically and that is the most heavy ingest rate of, of everything. So you could easily have a quite a small team looking after one huge. Node in the, in the microservice infrastructure.

So I, I suspect you could still hit this, you know, if you've got like, a load of IOT sensors or something, you could get a lot of data very quickly. So, yeah, may, may. I think it's


[00:11:25] Nikolay: I, I, I cannot agree here a lot of, like, if, if you, for example, a e-commerce, and a huge one like, leader or continental leader,

[00:11:34] Michael: Mm-hmm.

[00:11:34] Nikolay: For example, and I have few examples, Postgres based, uh, which I worked with directly, or I just learned from them, based on discussions with the people involved. So if you choose microservices and e-commerce, most of e-commerce systems somehow tend to choose microservices approach.

They love it. I mean, engineers, backend engineers and managers, they, they usually choose. Microservices in this case, it's very hard to imagine that one of services, which usually you have some related to registration, inventory orders, like and so on. Many, many services. In typical e-commerce, it's hard to imagine that one of them will require sharding right away.

You need to grow a lot to see the need in, in charging of one of those services.

You should be really huge,

[00:12:23] Michael: I, agree. For

e-commerce. I think there are some sharding, or things that get called sharding that, are horizontal but have a different primary driver, I think. And I think that's analytics query

[00:12:38] Nikolay: uh Okay. Okay. I, silently reduced the topic to all TP as usual for me,

[00:12:45] Michael: Yeah. Okay.

[00:12:46] Nikolay: but I agree all up analytical systems, they can have a lot of data and uh, usually arding there is definitely a good thing to have. So I agree with you here, definitely

[00:13:02] Michael: So a couple of, uh, a couple of reasons at least, uh, for wanting this, but yeah, I think you, yeah, how is probably a good. I think to move on to like, what are our options, uh,

[00:13:12] Nikolay: Right. How so, unlike my SQL world where Vitas exists, uh, we don't have Vitas in Postgres. And attempts to migrate with Test to Postgres failed. I know a few. And, uh, the developers of Vitas, uh, announced that they don't pursue this goal anymore. But we have a few options. Stoss, first of all, and again, like I, I had a joke about we are big experts because I consider my, myself not an expert in charging at all because I reviewed STOs and played with it a couple of times in the past, but it was before Microsoft has decided to open source everything.

So I always considered only open source part because I didn't want to be. To have vendor lock in to Azure, but right now we have interesting situation. They open sourced everything, so it, it's worth reviewing once again, especially the feature, the very important feature for large and growing projects. Uh, rebalancing without downtime.

It's super important for, because you never know which node will grow. , it's hard to predict. So you need some tools. To change a distribution of data in, in, in the future, but without huge downtime. And this feature was originally only in paid version of cyto, but , now it's open source as well. But I lack UpToDate knowledge for me.

, if we talk about L T P, , I usually ask, uh, I, I, I talk to them, uh, to cyto team , couple of times and asked, , Police provide cases of pure L T P good, like heavily loaded systems with L T P workloads, but everything they provided, but that time it was like 2, 3, 4 years ago, it was looking as H step to me, not l t p.

You understand what

[00:15:06] Michael: So hybrid, hybrid, transactional, analytical.

[00:15:09] Nikolay: For example, some search, uh, an analytical engine for videos or something where it's okay, like only limited number of users and they are motivated to wait a couple of seconds. In this case, it's okay, I mean, to have some LEDs. , but , in L T P we have only usually dozens of milliseconds.

Or just hundreds of milliseconds, but not a second. The second kills our traffic and people go away. So we cannot wait. We cannot allow waiting so long and when I, benchmarked myself, cyto was not behaving very well for O T P. But then, , I've got, Some response on Twitter from, developers that I do something wrong.

So very likely I did something wrong. I mean, I, I was trying to measure latency overhead. This is my favorite, , test for such systems because the biggest problem for performance when like splitting is not, it's difficult, but when you have, , something that decides which chart to go. Router, right? It adds overhead because it needs to parse the query and parsing query and so on requires time.

So this adds overhead. And, for me, in TP case, it's unacceptable to have overhead more than a sec. A millisecond. Millisecond is quite big already.

[00:16:29] Michael: what were you seeing?

[00:16:31] Nikolay: I don't remember details.

[00:16:32] Michael: okay.

[00:16:33] Nikolay: it was, a few years ago.

[00:16:36] Michael: imagine if it was, and I'm not saying it is. Imagine if it was tens of milliseconds, you would deem that completely unacceptable.

[00:16:41] Nikolay: Yeah, definitely because, , it's already very close to human perception, uh, threshold, which we know 200 milliseconds and we need probably multiple queries to serve one H T P request. So I cannot accept that overhead at all because I know my backend engineers will, add milliseconds on their own.

They, they know how to do, how to add more milliseconds, so I, I cannot allow, uh, this proxy, uh, middleware, right? I can know, cannot allow it to have significant overhead. But again, I, my benchmarks very likely were not ideal because of this feedback. And I, I never, tried one more time. So probably it's time to benchmark again and see the overhead of cyto proxy

[00:17:30] Michael: and you're a few years out of date, right? If they, they open sourced it a little while ago, so it, you're at least that much out of date. and I think the latest version included, uh, schema level sharding, which seems quite interesting for some like O L T P, uh, type split. Yeah, exactly.

So there,

[00:17:46] Nikolay: It's quite vertical, right?

[00:17:47] Michael: if you, if you, if you were considering, um, If, if someone's considering sharding and they're on Postgres, they're gonna look at Citus, they should look at Citus, but there are other options as well. Right? Like that's the

[00:18:01] Nikolay: I will, I would, uh, revisit this decision, right, right now for L T p, heavily loaded L TP system. But, uh, again, benchmarks are needed once again and probably some tuning and so on, and I still don't know very good examples of L T P systems, which use cyto, but I, I, I'm not closely watching.

[00:18:23] Michael: I looked at their customer, look, looked at some of their customers, and some of them are very analytics heavy. So like Algolia for Search Heap. Um, I believe Microsoft used them internally for quite a few things, but FreshWorks studs stood out to me as potentially, like, that's a help desk ticketing system, right?

, sure they need to search as well. But a large part of that is O L T P, so that

[00:18:46] Nikolay: interesting Yeah, that's interesting. But analytical, area, let's look at, Hydra, which recently released to a 1.0, congrats to the Hydra team. Interesting column based, and they inherited this code from Cyto for Comm store, so it's interesting.

[00:19:04] Michael: Looked at the hydro website in preparation, expecting them to have something around sharding. But I think they've fought Cirus to do the other parts of what Cirus as well, like the analytical processing. But no, no mention of sharding anywhere on the hydro

[00:19:18] Nikolay: Uh, maybe at some, at some point, at some level, you don't really need it if you have column store and, and, uh, vector vectorized processing. Ah, yeah. So I'm not analytical guy. I'm not running guy. I'm not, I honestly don't understand with what we are talking

here about.

[00:19:35] Michael: Let's move on because we do have some really interesting cases and writeups and blog posts from recent O L T P company like Notion and Figma, both blogged relatively recently in the last couple of years about sharding Postgres.

[00:19:50] Nikolay: in what, in a way which I call application server sharding, a s s, which I also implemented myself a few times. This is what you usually choose because you don't have proper tooling and we probably cito assess proper tooling for L T P. Again, maybe someone has good experience or knows about some good experience.

I would like to know, please comment anywhere, like on Twitter or on YouTube or anywhere. but applications server sharding or application level sharding. I like application server sharing because it's about this side application, side sharing, sorry. this is challenging and requires an effort.

Definitely. And, uh, usually it's quite easy to split and so on. But you need to think about several as usual. First is, uh, rebalancing. In future you will need it, definitely, and how to do it without downtime. Second is, , how to avoid distributed transactions, for example, it's absolutely bad idea to have multiple connections to different primary nodes and work with them like in some messy way.

You start transaction, you work with different connection and it also has transaction you need. If you do that, if you absolutely need it, you need, two-phase commit, two pc, but it's very, it's slow, so you cannot have thousands of t p s on two PC at all. Like it's impossible. So it's very slow because it has its own, uh, huge overhead.

So usually in old TP context, we try to avoid to PC unless absolutely needed,

[00:21:27] Michael: Mm-hmm.

[00:21:27] Nikolay: right? And, uh, finally, you also need to take care of this, uh, router and, should have small overhead. One millisecond is good.

Probably that's it. Uh, high level.

[00:21:41] Michael: Yeah, I, I think the notion blog post in particular, so there's two, there's two blog posts actually from Notion One from 2021 where they initially did this and they shared a huge amount of detail in preparation, how they chose things, how they were preparing for resharing, uh, uh, late date rebalancing, sorry.

And then there's this, a follow-up blog post from this year from them talking about, Re sharding without down to, or with, I think less than a second of noticeable, impact on users, which is incredibly impressive. But they, they, um, they might have even used the word, partition key or something. They, they chose to shard based on.

uh, what's it called? Um, like workspace because people aren't ever, people aren't ever looking for information from two, workspaces at the same time, so you don't have that same problem.

[00:22:30] Nikolay: Yeah, that's actually the same as with partitioning. It's, it's very di difficult and I saw, cases when people spend a lot of time, , trying to find ideal partitioning key or charging key in this case, or a set of keys, for example, unlike partitioning, we, where we can, for example, choose one key and it's enough.

We also need to think about how a right workload will be distributed among multiple nodes here. And for example, we know Timescale cloud, they, they unfortunately not an open source. They have basically shouting as well. And there you need to have two keys. One is time-based timestamp. But it's not enough.

Why? Because, uh, if you just use only timestamp, , you'll have hotspot. One shard will be receiving all most of the rights all the time. So you need additional, for balancing. You need a second, uh, level of, second key, basically as a part of, uh, complex key. so for, for example, a workspace Id could, could work here as well.

[00:23:32] Michael: Timescale. Multi-node is a really good point actually. I was looking up and it looks like you can self-host it, so even if it's not, open source by different.

[00:23:40] Nikolay: It's interesting news to me. Okay,

[00:23:42] Michael: According to the, yeah, I was reading some, I only, I also thought that wasn't true until I checked the docs

[00:23:48] Nikolay: what is the license?

[00:23:50] Michael: I hadn't checked, sorry,

[00:23:51] Nikolay: they have two licenses, Apache and uh, timescale, which

is not open source.

[00:23:57] Michael: very much doubt it's not, I don't think they're doing anything on the Apache one anymore. I think that's their old, stuff, but anyway, let's not guess.

Uh, I'll link up the docs and people that are interested can look into it themselves.

[00:24:08] Nikolay: This episode is full of guessing, But let's, let's jump to this main topic. you need to take care of these things and, uh, you need to think about, this is usually when you are architecting something, you need to think about how much of data you'll have in five years, and how will you approach, rebalancing, , with minimal.

Cut over time. So you need probably involve, logical decoding, logical replication.

it's improving in the latest postgre versions. But it was surprise for me in one case when it was, it was not about shutting, it was about, uh, vertical split in, of acquired big system. And I was, considering a logical, uh, replication right away to perform a split.

But split was like in, in two parts. like 50 50. And in this case it's, it's easier to use just physical replication than drop tables on both sides, which are not needed because it's, easier to install. it has less bottlenecks, um, fewer bottlenecks and so on.

So in some cases, uh, just physical replication and then you drop. tables you don't need. And also, uh, logic balancer and, uh, leg detection is interesting there. So let me add one more item, which is quite like huge item. If you go, uh, doesn't matter microservices or sharding, you, you're going to have many more nodes and, uh, in this case, uh, operational side, you should be better.

Auto, fellow or backups, provisioning, uh, balancing, uh, everything should work much better, more reliably. And in this case, it means that you need to simplify. If you rely on managed SREs, probably it's also okay, but you need to trust it a hundred percent and so on. But if you manage Bos yourself before in increasing the number of primaries you need to.

Unify, for example, naming convention . It takes a lot of time if you have different, uh, schemers of naming, of hosts, for example, for different parts of your system. And then you design some very good, , tool, which, for example, performs minor or major upgrade or something, and then you bump into issue of deviations.

So you need to simplify, unify everything because you are going to have many more nodes now.

[00:26:28] Michael: Yeah, it's a great point. it's not cheap to add, sharding adds a lot of complexity. so it's a good idea. That's a nice point to simplify

[00:26:35] Nikolay: It's not about sharding, it's about just growing fleet in terms of, uh, clusters. You have more clusters, so you need unification and simplification and so on.

[00:26:46] Michael: So I actually thought, just thought of an like, uh, thinking about the other options out there. I do remember hearing and reading up on. EDBs product, it kind of in this area called Postgres distributed. Now that's an, that kind of raises the point of a different use case for sharding, which is, well the, one of the, the big advertised features of that is being able to keep data local to a region, for example.

So like if you want to keep

[00:27:13] Nikolay: Bidirectional replication


[00:27:15] Michael: that it's the new

[00:27:17] Nikolay: let's detach this topic because it's very different than the specific.

it's not charting.

[00:27:22] Michael: well, but by our definition of one logical database split into multiple, physical databases, it kind of is.

[00:27:31] Nikolay: But then you need to duplicate in both directions and so on. This is based on the claim that, Replaying logical replay is easier than initial AP apply of changes. And so far I haven't tested myself. I saw it only in b d r documentation. but I don't, think it's so, I think it's more marketing, claim.

I didn't

see benchmarks. so let's, like, let's keep aside, uh, bidirectional replication completely and return to this topic in a separate episode. And because we also don't have time for it right now, I also want, want to mention two different tools.

Just this p g Cat for example, if you Right. If you want to Mm. Tochar yourself. Already offers, simplified approach because, uh, it, provides, , sharding and like originally provided sharding in explicit form application needs to say, okay, this, this needs to be executed. And I know on which shard. In the comment, so like, basically just some, uh, helper tool, but you need to take care of a lot of things yourself.

But I saw this, they improved, improved it and some kind of, uh, automatic routing already there. And also overhead is quite low. I tested it long ago again, like a year ago maybe. And overhead was not bad at all for L G P. Uh, it's written in rust, so quite performant. Interesting. I, I would look at it and it's, , being developed, quite at good pace.

And another is sp quo.

[00:29:02] Michael: Mm-hmm.

[00:29:02] Nikolay: From also under development. I watch changes in both repositories and I see a lot of development happening in both projects. and this project was developed with idea of having more automated, sharding tooling similar to Viti Gate and Vitas.

[00:29:21] Michael: Yeah. Is, would you call that, do we need another name for it? Is it almost like pool level? Sh If we've got application level sharding, is this pool level?

[00:29:29] Nikolay: Well, well, yes. Well, if in this case we, we can distinguish like application level, sharding, application side sharding. It's when backend engineers is responsible for everything, basically, almost, or almost everything additional like software can be, can be put. Uh, Like in transparent fashion, yes, we need to take care of, rewriting some queries because we don't want to deal with multiple charts at the same time or often.

But, we can distinguish, uh, at least two, two types of this middleware, which helps us with shouting First is like, right, like Viti gate style or. This s p QR style or P g K when something lightweight is placed, and it doesn't include Postgres code or at least a majority of Postgres code in this case.

This, tool needs to parse queries, Postgres queries. Posty is very, very advanced, so it's a challenging task. Grammar is huge, and another approach is placing a whole Postgres node in between. In this case, I think latency overhead is quite significant. And this is what, PL proxy in Skype 15 years ago, developed more than 15 years ago was doing.

And, uh, it's quite interesting approach. I'm not sure about overhead, but, , it has limitations that you need to rewrite all your queries in, uh, the form of, uh, server functions. 'cause it's kind of language proxy, it's a language which, similar to map produce approach, but overhead is interesting. And, but at the same time, Skype was l t p definitely.

And, uh, overhead, requirements were quite strict. it's, again, it's similar to P G Q I lost art or lost knowledge,

[00:31:19] Michael: Well, if you like, I was thinking we could end on like, where do we think the future's going a little bit and I think ESSA is a really interesting case. I do think we're maybe at a point where the postgre based startups that started in the last 10 years or so are really at a scale that.

YouTube were at when they started needing this and, and they started building the test. And I wonder if maybe with that is what we're starting to see with the likes of PG Catt, that we're starting to see some of these companies that have built been built on Postgres really needing, I. Some better tooling here for their own use cases or, you know, as you said before, the cath, the, the bizarre of different options.

Lots of people will build their own tooling and maybe one of them will emerge like VI test did for my qr. Uh, and will have that in 10 years time. It'll be the same place maybe

[00:32:13] Nikolay: I don't consider Pja cat as a sharding solution. It's more like pool. But with, with very lean approach to development, when they decided why not having these, for example, and unlike pigeon pool, like they don't aim to, uh, solve every problem completely. For example, this explicit approach when you, you as a backend engineer, is putting a comment on which chart that to execute.

So you are responsible for routing and there's like, it's not, it's manual routing, right? It works quite well. It's a, it's simple feature. Why not? Right? so I think with Postgres, somehow very good sharding solution. Never existed. And the test has many features, uh, which, uh, they usually are overlooked.

For example, a synchronous flow of data between different parts of system. And, if we consider huge project, uh, we chart different parts differently. So basically we already split into services, right? So we need to start users in one way on products or orders in different way. In this case, we have basically two vertically split, uh, parts and already then we, split them, horizontally.

But in this case we want to avoid, uh, dealing with multiple nodes. So why not having some kind of mineralized views, which will, will, would bring data asynchronously with some delay. But quite reliably so we could, uh, work with, uh, local. So I mean, if you imagine some charting system with a, with ability to have, uh, different charting keys for different parts of schema, plus you add materialized views with ability to be incrementally updated.


[00:34:02] Michael: Yeah.

[00:34:03] Nikolay: and ability to be incrementally synchronously between nodes plus also global dictionary because sometimes you need some global dictionary and also ability to synchronize everything. This all something like is something like the test has already in my SQL world, but sss somehow we don't have it.

Right And I think it's possible to build it from bricks. It'll take time. all these things, but somehow, like full fledged solution doesn't exist. Again, I'm, I'm lagging in terms of, uh, cyto understanding, but, uh, because for me, like there is a requirement on the overhead, if it is not passed, I, I'm already looking at different direction and right now I would revisit cyto, revisit site, revisit.

P g Cat and s sp q R and, uh, if tested in my, for my case, and then go with application site charting once again, unfortunately,

[00:34:59] Michael: Last question from me. I

think I saw

[00:35:02] Nikolay: for, for to have complete picture. There is also a third type of sharding when we. Maybe fourth type of arding when we don't chart, uh, in, in ourselves, we, when we don't have middleware, but we decide to split our system to multiple, systems.

if you talk about Arding, uh, you talk usually about database split, but what if we split whole system? For example, we had only one website, but now we have 10 websites. They have, for example, unified, notification, subsystem. So you have one login that works on, on each one of those 10 sites.

And this can probably solve your, uh, you mentioned this problem, to have data closer to your customers. For example, you could have one website for one country, another website for different country. Maybe you have a hundred of websites. They have single login system. And they have different databases.

In this case you split horizontally also. Right. But you split not a database level, but at like whole system level. So they have everything, separately. What about this approach?

[00:36:10] Michael: Yeah. And it by the way, it's not just about speed. Like it's not just about latency of having data close to users, it's also about residency. Like privacy

[00:36:18] Nikolay: Mm. A law.

Yeah, Yeah. but this approach is interesting for SaaS systems, maybe like notion,

[00:36:28] Michael: Yeah, well, but interestingly, they didn't like they've, uh, there are a lot of benefits of keeping it as a single application, right?

[00:36:35] Nikolay: simplicity of management.

[00:36:37] Michael: or like having, let's say I am part of one company in the US and part of a different company, uh, in Europe. I can, I can have my work, I can log into the workspaces within the same UI very easily.

so I, I

[00:36:53] Nikolay: You can do it even if it's, uh, two different applications. You just need a single uh, uh, notification. Auto on notification system. It's

possible to do So you use one login. As for Google Services, you use one login for many, many different, uh, services, right? It's, it works similar. Here. We can have different systems.

They are the same. if we forget about Legal details, they are all the same. But you log in to all systems using single login. It works everywhere, like seamlessly,

[00:37:23] Michael: Yeah, I don't know. I don't know if it would be quite as easy, like,

[00:37:29] Nikolay: It's not easy.

[00:37:29] Michael: for example. Okay.

[00:37:31] Nikolay: It's not, It's not easy. But you can grow a lot using this approach. Like, uh, scalability is infinite.

[00:37:39] Michael: I think I saw a talk by GitLab, talking about splitting systems out. Uh, I think I, I think it might, there might be a recorded version of it, but, but I suspect you know more about it than me. Which route did they go?

[00:37:53] Nikolay: well, a disclaimer, they are still my customers.

[00:37:56] Michael: Yeah.

[00:37:56] Nikolay: So, so I would recommend checking their documentation. It's very transparent. Most of things are public. They split vertically first. It's, and it's obvious for systems like GitLab because if you think, they have a lot of functionality, but, some kind of functionality is quite, uh, automatic and it's related to C I C D pipelines.

And this is exactly what was, uh, moved to different database, C I C D data.

[00:38:22] Michael: Yep.

[00:38:23] Nikolay: it's coupled very, very well inside Shi Park, but it's not that connected to different parts. Oh, by the way, check out, what they needed to do with foreign keys when performed this split because they needed to create a new concept.

I, I forgot the name, but basically how it works. Uh, it's a like, a synchronous foreign key check. Between two databases. So it's quite interesting, concept to which, helps you preserve at least something when you move data to multi to different nodes. And this is useful. This experience is useful probably for other systems as well.

And right now I think they like, they go to direction probably, I don't need to discuss because it's still under development, but you can check their website documentation and open issues. A lot of information is open.

[00:39:17] Michael: Yeah, for, I mean, foreign keys are notoriously difficult. I think that's something vi Tess still is, working on or might be soon even.

[00:39:25] Nikolay: I forgot the name unfortunately, but there is

some, um, uh, some name GitLab Invent, like used, invented, they introduced some, uh, uh, new concept for,

[00:39:35] Michael: We can find it and link it up. Right.

[00:39:37] Nikolay: yep.

[00:39:38] Michael: Awesome. Thanks so much

Nikolai. Thanks everybody and catch you next week.

[00:39:44] Nikolay: Bye-Bye.