Tractable

In this episode of Tractable, Orb CTO and Co-founder Kshitij Grover and Gwen Shapira, Co-founder of Nile, explore the cutting-edge world of multi-tenancy and serverless Postgres offerings for modern SaaS applications. This podcast episode delves into Nile's innovative approach to virtual tenant databases, the complexities of managing mixed data blocks, and the challenges of building a database offering tailored to diverse customer needs. Gwen uncovers insights into the decision-making process of building on Postgres and the relentless focus on reliability for ongoing operations.

What is Tractable?

Tractable is a podcast for engineering leaders to talk about the hardest technical problems their orgs are tackling — whether that's scaling products to deal with increased demand, racing towards releases, or pivoting the technical stack to better cater to a new landscape of challenges. Each tractable podcast is an in-depth exploration of how the core technology underlying the world's fastest growing companies is built and iterated on.

Tractable is hosted by Kshitij Grover, co-founder and CTO at Orb. Orb is the modern pricing platform which solves your billing needs, from seats to consumption and everything in between.

Kshitij Grover [00:00:05]:
Hey, everyone. I'm Kshitij, cofounder and CTO here at Orb. Welcome to another episode of Tractable. Today, I have with me Gwen Shapira. Gwen is the cofounder of Nile, which is a serverless Postgres offering for modern SaaS applications. Previously, Gwen was an engineering leader and architect at Confluent leading the cloud-native Kafka team. Gwen, really excited to have you. Thank you for coming to the show.

Gwen Shapira [00:00:28]:
It's so exciting to do this with you. Thanks for inviting me.

Kshitij Grover [00:00:32]:
We're gonna have quite a rich conversation. Before we dive into Nile and the architecture, I'd love to just hear a little bit about your background and what you've done leading up to and then I'm sure we'll incorporate that into the rest of the chat as well.

Gwen Shapira [00:00:48]:
My background has been in software engineering, especially in the data space. I'm old.
Many years ago, I took a job as a software engineer, and then, the data architect left, and I was asked to fill in. And the company hired someone more senior. I learned a lot from them. And, basically, from there on, I just did data, starting with Oracle and then a bit of MySQL.

Gwen Shapira [00:01:22]:
And then, when big data became a thing, I started doing Hadoop, and from there streaming data with Kafka and Confluent. And now we are doing serverless Postgres at Nile. I guess my entire career is all databases all the time.

Kshitij Grover [00:01:42]:
That's what's interesting. I think you've seen a lot of those trends at honestly, some of the leading companies at the time that those trends were really popular. And I wanna dig more into Nile as a point because that's what you're currently working on. As I understand it, Nile's core architecture is predicated and based on this idea of native multi-tenancy. So maybe we can start there. Can you give us some background on what that means? What is the native multi-tenancy? And maybe we'll start with the use case of why is that so important in B2B software? And then we can talk about at a technical level, what implications that has.

Gwen Shapira [00:02:19]:
By saying the obvious, if you're a B2B SaaS, you have customers. And those customers, they may be small, they may be big, you may have a mix of sizes, but they're critical to everything you around your business, around your service, around everything that you provide. And they're going to be quite central to your data model as well. And there are tons and tons of companies like that. One may say even most of them. And then it comes as a surprise once you start thinking about it. Like, how come our databases don't know about this core idea of a business?

Gwen Shapira [00:03:03]:
And this is a thing that was around from the beginning of databases, like, 40 years. And for 40 years, or in my case, I'm not that old, so maybe for 20 plus years, through all those databases, all those different data systems, every single time, do we want to give a database for each customer, which gives us a lot of flexibility and a lot of isolation and good security? Or do we want to do the multitenancing and just put all customers in the same database? And if they grow, we'll shard it out, essentially. And every one of those has pros and cons. If you have bigger customers, you probably want to dedicate more resources to them. They're more sensitive. They may ask you to do more things like, please restore my database.

Gwen Shapira [00:03:52]:
And if you have tons of customers, you probably don't really want to run 100,000 different databases. So you don't do that. But then if you have a database per customer, you cannot really run large reports. So you can have to balance a bunch of trade-offs. And when you get started, you don't even quite know what the trade-offs will be. So many companies I saw that I'll go after very large enterprise companies, but guess what? I couldn't, so I went and sold to 5,000 small companies.

Gwen Shapira [00:04:21]:
You probably want to design around those concerns. And when you talk to a lot of companies that serve customers, which again is pretty much everyone, You keep running into these things where there are massive incidents around, I tried to do something for one customer, but then everything blew up. We had one customer grow really large or had more traffic than we expected, and suddenly everything went on fire. Very obvious is that, to us, there is a need to give the problem in models that developers like, which is the multi-tenant, show all customers into the same database but still deliver the kind of functionality that provides isolation, that provides scalability, that provides a predictable performance, all those security, privacy, all those things you really you do the best of worst words. So this is basically what Nile is trying to do, to minimize the trade offs, give a really nice developer experience, and give all the capabilities you would get from isolated customers.

Gwen Shapira [00:05:37]:
And also all the capabilities and cost savings you would get from the multi tenant model. And we call it virtual tenant databases. That's our key idea. We put the multiple small databases with a lot of isolation deep inside Postgres. So our entire architecture is derived from the ideas that you if you could, you would really want a 1,000 different databases. But you don't want the trade-offs involved who's doing it, so. We are going to give you the boss.

Kshitij Grover [00:06:09]:
And one thing I wanna talk about is how this differs from a journey a company might go through on their own. Right? So lots of companies, as you were just saying, maybe they start out with one, let's call it a Postgres database, and they vertically scale it until they run out of instance sizes. And then they realize, we have to now shard this database. Maybe their data model already is shard aware. Maybe it's not. They have to rearrange their business logic. Now they have, 2 or 3 or 10 or a 100 databases.

Kshitij Grover [00:06:37]:
There's a bunch of now management overhead on the dev ops side of thinking about multiple databases. But also, oftentimes, if you're getting to this place in a path-dependent way, maybe there's a bunch of business logic or application layer code that also has to think about these databases being separate distinct entities. So tell me how would something like Nile, having used this as the starting point, differ from a setup where you've gone through the sharding process and maybe you solve the scalability and isolation problem? What is the distinction there?

Gwen Shapira [00:07:13]:
So that's exactly what we're trying to solve because we've all been through this journey. So you start with basically your I'll put multi-tenant model. I'll put a tenant ID over everything. First of all, we auto-shard it for you. Without doing anything, you will actually run on a bunch of physical machines and get all the CPUs. This is straight out of the box.

Gwen Shapira [00:07:38]:
The second thing is that we actually force you on a specific programming model where when you make the connection. You specify what tenant you're connecting to. And we also give you the option to specify which user disconnection is for, and we have an SDK that kind of takes it all the way from the browser, the user identity. And then we even validate. This user is allowed to get a connection to this tenant virtual tenant database. We do the routing for you to the correct child, obviously. The data is isolated the moment that you are you're saying I'm connecting to this tenant.

Gwen Shapira [00:08:18]:
No matter what query you write, you will not see any data that belongs to anyone else. So it saves you from a whole variety of testing and mistakes that can be made and so on. So this is useful. And then as you grow, we charge things for you. Now sometimes you get customers with special requirements. Let's say you have someone who really needs to be on their own device or they need they already tell you they're going to need vast amounts of memory or whatever requirements they have.

Gwen Shapira [00:08:50]:
They need to be in they want to be in a database that center in Singapore for that matter. It's all of this is basically either a single click or a single SQL command, update tenant set region equals, and that's it. So it replaces having basically, we're saving you from having to build nine yourself. Which we've all built several times in our career and feel like it's time to stop doing it again and again.

Kshitij Grover [00:09:20]:
One thing that I find interesting is when you build something yourself, one advantage that you have is that you understand the trade offs you've made. So one thing I'm curious for your opinion on, is there a point where this multi tenancy setup might get too magical or abstract away too much? And I'll give you some examples. Let's say that I don't know if Nile does this today, but let's say you go so far as to have, I don't know, different encryption standards on different databases or even, like, different indexes because the workloads for each tenants are different. One concern I have is that might end you in a place where it's hard to reason about predictable performance because you don't necessarily have the mental model of this database performs like this in this workload, in this test because Nile has done some magic for us under the hood that maybe the queries are optimized for that. So I'm curious.

Kshitij Grover [00:10:10]:
Do do you think that's a problem or a concern, or do you think trying to take away or abstract this tenancy concept even in the face of those potential challenges is still not very beneficial?

Gwen Shapira [00:10:22]:
So this is something that we run into a lot when we run Kafka as a service at Confluent.

Kshitij Grover [00:10:29]:
Okay.

Gwen Shapira [00:10:30]:
Because some people don't really care about interns. They are small teams trying to move fast. They will just they prefer not to think a lot about Kafka or not to think a lot about how Postgres is going to scale for them and how to tune and optimize and so on. And then a lot of magic is absolutely the right amount of magic for them. On the other side of the spectrum, we absolutely ran into customers who are like, it was especially a problem. My team shipped auto scaling.

Gwen Shapira [00:11:00]:
And a lot of customers are like, I do not want you to autoscale for me. I have my own algorithms. I have my own experience. I know when I want to scale, and it will scale when I tell it to. And if I you'd I don't tell it to, it should not scale. And we understand that.

Gwen Shapira [00:11:15]:
And I always think that even if you absolutely don't want it to autoscale and you want to control everything, we can still make it easier on you. Make it a nice update tenant set capacity or something, and it will not be a whole operation that takes weeks before a big event. So we we all also hope to help those people, and then there's definitely people who really some people are DBA. They're really attached to actually running their own database and making every single decision. And some of them are incredibly good at it and know exact and they've been in the company for a long time. They know exactly what they're doing. The company will never replace someone like that. And this is fine.

Gwen Shapira [00:11:58]:
There is so many other processes in the world. Nile does not have to be for people who don't derive value from Nile.

Kshitij Grover [00:12:05]:
That makes sense. I like that. It sounds like it's a matter of and now it gives you the tools to do something like this safely, but you can still tweak the default such that you don't have to opt in to every single piece of functionality or every single auto optimization under the hood. It's interesting to start with multi tenancy as the kind of root of the architecture. Did you have a sense

Kshitij Grover [00:12:20]:
of other different points that you thought, were equally as important in modern B2B SaaS? I'm curious. Was it your background at Confluent and maybe your cofounders that motivated the multi tenancy specifically, or did you think through, okay, exhaustively, what part of the programming model is broken and multi tenancy just fell out as okay. This is really the thing.

Gwen Shapira [00:12:46]:
Yeah, it was definitely a journey. So when we started out, the only thing we knew is that Confluent had about 3 times more people working on SaaS on the cloud infrastructures, and it had working on Kafka, which was the core product and this even though we are the people who hired them, we knew that they were all needed. It didn't feel exactly right that this is the set of the world. So we're like, we have to find a way to make SaaS easier. That was it cannot possibly require hundreds of people over years to build a SaaS product. And then when we started, we played with a lot of ideas.

We started with kind of a workflow manager, and we had some ideas from Kubernetes on how to integrate it with control plane. Mhmm. This seemed pretty promising, but we had hard time taking it off as a business, I would say. Like, we were struggling with product model fail. People understood Kubernetes in different ways. They didn't quite knew what we're talking about. And then one day we talked to, actually, one of our early design partners, and we explained something that we're doing for time maybe a 100. And then he said, oh, so I basically need to use you as a database.

Gwen Shapira [00:14:09]:
And we're like, yes, exactly. You need to store all this information here, and then you get all this value about data access. And you could see the light bulb go off. And so we went back from that meeting, and I still remember, like, me, Ram, and also Norwood, who is our technical, founder. We were like, So we're like a database. Yes? And that building a database has so many advantages. Right? Because you can enforce things at the layers that nobody can really get around, like nobody can accidentally leak tenant data. We can solve all those problems we're seeing other companies have around charting, around what if I need to restore data to 1 tenant in this multi tenancy model.

Gwen Shapira [00:14:58]:
Like, it just stuff started clicking into place. If we do this, oh, those different regions, we can put tenants in different regions and that's a big deal for a lot of people. So, yeah, it all coalesced together. So I guess it was, both our experience, but also things came to place. And also building a database has the huge advantage for a company that your customers will not want to build a database themselves. They may choose a different post request, which is fine, but they're not going to say, nice idea. I'm gonna do it myself.

Kshitij Grover [00:15:30]:
And not only does every company need some sort of database, but I think to your point, every company it is very easy to resonate with this problem of I have end customers. I need to organize their data potentially differently, and so this is something I know I will run into at some point, at least in the happy path successful outcome. So let's dig one step deeper into the technical architecture. I was reading more about Nile's architecture under the hood. So I noticed that you push, and you mentioned this briefly, you push the multi tenancy story all the way to the page level in Postgres.

Kshitij Grover [00:16:03]:
So let's talk about that. And maybe just for background or context, let's say I'm just using regular Postgres, single machine deployment. Where does my data live in that world? And then how is this different in something like Nile?

Gwen Shapira [00:16:15]:
So basically, when you do create table, then the table gets placed on a default tablespace. A tablespace is basically a bunch of files and disks. The file has blocks. Your table is basically a series of blocks in this file. Every row fits into 1 or more of those blocks. And then when you do select, the database goes to the file via an index or something similar, picks the correct blocks, puts them to memory, then they stay in memory as cache for as long as you use them. But it's essentially the same blocks of data in memory on our disk.

Gwen Shapira [00:16:56]:
And if you do the multi tenant as people normally do, then you are very likely to have single blocks that mix a bunch of data. This is almost considered taken for granted. And obviously you'll need a bunch of conditions to make sure that you're not selecting someone else's data. So where you do this, where you do that. And then because everything is mixed up in the same physical blocks and those are the blocks that gets backed up and everything, then if you need to restore tenant, then it's a bunch of pain. How do you do that? If you need to, move the data of one tenant to somewhere else, that's gonna be a bit of a problem. So for now, we basically rearchitected the concept of the table spaces, essentially.

Gwen Shapira [00:17:45]:
And we make sure that when you create a table, it's actually not necessarily a single file. It actually basically, every time you insert tenant's data into that table, we make sure that you have at least one file for that tenant on the table and we make sure that your data gets stored both in memory and on disk with the idea that your data is a 100% fully isolated.

Kshitij Grover [00:18:16]:
Got it. And so that's interesting because how should I think about the change that Nile has made to Postgres? Is it true that this is a change to, the storage layer and not the query engine, or am I thinking about that wrong? Like, at what level of abstraction have things changed?

Gwen Shapira [00:18:35]:
We actually changed things in quite a few levels of abstraction. First of all, the storage layer changed. We have our own storage manager.

Kshitij Grover [00:18:43]:
Okay.

Gwen Shapira [00:18:44]:
And then at the query layer, we actually have not changed a ton, but we are nearing a point where we'll have to. One of the things that when for example, let's say that you connect as a developer and you want to run a report across many tenants, you actually need to convert select star from all the data to a lot of smaller queries that get distributed to all those separate virtual ten databases. So it's a bit like when you select from a partition table, essentially, and it has to go over all the partitions. So that's a good way to think about it. So that's part of it. And then even before that, we have a proxy and a router to make sure that you get routed your query gets routed correctly. We basically and then we have an entire transactional system to handle.

Gwen Shapira [00:19:35]:
We don't do distribute transactions because we assume that most transactions will belong to a single tenant. But when it comes to DDL, which in post is transactional, we do have to make sure that you can do a transactional create table and create indexes and all those things. So we actually have a distributed engine that is plugged in as an extension to, the hooks that Postgres uses both for DDLs and for transactions.

Kshitij Grover [00:20:05]:
And then that's interesting. So if I use an instance of Nile today, I sign up for your SaaS offering and I make a query, that query needs to get routed to the right physical instance, and that routing or maybe just even that lookup table, is that metadata store, let's call it, hosted on your end? To what extent is this a fully managed offering versus I can see some of those internals and play around with them? Give me a sense of how much of that is attracted away from me?

Gwen Shapira [00:20:37]:
The placement of tenants is very obstructed away from you at the moment. We do want to expose some decisions. So for example, region is something you'll want to decide about. Yeah. Whether the tenant is placed with other tenants or you want to give this tenant some dedicated compute power, this is something that you get to decide about. But as we did the placement is something we obstructive and with the ideas that this is exactly when you need more capacity, we believe what you'll want to do is tell us, "Hey, I want more capacity and not necessarily decide which tenant you want to move to use that new capacity".

Kshitij Grover [00:21:13]:
It's all quite interesting because I think in a lot of ways, most technology seem to be moving in this direction where you only want someone to have to think about a piece of complexity when it actually matters to them rather than upfront. I'm sure you saw this at Confluent where as you were saying in the when you're starting with Kafka, maybe you wanna think of it as just message in, message out. And then maybe you wanna think about partitioning or repartitioning, and then maybe you wanna think about multiple consumer groups or whatever the axis of complexity that's increasing is. So I'm curious, has there been anything as you've been developing, Nile, that has that you've been surprised, oh, people care about this, and I thought we could abstract it away, or maybe even the opposite. Has anything surprised you in that sense?

Gwen Shapira [00:21:58]:
Not as much in what people care about. Like, maybe even the reverse. Like, usually, we think people will care about stuff that they end up barely using. Like, I thought people use explained plans a lot more than they actually do.

Gwen Shapira [00:22:13]:
Most people don't even care about transactional guarantees. And we lost so many nightless night over that, and then so many people don't care about it. So I would say the reverse is the case. I'm trying to think if we had cases where people cared about things more than we expected. And I'm I think we care about things a lot, probably more than our customers.

Kshitij Grover [00:22:39]:
That's one thing that I am guessing is very hard about building a database offering, which is every single customer you have must care in very different ways about parts of the product. As you're saying, maybe a lot of people don't care about transactional guarantees. Maybe some people really care about region placement. That's gonna be a strong feature request from them because maybe for compliance reasons. Some people care a lot about, I don't know, performance, and so you might really optimize things like cache locality or shard placement or something like that. That just seems like a fundamentally a pretty hard problem to to tackle as a small company trying to productionize this this database. So how do you think about navigating that as a just as like a product exercise?

Gwen Shapira [00:23:30]:
So our core belief is that if we build the correct building blocks and do a really good job on the infrastructure, other stuff will just magically work, essentially. And in many ways, this has proven itself out. So if you build distributed DDLs correctly, it makes growing to other regions way easier. You have the correct transactional capabilities to even to go long distance. If you do a bunch of hacks, usually, you will need to rewrite your hacks every time product discover something. We really hired a team with really deep experts, tons of experience, and they built the best architectures that they knew based on, I don't know, probably 100 collective years of experience building databases.

Gwen Shapira [00:24:27]:
And a lot of those things that, some people would care about x, and they already took into account. I remember one really good example I went to one of our top engineers and asked, I don't understand why you are doing this transaction thing the way you're doing in distributed transaction model. And he said, Gwen, one of the biggest problems in databases is that someone takes a look on a table and then goes to have coffee. And in the 5 minutes in which she's getting coffee, effectively production is down.

Gwen Shapira [00:25:03]:
I'm going to do, say I'm saying with a lot of flock time out. We're going to avoid someone going to get coffee problem. And it's fantastic and so Nile solves the someone did something and went to get coffee and now what do you do problem? So I think that really having those experienced people on the team is such a key to anticipating wide range of customer problems that a single founder will definitely never single, and then tackling them very early on from the architecture level.

Gwen Shapira [00:25:37]:
That makes a lot of sense.

Kshitij Grover [00:25:38]:
I'm imagining one thing that helps somewhat with this, although I'm curious how much mileage you get out of it is the fact that you're still building on Postgres. So maybe tell us a little bit about that decision. I could imagine a world, and I don't know if you all ever thought about it, where you build this more from scratch. Right? Everything is from scratch. Multi tenancy is baked into the query engine, and entirely, you know, use the Postgres query engine. Was that ever a consideration or how much was Postgres being the starting point just obvious to you, especially now that Postgres is quite popular.

Gwen Shapira [00:26:14]:
I would say we haven't, for one second, considered building it from scratch. What we did consider for a while is to build on MongoDB, which is again another really popular data store for us, and we thought that maybe this would be a good bet. But Postgres architecture is really amazing. And there is something like, the ways that they made it extensible and a lot of hooks in the product and the ways that it felt like a product evolves very responsibly. Like, they've been evolving for 35 years. Tons of use cases.

Gwen Shapira [00:26:49]:
They have really good project governance around that. The moment we saw what Postgres lets us do, we were like, this is amazing. We have to do it.

Kshitij Grover [00:26:59]:
That's interesting because it sounds like it's a combination of technical architecture as well as community adoption and like you're saying, project governance. So across the board, I'm guessing this is why Postgres is such a popular database to to build on top of or build on because you have all of these things coming together.

Gwen Shapira [00:27:17]:
Yes. Exactly.

Kshitij Grover [00:27:19]:
The other obvious consideration when you're building a database is reliability. Everyone cares about this. If people are gonna depend on you in production, it's a big burden to bear. How has that been part of the journey so far? Is that something that, like you answered the other topic, is it every incremental architecture decision just has to take it into account in some deep way, or do you think there's some overarching way you think about reliability separate from evolution and future development and other parts of your road map?

Gwen Shapira [00:27:49]:
For us, it's, as you said, a very basic requirement. It's not exactly someone saying you can just, tack on. And this was, again, something that was really drilled into me at Confluent. We had an amazing VP engineering called Ganesh. And he actually, I think every meeting, he kept saying, nobody cares about all your features if the database if Kafka is down. And the same is true for databases.

Gwen Shapira [00:28:15]:
It doesn't matter how multitenant we are. If you're not available, it's going to be a bit of a problem. This is number one feature of a data store. And there is a lot of different ways that people think about data database reliability. So it's things like, first of all, especially since we are doing our own storage manager, we actually need to make sure that every single committed byte is persisted to disk forever and ever. So this is obviously a big part of the game. And then there is machine crushed because AWS is there is a zone crushed. Region is having trouble.

Gwen Shapira [00:28:53]:
So there's all those things. And then there is things like, I did something as a customer that is stabilized. Can Nile handle these things? There are things like planned maintenance. We're going to upgrade the version of Postgres. Can we do it with 0 downtime? Can we move a tenant from one place to another with 0 downtime? Right. So this is all things with various degrees of difficulty. Some of them are well known and well solved.

Gwen Shapira [00:29:22]:
Some of them are a lot more challenging. Zero downtime maintenance in Postgres is challenging. So this is an ongoing journey to get there.

Kshitij Grover [00:29:32]:
And why is that? Is that just because Postgres needs a hard restart when you do, let's say, a version upgrade and there's very you know, there's no way to get around that, so to speak?

Gwen Shapira [00:29:42]:
The hard restart is not the biggest part of it because you can imagine us moving tenants to another database doing a restart, moving everyone back. It's more transactional guarantees. Basically means that in order to do certain things, move someone from place to place, you need to make sure that there is no transaction in progress.

Kshitij Grover [00:30:02]:
I see.

Gwen Shapira [00:30:02]:
And in order to guarantee that there is no transaction in progress, in some point, you have to basically make sure that not take a look and make sure that not a new transaction can start while you're flashing blocks to disk and moving things around. Yeah. So how fast you can do it is, definitely an an interesting challenge to work with.

Kshitij Grover [00:30:24]:
One thing that kind of reminds me of is I'm wondering about what all the building blocks of Nile today are. Right? So we talked a little bit about the storage layer. What kind of cloud components are you using under the hood? Are you doing continuous backup to S3? How do you do scale to 0? Tell me a little bit about that overall kind of shape of how Nile is deployed.

Gwen Shapira [00:30:42]:
It's not very fancy, so it's almost embarrassing to talk about in a way. But we are running on top of Kubernetes, and most of the automation is homegrown. And as an engineering manager, there's always the challenging point to where you ask your team to look into different operators, and maybe there is something we can borrow and build on. And they spend 2 weeks looking at all the operator, and then they come and say, we have to write our own. And then you're like, do you did you actually do all the research, or do you just want to write your own operator because it looks like more funds than using someone else's? But in this case, they actually made a good case for it that a lot of what other operators do assume a lot of things about the architecture, about how you do replication, about how you do log shipping. And the moment you separated the compute to its own layer and you have a different way of moving logs from place to place and all that, they are not as useful as one may hope. So you have to write a lot of this infrastructure.

Gwen Shapira [00:31:54]:
And it's the fun of startup is that you always have too much to do with too few people to do it with.

Kshitij Grover [00:32:00]:
I'm not sure everyone would be super familiar with what exactly it means to separate compute and storage. So I'm curious if you can talk a little bit more about that and what it means for a database to scale to 0, especially maybe in the context of multitenancy.

Gwen Shapira [00:32:14]:
Early on, we mentioned that the basic structure, that you can have blocks in memory and then blocks on the disk. And scale to 0 basically means that you no longer have blocks in memory. You no longer have blocks on the you still have blocks on the disk. They're not going anywhere. You no longer use any compute power, essentially, and you no longer have the memory. And this is different from the separation of computer and storage is basically being able to have the physical blocks on the disk in one place, and then spin up different computes and have them start with blocks in memory from no matter, regardless of the fact that you didn't have to actually copy data.

Gwen Shapira [00:33:08]:
They simply connect to this source of data and can start running queries and reusing compute if they didn't before.

Kshitij Grover [00:33:17]:
I think one thing I find interesting about these sort of cloud first architectures, like the one you're describing, is you can do a lot of interesting orchestration stuff and even just like cloud cost optimization stuff in the SaaS model. But how maybe disruptive or how much does it affect the, like, local testing environment? Obviously, your database is something that you're you're running in your maybe local sandbox, maybe on your Mac, maybe even, like a CICD environment. So how do you think about that story with something like that?

Gwen Shapira [00:33:48]:
This is a great question because it's something that I, as a founder, was very much blindsided by. Because we started developing it as a cloud service only, essentially. And we're like, yeah. If you want the dev database, they are extremely cheap. Just create the dev database and, yeah, and go to town. Why do I need to worry about local development experience? Yeah. I think it took me less than a month to basically do some pair programming with one of the engineers on my team.

Gwen Shapira [00:34:22]:
And then one of the things he did was spin up a local Postgres. And I'm like, this is how you're testing? And he's like why don't you test against our dev environment in the cloud? And he's, "Oh, it's fine. I prefer to test that way", and I never got a good answer on why he prefers to test that way. But literally every developer I talk to prefers it that way. So there is clearly something to be heard about running a Docker container locally.

Gwen Shapira [00:34:49]:
And maybe the ability to start it, stop it at will, or something along those lines that developers really and apparently you cannot really move them away from that. And we are working on single container image that will have. So developers can keep doing that. We don't think it's gonna be particularly good for production. You should still use our cloud service for production. But we basically like, This we cannot you cannot really change how customers want to do their job. It's their job.

Kshitij Grover [00:35:22]:
And it sounds like today, if I'm using Nile, I can still use the the cloud offering for my dev database. I can also use a Postgres instance locally running on Docker, but having a Nile specific image will make sure that the programming model is fully consistent. Is that right?

Gwen Shapira [00:35:38]:
It has a lot of benefits. So if you're actually using our SDK that kind of allows you to take the the identity of user all the way from the browser to the database, then you have to have our container because you need actually our REST API services, essentially. If you don't, then I think probably abnormal Postgres will be okay. I honestly, personally, I wouldn't trust it because I want to test on the same systems that production will be on. But I think it should theoretically, it should be okay.

Kshitij Grover [00:36:11]:
Let's move to a maybe new topic that is hot these days, which is AI. Everyone's talking about AI. I know Nile, in fact, launched with support for embeddings at scale, and I think I wanna learn a little bit more about how you think about Nile as a product fits into generative AI applications, maybe the RAG use case specifically. I know you all have published some content around that but give me give me the kind of AI intersection with Nile.

Gwen Shapira [00:36:38]:
Honestly, rug felt really lucky because, it's as a use case, you're basically going to most use cases involve ingesting a lot of customer documentation and then letting customers do some kind of semantic search on it. Or maybe graph search, but similar things. And it turns out that the ones that you really don't want is accidentally mixing up different customer data. And a lot of people are still learning about things like vector indexes and so on. Vector indexes have massive scalability challenges. And suddenly we don't really need a vector index for all your tenants. You can get a vector index per tenant, and it's going to be transparent.

Gwen Shapira [00:37:24]:
And guess what? We support it out of the box, and you don't have to do anything. And we didn't have to do anything either. It just worked. That's what I mean by saying that if you get the architectural building blocks right, it's amazing what happens next. Essentially, so it landed on us as a use case.

Gwen Shapira [00:37:41]:
And, obviously, we talk mostly. We sell mostly to new companies. Yeah. And it won't surprise you that a lot of them are very interested in the use cases. So, yeah, we were a bit lucky is that we are just we give some things that everyone wants, which is privacy between tenants, apply to RAG, and scalability apply to RAG. And we didn't have to do a lot to have it.

Kshitij Grover [00:38:08]:
And I'm curious, is that something where people are using Nile for just their RAG use case or is it that they're using it for mixed workloads and just like everything needs to be tenant aware or or tenant isolated, obviously, RAG also needs to be tenant isolated. How does that mix look?

Gwen Shapira [00:38:23]:
We haven't seen anyone use nudge just for RAG, and I will be very surprised if I see it. I will not object, but it will be a bit surprising. And the people who use Postgres for vectors is they pretty much arrived to the decision because they're like, I don't want to deal with other databases, other clients. I just want to keep on using the thing I'm used to and now put some new type of data in it, and that's it. So everyone is using it for their entire SaaS.

Kshitij Grover [00:38:53]:
And actually maybe related to that, let me ask you a broader question, which is, as an industry, like, how should we be thinking about specialized databases versus general-purpose databases like Postgres? Right? Is it there's a bunch of offerings that are very specific to vectors or even things like Elasticsearch is very specific to certain use cases. What is your perspective on this? Are those offerings going to persist? Is Postgres gonna dominate maybe 5, 10 years from now? Pick your timeline. What is the what is the future of the database industry, so to speak?

Gwen Shapira [00:39:27]:
I think they will so there's a few things. First of all, there is use cases that are not your actual live customer facing data. So Elasticsearch will always be fantastic. Everyone uses, as in remember, Beats to get their data to Elastic, and I don't see this going anywhere. Logs are going to go to Elastic forever. Same thing for specialized data warehouses or big data systems for the the warehouse kind of stuff. I think this is forever.

Gwen Shapira [00:39:59]:
For customer facing data, for me, it almost seems if you're asking this question, keep on using Postgres. The moment you need something else, you will likely know it. You will know that, yes, I know exactly why I'm going to need this NoSQL system. I need Cassandra for reason x. I need the PG vector is no longer good enough. I need Chroma DB for reasons. Right.

Gwen Shapira [00:40:26]:
If you don't know any reason to use something else, I'm having a hard time. How will you even choose? How will you even evaluate?

Kshitij Grover [00:40:34]:
Awesome. This was a great conversation. I think maybe last thing I wanna end on, which is what are you most excited about that you're building at Nile today? And it could be a feature release or some technical architecture problem you're solving. Let's wrap on that.

Gwen Shapira [00:40:49]:
It's a problem being a founder because the thing I'm most excited about, I'm least allowed to talk about because we haven't made a big announcement yet. I'm going to have to decline. So is it wait for a big announcement?

Kshitij Grover [00:41:04]:
Either way, that sounds exciting. Thank you so much for for being on the show. I appreciate your time. I appreciate all the technical details, you were able to dive into.

Gwen Shapira [00:41:13]:
Thank you so much. It's been a lot really fun conversation for me. Awesome.