Watch this episode on YouTube:
https://youtu.be/41YJAflfnP4
Transcript
Jeremy: Hi everyone! I'm Jeremy Daly, and this is Serverless Chats. Today I'm chatting with Rafal Wilinski. Hey Rafal, thanks for joining me.
Rafal: Hi, thanks for having me.
Jeremy: So you are the creator of Dynobase and an independent AWS consultant. Can you tell the listeners a little bit about yourself and what Dynobase is all about?
Rafal: Yeah, sure. As you mentioned, I'm founder of Dynobase, professional graphical interface for DynamoDB, and also right now an independent AWS consultant, mostly focusing on serverless solutions. I'm deeply passionate about AWS and mostly serverless, since, I think, 2016, because I attended the first Serverless Conf back in London, and that's why I became so excited about this whole field. I'm running my own blog which is called Servicefull, because we had so many discussions about what serverless is and how bad a name that is for a paradigm of technology.
So I decided to actually steal the term coined by Patrick Debois, which is Servicefull, because full of services. I'm writing about serverless, about cloud, and I recently merged that with my own page, but you can still go there. It's going to redirect you. Before going all in into AWS, I was actually making mobile games. I've made a few of them, one of them became even quite popular, which was called Voxel Rush. But my parents were saying that making mobile games isn't a real job, so that's why I transitioned into making web and cloud, and now I'm here. Less than a year ago, I started Dynobase, which was trying to solve my problems with UX and UI with DynamoDB, but I guess we'll talk about it a little bit later.
Jeremy: Right. Yeah. And actually, that's why I want to talk to you about today is Dynobase. So you and I have been communicating for quite some time. You know I'm a huge fan of DynamoDB. I love just the scale of it. I love what you can do with it. Rick Houlihan opened up, I think, everyone's mind or minds with what you can do in regards to relational structures in there and how you can access data in different ways in the single table design stuff, which is quite fascinating. But I really want to get into Dynobase, what it does, what's the purpose of it, but maybe we just start at the beginning. Why did you create Dynobase?
Rafal: Yeah, sure. Actually, there is sort of I think quite interesting story behind it because it all started more than a year ago when I was working at X-Team. I was working on kind of quite a big project for the educational space. I was working as AWS DevOps engineer, I was setting up infrastructures, it was all set up on containers. But we received a new requirement to create a fully real time community platform. Our architects, Raynard, which is a great friend and also an engineer, approached me and said that, "Hey, I know that you're super interested in serverless. I know that you've contributed some pieces to serverless framework, and you write about it. So maybe we can try evaluating all those new tools that AWS released, like AppSync, Amplify, DynamoDB, and try to create something on top of that."
And that sounded super good because I could finally use serverless technologies at my day job. So we immediately rushed into evaluating those tools. We watched a lot of random session including Rick Houlihan about single table design and it was mind blowing and charming at the same time. But when we were evaluating those tools, we realized that if you would like to use AppSync, we probably have to use Amplify and if we go with Amplify, we can't go with single table design, because if you're using Amplify, you're creating graphical schema, which is then translated to separate DynamoDB tables, which is not working with single table design super good. So we decided to use serverless framework DynamoDB single table design to create everything. And we rushed into the implementation without having thought all about testing processes, about debugging because we decided to learn as we go because we had that opportunity.
And when we started implementation, obviously, there was a lot of bugs and there was a lot of mistakes. We had our single table design schema, designed very good, but it was a new field for us. So we obviously committed a lot of beginner mistakes. While we were doing that, we started checking a lot of data, a lot of records inside DynamoDB just in console, because we needed to check if that specific record was inserted correctly if the data that it's inside database is actually good, or we wanted to modify some records manually. And while I was doing that, I realized that I'm spending definitely too much time switching between browsers, between regions, between AWS accounts, because, for instance, you can't have open two separate AWS regions, two separate AWS consoles for two regions in one browser. It was actually a pain. The same for regions. So you had to actually kind of hack your browser. You had to have opened many browsers and it was just super messy. There was no bookmarks. There was no history. Scanning speed was quite bad.
So I decided, I'm an engineer, I don't like wasting my time fighting with software. I like automation, I like solving stuff. So I decided to hack something really quickly using React and Electron, like a tool which is going to allow me just query in a little bit easier fashion, and it's going to be easier. Yeah. And I've asked a few of my friends who are also working with DynamoDB, which are engineers, if they are sharing the same pains as I am, and it appeared that a lot of people actually have the same problem of DynamoDB, that the DynamoDB itself is super good. It's super powerful database. It's enabled things that you could never imagine before. But the way that you access it, the way you modify records directly when you debug it, it's not super good, especially if you're working with the local version of DynamoDB. You can't easily access what's inside.
And while I had that confirmation that it's not only my own problem because I initially wanted to open source the solution. I felt that, hey, I was working in open source space for the past five or even more years, I've committed so many lines of code, I've committed so many things to serverless framework and to other tools, and I haven't received any money for it. It was greedy approach, but, you know. So, I decided actually to turn it into a product and try creating a business of it because it seemed that many people would pay for solving their problem. And when I realized that it can actually work. I had this moment of big excitement and I gained a lot of power just to work tirelessly, even after my day job, just to finish this UI. I think I did the first prototype in a month.
It took me something like 100 hours. I was about to release it. I announced on Twitter to everyone like, "Hey, in a week, I'm going to show you something really good I've created ... to contract with myself and with my audience that DynamoDB is about to get much, much easier." And before releasing the alpha version, while browsing Twitter casually one day, I realized that AWS just released NoSQL workbench.
Jeremy: Right.
Rafal: And the moment I saw NoSQL workbench, I was devastated because I've spent just the last one month or two. and tireless countless hours, and so many nights working on something that I truly believed in. And then comes AWS with some of the world's best engineers and working on the same idea at the same time probably killing your idea because if AWS is doing the same thing as you, they probably made it better, right? So even without the loading the software, even without starting it, the day that NoSQL workbench appeared, I was devastated. But the next day actually I decided like, "Hey, maybe let's try, maybe let's download it, maybe let's run." And one thing I've realized is that NoSQL workbench appeared to be a super good tool for designing your data model, for designing your single table design, for doing all the things that you're doing before actually jumping into implementation and actually working in development.
While my pains were a little bit different. We already had our single table design set up. We only had problems with inspecting the data that is already in the table. And I felt that my NoSQL workbench was actually replicating many of the quirks and different weird things from the AWS console. So, after trying it out I'd say, "Hey, it might sound similar, but these are two definitely different products, they are solving different use cases." And that feeling gave me enough confidence to actually push forward and release the alpha. And I released it, I think like few days after NoSQL workbench, which might sound a little bit weird, and you know, AWS released day too, I released my own, but, hey. And their tool is free; mine is paid, so this is even more weird.
And I release it and after I released it, I've let it go. I said, "Hey, I can finally take a breath. Now I can just rest and see how the money is flowing." And you know, guess what? The product wasn't super polished. And not many people were interested in buying it when you have a free alternative, right? I've gradually started to implement the fixes, the changes, because obviously it was not a super polished product at the very beginning. But somewhere around beginning of this year, one of the customers approached me saying, "Hey, you've created quite promising piece of software, but I think you can do much more when it comes to productization of it when it comes to the growth, to the sales, to the strategy." And he proposed me a quite ridiculous thing, which is, "I can help you with that, maybe we can become co-founders." And without actually thinking too much I've jumped on a call with this guy and he seemed fine.
So I decided like, "Hey, what's the worst thing that can happen? I can only just waste some time, but there's so much to gain." And we decided to actually collaborate, we signed a contract which was less than one page saying that all the expenses and all the revenues is split 50/50. I take care of code, he is taking care of growth, and we started working. We designed the application from scratch. We started thinking about both power users and users who don't know what's even a GSI.
Jeremy: Sure.
Rafal: And we released a version 2 and I couldn't be more satisfied from what we've created so far. And I'm super happy about the state of Dynobase. It's not only helping me, it's helping a lot of my friends. There are already over hundreds of engineers using Dynobase. There are enterprises and teams saying, "Hey, this is great." And yeah, it was a super great thing to make and each of the single dollar I make on the Dynobase is much more satisfying than even $1,000 you make consulting from your day to day job. And that's how it rolling right now.
Jeremy: Right. Awesome. Well, I mean, that's a great story. I mean, and that's one of those things, too, where I mean, I'm a big fan of open source. And I've done the same thing. I've put a lot of code in open source. And it's great to see people benefit from it. And you wouldn't be a true AWS user if they didn't rebuild something that you already built, too. So that is a common tale. You implement something and then AWS comes up with something that just solves it for you. But, so, I think what's really interesting about what you're doing with this product, again, is you're taking a different tact from what I think NoSQL workbench is doing. Because it's very much so design specific and I think you have a lot of those tools as well within Dynobase. But what about the differences between the console because exploring data is just a super pain. So what am I going to see differently if I'm using Dynamo, sorry if I'm using DynamoDB console versus using Dynobase?
Rafal: Yeah, sure. So I think the first thing I've mentioned during my story is that working with multiple regions, multiple accounts, and multiple tables is much easier because we actually have tried to replicate the experience that you get in a regular internet browser. You can really easily switch the profiles, switch the region, switch the table, you can open multiple tabs, you can take a look at 20 of your tables across many regions, without switching contexts, logging out, logging in. You know, it's a pain. So that's the first thing you can much faster access your data.
The second thing is that we know that many people are not super aware of DynamoDB specifics for instance, indexes. So we decided to abstract away the concept of indexes, LSIs, GSIs, and stuff like that. The way you query data inside Dynobase is like, for instance, you would like to get a user with an email johndoe.com, let's say that's an example. And as a beginner DynamoDB, you don't know if that field is actually already indexed, because probably the table was provisioned on someone else. You don't know how it works. So the only thing that you know is that you would like to see all the records with the association email, johndoe.com. And in DynamoDB console, you have to switch between query and scan. And I even don't know what this skip query and scan is. I have to choose some kind of indexes. I don't know what's any of that. So what Dynobase is, is that it automatically figures out if we can use query instead of scan because it's much faster for the given attributes, for the things that you're looking for.
You can enter many attributes and we can find what's there fastest way to actually get that data. Once you have those fields filled and we are telling that, "Hey, we will be using query instead of scans to find this data because it's much easier." Also, in AWS console, you are capped to 100 items and you have to switch, you have to click the arrow to get the next page. And it's also slow. If you're running a scan, it sometimes can take even hours. Our solution is much faster simply because we use different algorithm used, we use different pagination settings, we are fetching, we are using search, sorry, scan segments, and it makes this whole thing much, much faster. The last thing is also editing the data because in DynamoDB console, you have to open this weird model and in Dynobase you can also edit the data in similar fashion, but you also have the same editor that you see in Visual Studio code.
Visual's great IDE. So when you click just edit this item, you see the same editor you see in VS code, and you can edit the JSON directly. But you can also click on attribute, double click it, and change the value and save it. We are also doing what most database tools is doing, that we are not immediately committing all the changes to the database. We are just kind of dry running, you're making modifications on your data. Once you've made modifications, for instance, 10 records or 10 entities, you can then decide to save that and I think it's just much safer. Yeah, and we have also a variety of other tools that our DynamoDB console does not have. We have a history of queries, we have bookmarks, we are generating the code, because sometimes you don't know how to write those expressions attribute values, expression attribute names. I remember the first day I started messing up with DynamoDB SDK, I had no idea how to write a single filter expression, right? And I think that's the problem many developers have. They know how to write SQL, but they don't know how the API works.
Jeremy: Right.
Rafal: Yeah.
Jeremy: So, yeah. So, that I mean, that's one of those things to where I just find that to be super interesting. Where it's like, there's all these little tiny features that can make a product better. And they're simple things, like the query history. That's one thing that drives me nuts, is that I'll be in the console, and I'll search for some ID, and then I'll get something back. And then you can't just open a new window with another thing or try to use a new tab to open something new. It's all just embedded. It's really tough to use sometimes. And then you have to go back and make all these different queries. And then you might have to go back, find something, change a record, or whatever.
And it usually involves opening, again, multiple windows and things like that. So I love how that just speeds up that development time or that debugging time, as you said. So, you mentioned this idea of writing queries and that you have the ability to generate the code for them and Dyno and the NoSQL workbench does as well, if you've sort of gone through a more, I think, more lengthy process. But I think just in general, it is not super easy for someone to just write a query. And that's just one of the challenges that I think developers face. So what other challenges do you see DynamoDB developers facing as they're trying to build out a solution?
Rafal: I think the there is just one big challenge. And the challenge is interconnected with many smaller challenges. And the challenge is to definitely change your way of thinking from the relational thinking because many developers use to work with Postgres with MySQL. Then you go to the Dynobase, well, to DynamoDB space, and there are no joints. You can't cross reference some tables, you probably hear from some people saying crazy things that you should put all your entities into one table. And it's crazy, right? The first time I've heard that I should put all the data inside one table, it was really crazy. So there's a educational gap. I think this is a challenge for developers. And there is the required, you need to think differently. You need to unlearn what you've learned already about relational databases. And yeah, we think that they're still needed, we still need educational resources. We still need tools and there is a massive gap to be closed. Yeah.
Jeremy: Yeah. I mean, it's not just the education either. I think that there's material out there, whether it's Rick Houlihan's, videos or whether it's Alex DeBrie's book now, there's some training courses. Certainly it's an investment to learn DynamoDB and do that. But what about the tooling? I mean, obviously Dynobase is one tool, NoSQL workbench is another tool. Are more tools needed do you think for people to really embrace DynamoDB?
Rafal: That's a good question. I think NoSQL workbench was definitely needed. Because if you are aiming to create single table design, that's definitely make it a little bit easier for you. More tooling, I think that, yeah, we still need more tooling because I kind of treat DynamoDB as a low level service. I mean, in the future I would like to see some kind of abstraction over the single table design because if you're already inspecting the table and see all the different entities in one table, it kind of feels wrong, even after Working with DynamoDB. I would like to see some kind of abstraction, which is separating all that.
We need better tooling. And I think that also your modules are also solving that problem. For instance-
Jeremy: Trying to.
Rafal: ... Dynamo toolbox. It's a brilliant tool, which is mapping the attributes and yeah, its super useful. And I think that also, I think you also wrote it on Twitter once that there is always need for more resources and for more tools because the same phrase rephrased in a different way can resonate with a person that is reading it, can finally click for that person. So the more resources we have, the more tools we have, the more freedom we have. And yeah, that's only a good thing, right?
Jeremy: Right. Yeah. I actually think that's a... I love this idea of repackaging, even if it's the same content, but just slightly a different way or a tool that works a different way. And as you mentioned, the DynamoDB toolbox is only for Node.js, right? It's just JavaScript. So that solves a specific problem for a specific group of people. But there's Python utilities, there's Java utilities, there's other things and there needs to be more because not everybody likes to deal with the same levels of abstraction. So I totally agree with that. So where is Dynobase going, though? I mean, what's the roadmap look like? Are you planning on doing some of these other abstractions or what's on the roadmap for you guys?
Rafal: Yeah, sure. So we also think we have kind of a mission of closing this gap between how great DynamoDB is and how few developers already know about it. We are aiming to solve that issue by two things, which is tooling and education. When it comes to education, we are constantly repackaging the content. I mean, like we have great sessions by Rick Houlihan, we have a great D-book about DynamoDB. And it's great. But as I mentioned before, I think some things that are repackaged are also kind of beneficial. Maybe just this diagram will work for someone, maybe this sentence said differently will work for someone. So we are constantly writing guides, we are making educational resources to make sure that more people understand, the more people use, and, yeah, so that's when it comes to resources.
When it comes to the tooling, we also think that educational gap can be partially closed by tooling. Imagine you use DynamoDB for the first time, I think there is a huge, you're going to approach a huge cliff because everything is super different. So what we are aiming to do with Dynobase is abstract away complexities and different things about DynamoDB and just let them start working with DynamoDB, and then figure out all the details later. For instance, you now are able to query the data and then you can only learn what is GSI. You can query the data and you can also generate the code that is ready to be pasted into your application code. Once you have a code that is following the best practices, that is working, you can just go back and see how it's working.
Actually, that's the way I got into programming. I started modifying some source files from games, maybe some configuration values. And that's how I learned programming. So I think that when Dynobase is generating code to query or to scan, it's also helping because it lets you use the database without actually knowing what's happening inside. It's definitely important to know how the database works, but at the very beginning, we can make the process a little bit easier for the people. One thing that we also identified is that most of the back end developers are already familiar with SQL or "S-Q-L." And you cannot use that language with DynamoDB. So what we have on our roadmap and which is the biggest challenge for us is to enable querying DynamoDB with SQL. And that way, we can just make people use DynamoDB easier. And yeah, hopefully it will drive bigger adoption.
Jeremy: Yeah, well, that'd be really interesting, too, is just if you could take some sort of T-SQL parser and give it auto complete, and then be able to start typing things in and then maybe even make the suggestions of where if you wanted to query the data this way here might be your optimal GSIs, or here might be optimal way to store the data and so forth. I think that's one of the tools that I would love to have, where basically just like, copy my ERD into some system and then have it do some thinking for me and come back and say, "Okay, so here are the entities that we want to create from this, here's the relationship between the entities and so forth."
But the other thing, and this is, again, maybe to the education side of it, where I think you've got a lot of developers who think that working with DynamoDB and, maybe I should take a step back, because I think I want to ask this question a little bit differently, the challenges of working in the cloud and working with serverless, this is something I think that is very, very new to a lot of developers. I've interviewed a lot of developers in my day, especially a lot of young kids coming out of school and no offense, calling them kids, but they're kids, and they come out of school and they know nothing about the cloud, or they know very, very little about it, like "Oh, I used Firebase one time."
But they've never developed anything on the cloud and serverless is this foreign concept to them because, again, their professors are still teaching them how to develop on servers. You know what I mean, and that level of thinking and that sort of on prem type of thinking. So what are some of those challenges that you see, maybe from people just going to the cloud and using serverless technologies?
Rafal: Oh. So there's... There are definitely many things. Because if you are using cloud and if you're using serverless, you definitely need to understand the IAM. And that's really hard thing to get all the policies, roles, users, groups, maybe some even SSO, and there is a lot of things that you need to wrap your head around. And there are many other primitives inside the cloud that are working with serverless. You cannot just think that I will have a Lambda function because it's serverless. Many people think that actually serverless is just Lambda. No, there is a series of services that are working together and there is also a whole foundation on top of that. There's this distributed way of thinking, I think, where you don't have this local, when you don't have that drive, where you don't have networks. It requires so much more.
Jeremy: Right. Yeah. No, I totally agree. I mean, that's, I think there's just that step from this idea of writing everything monolithically to being able to separate all these whole pieces, and having those things work together. So, let me change the subject a little bit, because I think having your experience and knowledge of DynamoDB will be helpful here and get some insight into a question that seems to come up all the time where people say, "Well, serverless is great for spiky workloads or workloads that barely run, or run every once in a while." And so that argument I don't necessarily agree with because I think that serverless works great because sometimes you have spiky workloads, sometimes things just roll straight and you have predictable traffic. And I think that's a smart way to build an application so you don't have to think about the underlying infrastructure.
That being said, there's a similar argument that is against DynamoDB, where they will say, "Well, DynamoDB is overkill for a small little project. So if I'm just building a little side app or small project, internal or something like that, I'll just spin up a MySQL database. And I'll just write it that way. Because it's not going to get a lot of traffic." I disagree with that because I like the fact that with DynamoDB, I don't have to think about database backups. I don't have to think about scalability, if for some reason it does, maybe I need to do a bulk load of data into it, and maybe the database isn't powerful enough for whatever those reasons are, but what are your thoughts on that scale argument? I mean, do you think that people even if they're building something small should default to DynamoDB or should they be using something else until they get to a point where maybe they need that scale and that NoSQL back end to handle massive amounts of traffic?
Rafal: I totally agree with you. I think that they should already go to DynamoDB because I feel like this is the database of the future. And if you're, for instance, starting a small business, small startup, I think there are two things that you definitely will not like to care about. And I think it's bureaucracy and maintenance and DynamoDB is distinct, which lets you set your database and forget it, you're done. You don't have to care about maintenance, patching, security, tuning the performance, checking everything works, making it highly available and stuff like that.
You get that out of the box. And it's going to be future proof because AWS engineers will take care of that and your database probably will upgrade itself many times throughout the project and you don't have to do anything about it, you just have a reliable data store. Yeah, and this way, when you don't have to care about all those things, you can focus on those things that are making you differentiate on the market. You can innovate, you can build, you can focus on application logic. And I think that's the core of innovation, that we don't have to do the grunt work, we can be just creative. And serverless and the cloud makes this creativity easier.
Jeremy: Right. Yeah. And actually, I think that, for me, the biggest sort of pro to using something like DynamoDB is, if you are not using it, it costs zero dollars. You know what I mean? And so if you set up, even if you spin up an aurora serverless database cluster and I think you can do one ACU now, but it still costs me $30 a month or something like that to keep that constantly running. And granted, you can shut it down and have it sleep and some of those things and certainly save yourself money that way.
But it's the same argument I think with spinning up an EC2 server in order to write a Node.js app or Python flask app or something like that, where it just seems like if you're getting barely any traffic, or you're experimenting, you're trying all these other things, that that cost argument is huge. I mean, I can build as many DynamoDB tables as I want to and it's most likely in that free storage, which is awesome, right? So I'm not paying anything, even to store data, I think get 20GB of storage for free or something like that, which is insane.
So, yeah, I really like that idea of just being able to do these things very quickly and very easily, and very cheaply because in the past, I would spend, I mean, I remember this in the days before, I mean, I would be spinning up multiple EC2 instances. You'd have a SQL database running or MySQL database running and as soon as you put that into production, you couldn't be running just one, right? You had to have some sort of replication there. Then you're always worrying about that, you're thinking about failover, and all that kind of stuff. All of that stuff goes away when you start using serverless applications.
Rafal: Yep, that's true.
Jeremy: All right, great. So let's move on to another topic that I think would kind of ties into this. And that's this idea of again, changing your thinking from relational databases to DynamoDB. So what are the mental shifts that developers have to do in order to go from writing T-SQL and just saying select star from whatever to dealing with NoSQL queries and really the limitations that are added to the types of queries they can run?
Rafal: Yeah, sure. So I think that there are two challenges actually, the one is that we are learned to always normalize the data. We have the second normal form, the third normal form, we aim to de-duplicate the attributes the data to store them in separate tables in MySQL or Postgres, and you have to unlearn that. DynamoDB works totally different if you want to use it efficiently in one table. And that's just simply hard. If people were using relational databases for past 10 years and someone says that's totally different, you shouldn't do that here. It requires a lot of effort, but the second thing is that it requires you to be involved in the creation of the application a little bit earlier because if you're going to use DynamoDB, you need to know the access patterns because the access patterns are actually shaping your data models.
And if you'd like to take care of the designing data model responsibly, you need to be involved in a business process. You need to understand the client because if you understand the client, you can build your access patterns accordingly. Maybe you can interact with them, maybe you can suggest some kind of change. Because once you've committed to the data model, or if there is a requirements passed to you from the top, maybe you will realize that some time after going that route you cannot change something, you cannot alter some decisions. So I think being a cloud engineer, as opposed to software engineer, it requires you to have this broader knowledge and to have take broader responsibility. And actually, cloud allows us to have more responsibility in the business process because we no longer have so much responsibility in maintaining those underlying services and tools. So yeah, I think it's good and it's fun to be involved in business and in shaping those access patterns.
Jeremy: Yeah, because I think I totally agree with you where when we think about building data for a SQL database, its usually just okay, well, what fields do we need for this particular entity and then we can always join them afterwards. So deciding on those access patterns is important. But the other thing, I think where I guess the shift needs to be made is, NoSQL might not be right for you or NoSQL might not be right for you, depending on what your application is, right? And I know Rick Houlihan talks a lot about this idea that if you're building something where your queries are changing all the time and then you move to NoSQL and you say, "Okay, well I want to be able to select star from this or want to be able to join this..."
Which you obviously can't do joins, not the traditional way anyways in SQL, but that developers will become disappointed if they put data into DynamoDB and then realize they don't have that query flexibility. So what are your thoughts on, how do we tell developers that? Because it's really hard, I think, for some of them to grasp. It's like, "Well, if it's a database, I should be able to query it." So what's that advice that we give those developers about when they should choose NoSQL?
Rafal: So I think that there is a general answer to that question because I see developers so many times rushing into implementation without properly researching the topic and evolved properly knowing the requirements and the limitations of technology. And that also applies to this specific problem. You need to know what are the limitations, but from the technology perspective. And you need to know what are the requirements from the business perspective. If you immediately rushed it in implementation, you can realize that, "Hey, I've made some bad decisions and it's not going to end well." You're probably going to hack some things and it's going to end badly. I've seen that and I've been put in projects like that in before. So yeah, lesson learned. Take your time, and spend more time on research.
Jeremy: Right. Yeah. Because I think that's the other thing. It just hits people in the face if they implement something in NoSQL and then they're like, "Well, why can't I do X or why can't I do Y?" So what about ERDs, right? Building your entity relationship diagrams and things like that. That's still something we want people to do before they jump into a NoSQL design?
Rafal: So I think that it's not going anywhere. We still need those for productive discussions, for working on application layer, for proficient communication, but just this concept of translating that to actually how it's going to be stored in DynamoDB. That part is only different. And I think actually we need some kind of better, I don't know, spreadsheets, abstractions, tools to visualize how those things are evolving from ERDs to different shapes and forms, how the data is structured and stored and then translated back to a business domain.
Jeremy: Yeah, totally agree.
Rafal: So that's another tool that we can solve.
Jeremy: Yeah, that'd be great, right? That's what I said. I mean, I would that. I would love that ERD input tool that just spits out, "Hey, here's how you want to structure your DynamoDB table, your NoSQL table." So, all right. So another thing that I think it comes up a lot, especially with single table design, is this idea of, well, how many entities do you put into a single table? So if you're building some really large application, are we putting you know hundreds of different entity types in the same table? And I always say, "No, we want to use a separate table for each microservice." So what are your thoughts on that?
Rafal: I think it all depends on the project and all the things that are specific to do to your use case, to your requirements. You can definitely interact, I think, many microservices can interact with one single table. Because thanks to a really granular IAM policies, you can, for instance, restrict the access from one Lambda function to only specific DynamoDB records inside a table using, I think, leading keys and attribute types or something like that. You can tell that this Lambda function has an access to this grand single table with all the entities, but it can only interact with the entities of type, which begins with ID for instance, I don't know payment or invoice or something like that. So it's definitely doable. Also, I think there is also a sentence in AWS documentation saying that the most, the best designed applications require only single table. So it kind of contradicts, but it also contradicts with the what's Amplify is doing. I think AWS is not having one singular statement on that. And it changes case by case.
Jeremy: Yeah. Well, I mean, it also depends on how you define application, right? So I mean, if you have a service that has a payment service, or you have an application as a payment service and a user service, things like that, each one of those services could be considered separate applications and you'd be storing the data differently that way. I mean, certainly what you don't want to do, at least, I guess, more best practices from a microservices perspective is you don't want to be storing data across bounded contexts in the same table or in the same database. You want to keep those separate so that one service can't update data in another service without using a formal contract through an API or some other method to do that.
All right, so what about some of the patterns though, that you can build off of that? So I mean, we know we've got DynamoDB streams. So if you are building separate tables for individual microservices or individual applications, obviously you need to be able to potentially share some data back and forth. But what are some of the patterns that that sort of allows you to implement?
Rafal: So we can definitely use event sourcing and common query response aggregation, because thanks for instance, to DynamoDB streams, you can react on the changes that are pushed to the DynamoDB tables. And actually, I think that DynamoDB streams are also solving some of the problems of DynamoDB. For instance, there is always this analytics requirement of all the projects that you sometimes need to aggregate some value. In DynamoDB if you want to query if you want to, for instance, sum the value of all the items inside a table, it's not going to end well because you need to run a scan through all the records and probably merge it, reduce it, run some really complicated process. Thanks to DynamoDB streams, you can aggregate the value just in time and always have that attribute that updated value. And you don't have to run the query on demand, you can always have the results on some kind of aggregation, whenever you want that. It requires a little bit work and it requires a little bit of education. And there is also a change in thinking required, but it's definitely doable.
Jeremy: Yeah, no, and I and one of the great patterns that I really like, too, is this idea of just using DynamoDB streams to take the data and put it into an Aurora serverless database as well. Because you can just use a small instance if there's too much pressure on the database, then obviously that can back off because DynamoDB streams will just build up. I mean, I wouldn't use it for translating like clickstream data into a MySQL database, but certainly for applications that are just create, read, update, delete type stuff, it's very cool way to have that extra data there for you with multiple things you can do with it. And like you said, I mean, you can push that off into EventBridge or do some sort of event sourcing with it. So very, very cool stuff. So another thing about education and you've mentioned education many, many times. And one of the things that you're doing on top of Dynobase is you have a DynamoDB newsletter. Can you tell us about that?
Rafal: Yeah, sure. So each week, we are gathering some interesting articles and videos, and probably will also sharing some live sessions from AWS. And that's also kind of part of our mission to share the good content. So we decided to start a DynamoDB newsletter something like 20 weeks ago, I guess somewhere around then. And then yeah, you can sign up and we'll deliver to your inbox the best resources we can find, so you don't have to spend all day on Twitter like I do. And yeah, feel free to join.
Jeremy: Awesome. Well, I am a subscriber. I love the newsletter because again, I like reading great content about DynamoDB and unless you are just trolling Twitter all day it is very hard to find that. So that that aggregation of that data is very, very helpful and sometimes I take some of those articles and I put them in my newsletter, so thank you for sourcing those for me as well. Anyways Rafal, thank you so much for taking the time to talk to me today and obviously for Dynobase. So if people want to find out more about Dynobase or more about you, how do they do that?
Rafal: Just go to dynobase.dev, that's our homepage of our product. If you want to approach me, I think the best way is just to find me on Twitter. It's @rafalwilinski and I'm pretty sure it's going to be included in the description of this podcast because it's different, it's difficult to spell for non-Polish people. And yeah, just go to dynobase.dev base or Twitter. And that's it.
Jeremy: All right, awesome. I will get all that in the show notes so they will be able to spell your name. Thanks again, Rafal.
Rafal: Thank you.