Data in the Wild

Today, we’re joined by Anthony Eden, co-founder of DNSimple. He talks to us about the difference between DNS and domain registration, what ephemeral data is, and more.
Chapters
  • (00:35) - What is DNSimple?
  • (01:24) - DNS versus domain registration
  • (02:44) - DNSimple’s tech stack
  • (04:05) - Example of ephemeral data
  • (10:06) - Tall tale about the missing model
  • (32:19) - Anthony’s data modeling tip
  • (36:35) - Tall tale about the server that almost caught fire


Sponsor
This show is brought to you by Xata, the only serverless data platform for PostgreSQL. Develop applications faster, knowing your data layer is ready to evolve and scale with your needs.

About the Hosts
Queen Raae wrote her first HTML in 1997 after her Norwegian teachers encouraged her to take the new elective class. 

Around the same time, Captain Ola bought a Macintosh SE for his high school with the proceeds from the school newspaper he started.

These days, you’ll find them building web apps live on stream and doing developer marketing work for clients. They are both passionate about the web as a platform and the joy of creating your own thing.

Visit queen.raae.codes to learn more.

Creators & Guests

Host
Benedicte (Queen) Raae 👑
🏴‍☠️ Dev building web apps in public for fun and profit 👑 Helping you get the most out of @GatsbyJS📺 Streams every Thursday: https://t.co/xaLy43cqMI
Host
Ola Vea
A piraty dev who also help devs stop wrecking their skill-builder-ship ⛵. Dev at (https://t.co/8m50kyT981) & POW! w/👑 @raae & Pirate Princess Lillian (8) 🥳🏴‍☠️
Guest
Anthony Eden
Founder of DNSimple, vendor of duct tape, purveyor of UDP packets. He/him. Mastodon: https://t.co/OZlBruTRQO
Editor
Krista Melgarejo
Marketing & Podcasts at @userlist | Originally trained in science but happily doing other stuff in SaaS & tech now

What is Data in the Wild?

Learn from your favorite indie hackers as they share hard-earned lessons and tall tales from their data model journeys!

Brought to you by Xata — the only serverless data platform for PostgreSQL.

[00:00:00] Ola: Welcome to Data in the Wild! Discover data model tips and tricks used by our favorite indie hacker devs, with Queen Raae! I'm your host, Captain Ola Vea and this podcast is brought to you by Xata, the serverless data platform for modern web apps.

[00:00:22] Benedicte: And today's guest is the great and powerful Anthony Eden, co-founder of DNSimple.

[00:00:29] Welcome to the show, Anthony.

[00:00:31] Anthony: Thanks. Thanks for having me on.

[00:00:33] Benedicte: Oh, you're a great guest to have on. So before we get into your experience with data modelling for DNSimple, could you tell us a little bit about the problem that DNSimple solves?

[00:00:47] Anthony: Sure. So it was founded 13 years ago, really because at that time there weren't any. We were in between phases and there was a lot of opportunity to make a DNS provider that would be good for developers.

[00:00:59] And so that's what I did. I essentially made the DNS provider that I wanted. And within six months after founding, then I started doing domain registrations as well, because that was also painful at the time. And that was just the origin and we've developed it over the years to be something where we focus on automating all kinds of things around domain management. Through APIs, through various tooling built into our web app, things like that.

[00:01:24] Benedicte: So what's the difference between DNS things and domain registration?

[00:01:30] Anthony: So DNS is the operational side. So when you think about any type of website or email service or any other service where it actually has to be found on the internet, something has to translate the names that we use into the IP addresses and that's the foundation of DNS.

[00:01:44] So DNS is 24/7, 365 operationally intensive. You have to have really good redundant systems that stay online. Domain registration is kind of the intellectual property side. That's where you get into, okay, I'm going to register a name and then it becomes. At least while you hold it, your property that you can then use with the DNS system in order to get traffic to your properties or send out emails or whatever it might be.

[00:02:12] And so there's, they're often mixed together. Some people say DNS and they really mean both systems.

[00:02:19] I kind of like to mentally separate them because I feel like they're such different beasts. One is extremely administrative heavy. All the domain registration side is very, it's got a lot of policies and procedures and things like that.

[00:02:32] And on the other side, it's very much tech heavy. The DNS operation side is very, essentially keeping things running all the time. It means working really hard to have good systems.

[00:02:43] Ola: Yeah.

[00:02:44] So could you quickly run through your tech sack?

[00:02:48] Anthony: Sure. So on the web application side, we run Rails, say that fast three times.

[00:02:56] And we run the latest version of Rails. We have a Postgres database for our data store. It's pretty standard, really. We use a lot of gems that help enhance things, but the core stack is Rails and Postgres.

[00:03:11] Now on the DNS side, we're currently running an Erlang name server that we've been running for, I'd say about 10 years now, nine or 10 years that we developed specifically for our needs.

[00:03:22] We're starting to sprinkle in some other name servers as well. We may be moving things around a little bit to try different authoritative name servers.

[00:03:31] And then we also use Go quite a bit in between. So when we're talking about moving data to the various edges, because we run an Anycast network with points of presence around the world.

[00:03:43] And so you have to push data out to those authoritative name servers. And there we lean in heavily on Go as our language of choice there. And that's kind of the, that's most of it. We also do use Redis as well for sort of ephemeral data as well, in addition to Postgres for our more long term data storage.

[00:04:05] Benedicte: What would be an example of ephemeral data?

[00:04:08] Anthony: So for example, short term session data for rate limiting API calls would be a good example of ephemeral data. If you have a Redis instance go down, it's not that big of a deal if you reset the counter for somebody. So it, we try to only store long term permanent data in the Postgres database and the stuff that's going to be accessed very often and that can go away, we just shunt that into our Redis database usually.

[00:04:35] Ola: So let's get into the data modeling then. Yeah. What data model has changed the least for you since you launched?

[00:04:45] Anthony: So I was thinking about this. It's a tough one because all of, we've been around for 13 years. So all of our data models have gone through a significant number of changes. Funny enough, I think when I was looking, we have some data models around DNS templating.

[00:04:59] So these are things that are used to, that our customers can use to create a template That then they can apply and immediately create a series of DNS records, right? A whole bunch of DNS records on a particular zone. And looking into it, I think that has changed the least since we've started. Probably because that feature has changed the least.

[00:05:20] Since we've started. Now it will, like, when I get into the tall tale, I'll talk a little bit about how that's going to have to change a little, but that's probably the one I would say, the ones that are really, there's a couple around that, let's call it like DNS template and DNS template record or DNS record template, something like that.

[00:05:35] Benedicte: So that would be the act of storing the template and then using the template.

[00:05:40] Anthony: Correct. Exactly. And it just hasn't changed much even though the record model that it gets applied to has changed a bit over time. I think the templates themselves, they just, they haven't, because they don't really contain that much information.

[00:05:53] They're just kind of like a template that you can use. So they don't need to go through a bunch of changes.

[00:05:57] Benedicte: Oh, that's cool. That's a new one. It's a new one. I haven't heard that one before.

[00:06:03] So last time, I think a lot of people have mentioned that they're kind of, their user model has changed very little, but then what it's kind of connected to in like plans and subscriptions and stuff has changed a lot.

[00:06:15] Has that been true for you?

[00:06:18] Anthony: Our user model has changed over the years, primarily because we shift and when we first built it, the first design had users attached to everything. And then we introduced the notion of an account. And so we started saying an account is going to the entity that is connected to billing and subscriptions, things like that, and users represent actually an authentication, a person.

[00:06:37] So it's the credentials, if you will. And then that, and then as we added more features around how users can get credentials. So adding things like two factor auth, adding things like multi factor auth, then moving over to hardware keys, those types of things required regular and steady changes to the user model.

[00:06:55] And in addition, over time, we've just kind of enhanced the user model to make it better for our internal usage. So we have data bits on there that you'll never see, but they're very important for us as well. So that's one of the reasons why for us, it's actually changed. Fairly steadily over the years, sometimes in big booms.

[00:07:12] So when we built that, when we linked up the accounts concept and we created a many to many relationship between the two, that required some significant overhauling. And then it kind of slowed down a little bit again and stabilized.

[00:07:25] Benedicte: So would you agree with Monica as one of our earlier guests? She said, always make it an array.

[00:07:32] Anthony: In terms of?

[00:07:35] Benedicte: In terms of like, or many to many. Like always make it a many to many because it will happen and at that point, the work you'll have to do to change it from like a one to one, it's easier to just make it many to many from the beginning.

[00:07:51] Do you find that to be true or is there any pitfalls of doing that?

[00:07:56] Anthony: I'm always reluctant to over design things. I'm reluctant to build something that I may need five years from now because I don't know, it just seems like over engineering is the source of a lot of code that we end up throwing away in the long run. Granted in this particular case, a relationship.

[00:08:16] One thing I will absolutely agree with is that users should represent an entity, a person that is connecting and the credentials around that, the things around that. Not, it shouldn't be related to your billing. It shouldn't be related to account management. Those really should be designed from the beginning as two separate models.

[00:08:35] Benedicte: Yeah. I think that's where it really. That's one place where you can like look a little bit ahead. Like you can get away with just having a user model and not an account and all of that.

[00:08:47] But with everyone we've talked to, like that quickly changes. And it's one of those things that will save you some headaches fairly quickly. If you thought it, through a little bit.

[00:08:58] But that leads us over to what models has changed the most for you?

[00:09:06] Anthony: So it's a pair of models and their interrelationship, and they are also the source of my tall tale.

[00:09:14] So we talked about just a little bit ago, the difference between the operational concepts of DNS and the IP concepts of domain registrations.

[00:09:22] Benedicte: Yes.

[00:09:23] Anthony: The models that represent those two things have changed, I think, the most over the years. So one is the zone, and that represents the operational DNS side, and the other is the domain, and that represents That's the registration side, and those are kind of core models inside of our system.

[00:09:39] So every time we've touched relationships with other things, every time we've created new bits of data that we need to store on those things, they've had to change. And unfortunately, being a core model, it means changing them usually has impacts that we didn't even foresee. But at this point, and I'll get more into in the story later, we've mostly stabilized. We have one more kind of big change and that's what I'll talk about.

[00:10:06] Benedicte: Should we just get into the tall tale since it feels like a lot of things are connected to this tall tale?

[00:10:11] Anthony: I mean, if you want to, we can. I'm ready.

[00:10:13] Benedicte: Let's do it.

[00:10:15] Anthony: Well, I've already given kind of the backstory, right?

[00:10:17] DNSimple started, we talked about this beforehand, but for the audience now, that's hearing this recording in the future, DNSimple started with just zones. So just the notion of DNS. And then about six months afterwards, I started putting in the notion of domain registrations. And if I recall correctly, I believe in the very beginning, I had a domain model that was actually the zones.

[00:10:42] So that was the kind of core of things. Then later on, we had to start separating those two notions. So we had the zone and we had the domain. And I think that the challenge we've always faced is we never could figure out what tied these two things together. Which one from a customer point of view What is the entry point into these two?

[00:11:04] What is the thing that ties them together? And so that leads me to the MISI model that we still haven't implemented to date, that we constantly work around, but that's coming up this year. There needs to be something in the middle that I think is the name itself, right? So we've been, both zone and domain have names in them.

[00:11:23] They have an actual, like a dot version of the name, like example. com and both retain that as strings inside their models and what we've come to realize over the last few years and just haven't had the chance to implement. Is that there's a third model that sits that our customers think of and that is the domain name.

[00:11:41] That is actually that thing that has that string representation. And that model is super interesting because if we do it right, that thing might not be unique. It might be that one account holds the right to have that name operational at any given time, but that historically another account could have had it, right?

[00:12:02] So now we have to start thinking about how do we have a model where only one is allowed to be active and so that we can't have name takeovers and things like this. So I think by, at some point later in this year, we'll have a kind of a little triangle there where we have a domain name that's the entry point that everybody sees.

[00:12:17] And from that, if you go to the operational side, you'll be working on the zone model. And if you go to the intellectual property side. You'll be working on something that might still be called domain, or it might be called something like domain registration. So we'll see. So that is mymy tale is you go 13 years with a mediocre model and still go a long way.

[00:12:38] But at some point you're going to have to fix the damn thing. This year is the year that we're going to have to fix the damn thing.

[00:12:45] Benedicte: The tale of the missing model is

[00:12:47] Anthony: The tale of the missing model. It's funny though, because in hindsight it's with missing models. You're like, "Oh, of course! It's so obvious that needs to be there," but it took us a lot of years to the point where we really needed it to be there.

[00:13:02] And so now we're finally going to bite the bullet and make the investment to make it be there. I think.

[00:13:08] Ola: Yeah.

[00:13:09] Benedicte: So you said something interesting there. You said like what the user perceived it to be.

[00:13:15] Do you think it's been missing because you've been looking at it from the technical side?

[00:13:21] Anthony: That's, there's a pretty strong chance that as engineers, we took an engineering point of view and said, "okay, these are the models that make sense to us in terms of how we're using the models." but maybe what we didn't look like is how do our customers see this?

[00:13:37] Right now, I mean, we kind of separate the two inside of our interface, but we don't. We merge them together. It's very convoluted how you get through from either the registration side or the zone side. And the relationships of other things to the domain and zone is also fairly convoluted, at least in my point of view.

[00:13:57] And I want to make it, we want to make it better. And so we, we're kind of just at the point where we really feel like this third model, this thing that just represents the name itself is kind of the key to unlock the door to a simpler way of thinking about how, not just the backend, but our front end works as well for our customers.

[00:14:17] Benedicte: Cool. Because then that would be the entry point. When you, when the user signs in, they would kind of land on a page that would get most of its data from that new data model or like from that data model and through to the other ones.

[00:14:32] Anthony: I think more importantly is that that model becomes the identifier that they, so that right now you drop into the InSimple and you get onto a page where it lists all of your domains once you go into a particular account.

[00:14:43] But what are you actually listing there? Are you looking at the operational side? Are you looking at the registration side? You're not really looking at either of them, or you might look at either one of them depending on what type of view you currently where your mind is. What you're trying to think of right now, but at the heart of it, there's still a list of names.

[00:15:00] And so that's the part that we want to be able to say, this is where everything starts from.

[00:15:04] Benedicte: Yeah.

[00:15:05] Anthony: And now here, you know, you might want to focus on the IP side and go to the domain registrations, or you might want to focus on the operational side and go over here.

[00:15:13] And I think one of the biggest pieces of software or services that we use that kind of does a really good job at this is Airtable. Airtable has this notion of, I mean, they're kind of a data model expressed as a SaaS, right? And, but they have these really great concept of views that are super malleable. And I think that we're starting to see the value of that concept where the operational side is one view, the registration side is a different view.

[00:15:40] And maybe there are different types of views within that. You know, maybe sometimes you're thinking, I just want to know about the, which of my names are going to expire soon? What is the temporal aspect of each of those names? But then other times you might think, what is the, how is it related to the operational side?

[00:15:57] Where is it delegated to? Like who's actually running the DNS for it? And so we're starting to latch onto this idea of these, this views as being maybe the way forward for the customers to switch. But again, at the core of it, there's always a list and that list, at least in our domain space, is the name. Domain name.

[00:16:17] Benedicte: Okay. So this will simplify the way you think about it internally, but then also simplify it for your customers.

[00:16:24] Anthony: That is the hope.

[00:16:26] Benedicte: And hopefully also make the front end code simpler.

[00:16:30] Anthony: Yeah, that I'm not going to just go, that's a little overboard for me. I don't know.

[00:16:34] Benedicte: That's a little overboard.

[00:16:36] Anthony: No, our front end code is already pretty simple.

[00:16:38] So we stuck to to a very classic model for the web where the vast majority of it is just on the page. And we only use JavaScript for enhancements most of the time.

[00:16:51] Benedicte: I wasn't sure because you're on Ruby, like I'm in more of the spa world, so I was,

[00:16:55] Anthony: I'm sorry.

[00:16:56] Benedicte: Well, it's all I know. It's all I know.

[00:17:01] Maybe I should go back and do those Ruby courses that I never did. But we'll see.

[00:17:10] Do you want to follow up, Ola?

[00:17:11] Ola: Yes! So if you could time travel back in time, what would you undo or change in your data model or data models?

[00:17:20] Anthony: I would have absolutely, if I could travel back in time, I would have the notion of a name as this as sort of like, that's the heart, a domain name is kind of the core thing.

[00:17:30] And then I wouldn't. I wouldn't take that name and push it out necessarily into zones and domain registrations. I might have to, that's the other thing too, like there are some complications, which means variations, might have to exist nonetheless. So I'll give you an example of this in the domain registration side when you're dealing with internationalized names.

[00:17:53] So these are ones with non ASCII characters. You have a Unicode version of that, which the customer sees, and then you have what they call a Puny code version of it. That the, that converts into an ASCII version. And so you have to materialize that for like, you have to have both those versions in essence.

[00:18:13] And so then the question becomes, do you materialize that into one of the, on the lower level models, like a domain registration or what have you? And it's challenging in that sense. There's a lot of. It's a lot of weird stuff. I mean, I guess that's what happens when you have the name of the thing as being the core of your model, both inside and outside of your system.

[00:18:35] Benedicte: So I have to ask though, cause I am not a hundred percent sure on that. So what do you mean by a zone?

[00:18:40] Anthony: Okay. So a zone is it's a collection of DNS records that are used for operations. So normally a zone includes what they call is a start of authority. That's SOA record.

[00:18:53] And this says, this server is going to be the authoritative DNS for this set of records. Now it's a tree. So inside of that zone, you can also say, and we're going to delegate a sub zone out somewhere else. And so then that might have its own zone and its own startup authorities and it might own part of it.

[00:19:17] So that's how DNS works at its core. You start from the root, which has nothing. And then you have, for example, a TLD zone. So that's com and the com zone. All it really has, the vast majority is just name server records that say, go to these name servers to get this second level domain, go to these to get this second level domain.

[00:19:35] So you know, streamyard. com is going to have one, two, three, four, five and S records. To where people will then go to, to find that out. And we actually, we have a webcomic at HowDNS.Works, that explains all of this really well.

[00:19:53] Benedicte: We'll put that in the description.

[00:19:55] Anthony: Yeah.

[00:19:55] Benedicte: Whenever I get into this and I like read up on it, and I've done that several times and my mind is wiped on that, it amazes me that it even works.

[00:20:06] Like there's just a bunch of servers, like asking other servers, where should I go? Like, and it all happens so fast. And I'm like, every time I'm like, how? How can this work?

[00:20:19] I don't know if you can answer this question alone. Or if we're all amazed.

[00:20:25] Anthony: I mean if you look in my bio in many places I say I'm a vendor of duct tape.

[00:20:31] Because basically what I do is try to keep this stuff running all the time from our little corner of the world. But I think the reason that it works is because the original protocol designers did a really good job of creating something that would stand the test of time. And they said, we're going to do this.

[00:20:47] We know that it needs to grow over time. Cause they had confidence that the internet was, they already saw the growth rates. They knew that there was going to be this need to spread the responsibility out amongst a lot of different entities. And so they built that into the protocol to make what is essentially a distributed or a federated protocol.

[00:21:07] And I think that is the reason that it works. They made decisions that made sense at the time. So they make choices that just ended up working correctly for a long, long time.

[00:21:21] Now you'll have, even nowadays, the one other thing you'll know about the DNS set of like, set of rules is there a lot of them. So all of the rules are defined in RFCs. So internet RFCs, which are kind of this open way of describing protocols, and the DNS RFCs are very extensive and they're constantly, they're constantly new ones coming out. And so the evolution continues even in a protocol that's been around now for, well, it's been around since what, the eighties, so we're talking 40 plus years.

[00:21:57] And it just doesn't stop. People keep making improvements to it, adding more features to it. And that's just, I think why it works. Cause there's still people that are passionate about making it work and constantly willing to evolve it.

[00:22:11] Benedicte: So do you know, is there. Going back to our questions about data models, do you know if there's like any choices that the people who are deep into this would've loved to have undone? Or have changed 30 years ago?

[00:22:25] Anthony: I have no doubt that there are a bunch. In terms of if you, so when you think about an RFC, the way that the data models typically work is they, at least in the D space, is they describe packets, right?

[00:22:35] So I have if you go way back in time, you can probably see some of the packet diagrams that represent, say, the first data models of what a DNS packet would look like.

[00:22:45] And you'll find fields in there that were never used, right? They had some idea that they were going to be there and they just were never really used. So I'm sure some of the original designers, I can't speak for them, but I have no doubt that they said, "Wow, I wish we'd have done this one little thing different.

[00:22:59] Or maybe lots of little things different."

[00:23:02] Benedicte: So there's this, that, so I guess it's like the same thing with HTML and how things have like evolved, there is the kind of the written what's written in the proposal and what is possible. And then there's the kind of evolved way of how we use it.

[00:23:19] Anthony: Yes.

[00:23:20] Benedicte: Like you said, like there are fields there, but nobody ever started using them. So in essence nobody should start using them, but if they go back to that original description, it's there. But if they start using it, they'll mess up potentially. Kind of the, oh, I can't cut, cut the word. Like the common way of using it.

[00:23:39] What's the word for that?

[00:23:40] Anthony: Well, they can break compatibility, right?

[00:23:42] Benedicte: Yeah. So the kind of the best practices was the word I was looking for. Like there are best practices that probably has evolved on top of what is correct, correct. Technically correct.

[00:23:52] Anthony: Absolutely. And it's a testament to the IETF, which is the organization that oversees sort of the creation and standardization of these protocols. That they too have over the years evolved as an organization to ensure that the protocols are reviewed by a lot of people from the community.

[00:24:13] So there, I think one of the unsung heroes are not one of them, but the group of them are the reviewers of IETF submissions. They go through these things with fine tooth combs and they will go for months. Like back and forth, trying to figure out something that will work without breaking backward compatibility.

[00:24:32] That includes both on the kind of data that you're sticking in these packets as they're going, as well as just the wording around how you describe things so that future. Readers of it will hopefully get a clear view to it because a lot of the original RFCs were vague in certain areas. And so we, there's just, people would implement things in ways that were unspecified and it worked, but at the same time, it leads to a lot of potential trouble down the road.

[00:25:03] Benedicte: Oh yeah, absolutely.

[00:25:03] Anthony: We are definitely getting off the data model track if you want to.

[00:25:06] Benedicte: No, but I feel like this is also data modeling, right? Just in a different, it's not saved in a database. Oh, well, I guess it is. I guess it is saved, but it's also saved, you know, all of these name servers. We'll have some internal storage where they store, where they're going to send people that land on their name servers.

[00:25:24] Anthony: So you have three actual, I would say there are three different potential data models for every one of these servers, right? You have it's internal model. So however, it's going to represent it. You have a. An external representation. So for example, most of the RFC specs talk about this in as a text file, right?

[00:25:43] You might've heard it as a zone file and that's a representation of data usually as a list of records. And then you have the third one, which is the representation as a packet, because that has to be scrunched down into the smallest possible thing that can be sent over the network so you get those fast responses and how those three data models, how you transform through those three data models.

[00:26:05] Can have a significant impact on the performance of your system. Because if you're slow to transform in, when the model changes, then you have your customers are like, why do I have to wait 24 hours for something to change, right? And then if you're, if you have problems translating from the internal model into the packet model, well now the response times might be slow.

[00:26:26] So the efficiency of data models in the DNS life cycle is actually a fairly important part of any DNS server. And the really good ones do a very good job of kind of making it fast throughout the entire chain.

[00:26:40] Benedicte: Like DNSimple.

[00:26:41] Anthony: Well, like our server under the hood, yes but. And we invest a lot of time in trying to make sure that those things stay speedy.

[00:26:51] Even as zones get really big. So if you're dealing with a zone that has maybe a hundred records, that's not that big of a deal, but change that when you have a zone that has a million records, and now you have to figure out how you're going to take that data, transport it to all of the edges in one form, store it inside the server in another form, and then translate the you know, search through it to find the data you want to give out the packet that you want.

[00:27:16] And all that has to happen in milliseconds times basically.

[00:27:20] Benedicte: But so how, what would be, what would make a own be, I mean, I'm guessing okay.Com and stuff like that, that would be that big. And then the next level down would be the top level domain.

[00:27:32] Anthony: Mm-Hmm.

[00:27:33] Benedicte: So that would be, so for, um, so for, for.com, for, that's the top level domain, not com is the top level.

[00:27:41] And then you have the domain because you wouldn't be doing any name servers for the top level.

[00:27:46] Anthony: We could, I mean, there are companies out there that are contracted out to do that work. We don't operate any TLD right now, any TLD name servers, but we could.

[00:27:56] Benedicte: Yeah. So I guess those, yeah. So those would be really big, but what would make like a domain, I guess, gets a really big zone.

[00:28:08] Anthony: Oh, so an example that I've seen is when, one example is, it's going to seem really weird, but it's, it actually is appropriate is Minecraft hosting providers. Okay. Why? Because Minecraft, you can publish an SRV record. That can be used to detect any given Minecraft server and Minecraft servers have to be run, you don't run them in a shared environment.

[00:28:34] They just, they soak up too much CPU and memory usage. And so they have lots of instances running and every one of those instances is mapped to an IP address and mapped to a server, an SRV record, maybe a port and IP address. So that is an example of a, of a service that could end up with. With lots, especially one of the bigger ones with lots of records in it.

[00:28:57] I don't think anybody comes close to what the TLD name servers do because they have a lot. If you think com has 179 million, I think, domains in it. And then for every one of those, you have probably at least two NS records. Well, you start to get the idea of the file size. But even smaller providers, if they're providing a hosted service for others, that's where we usually see those zone sizes start to balloon up pretty quickly.

[00:29:20] So that would also be, so it's not necessarily on their domain then, that they could have like several domains that they have to know. Because that could, it would be

[00:29:30] their sub domain. So they might, one, one technique might be for every customer that we have, we're going to give them a unique third level domain underneath.

[00:29:38] Well, now you have one per customer. And so it's, if you have a hundred thousand customers, you have a zone with a hundred thousand minimum in it, plus whatever other records that you need.

[00:29:48] Benedicte: And oh my God, so all of, cause this is, you know, this is where I love the mother web, where I can get like preview deploys.

[00:29:55] You know, with Netlify or Gatsby or any of those, like you, you just push a branch and then you get a preview deploy. And then that is deployed both to the, to maybe sometimes it's even deployed to like two or three different subdomains. Cause it's got the preview URL and then maybe it's got a, like a test URL and then something else.

[00:30:15] So I would guess those would blow up the size pretty fast, like a Netlify type service.

[00:30:20] Anthony: If you were doing it in such a way that you, ththere arericks around how to deal with this because DNS does come with a wildcard record. And so what you can do is instead of handling that name routing inside of DNS, you can push it down into the web servers in essence, right?

[00:30:38] And then, so they could get every single name coming into one web server, but that doesn't scale. So now what you might do is chunk it up into. You take a third level name that represents, say, a load balancer, and then you're giving people fourth level names underneath those, right? So again, this is one of the neat designs of the DNS system is the fact that names can be, you can prepend things to them and you can chain them in essence to create a tree, thus improving the scalability because now you can put the wildcard record potentially on things like that.

[00:31:10] The fourth level name or the third level. So there's lots of ways to handle it, to try to address some of those zone size issues.

[00:31:17] Benedicte: And that's the type of things that I then would use your server or could use your service.

[00:31:21] Anthony: Absolutely. Absolutely. We have plenty of customers that use us because they can use our API to inject those records really quickly and they can sort of change their topology on the fly if they need to.

[00:31:32] And if they're scaling. At any moderate speed or greater, they're going to want to control their name topology a little bit with a little bit more control. And eventually you get to a point where the DNS really isn't the best way to do it. You might have one layer in there, but you're going to have to push it down to the routing stack into where the load balancers are.

[00:31:51] Benedicte: I know you, yeah, you were like, this is not data modeling, but I feel like it's a little bit of data modeling and also just interesting for anyone building for the web because we all kind of get into these things.

[00:32:03] And then at least I'm like shying away. Like, I don't want, like, let's just have the service deal with this, but then you do hit issues and it's nice to know a little bit about the backbone of our beloved internet.

[00:32:19] Ola: Yeah. So do you have a tip or a trick to offer our audience about data models?

[00:32:28] Anthony: I would say that practice like anything else, I think in a lot of cases you have to, if you build more small things that use small data models, then you get more practice about what works and what doesn't work.

[00:32:44] And I think that includes, so there's a very distinct separation between transactional data models and analytic data models. I think it's good to practice both of those. I love, I think one of the things that really helped me in my career and understanding a little bit about data modeling better than I did when I first started was getting into data warehousing and understanding what the difference is between an analytical data model and a transactional model.

[00:33:14] And so I would suggest reading up on those types of things. I think it's super interesting and super useful because they're very different. They solve very different purposes. And you might find that you, there's an area that you've never touched. That's super interesting to you. Like I D I did a deep dive for.

[00:33:29] I think three years into data warehousing and all of that and wrote software around that just cause it was so much fun. It was so much fun looking at data in a different way than the basic, you know, transaction model where everything was normalized, I believe. Yes. Or do you know, I can never remember even today.

[00:33:48] 26 years of being a developer. I can never remember the difference between normalization and denormalization when it comes to data models. It's terrible. Of course, I haven't been doing it for a little while.

[00:33:59] Benedicte: So give us the highlight. What's the difference?

[00:34:01] Anthony: Well, so in one case, you're going to have lots of little models that are related by their keys.

[00:34:08] And so for example, you want typical transactional models are going to have lots of these little models that have relationships between them. Whereas in an analytic model, you're going to flatten things into much wider, longer tables, for certain bits. So for example, you have the notion of facts and dimensions.

[00:34:32] And facts are going to be, they're going to have, Some data points in them, and then a whole bunch of relations to their dimensions. And then dimensions are going to have one primary key and then a whole bunch of details that you can filter in different ways. And that's why it was always super fun to play around with because it was just very different between the two models.

[00:34:52] And some people get really good at translating between the two models. And some people just get really good at using one model or the other to accomplish what they want to do. I just, my tip is read, practice, just experiment with it and learn what's already out there because most likely. Somebody's already done what you're thinking of and has probably written, there's somebody out there that's written up a pretty good guide about how to do certain things.

[00:35:18] Benedicte: Do you have any tips to any data model books that you've read?

[00:35:22] Anthony: Oh, the ones I've read are probably all out of date by now. There was the whole Kimble series when you, when it was date, so data warehousing, the sort of the Bibles of that field were from Kimball and they're really good. I think they're still valid today.

[00:35:37] They're just probably maybe a little technically tired now, considering that the world has changed and we have access to a lot more computing power than we did, you know, 15 years ago or 20 years ago.

[00:35:51] Benedicte: But it feels like a lot of the basis for kind of data and databases and data model, especially transactional data models.

[00:36:00] It's not that like it's the same. It's just new, it's new databases, but the concept of how we model so that we can do what we need to do with our software, that kind of piece is the same. But I feel like a lot of that gets lost in the conversation about, you know, technically, which database is the fastest or like which.

[00:36:24] Anthony: Implementation. Yeah.

[00:36:26] Benedicte: It doesn't really, you know, like if you end up making really bad choices on top of your database, it's, you know,

[00:36:35] Anthony: I have one more tall story. I have, can I have a second tall story?

[00:36:38] Benedicte: Oh yes, of course!

[00:36:40] Anthony: So back in the, in 1999, I was working for one of the first seven registrars during the deregulation of the dotcom domain. So this was like long, long ago.

[00:36:52] And it was, I was, it was me and the CEO andandat was it really, we were trying to do this thing. And we had our server, we had one server to run this business on, was sitting on the floor in his art studio. Cause he was also, he ran an art gallery. And we had, I built a Java, like a JSP Java application that was running on this.

[00:37:17] And we, when they opened the doors so that we could register dot com domains, we started getting a lot of traffic. And that, and a few days in, and the server started getting hotter and hotter, and it literally started smoking at one point. Our site was crawling. And, and so I went to somebody that I knew, I was like, I don't understand what's going wrong here.

[00:37:41] Like this, the site's super slow. It doesn't like it, there are machines like get about to catch fire. What am I doing wrong? He's like, have you put an index on the database? And I said, what's an index?

[00:37:56] So this is 99 and I'm like, I'm just, I have no idea what I'm doing. I'm like this, lo and behold, fine. He's like, here's how you do an analyze. And this was my SQL at the time is what we were using. And I figured out one or two tables that needed that index.

[00:38:13] And then boom, it's all like, everything just went from taking seven seconds for a page to respond down to, you know, like seven milliseconds type thing.

[00:38:22] I mean, just, it was crazy. What a difference a few key indexes made. And I think that the point to this is that. Those of us who have gotten used to using things like web frameworks like Rails or whatever it might be from the various, you know, from the various languages you're using, those, the migrations that you get built into that now take care of so much of that for you that you don't realize that there, there was a time when we didn't have the tooling to do that automatically and it was really easy to screw up your indexes.

[00:38:57] And not do them in the right places you need. And so, and I think it's important to say that we, the whole internet have really benefited from these notions of web frameworks that bring the data modeling and give us sort of patterns to follow. And we can poo poo and we can get mad at some framework authors because they do one thing or the other.

[00:39:20] But as a whole, it has made our jobs so much easier to have a framework that kind of helps us with those data models and do a better job of doing just the basic things and creating a quality model.

[00:39:32] Benedicte: But I feel like the missing index is still, maybe you're not missing that primary index, but I've seen SaaSes grow and having issues and that being the issue.

[00:39:46] Anthony: Oh yeah, I think it's still really important to have a basic knowledge about how to determine how long the query is. Like what the execution plan is for a query, because anything that's more complicated than looking at the primary key, you might be surprised how your database sees things versus how you think it should work and understanding how to, like, how to show yourself a query plan, to see the performance in whatever your database of choice is is very valuable and every developer should know how to do that.

[00:40:19] Benedicte: That's a good ending note.

[00:40:22] Where can folks find out more about you and DNSimple?

[00:40:25] Anthony: All right. So DNSimple is pretty easy. We're at DNSimple.com. It's pretty straightforward. We also have a presence on a variety of social medias, from Mastodon, Twitter, Instagram, all over the place, Facebook, so on and so forth.

[00:40:43] Me personally, you can find me on the DNSimple site in the About Us page, or you can find me on LinkedIn as aeden. So A E D E N, and feel free to reach out if you ever want to get in touch. Happy to help.

[00:40:59] We are hiring. I'll tell you what, it's a challenge right now to hire good people. So we're always looking for good folks. We have an open position for a key account manager and we have an open position for somebody in our security infrastructure and performance team. So that's, they're the kind of the operators of our hardware and managed infrastructure. So yeah, we have those two and we're always looking for good folks, if you want to join us.

[00:41:28] We'll probably be opening a position here in the near future for another web dev and lots of different things.

[00:41:34] Benedicte: Cool. Thank you so much for sharing your data model stories with us. I really enjoyed the missing data model and also the server that was catching fire on the art studio floor. That's what we're here for.

[00:41:50] So thank you for that, Anthony.

[00:41:51] Ola: Welcome back to Data in the World. Discover more data model tips and tricks. Ahoy!

[00:42:00] Benedicte: Ahoy!