Data in the Wild | The What, Why, & How of Data Modeling with Dan Singerman

Today, we’re joined by Dan Singerman, founder and director of Reason Factory. He talks to us about why you should consider the user’s mental model when creating software, why it’s important to have consistent vernacular across your system, and more.

Chapters
(00:36) - Dan’s background and what his consultancy does

(04:03) - What is a model?

(06:45) - The challenges with modeling

(11:03) - Tall tale about the issue encountered with a music streaming platform’s data model

(23:25) - Tips on effective data modeling

(38:36) - The importance of consistent vernacular

(44:41) - Why you should fix your domain model before launch

Sponsor

This show is brought to you by Xata, the only serverless data platform for PostgreSQL. Develop applications faster, knowing your data layer is ready to evolve and scale with your needs.

About the Hosts

Queen Raae wrote her first HTML in 1997 after her Norwegian teachers encouraged her to take the new elective class.

Around the same time, Captain Ola bought a Macintosh SE for his high school with the proceeds from the school newspaper he started.

These days, you’ll find them building web apps live on stream and doing developer marketing work for clients. They are both passionate about the web as a platform and the joy of creating your own thing.

Visit queen.raae.codes to learn more.

Creators & Guests

Host

Benedicte (Queen) Raae 👑

🏴‍☠️ Dev building web apps in public for fun and profit 👑 Helping you get the most out of @GatsbyJS📺 Streams every Thursday: https://t.co/xaLy43cqMI

Host

Ola Vea

A piraty dev who also help devs stop wrecking their skill-builder-ship ⛵. Dev at (https://t.co/8m50kyT981) & POW! w/👑 @raae & Pirate Princess Lillian (8) 🥳🏴‍☠️

Guest

Dan Singerman

Parent, programmer, inverts the Y-axis, makes silly word games on the Internet. The Last Jedi is the best Star Wars movie, don’t @ me.

Editor

Krista Melgarejo

Marketing & Podcasts at @userlist | Originally trained in science but happily doing other stuff in SaaS & tech now

What is Data in the Wild?

Learn from your favorite indie hackers as they share hard-earned lessons and tall tales from their data model journeys!

Brought to you by Xata — the only serverless data platform for PostgreSQL.

[00:00:00] Ola: Welcome to Data in the Wild! Discover data model tips and tricks used by our favorite devs with Queen Raae! I'm your co host, Captain Ola Vea, and this podcast is brought to you by Xata, the serverless database built for modern development.

[00:00:23] Benedicte: Yeah. And today's guest is the great and powerful Dan Singerman.

[00:00:28] Welcome to the show, Dan.

[00:00:30] Dan: Thank you. That's slightly overaching it. But yeah, I'll take that.

[00:00:36] Ola: But maybe you could tell us a bit about your background.

[00:00:39] Dan: Yep. So I've been a software developer for a long time. Probably about 25 years now. Mainly building software on the web, mainly working on software that has a web front end and a relational database somewhere in the back.

[00:00:57] I spent, I've worked in lots of different industries, worked in publishing industries, sports data, social networks, e-commerce. And for most of the last 10 to 15 years, I've been running my own consultancy where I help generally non-technical startup founders who want to start a new business get from zero to one with their sort of like business ideas and for something to do, you know, in terms of online business.

[00:01:28] Ola: Yeah, we met in Athens. We were on the workation in Athens and we met you there at. Well, I met you at the restaurant. It was this roof restaurant.

[00:01:41] Dan: Yeah, no, it was The Rails SaaS conference run by Andrew Culver, who asked me to speak there. I don't do a lot of speaking at conferences.

[00:01:50] Benedicte: You don't? Because I was just, I was blown away by like the thoughtful process you shared. And we've already started with Data in the Wild and I was like this, "we need to have this person on the show and like talk us through the what, the why and the how," that you shared.

[00:02:06] Dan: Okay. Well, that's very kind.

[00:02:10] Yeah, I mean I might do a bit more conference talk in the future but, I'm really someone who creates software and sits at a desk and write code and, it's very different.

[00:02:21] You stand up in a room of a hundred people and going, "please listen to my ideas." They're really interested, honest, and you know, doing a talk for like 45 minutes as well. That is quite a challenge. It was quite a challenge. It's more effort than you might imagine or someone who hasn't done it might imagine to think about how you actually structure the talk, put it together, write your slides, then the practicing, and then obviously the actual delivery of it as well.

[00:02:47] Ola: But I think that's why the talk was interesting to us because you actually make software and because some of the people that are speaking, they're basically marketers.

[00:02:59] Dan: Yeah. Yeah.

[00:02:59] I think there's certainly some people who do a lot of speaking and that's mainly what they do. Whereas, you know, I guess that's someone who's actually speaking about their sort of like day to day lived experience is slightly different to that.

[00:03:12] Benedicte: You get a bit more, more kind of depth into, to real world problems or like things that have actually come up instead of like hypothetical things that could come up.

[00:03:21] Dan: Yeah, exactly.

[00:03:22] Benedicte: Yeah. And as I alluded to, well, I was really happy when he saw that you used that what, why, and how structure, because he had a card that you might want to show for the people watching. We'll put a picture of it in the description.

[00:03:37] The what, why, and how of data modeling, I guess is what we're talking about today. 'Cause we've talked to a lot of people through the course of this series and, you know, everybody has started out with good intentions, but all of them have had issues in some regard when it comes to data modeling.

[00:03:54] So we thought like, "let's bring it back and just let's get started with what is data modeling in your words?"

[00:04:02] Dan: Yeah. Okay.

[00:04:03] So I think first of all, it's interesting to talk about what's a model before we talk about what a data model is.

[00:04:09] So a model is generally an abstraction of a real life system or, yeah let's say real life system that you aren't capturing all the details of reality because that's impossible. But what you want to try to do is actually capture a version of it that is useful in however you want to use it.

[00:04:34] So one good example of this is Newton's model of gravity.

[00:04:39] So in, you know, 17th century or whatever it was, Newton allegedly, I don't know how true this is, but the apple lands on his head and he saw all things fall down and he started thinking about his model of gravity and he came up with a bunch of equations that described how lots of things in the planetary systems moved and that was Newton's theory of gravity. And that was a great model and explained an awful lot of stuff.

[00:05:08] But then in the early 20th century, I think it was, and someone noticed that the movement of Mercury wasn't consistent with Newton's model of gravity. So Newton's model of gravity is useful because it does explain a huge amount of stuff, but it doesn't explain anything.

[00:05:25] And then you had Einstein who came along, and his theory of gravity included the ability to deal with supermassive objects. So the reason that Newton's model of gravity doesn't work for Mercury because it's very near the Sun, and the Sun is a supermassive object which distorts lots of things that Newton's model of gravity doesn't predict.

[00:05:52] But Einstein's model of gravity does predict it. Einstein's model of gravity is a lot more complex, but it sort of describes some things that Newton doesn't. That doesn't mean that Newton's model of wasn't useful, but only up to a certain point. And there's a quote that I really like by a famous statistician called George Box which is: "all models are wrong, but some are useful."

[00:06:16] So when you're trying to come up with a data model for a piece of software you're trying to create, you shouldn't try and go: "what is the absolute, deep level of detail of trying to capture reality," because you're going to get that wrong. What you need to do is: "what's the useful version of that for what we're trying to do here?"

[00:06:35] Benedicte: Because you want to be thoughtful, you want to be able to capture some of your potential future needs, but you also want to land on something.

[00:06:45] Dan: Yeah, so I think there's great danger in trying to do too much too early as well, because one of the problems which I'll probably get into is data models, as opposed to almost every other part of the software stack, is that once you have a live system, and you finally need to change it, it's a lot harder to change the data model than almost anything else.

[00:07:08] Let me just come back to the models as well, definition.

[00:07:12] So I think another important thing to bear in mind in models is that we're, you know, we all deal with models every day. So I have a mental model of what's going on in this podcast. I can see you two. My brain doesn't know all the things that the software is doing to sort of transmit these two moving images and sound and it being broadcasted on the internet and stuff like that.

[00:07:38] So you know, I can't understand all that detail, but I've got a mental model of how it is working. And obviously it allows us to have this conversation, but my mental model could be wrong or incorrect. Or, you know, I could, I'm kind of like, yeah, able to sort of like sit into the rooms you're having this conversation in and I can imagine what might be off camera.

[00:08:01] I might be completely wrong about that, but that is also my mental model.

[00:08:04] Benedicte: Probably!

[00:08:08] Dan: Yeah. I'm going to be on camera today, so I'm going to push all the rubbish forward into the room so that can't be seen.

[00:08:16] So yeah, it's also important to consider people's mental models when creating software. Because there's a principle of user experience which is called the principle of least astonishment. Which is when you're actually working with any computer system, if you astonish the user, then they're going to find the system harder to use.

[00:08:36] You want to build a system that is consistent with how the user expects it to work when they try to take an action, and that is based on their mental models. So if your software has the modeling, so let's take someone who works in a particular domain, I'll take a really boring one like accountancy.

[00:08:56] An accountant will have a bunch of mental models about how accountancy works in their head. And if you're building accountancy software, you're going to have to build some data models in the accountancy software to make your software work. But if the mental models of the user using the software don't match the models that you've built in your system, it's going to be very hard not to make a user experience where there's a mismatch between those things.

[00:09:26] So yeah, if you've got like an accountant has an idea of like how an invoice works and it's got, you know, it's got line items and text. And I don't know why I've chosen such a boring example, but there we go.

[00:09:38] Benedicte: Well, every small business owner knows about this, though.

[00:09:41] Dan: You've got line items and text and quantities. But if how you've defined a thing in the database doesn't match up with that, then either you've got to make some complex abstraction layer between your data model and your user interface, which you don't want because complexity is bad. Or, your user interface is going to match your data model, not the user mental model, which is also going to be surprising and not be useful to them.

[00:10:07] So, you know, it's very important when you're talking about modeling to consider mental models in that as well.

[00:10:12] Ola: Yeah. So that's kind of the why here on my little card is, that if the user will suffer, if you don't do your job.

[00:10:26] Dan: Yeah. Well, so it's important because if you don't get your data modelling right, there's lots of deep holes you can get yourself into. The user experience hole is just one of them.

[00:10:38] As I said, the closer the mental model of your users matches the model of your software, the less complexity you have to bridge that gap. But it's not just about user interface in terms of what button I press and what appears on the screen and stuff like that.

[00:10:56] There's, you know, as we know, user experience is a much wider area. And let's talk about a concrete example, which I mentioned in my Rails SaaS talk.

[00:11:09] So I used to be a user of Spotify. Before I had kids, I knew Spotify. I'd listen to the music I'd want to listen to. I'd quite often go on a run and have Spotify played in my ears when I was doing a bit of exercise.

[00:11:24] And then when my kids got to an age where they thought, "I don't want to listen to daddy's music, I want to listen to my own music," I thought, "Great. I'll get the Spotify family plan and we can all listen to our own music."

[00:11:36] And we've also got Amazon Echoes here so we've got an Amazon Echo in, you know, several rooms downstairs and the kids have Echoes in their room upstairs. Let's not get into privacy issues about that.

[00:11:50] Let's just assume that's fine. That's a different conversation.

[00:11:54] But generally it's good for, you know, listening to music. But one problem we found using the Spotify family account, so you can connect it to Amazon Echo, and that works quite well, and I can speak to my Echo. I won't do it now because I've not stopped playing a stream, but I could go, "hey Alexa," and I don't use Alexa because of this precise reason, I go, "hey Alexa, play blah," and it will start playing.

[00:12:14] But the problem we found was, if I was playing the music me or my wife wanted to listen to downstairs and a child upstairs told their Alexa to play a different tune on their own echo, the stream downstairs would stop and the other song would appear upstairs. And it seemed impossible to actually have different Echoes playing different streams, even though we were paying for the family account, which should allow six simultaneous streams.

[00:12:42] So I thought there must be a way around this. So I searched the internet and like Spotify, like, you know, community message boards and stuff like that. And there were loads of people complaining going, "This is ridiculous. I can't believe your software design is so bad. Even though I'm paying for six streams, I can't have two simultaneous ones."

[00:12:59] And it seems almost certainly that is probably a data model issue. They've probably, I mean there might be other issues going on as well, but fundamentally it's probably a data model issue. Which somewhere in their early software design, they said, "Right, we don't want people to be able to have one account and stream all over the place because that means people won't buy more accounts. It's going to be bad for our business."

[00:13:20] Somewhere baked into the technologies a concept that each account can only have one stream. And then if you try to stream on a different device, the first one turns off. And Spotify were, you know, they're founded like 2011 I think. So it was first launched in 2010 and it wasn't until 2014 that they had family accounts.

[00:13:42] So obviously in that four year period they realized, you know, people aren't just going to be individuals. They're also going to want to like listen in family groups and they introduced a family plan. But you still have the problem when you integrate with Amazon Echo that you're integrating just one particular account and you can't actually stream more than one song at a time.

[00:14:07] And so I searched the boards about this and there was like, loads of people complaining, going, "This is ridiculous. I've paid this much a month, you know. Surely you can make software that works properly," and stuff like that.

[00:14:19] And then there was one message on the board going, "Apple Music works." So we've changed to Apple Music.

[00:14:26] Benedicte: And they lost a customer.

[00:14:27] Dan: Yeah, exactly. So now, you know, my children can listen to their songs upstairs. We can listen to what we want downstairs. And I've never actually tested if it actually cuts off at six streams, but you know, that's what you want. If you're paying for six streams, you should get six streams.

[00:14:40] And you know, someone's data modelling mistake 13 years ago should be something you should solve. But the problem is it's really hard to solve data. I mean, can you imagine how many users Spotify have? It's, I think it's like billions now. Yeah, no, it's half a billion users I think they've got.

[00:15:00] So imagine how much data you have and if you want to start migrating the data to a different structure, that's a huge problem to solve. And you know, these complaints to the message board go back five years. So it's not like Spotify aren't aware of the problem, and they don't have enough engineers to solve it, if they cared enough.

[00:15:18] The problem is a really hard problem to solve, because data migrations are hard.

[00:15:22] Benedicte: Yeah. So it's probably so hard that they don't see the business case for it. Like, yes, they'll lose customers, but they're not losing enough customers.

[00:15:29] Dan: There's probably a ticket somewhere, which says, fix this. And every now and then a developer picks it up and goes, "that's really hard. I'm going to put it back. I find something that's probably more important and easier to do."

[00:15:40] Benedicte: Or like quite a few developers. You know, maybe slightly junior who are like, "well, this should be easily fixed." And then spends two or three weeks figuring out that this is not easily fixed and let's put it back on the queue.

[00:15:53] Dan: So I imagine that the code fix to make it right is probably not that hard. The hard part is rolling it out to half a billion users. But you know, we don't, and the thing is, it's a very easy situation to get in since we've all done it.

[00:16:09] I've done it, you go, "right, I'm going to make a new product." We want a user to log in. So you make a user table and you put your user data there with like email and password. And, you know, whatever, you know, name and whatever properties you want to do about your user. And then you go, right. Let's say you build a SaaS product or something like a SaaS product.

[00:16:29] So you go, my user has certain stuff to do with my SaaS. If it's like a sort of images product, you'd go, "right, a user has many images" and on the image table, you'd give a user ID and that's what it belongs to. And then you probably get going and then you might have a few hundred users. And you go, "this is great. I've built a SaaS. Hooray! I can build a SaaS product. I'm great."

[00:16:47] And then one day someone goes, "Can my friend log into my account" and then you go, "No," because you haven't built that. And then you go, "Oh no! We need to create the concept of a team or a workspace or something like that." So then you make a new model, which is the join table between the user and whatever the assets are. And then you think, "that's great. Okay. Now we've got teams logging in. I'm great at building software. I can now have multiple people logging into one account."

[00:17:15] And thh, sometime later someone goes, "I'm already a member of this account, but I want to be a member of that account now as well." And then you go, "Well, it doesn't do that. You've got to sign up with a different email to do that." Which you'd think would be something that big companies do well, but they don't.

[00:17:34] So I got into that exact situation with BrowserStack because I worked for multiple clients and multiple of my clients used BrowserStack. I was using BrowserStack quite happily with one client.

[00:17:45] And then another client said, "Oh, can you log into our BrowserStack and look at this particular issue here?" So I did. I logged in, I think I signed in with Gmail, with Google. So it's frictionless. At no point was I warned that this would have a negative effect on my account, but I signed into a BrowserStack, another client's BrowserStack, for my email address and I looked at that stuff and then a few days later I wanted to look at my first client's stuff in BrowserStack and it wasn't there.

[00:18:12] It just completely gone.

[00:18:13] Benedicte: It was gone?

[00:18:14] Dan: It was gone. Yeah.

[00:18:16] Benedicte: Oh, okay.

[00:18:19] Dan: And I sent them an email going, "This has happened. Is there any way I can access both BrowserStack accounts with the same email address?" And they went, "No, no. You can't do that. If you want to access two different accounts, you have to have two different email addresses to log in with."

[00:18:36] Now BrowserStack, a massive company, they've got like half a billion in funding. They've got over 500 employees. You'd think they'd solve all of that.

[00:18:46] And the thing is, that's more of an annoyance than the Spotify situation. That was actually making me look bad to my clients. So I had to go back to my first client and go, "Can you invite this different email address because I didn't do anything wrong, but now I can't get into this stuff until you invite me again."

[00:19:05] Benedicte: But that's interesting because I work for a company called Outseta and it's like a membership SaaS that you can use to create your membership software.

[00:19:14] And we've solved that, or they've solved that. But for our smaller customers, they have a hard time then realizing that an email isn't an account. Like one email can be connected to several accounts.

[00:19:27] So some of our support back to the end user, again a lot of our support is explaining then, that they're like, "we have that possibility." therefore their users, even though they've signed up with one email that like, that's not the account, like they need an account in addition to their user login.

[00:19:46] Dan: Yeah, I mean it certainly does create some user experience issues because obviously the vast, for most SaaSes, the vast majority of people only get ever going to be logged into one account or workplace or whatever you want to call it.

[00:19:57] Benedicte: Yeah. But when you need that, it's like it needs to be baked in because it's hard to just sprinkle on top. But I think people are so trained in a way that they need to use different emails. They kind of just expect this not to work at this point because so many companies have issues with it.

[00:20:13] Dan: Yeah, well, exactly.

[00:20:14] It's actually really, I mean you're right. Quite often people just assume, "well, I need to log in something different. I'll use a different email address." Unfortunately, in Gmail nowadays, you've got the plus thing. So it's easy to create different email addresses, but it's still annoying because you have to remember all the different email addresses you use rather than that.

[00:20:30] And also that doesn't work with sign in with Google. That's part of that case is that it's like really footless sign in. Click the invite link. Click "Sign in with Google". Great. I'm in now. My other stuff is gone. And there was like no warning along the way going, "you might lose some other stuff if you do this."

[00:20:49] So, yeah, that wasn't a great experience.

[00:20:51] Benedicte: Before we kind of got into the examples here. You were saying that yes, you interface is one of the black holes you can get into. And you mentioned data migration. I'm guessing that's a black hole.

[00:21:02] Dan: Yeah.

[00:21:03] Benedicte: And what are some of the others? We don't have to do all of or chat a lot about each, but like mention some of the other black holes.

[00:21:11] Dan: Okay. So another thing that you could really bite yourself with is if you build your product and you expose a data model by an API, and then some people start writing software against your API. Because it's one thing to make an abstraction between a human user interface and your data model when things change underneath it, but it's much harder to change an API.

[00:21:35] So for example, imagine you'd, it's a fairly simple case, but imagine you have the case where you had the single user problem and you haven't created workspaces and memberships and you expose that new API. And you go, "right, I'm going to get all my images via API."

[00:21:52] So you look them up by the user ID and they come back saying, you know, this is the user ID of all my things. And then one day you go, "actually, they don't have a user ID anymore. They've got a membership ID." But you've got all these people written software against having a user ID. So you've got to go, "actually, maybe we need to redundantly show the user ID there, even though it isn't the foreign key anymore," which you probably can do but isn't that hard.

[00:22:19] But then if you then change your data model even further, you might get to a situation where you actually can't even create a redundant user ID there. And then you're in a situation where you're saying to people, "Sorry, you know all that lovely software you built against our API? It's going to break."

[00:22:34] So not only are you creating problems for yourself and changing your own software, you're actually going to ask your customers to have to change their software as well. which is never a situation which you want to be in.

[00:22:46] And it's not even like you can easily do a different version of the API because what's quite common is you go, "right, this is API version 1. Oh, we've got a new version of API coming out of version 2. We're going to deprecate API version 1 over time or whatever." But if your fundamental data model is changing in a way that you actually can't even create the redundant data accurately. I mean, you could probably do it inaccurately in many cases, but that's obviously not a good thing either.

[00:23:12] Then you might be in a situation where you actually can't even migrate people to a new API because the data just doesn't exist to maintain the old one anymore which is, you know, obviously a terrible situation to be in if you have any sort of API driven business.

[00:23:25] Ola: So what kind of approach can we use to get it right then?

[00:23:30] Dan: So I have an approach that largely works for me. Well, I'll qualify this by saying firstly, your mileage may vary because you know, it's a difficult thing to do to get right. And secondly, even I often still make mistakes now. I'm very far from infallible.

[00:23:49] What you want to do is try to minimize your mistakes and try and make sure that you. What's also really important is if you don't have to make a decision, don't make the decision.

[00:24:01] Often when building software and if it's like an MVP or an early stage piece of software, you might go, "right, I need one of these, one of these, one of these. But I'm actually not going to build an end to user interface now because we're not actually going to get to that point in like version one or whatever."

[00:24:15] What's often useful is don't actually make your data models concrete until you need to. Because if you start making data models concrete and they turn out to be wrong, then that's a problem you've got to change. So absolutely minimize how much you build.

[00:24:32] Another thing to bear in mind is you never know less about the problem space you're working in than when you start the project. The moment you start the project and you start actually building a software, you will learn more stuff. So that's another good reason to sort of like do as little as possible until you get those learnings.

[00:24:52] So yeah, so what I do is, and the scenario here is me talking to someone with the domain knowledge. I mean, you can introspect and do it on yourself, but the process is largely the same.

[00:25:04] What you want to do is, turn what you're trying to build into a list of user stories. I'm sure most people know what user stories are. It's a sort of like high level software development tool to sort of capture what a piece of software should do. And that's a very simple structure, which is as a type of user.

[00:25:24] So as an admin, I want to do an action. I want to moderate the content so that () the reason) so that, you know, inappropriate content doesn't appear on my site. And what you want to do is, and generally I'll do this like in a workshop with my clients, which for most of the MPT VPs we build takes pretty much a whole day. And it can be quite intense because you've got to,

[00:25:51] Benedicte: Oh that wow. That was super interesting because that just shows you like how much time you should actually spend on this. Thank you for that tidbit. Yeah.

[00:26:00] Dan: It's all right. So, I mean, obviously it's different if you're doing it for your own software and you've got to respect what you know about things. But generally if I've got to discuss with a founder or a couple of founders what the software they want built. I'll book a whole day, we'll sit in a room. We'll go through firstly, at a high level, what they want the system, the software to do. Try and understand their business and their goals and their customers. And then, at a certain point we start going, "right, I've got enough background to try and start actually capturing what we're going to build here."

[00:26:30] And that's what we try to capture the entire scope of the MVP in user stories. And sort of write them sticky notes, put them on the wall. And often, for even a fairly straightforward MVP, you can end up with like 80 or 100 user stories or something crazy like that. And the point here is not just to define the MVP. What you want to do is find out what they.

[00:26:58] So, startup founders always have a grand vision of where they want to get to. We all know that. It's like, "you know, I'll build something that half the world will use and it'll be great and it'll make this much money and it will do all these like thousands of things, because I've thought of a thousand things I can help people with my software." and you always say, "well, that's the grand vision. That's great. But what's step one?" And then you've got to sort of like bring the grand vision back to what's the MVP? What's the first thing you put into market that users are going to use?

[00:27:31] But what's important here, particularly in the context of data modeling, is you don't want to only write the user stories for the MVP. You actually want to try and capture as many user stories of that grand vision. And, you know, if the grand vision is crazy, which it often is with startup founders, you kind of have to put in some sensible limits.

[00:27:50] Benedicte: But if we go back to Spotify, like I imagine their grand vision was like the whole world listening, streaming music.

[00:27:57] Dan: Yeah, they were probably one of the few start up founders who pretty much got that.

[00:28:02] Benedicte: And now they are there and I am guessing when they did this exercise, or they probably didn't.

[00:28:08] Dan: There were probably engineers that just started coding, which is how easy it is.

[00:28:13] Benedicte: Or maybe, maybe they were a little bit like in thinking, like you said introspective like, "we have songs, people want to mix them into playlists." And you know, a little bit like that.

[00:28:23] But if they'd gone through this exercise and looked more broadly, they might have had a user story that said something about families.

[00:28:30] Dan: Yeah, I mean, maybe when they were doing that first version they said, "we're going to have family accounts at some point." And someone said, "we'll worry about that later." I don't know. That's quite possible. That's a very likely scenario.

[00:28:43] Ola: And then you say that they should not make a choice to stop them from doing the family account, right? Because that's probably what they did.

[00:28:51] Dan: So this is really tricky. This is really tricky. And like I say, I don't have all the answers here.

[00:28:59] So what should have happened, in my estimation, is if they were in a room designing a data model and someone said, "will we ever do a family account?" As long as someone, I expect they didn't have that conversation come up because you knoe, that's a sort of like, not a year one problem.

[00:29:18] Benedicte: Probably single dudes in their 20s making this software.

[00:29:22] Dan: Yeah, so they probably said, " we don't want an account to be able to play more than one stream at a time because that affects our revenue model." So they probably just focused on that and they didn't even come up with a family account idea.

[00:29:34] But if they had, the right thing to have done would have been to say, "we're not going to build the family account now, but does this affect our fundamental data model? What might happen in the future just to make sure that we get our fundamental structures right now?"

[00:29:49] So this is exactly what I try to do with my customers, which is we'll look at what the MVP is going to do now, which is a relatively small piece of software, but we treat all the other user stories we've discovered about the grand vision. And, you know, maybe not the grand vision, but anything that's likely to happen in the next couple of years and say, "is this evidence towards information we need to understand about how we build the data model?"

[00:30:12] And the Spotify family account is a perfect example of, you know, "we're not going to do it now, but does it have an effect on what we, what decisions we make now?" Because in the Spotify case, it clearly does.

[00:30:23] Benedicte: I like this because I've had a hard time articulating that in teams where it's like, I know we're going to be like small. We're going to be, we're going to make the decisions. Like you said now, like don't make a data model that you don't need anymore.

[00:30:36] But then I haven't been able to kind of verbalize it correctly. And I don't know if I still will be able to, but it's like, but I'm always like, but like, "I can see in the like, I've heard some manager talk, like I know where this is going and we need to take this into account because otherwise we're gonna, we're gonna like. I can see a problem. I just can't a hundred percent tell you it's going to be a problem, but I can like sense it."

[00:31:00] Dan: Yeah. I mean in that scenario, I mean it's difficult and obviously working in a company on an ongoing basis, where your situation in software is somewhat different to, we've got a completely blank sheet and we're starting from zero and we're going to design the whole thing now.

[00:31:14] But I still think it's useful thinking, "right, this is what we have. What are we trying to build now? What do we expect to have to build on top of this part of the software over the next, you know, one, one, three, five, however many years you want to look ahead." And I think it's worth looking at those user stories.

[00:31:35] 'Cause what you want to do is your user stories will mention nouns. And your nouns are likely to be mapped to entities in your data model.

[00:31:49] So going back to my really boring accountancy example. You're going to have a invoice model. Invoice is a noun. You're going to have customers, you're going to, and you know, a customer will have many invoices, and so forth.

[00:32:01] And then you're, so if you go through all your user stories, you can identify the nouns that frequently occur. You can kind of group the user stories about common nouns together. And then you can think about how do those user stories about those particular nouns indicate the relationships with other nouns.

[00:32:21] And what's really important is whether something is a plural or not because quite, you know, you might think, again, it goes back to that user case. A user has many entities in your SaaS product, but you assume that the entity only has one user.

[00:32:41] And then one day you go, "Oh no! My friend wants to log in now." So you realize that's a many to many. And when you're actually designing your data model, one of the most important things is to go, "is this thing that looks like a single now ever likely to be a plural?" Because that totally informs how you design your data model.

[00:32:59] And then going back to the example you were just mentioning, when you're going, "how do I do this now and get it right?" it's like, well this is a software here of now, this thing might be a single now, but is there any set of future requirements we consider likely where it's going to be a plural? If it's a plural, then you know, make it a data model. It's a one to many or maybe a many to many relationship between what you're building.

[00:33:23] And I actually think it is useful to go, rather than just sit there and go thinking, "I'm thinking, is this ever a plural?" You could actually write down the user stories of the things you think might realistically happen.

[00:33:37] Not, I mean, again, you don't want to do it to a ridiculous extent and try and consider every single possible thing that could possibly happen, but if you consider the things that you consider likely to happen, then capture those in your stories, look at your nouns, see what the user stories that imply about the relationships between nouns, and you might actually go, "Actually, yes. I can foresee a scenario where we're going to have to actually have more than one of these. Hence, why don't we build that into a data model now, even if we're not actually going to use that now. Because at the point that thing happens, we'll all have the data model correct. We won't have to do something like nasty migration."

[00:34:13] Benedicte: Because then the user interface could be the one, and that's much easier to build often, like on the front end. And then when you get to the point where you need a list of things in the user interface, your data model will account for that. And you don't have to do a big migration or do a big job, make that happen.

[00:34:31] Dan: Exactly. And you know, for example, I use Rails quite a lot and that has an association called "has many", which is fairly obvious. A user has many things. But you can also have a "has one", which is the exact underlying data model with the foreign key in the table which you have the, "has many" too, but you call it "has one". And then it just creates a singular one of them. And if you ever need to make change your "has one" into a "has many", you change your Rails code, but you don't have to change your database.

[00:35:00] Benedicte: Ooh, interesting.

[00:35:01] Dan: I'm pretty sure Laravel does the same thing. I'm pretty sure most, you know, modern frameworks with an ORM do that sort of thing now.

[00:35:08] So, you know, particularly when you've got a one to one. So in Rails, when you've got a one to one, on one side it's belongs to and on the other side it's has one. It's very important to think about which one of those is most likely to become plural and make sure you get the belongs to has one the right way around because that serves you a lot better at the point that thing becomes a plural in the future.

[00:35:28] Benedicte: So if we go to Spotify again, you have the kind of a user has many playlists. But a playlist may have many users because they now have shared playlists.

[00:35:37] Dan: Yes, exactly.

[00:35:38] Benedicte: And that would be many to many.

[00:35:39] Dan: Yes, yes, exactly. The many to many case is. Or a slightly even better many to many with very close example is songs and playlists.

[00:35:50] So I have many playlists and a song appears on many playlists. So a playlist is a many to many between a user and a song.

[00:36:02] Benedicte: Ooh, I like it. It's a good way of thinking about it.

[00:36:04] Ola: I think it sounds very useful to talk to somebody outside of the startup because you quickly create a little bit of a bubble there.

[00:36:17] Dan: Absolutely. Yeah.

[00:36:19] Ola: And then you can come in and kind of pop the bubble and you're like because it's not automatic that talking to your users, because everybody's knows that you should talk to the users.

[00:36:28] But the users, they don't really like Ford said, "they don't really want a car, they want a faster horse," in a way. So they're not.

[00:36:37] Dan: Yeah, that's true. That's something that Henry T. Ford famously said, but didn't.

[00:36:43] Benedicte: That's a good quote.

[00:36:46] Ola: Yeah. So when they're talking to you who are experienced in this, you can kind of nudge them in the right direction.

[00:36:54] Dan: Yeah, I mean I think talking, particularly if you're doing a startup, talking to users is incredibly important. But you also write, you don't want the users to create the solutions for you because the users aren't the experts in creating software that solves their problems.

[00:37:09] They want to just have their problems solved. And you know, if their horse is too slow, they want it to be faster. I mean, that's a slightly different conversations, but I think when talking to users, you don't say, you don't ask them, "what they want to do?" Or rather you don't ask them "what they want the software to do?" You ask them, "what are they trying to achieve?"

[00:37:28] And then by asking them what they're trying to achieve, then you understand what the problems you're solving for them are, not what they want the software to do. And once you've actually understood what the problems the software should solve for your users are, then you can turn them into user stories.

[00:37:47] And I still think it's worth going back to the user learning to validate your user stories. So you know, I'm not going to come up with another abstract example because there's a lot of people in it too abstract.

[00:37:59] But you know, you talk to a user and they say, this is my problem, you go away and think, "actually, I think a good software solution for this problem is this," and then you should actually then validate that again with your users. And hopefully, if you're good at communicating what you're trying to achieve, they go, "Great! That's a much better solution for my problem than the faster horse, can I have one of those?"

[00:38:18] And then you still have the user stories and then you can do the nouns and look at the relationships and hopefully get your data model as right as possible.

[00:38:27] Benedicte: So what we're saying is like finding the nouns, we need to find the nouns. That's the big takeaway.

[00:38:36] Dan: Another thing on the subject of nouns is that something that's really hard, but really important is to come up with a consistent project vernacular.

[00:38:46] Every single big software project I've ever worked on says, has different words for the same stuff. Whether it's in code, or whether it's in the UI, or whether it's in support documents, or whether it's just people in the business talking to each other about things. I mean, you know, in some an obvious case, you know, in some parts of the code, something might be called an image. In other parts of the code, it might be called a picture. And then

[00:39:13] Benedicte: An asset.

[00:39:14] Dan: Oh, asset. Yes. Although asset is slightly different. Asset could be like a superset of other things.

[00:39:19] Benedicte: Yeah. But I've seen images, but I've seen images being assets because it was an image, but then suddenly it was extended to be video also, but it's still called an image, but it's now an asset.

[00:39:30] Dan: One thing that's super important for a multitude of reasons is to have a consistent vernacular. So if something is going to be called an image, it should be called an image, you know, all through the code, all through the business. I mean, you don't have to put it in your user interface, if it's a technical term. But it probably helps, because then when someone phones up and goes, I've got a problem with this image, you're not going to go, "do they mean this, or do they mean that?"

[00:39:53] And one thing you can do with your nouns and your user stories is like, collect nouns together that are similar and go, actually these are the same thing.And we'll treat them the same and we'll get, this is the name we'll call this thing within this system, within this business, within this whole thing we're creating. And then ideally you call your database table that thing as well.

[00:40:16] Benedicte: Yeah. So that's when you take the image and the video and the audio and you call it assets.

[00:40:21] Dan: Yes. Well, like I say, you've got a hierarchy there which might. You can call things more than one thing if there's a different meaning. So, you know, video does mean something different to asset. Asset is a collection of different types of stuff. And in certain businesses, maybe that difference isn't important. You can just call it assets.

[00:40:41] But I'm sure in other businesses, you know, that distinction is important. So you do want to have that hierarchy. But obviously hierarchies are something that's slightly different to relational databases. So, you know, as a rule your mileage may vary.

[00:40:55] Ola: I have an example. We have this weather page in Norway and it's made by well one of the entities that's made it, they are the metrologists or what do you call it? Like the weather guys, the weather.

[00:41:11] Dan: Meteorologists. Yeah.

[00:41:12] Ola: Yeah.

[00:41:13] Benedicte: Meteorologists.

[00:41:14] Ola: And rain for them, in their vernacular, is green. It's green. Rain is green. And there were these two guys, actually there were just two guys making the page initially. And then the one guy he was like, "No, you can't have rain as green because people won't understand it." Yeah in the user interface. So the vernacular was too inside in a way.

[00:41:42] And then he went out and tested it on the subway. He brought a piece of paper and he had like green stuff and like, "so what is this?" And they were like, "grass?"

[00:41:58] In the end he convinced them that they should use like use blue

[00:42:07] Benedicte: In the user interface.

[00:42:08] Ola: And he got super unpopular.But yeah.

[00:42:15] So have you ever run into that kind of thing where you kind of trying to convince them that, you know, "your vernacular here, you gotta change it. You gotta change it." Has that happened?

[00:42:26] Dan: No, not really, because I mean, as an app started helping people, It doesn't matter, I mean, you've got two, you've got two slightly different concerns here.

[00:42:37] One is your internal communication, whether that's just in the code or within your business or whatever. And then you've got your user experience and how you present that to the outside world.

[00:42:51] It's better if, it's easier if they're the same, but the most important thing is just the consistency. And if you've got one entity that's called three different things throughout your code base and three different things in your support documentation. And then you go and you say, "well, I want to organize a meeting about images," and someone goes, "do you mean pictures and so forth?"

[00:43:15] You know, it just creates friction in all the communication. So I don't have a strong view on you know, it's not the case that your internal vernacular has to be the same as what's in your user interface, but it does make stuff easier if it is.

[00:43:32] Benedicte: And last question on the nouns. I feel like you often come up in situation where you have the same noun for two things.

[00:43:41] Dan: Yes.

[00:43:41] Benedicte: Would you then recommend just like prefixing it or would it be better to like find a separate name?

[00:43:48] Cause if you go back to accounting, there are accounts in accounting and then there's accounts for the user accounts that have multiple users, for instance.

[00:43:58] Dan: Okay. So there's a famous quote which includes the fact that naming is hard.

[00:44:03] Benedicte: Yes!

[00:44:04] Dan: I'm not great at naming, so I'm not going to give you advice on that. What I will say is you've got to agree what your names mean and be consistent. But in terms of product names, I can't help you with anything on that.

[00:44:18] Benedicte: No. Okay. That's fine. Is there anything more we need to know about the how?

[00:44:25] We were doing user stories, and we tried to look to the future. So we kind of know what's coming so we don't limit the user stories to just the MVP or what we're doing now. And then we collect the nouns from the user stories and then we see how they relate together.

[00:44:41] Dan: Yeah. No, I think that's fair. And I think the other thing is it's never going to be easier to fix your domain model than before you launch. So if you're in a situation where you have the process where you start making a database. You haven't launched yet, and you start writing some code.

[00:44:57] And I've done this myself. You get to a point where you go, "Actually, this data model doesn't feel right. I'm already having to code hoops here to jump through the fact that these two things aren't related correctly," or something like that.

[00:45:09] Then you should fix it before you launch. Because if you launch and you have got real user data in it and you've got to do a migration or you've exposed it by API, it's going to be a much harder problem to fix then.

[00:45:20] And I know the feeling. You've got your idea, you're excited about getting it live, and, you know, it feels like a little bit of a road bump just having to go, "oh, now that I've written like, you know, 300 lines of code, I realise that these two things are related in a way I didn't realise." And I'm like, "Oh, can I be able to like, migrate my database down and change it and do it up and do all these other things."

[00:45:41] And I'd say 99. 9 percent of the time, you should because if you're hitting roadblocks before you launch, it's going to be massively more painful after you launch.

[00:45:49] Benedicte: Do you have a lived experience of that happening?

[00:45:52] Dan: Of? Not fixing the data model and then it biting me later.

[00:45:56] Benedicte: Yes.

[00:45:57] Dan: Probably the first three times I didn't bother doing the user membership model right. It took me more than one time to like, you know, learn that lesson properly.

[00:46:07] Benedicte: I feel like that is the biggest lesson from this season of Data in the Wild is spend some time on your user versus workspace versus teams thing.

[00:46:18] Dan: And the thing is, it's a solved problem. People don't have to keep solving a problem.

[00:46:23] I mean, if you want to use a SaaS starter kit, like something like Bullet Train, that's got that baked in from day one, so you don't have to build it from the ground up.

[00:46:31] Or if you don't want to use some sort of SaaS starter kit, just do the research to find out what the standard solution to this is. Because it's a solved problem. People don't have to solve it again. You’ve just got to take the small problem that’s in your software.

[00:46:42] Benedicte: Cool! So where can folks find out more about you?

[00:46:46] Dan: I don't have a huge online presence, but I guess my website is reasonfactory.com. And I'm on Twitter, @dansingerman.

[00:46:54] Benedicte: And we'll put links to that in the description.

[00:46:57] Thank you so much for coming on and sharing your process with us. I felt that it was very illuminating and I will actually try to do this the next time.

[00:47:07] Dan: Well, let me know how much it works because as I say, you know, your mileage may vary, but even if the takeaway is just, I need to be careful about this bit, then I think that's probably worth having.

[00:47:20] Benedicte: But if you're also just having some sort of process. So even if you're doing this by yourself, but setting aside some time to think short term and long term and like finding that you are not blocking your future vision is super valuable. And you can do that and still kind of be fast and break things and all of those fancy things they say.

[00:47:45] Dan: Absolutely. I think it's not saying you've got to do tons of design and get it right. It's certainly not. I'm certainly not advocating big design upfront. It's more just probably put 20 percent thought into this area than you otherwise might have done.

[00:48:00] Benedicte: Yeah.

[00:48:00] Ola: So this has been the last episode of Data in the Wild. Thank you for listening and we'll see you around the interwebs!