Data Matas

Stop Coding, Start Diagramming: How to Build Data Platforms That Deliver

If you're rushing to hire a data engineer before you have a clear business question, you’re doing it backwards.

I'm joined by Teddy Bernays (Freelance Data Engineer) to unpack his "business first" approach. Teddy shares his journey and explains why simplicity and a solid plan always beat the latest tech stack.

His top advice: "Find the problem you want to solve first. Is data the answer? Only then should you start building."

In this episode, we cover: 
▶️ Why you should hire a Data Analyst before a Data Engineer 
▶️ The "Diagram First" rule for technical projects 
▶️ How to escape the painful world of legacy spreadsheets 
▶️ Finding freelance clients in the real world (get off LinkedIn!) 
▶️ Using AI to finally solve your documentation problems

What is Data Matas?

A show to explore all data matters.

From small to big, every company on the market, irrespective of their industry, is a data merchant. How they choose to keep, interrogate and understand their data is now mission critical. 30 years of spaghetti-tech, data tech debt, or rapid growth challenges are the reality in most companies.

Join Aaron Phethean, veteran intrapreneur-come-entrepreneur with hundreds of lived examples of wins and losses in the data space, as he embarques on a journey of discovering what matters most in data nowadays by speaking to other technologists and business leaders who tackle their own data challenges every day.

Learn from their mistakes and be inspired by their stories of how they've made their data make sense and work for them.

This podcast brought to you by Matatika - "Unlock the Insights in your Data"

Teddy Bernays (00:00.258)
need that with new technology but a good old fashioned clap never hurt, will simplify the edit.

Aaron Phethean (00:03.789)
Yeah, exactly. Okay, so you've clicked record, I've clicked record.

Teddy Bernays (00:10.975)
Yes, I have a beautiful clip.

Teddy Bernays (00:16.191)
and extend.

Aaron Phethean (00:16.399)
Awesome. Well, Hello, Teddy and welcome to the show. I'm really, really looking forward to this discussion. We were chatting off air about who might be listening and who we could help. So we'll dive into that in a second. But before we even do, why don't you introduce yourself, tell us a little bit about you and then we'll talk about who we think might benefit from this episode.

Teddy Bernays (00:38.338)
Yeah, sure. I like to introduce myself as a data engineer, working as a freelancer and a Google Cloud trainer. And that's what I do most of the time. And I would start like that. And then I think I could go on with your questions.

Aaron Phethean (00:54.947)
Yeah. Okay. So, well, let's, let's, I'll usually like I said, try to do a one take. I feel like my intro is then like, let's do another start and then go from there. I'll do the intro, but then I'll explain who might be listening straight away and then you do. Okay.

It's quite funny, normally there's like, once we get started after a few minutes, it's like smooth as anything. So this is, think we're gonna get rid of this early part and then we'll be straight into the smooth part. Okay.

Hello Teddy and welcome to the show. I'm absolutely delighted to have you on and really looking forward to discussing everything that data consultant brings in the early stages. Plus we were chatting off earlier what it might be like for someone internal starting a project data platform and understanding how that works. I think there were a couple of other people we were kind of chatting about. Obviously the leader who needs the data has a...

this interaction and then someone who's just joining and thinking about becoming into the data community, they can get a lot out of this episode as well. So hopefully if you're listening and that sounds like you, this one's for you. So Teddy, why don't you just tell us a little bit about you and how you got into data yourself.

Teddy Bernays (02:14.062)
Oh, interesting. I started as a data analyst actually. think I wanted to, before I was working more in audio as we discussed previously, and I wanted to do something with IT. I mean, was always as a young man, interested by video games and computers more in the hardware side. And then I say, okay, let's go. I don't want to be afraid anymore of the terminal commands.

So I was meditating between data and networking and I don't know, I started with data and was my, okay, it clicked, you know? And I think I became a data engineer as a frustrated data analyst where I wanted to go further than just using Python with pandas and visualization packages. So I wanted to go a little bit further, like go even more under the hood.

Aaron Phethean (02:49.326)
Yeah.

Teddy Bernays (03:11.466)
And I think now I'm satisfied. okay, I found... Yeah, yeah, okay. Now I can... There is so much now I can do without going to learning new crazy stuff. There is already enough now with those toolbox.

Aaron Phethean (03:13.871)
You've gone far enough, you think?

Aaron Phethean (03:27.757)
Yeah, plenty of tech to keep you busy. Maybe wind back to something you briefly mentioned there. you're sound, audio technician, you're actually still actively recording and doing things. Tell us a little bit about that and how does that relate to data?

Teddy Bernays (03:38.134)
Yeah.

Teddy Bernays (03:42.986)
Okay, if we have to find one bridge, I would say now in modern audio you have a lot of digitalization and like you use a lot of networking, that would be the bridge, you know? But I think I'm just interested about...

Aaron Phethean (03:55.939)
Yeah, yeah, yeah.

Teddy Bernays (04:02.296)
Techniques and how it's made underneath because okay. I think I started as a musician would make more sense You know like in my 20s being a musician and then recording yourself your own EP and all our music then sound technician and live sound mixer, etc, etc and Yeah, I think yes sound move through pipes data as well you can find also like an analogy here

Aaron Phethean (04:09.529)
Mm-hmm.

Aaron Phethean (04:28.909)
Yeah, yeah, yeah. think if I have to continue the analogy, which may not be perfect, it feels a lot like in data, there's a whole collection side of the problem that you tend to interface with someone. And then there's this whole producing, you know, turning into something beautiful or it's got a good kind of insight. And I actually think that's kind of the lost art of if you're just in data, just in the analytics side.

and you don't think about that or you don't think about the processing, you kind of lose the overall whole picture. So yeah, maybe there is like a real link there. Well, maybe slightly related. If you're doing the mixing and the music, you're presumably a bit more like your own boss, you know, you've got to find your clients, you've got to find your, you know, the people that you work with.

If we apply that to data and you as a consultant, how did you get started with the consultancy and finding clients and how do you start a conversation?

Teddy Bernays (05:32.046)
Yeah, that's a good question. I think I followed the path. was already a freelance searcher, so it felt natural to keep doing what I do, do sound or data, you know. At the end of the day, it was just a structure to declare your taxes, you know, it's what it is. But how did I start to find clients? I think I'm lucky enough to live in a big city, so I started to meet people because LinkedIn is fine. I think we met...

through LinkedIn, by the way. So it's amazing. But meeting founders of startups like that, you're going to some meetups. And I find it interesting. An addition of some platform, could go a little bit more in depth on the way of doing that. But I would say don't forget about meeting real life people. If you're good at communicating, if you're good at meeting people.

Aaron Phethean (06:00.761)
Yeah.

Aaron Phethean (06:18.788)
Yeah.

Teddy Bernays (06:28.366)
I mean, you can't even find jobs in like going out in a bar, you know, or in a park.

Aaron Phethean (06:32.067)
Yeah, but I think that often I find you bumping into interesting people all the time. All you have to do is ask, you know, what's going on in your world? What's interesting for you? And yeah, be surprised to hear me. Yeah.

Teddy Bernays (06:38.562)
Exactly.

Teddy Bernays (06:44.088)
We rent, my brother visited me not so long ago and we rented a bike and the guy like needed like a platform to, have some SQL database, but I don't understand this. I said, let's talk later. I mean, can help you. How do you find clients just being out there and meet people? That would be, that would be the best because you're, we are so many online and you could be, you don't, you don't master that. You don't know exactly where you are.

Aaron Phethean (06:57.455)
Hahaha!

Aaron Phethean (07:02.774)
Yeah, yeah, yeah.

Aaron Phethean (07:11.075)
Mm-mm.

Teddy Bernays (07:13.102)
how your profile will be seen. Even if you reach out to people, may be, lot of people reaching out to people.

Aaron Phethean (07:18.893)
Yeah, yeah. So maybe without leading the witness too much, how do you think the person sitting in a job in a company, perhaps they're in an analytics role or you're almost in that data space already for the company? What would you advise them to do? What's the kind of way that they might want to get into data more in their own?

Teddy Bernays (07:39.79)
If you want to get more data in the company, I find a problem you want to solve first. And is data the answer first? If you already ask yourself the question and say yes, okay, then we can start thinking about the data platform. And if you are a stakeholder and you wonder what is the first to do, an analyst. You know, I don't even say no, hire an engineer. No, no, hire an analyst because it will be...

first transition translation of your business needs into data problem. By engineering afterwards you can but higher data analysis first they will define that with you will help you to define how can we translate that.

Aaron Phethean (08:28.823)
And probably the guy is using this, who's doing that role today. Your most, your best customer is using a spreadsheet is already doing some analysis of some sort. And the engineers bringing a scale, but you need to understand a problem. Yeah. I totally get that. And how does the conversation generally start for you? Like at what point does a company bring you in and say, Hey, we actually know we need you.

Teddy Bernays (08:46.668)
Yeah, exactly.

Teddy Bernays (09:00.59)
It's even interesting because I was thinking of that and I would define two categories There were some people they don't even know that they need a data platform and you're the one mentioning it So we'll start from then as they come and they say the problem I think it is We have inconsistent seeing our data dashboard or it's too slow Because they are plugging like connectors as a classic, you know connecting directly on the sheets as you mentioned So they don't even know that in a platform they come to me do an audit and say actually it's gonna be

Aaron Phethean (09:19.364)
Yeah.

Teddy Bernays (09:29.802)
more painful of what you thought at the first place. And some say like after this audit, they said we're not ready for it. I remember like Glasses company, the business is going well, but they came to me and they showed me like a screenshot of every week sharing their, their whatever it was. But that's how they communicate through sheets and screenshot by mail. was like, okay. So they were not ready.

person that I interact with, but not the stakeholder, unfortunately. So there is this first category. Tell them that they need a data platform. And there is a second category that they know they need a data platform or they need a unification. Most of the time I notice like, they identify that we have inconsistency, we have different sources, but we don't know how can we mix pears and apples and oranges together.

Aaron Phethean (10:11.769)
Right. Yeah.

Aaron Phethean (10:24.271)
I think I get that to try and drill into it a little bit more. In the first camp, they're not really used to reporting in general. So they're not driving decisions from reporting, at least that's what I heard. And the second camp, they've got used to the idea of reporting, but now they really want to be able to correlate a report from one pile of business with another. Does that sort of sum it up?

Teddy Bernays (10:48.334)
Yeah, and I think sometimes it's kind of late because they bootstrap something data-wise. And we mentioned, you and I, already a couple of times, the classic building a Google Sheet or Microsoft Excel spreadsheet, and it's building up over the year. like the Google, Richard is now retired, but he built it with macros on top, and all companies using it. even our bank system is based on Excel spreadsheets. So it's just like...

Aaron Phethean (11:09.688)
Yeah.

Aaron Phethean (11:16.559)
you.

Teddy Bernays (11:17.134)
It's almost too late when you realize that you need to modify it. But it's not easy. when you hire anything, it's like the time that you hire data engineer. Maybe you don't want them in your team overall because your business is not only data driven because you're doing your business. Overall, you just want to improve it. That's what I had like.

Aaron Phethean (11:30.415)
Mm-mm.

Teddy Bernays (11:44.814)
client they were selling in Germany like some pergolas and then they have a data that came to me like wanted to build some ingestion pipeline and they just want to improve their business you know

Aaron Phethean (11:57.968)
Yeah, yeah. Do you the kind of my go-to model of the point at which they need it is to think about if they can look around and count everything, if they can sort of gut feel it, they probably don't need a data platform, It's probably too early. When it gets beyond that, too many things to count, too many widgets, too unit cost becomes an issue for them or something specific, then okay, great, now it's time to start.

I guess moving on to the kind of you and not the day in your typical, you know, working with a client looks like, where does it, you know, tell us what does it look like to try and build something for someone? how does the process generally work? And then what's your, your kind of maybe exit point from that kind of typical engagement?

Teddy Bernays (12:48.142)
I would, it differs, it depends if they have already a platform or not. It will change. And that's I like to work with startups like let's say Greenfields, they have nothing because you can then be what I call more efficient. They're not maybe, you're not dependent in your innovation as an engineer with the tools that they're already using even though you don't think it's appropriate.

Aaron Phethean (13:17.455)
Hmm.

Teddy Bernays (13:18.616)
So if we talk about that, I will always think about before we build any data platforms, like what? By having this data platform or this dashboard, because they mentioned usually they mentioned dashboards, you know, because that's what they can visually, yeah, can see. That's why, but when they say dashboard, they say, I want a unified platform that can collect all my sources and give me reliable information.

Aaron Phethean (13:36.313)
see.

Teddy Bernays (13:47.214)
So that's the translation. And from that, I say, what will make your business better? What information will you get before we build anything from this dashboard? How can you help to make your business better? So when it's data driven, it's easier because the data is really at the core of the business. When it's not, you can define that. I will always start from there. When I talk to an analyst, it's easier because they know better the data than me.

They know their business better, so I have the perfect person to talk to. And they probably have defined even more precisely that some stakeholder that they don't necessarily know exactly. When they have plat... Yeah.

Aaron Phethean (14:22.831)
Yeah, yeah.

Aaron Phethean (14:29.133)
They might not call them the same things as us, but they're already starting to think in terms of things that they're counting or things, dimensions they describe, what they slice it by. And they're probably already starting to think of these things, so that just makes sense once you start a report. Yeah.

Teddy Bernays (14:44.462)
Yeah. So, but for then companies that already have something, they come with like a pre, they know the tools they want to use and they come with to you to connect the dots, I would say. Say we already established because our data is, I don't know, they have like already on Google cloud, they have already a BigQuery data where I was destined to need to connect something or...

Aaron Phethean (15:00.309)
Mm-hmm.

Teddy Bernays (15:13.162)
Optimize the platform or they want their their way has to be more cost effective so can you know so Like it's really if he really depends on what they need at this moment but to try to answer properly your question, so Yeah, I will try to define really precisely the Benefits that it will get even from an optimization, you know

Aaron Phethean (15:27.321)
Yeah.

Teddy Bernays (15:42.924)
You optimize it. Why do you optimize? Because it's too slow, because it costs you too much, because the frequency is not correct.

Aaron Phethean (15:47.961)
Yeah.

Aaron Phethean (15:52.048)
Yeah, so what exactly what benefit do you derive makes tons of sense? It feels like the second case where they want to solve a very specific problem

In some ways it's good the problem is already defined, but it does limit the usefulness that we can actually have for them because they think the problem is this one small part of the system, whereas if someone has stood back, and hopefully someone has in that case, there might be a sort of larger issue that how things work together rather than just the one...

Teddy Bernays (16:25.636)
they have legacy, they have a system working, it's too old fashioned, you know, way of doing it. So you have to bring like a newer technology. Yeah. So it could be that because they want, they know that it has been bootstrapped before, they use a technology that is not supported anymore. So they know they have to move towards more cloud oriented or even though, yeah, I think.

Aaron Phethean (16:30.137)
Yeah.

Aaron Phethean (16:45.039)
Yeah.

Aaron Phethean (16:51.203)
Yeah, exactly.

I guess that then probably, we think about the types of problems you're and the way you go about it, it made me think of this quite funny phrase here in the UK. I think it's actually generally the Irish is sort of credited with saying it, that if you wanted to get to your destination, well, you wouldn't start from here. And this is like, you know, the idea that, you know, know where you need to get to, but, you know, it's quite problematic place you start.

from I guess when you go into a company that's got a bit of a mess and got some legacy and you know they're thinking about the benefits what does the project look like how do you how do you actually think about this sort logical phases and how to help

Teddy Bernays (17:32.654)
Mm.

Teddy Bernays (17:44.814)
That's a good question. So let's say we define already what they want to, why they need to do that. I like to modelize anything. use a good old diagram. That's my trick always. When I didn't do that, sometimes I was like, damn, I should like, because I'm losing time of rewriting some code or redoing some,

some dashboard graphs and start from a diagram because it will help you to explain better what you have in mind and once you want reorganize the potential mess that they have. Because they have a way of functioning and you see like it's a nonsense. So you have to propose a schema, the famous diagram, see, okay, look, that's what I see. And then you plug it afterwards. But I don't even touch any line of code at this moment.

Aaron Phethean (18:18.735)
Mm-mm.

Aaron Phethean (18:26.829)
Yeah.

Aaron Phethean (18:43.193)
Yeah, I really, like that.

Teddy Bernays (18:44.14)
But I like to, I really like this visualizing tool because it helps you to be sure of what you're doing and it helps your client to understand what is your intention there. And you can have different level of, I lost the word.

Aaron Phethean (18:58.159)
Mm-mm.

Aaron Phethean (19:10.061)
detail.

Teddy Bernays (19:10.386)
Yeah, yeah, thank you. Off-scale. So I will start like a wider one from A to B. From their source to usually the dashboard or the analytics platform. And then we can go, okay, let's zoom here. This first as the data lake and let's see how we can organize the data lake because our API is giving us that or because we want to...

Aaron Phethean (19:24.015)
Yeah.

Aaron Phethean (19:32.047)
Mm-hmm.

Teddy Bernays (19:39.904)
some transformation before or not for some reason.

Aaron Phethean (19:43.847)
Yeah, definitely love that advice. Every time we spend more effort on the requirements, the actual implementation ends up being quite straightforward and simple. The times again, every time you dive straight into implementation, you think you know what you're doing, you end up with rework, you end up taking longer than if you just thought about that.

Teddy Bernays (20:11.47)
But you know, and you mentioned some people that they are in the company, probably you want to start from something like in data related, maybe inside a company. It looks like sometimes as engineer, we are afraid to do simple thing, because it's like, we need to do complicated. We need to use at least three different tools, one an ingestion tool, or like, I need to build this and that's like, hold on, hold on, what are you trying to do? If it's simple, if you can do it simple, do it simple. That's your...

And then you can spend most of your time, like, as we say, like create your diagram, create a nice modelization that represent the business needs properly, to do steps clearly, document it well. I like to document the why I choose this over that as well, because you're, as a freelancer, most of the time you won't stay. You can have long contract, okay?

Aaron Phethean (21:01.145)
Yeah, yeah.

Teddy Bernays (21:10.264)
six months, one year, even more sometimes, but I always think like tomorrow there will be either another me or another colleague. So they need to understand why at this moment I took that decision, even though it was maybe wrong, but I can justify why I did it. And I always think like my colleague will be thankful for that or my future me will be thankful to do that. you did that for this reason. Wrong, but at least you explained why.

Aaron Phethean (21:20.099)
Yeah.

Aaron Phethean (21:28.237)
Yeah, I like it.

Aaron Phethean (21:38.327)
Yeah, it was the best information, the best choice with the information at the time. Like that, that is definitely, you know, that idea of like thinking about your future me. I think like that all the time too. It's like when I'm, when I'm writing tests or when I'm writing down, know, a kind of capturing of what I expect the output to be or what the work.

Teddy Bernays (21:50.126)
Hmm.

Teddy Bernays (21:57.902)
in

Aaron Phethean (21:58.511)
I'm kind of taking it from the point of view of I'm going to forget almost immediately why I did this. let alone someone else who's got no context.

Teddy Bernays (22:03.702)
Mm-hmm.

Teddy Bernays (22:09.678)
I mean, you know, like, I don't know if you built a pipeline, you built a Python pipeline of you and you come back to it, why did I write this function again? What is it like? Why is it four exactly? So documentation at every level, are you code level, you at a higher level? I mean, use your, love Markdown for that. Markdown is I think a language that a lot of people know.

Aaron Phethean (22:28.706)
Yeah.

Teddy Bernays (22:36.942)
That's the language I use for ReadMe and your GitHub repo. Use that document that well. Notion is based also on Markdown. I love personally using Obsidian. And I think you can transfer a lot. It's a nice infrastructure, the way it's structured. Let's use it.

Aaron Phethean (22:41.112)
Thank you.

Aaron Phethean (22:46.638)
Yeah.

Aaron Phethean (22:59.085)
Yeah.

again, fantastic advice. If you've got nothing, or if you've got something even, the sooner you start documenting and having a place that's kind of living, and I'm a technology guy, so I'm sort of like the closer it is to the code, the better, because it reflects reality. Documentation that's out of date is sometimes more painful than having nothing. But that kind of idea of having some kind of repository of where you can go to discover things. I also find cards and creating

Like if you think of them as a really useful documentation to go back to, suddenly there's a lot less pain involved in just creating things where you think, this is not really necessary, it's just overhead. Yeah, exactly.

Teddy Bernays (23:42.062)
People are lazy to read documentation then. But we have no excuse. We have no excuse because we can create LLMs. It's formidable for that. Just like, help them, you say, explain, I write that code with this and that and that. Write the documentation for me. And I think this is super useful because documentation is mandatory. The fact that you're writing it, not really. So give it to your LLM the same way you could give it to your LLM when you have like a long documentation to give you some

Aaron Phethean (24:05.742)
Yeah.

Teddy Bernays (24:11.618)
Heads Up will help you to find what you're looking for.

Aaron Phethean (24:14.188)
Yeah, I'm actually really personally excited about that particular future where with the right context available, whether it was cards or issues or historical, like this is what happened type documents, well then an LLM is able to say, well, what's the best action here? And so you're actually almost treating the LLM like a dynamic documentation for now based on everything.

Teddy Bernays (24:30.487)
Mmm.

Teddy Bernays (24:34.316)
Yeah? Yeah, I love that. Yeah.

Aaron Phethean (24:38.798)
That then, you know, I think we're kind of getting sort of used to the idea of how our project starts, how someone might kind of get into data. I suppose what advice would you give to a company, maybe in the middle of their kind of, you know, data project, they're kind of thinking about outcomes that, know, what would the kind of company should they be aiming for when they're working with a team? So I'm thinking more like the stakeholder point

view now, what do they need to be asking of the consultant or someone internally?

Teddy Bernays (25:17.688)
So if I get it right, you have a company now, they have already a data team, and they want to ask help for a consultant. Is it your question?

Aaron Phethean (25:23.182)
Mm-mm.

Aaron Phethean (25:27.724)
Yeah, I suppose. I'm thinking like that. So maybe they're feeling like something is not going quite right. Maybe they're feeling like really like a company to be more led by the data in there. And they're not, you know, thinking like, what does this what does the person who thinks data teams can be doing more for them need to be asking in order to get the company and them along along the journey? If we swivel around and say, they're the boss.

Teddy Bernays (25:33.836)
Mm-hmm.

Teddy Bernays (25:39.768)
Mmm.

Teddy Bernays (25:53.23)
It's like, okay, like what today the team would ask today's stakeholder to see that they are useful or how they can improve the business. Yeah.

Aaron Phethean (26:02.638)
Yeah, I guess, actually, I'm always like the other way, like, what would you tell the stakeholders? Because I'm thinking now, like, if we're talking to the, if we're used to being data people, and we're not quite getting, like, why we're not seeing the right interaction with the company, what do we see from the other point of view? Like, we think about the person on the outside of the data team that could be asking of us.

Teddy Bernays (26:11.939)
Hmm.

Teddy Bernays (26:26.03)
Mmm

Teddy Bernays (26:29.422)
I hope that they are listening to the data team. You say when you have a recommendation from the data, but you have experienced people, they know the business and they kind of have resistance to follow what the data says. So is it like a data team issue or is it management or direction of...

Aaron Phethean (26:32.27)
Yeah.

Aaron Phethean (26:44.354)
Yeah.

Aaron Phethean (26:51.842)
Yeah, interesting. Really interesting, actually.

Teddy Bernays (26:53.102)
decision-making. So what I suggest, what I do also, that's why I always ask, how can I improve your business? Or what am I doing that I'm improving your business right now? And if it's providing you a dashboard that is not even seen or not that often, is it really helping the business? So maybe the data in your example could focus on like...

Aaron Phethean (27:05.208)
Yeah.

Teddy Bernays (27:22.85)
The work that we are doing is like helping the decision making, if it's the case. If it's a cost issue, and that's where I'm like to try to focus, the data team can also ask themselves, are we optimizing our storage and compute? Because we can sum up, in my opinion, can sum up data engineering for the people that don't.

Aaron Phethean (27:27.683)
Mm-mm.

Teddy Bernays (27:48.334)
know exactly what it is to, it is your work to know how data is stored and how data is computed. So from that, you can then develop like how the security is employed, how the transformation of your data you're making, that's why computation and the storage. So, yeah.

Aaron Phethean (27:55.694)
Yeah. Yeah, yeah, yeah.

Aaron Phethean (28:08.792)
That's interesting. So if I play that back to you, the stakeholder needs to be engaging and asking the data team questions that lead to a business decision. yeah, that's great advice. And there's this whole bit you touched on whether they trust them enough or trust the data enough. And I say them because sometimes the data is trustworthy, but for whatever reason, they've lost trust in the individuals. So there's this conversation there.

Teddy Bernays (28:34.894)
Mmm.

Aaron Phethean (28:37.526)
And then, like if you think about the kind of outcome of optimizing things and running things efficiently, I wonder if you've seen really good examples of teams that, data teams, that measure themselves well or measure their utilization well.

Teddy Bernays (28:57.582)
I'm not sure I've seen that, like a data team having the dismeasurement, but I've seen healthy teams being very autonomous on their, each individual is really autonomous in their task. And when they meet, they meet for a reason.

So they won't over meet, they will take decision and then try to have markers of their impact. So if we decide to rewrite that code, if you decide to change the data storage location, whatever, just trying to find an example here.

implement a little benchmark on one. So it could be like, as I said, in cause it could be like in a function that measure how fast your data has been ingested or things like that. So there will be on the human side, I would say like a lot of trust and autonomy and reduce the moments of and try to introduce small benchmarking of new, when they introduce new code.

then they can do kind of A-B testing. So the benefits of code is like we can do that relatively easy. So we can, and there is like no, I think it is, or like there is no emotional on that. we, we're just, there is not related to our experience, not related to, we put hypothesis and then say, okay, it's relevant or it's not for what we try to achieve.

Aaron Phethean (30:44.43)
Yeah, that I see in the really successful people, that is one of the things I see a lot, is that they really thought about the outcome, they held themselves to account by measuring it, and like you said, then they told people and that kind of breeds that kind of autonomy and that trust and they know that they're going to get that result from that person that...

It stays on track. It's not gonna be perfect every time. Sometimes you'll invest in an area and there's no outcome. But the sooner you find that out, that kind of conversation can continue.

Teddy Bernays (31:15.822)
Exactly, that's what I wanted to add. Maybe I think about healthy team. Allow yourself to be wrong. Allow yourself to... So it could be complicated for some people, because it costs time and then like, no, no, no, no. Like when we try to do something, it's not... You will almost never end up always with the right solution. There will be a try and error process. So you need to allow, be positive to that process.

Otherwise, if you negatively impact when you did the process, you won't allow people to try things, to be curious. And then when you talk about optimization, there's like no science. so I noticed some stakeholder like might not, because they don't know, they don't understand. So they need to trust your expertise and the way you explain to them that it's not deterministic sometimes.

Aaron Phethean (32:10.894)
In fact I've heard it said up to three times you get it wrong twice and the third time you know the optimal way to do it. That's like every problem is unique. You see that in software and data. It's really hard to know what to do. Experience can help but when every problem is unique it's challenging it.

Teddy Bernays (32:31.98)
Yeah, and I think it's interesting for any people that want to be freelancers, be ready to jump from different projects. I would say when you start, it's better to focus on domain that you know. It's really important as an analyst, it's also the case as an engineer. What does it mean if you have a knowledge in certain industry or a certain, like, I don't know.

quote the example of sports statistic that I work on. I know it pretty well. It really helps you because you understand already a part. So when you will provide the data in a way, you could understand things that will help you when you code or when you move your data. Really useful for an example. I know because I was working on some rugby data, I know that I shouldn't have more than seven games.

Aaron Phethean (33:02.648)
Mm-hmm. Yeah.

Aaron Phethean (33:16.686)
Mm-mm.

Teddy Bernays (33:27.938)
per week. So my data should reflect that. It's a little example, but you can apply a lot. It will help you to get it right when you're doing that.

Aaron Phethean (33:31.438)
Mm-mm.

Aaron Phethean (33:38.905)
Yeah. I think that that's really showing your kind of analyst experience. That domain knowledge is so crucial to really communicating well and like sort of understanding almost like kind of naturally what makes sense here when you're looking at something. Yeah, I really like that advice. Okay.

Teddy Bernays (33:44.526)
.

Aaron Phethean (33:59.171)
Couple of things before we kind of like think about wrapping up. I wonder like the other, we talked a little bit about the technology and the temptation to just choose technology for technology sake. I wonder if you could scratch that itch for a second and think about what is the future of technology that companies could look forward to and what perhaps you're doing that's on the cutting edge of seeing more people do in the industry.

Teddy Bernays (34:05.857)
Mm.

Teddy Bernays (34:15.17)
Hehe.

Teddy Bernays (34:20.608)
Yes. You know, I'm proud of us. barely talked about tools because I need to... There is a thing, talking about tools and technology, and I will answer your question always, but it's, I think it's way too much.

Aaron Phethean (34:27.021)
Yeah, I know, I love it.

Teddy Bernays (34:37.962)
as engineers like talking, it's like our toys, you know, so it's tempting to follow trends. But remember, most of the time people want to sell something behind. that's why, and to answer your question, I was focused more on things that are reproducible with code, not being too much vendor dependent. And as I said, I like to work with startups because they don't come...

Aaron Phethean (34:48.985)
Mm-hmm.

Aaron Phethean (35:00.771)
Yeah, interesting.

Teddy Bernays (35:07.662)
that much with, oh, we have this tool, we need to use it because our team is already kind of used to it. And I like to apply this process over people over tools. So the process matter the most. So as I said, if you need to build a simple from one to three data source, you don't need to set up a whole Spark cluster and blah, blah, blah. And then it's a nonsense, okay? So as you have more freedom, I like to be kind of efficient. And it's true that now,

Aaron Phethean (35:15.417)
Mm.

Teddy Bernays (35:38.282)
I like to ingest with more and more because I used to work more like a custom Python code but with DLT, DLT-hub because it's really, it's for Python but the way you can ingest because they are kind of connectors that work with code basically or you work with the CLI, I find it really fast and practical.

even if it's tempting to use something like Aerobytes that has a nice interface and blah, blah. And the other day, what I like, you can run it on a cloud function, for instance, your DLT Hub. And in the terms of efficiency, and as I said, you don't have too much data, you can even run on GitHub, know? Like basically, use the virtual machine on GitHub. So this is what I care about. And then...

Aaron Phethean (36:15.502)
Mm-hmm.

Aaron Phethean (36:23.618)
Yeah, the action is in the front.

Teddy Bernays (36:32.014)
keep on technology. I'm a bit advocate also of DougDB. I know it's really popular and it's also the cloud warehouse version that works on it like Moveduck, this American company that builds cloud warehouse on top of DougDB. And for transformation, DLT, because it's used SQL, SQL is really powerful and it gives this like, if you are a software engineer, so you know how to structure code.

Aaron Phethean (36:36.823)
Hmm.

Aaron Phethean (36:43.629)
Yeah.

Teddy Bernays (37:00.462)
It gives this approach to probably data analysts or new terms that you call analytics engineer, the structure that is really healthy to organize and modelize your data when you have to do transformation. Because I consider building a pipeline is like kind of building a software application. So if we want to talk about tools and technology, that's kind of my little trio. Right. Yeah.

Aaron Phethean (37:11.406)
Mmm.

Aaron Phethean (37:19.8)
Yeah.

Aaron Phethean (37:27.246)
that's your kind of go-to. And I think that's one thing I find quite interesting. in like what they're the stack of tools, we sort of think about data ingestion, we are a vendor, we are a vendor of a technology and we're similar to DLT or 5Tran or that kind of that job, yeah, but and it's really interesting you touched on that kind of completely code or completely UI or.

I definitely think my mindset is both because there's a kind of progression to mastery. And we're based on Meltano and Open Technology and has connectors. But then our platform runs into the cloud and manages it. And that kind of idea that as an engineer, I definitely see situations where both is really important. The same thing I see happening in the kind of transformation layer.

Teddy Bernays (37:57.976)
Mm-hmm.

Aaron Phethean (38:18.575)
DBT is really popular because it's managing it like code. And then actually you speak to some customers who value the kind of tooling, the kind of like the UI driven, sort of what they see as a kind of acceleration because more people can understand it or the joins are simpler and more straightforward. I personally feel like, and I wonder in your kind of opinion,

Teddy Bernays (38:21.912)
Mm-hmm.

Teddy Bernays (38:27.062)
Hmm.

Aaron Phethean (38:40.609)
In a UI, it quickly runs out of steam, like in kind of efficiency. Like it gets to a point where it's like, I can't do what I need to do. And it's interesting that your stack is generally more code based. that your experience or that's just what you've always used or why have you ended up in that kind of space?

Teddy Bernays (38:59.156)
No, no, no. No, that's a good question. It's like, I want to build in mind that I want to put the tools away from my mind. I want to focus because, and that's why I'm a bit tired of like, this over thinking only about tools, even though it's great. But as an engineer, remember we need to work with the modernization of data, architecture, our pipeline, the abstraction level.

Where do I put this abstraction level? Do I store it in a lake or in a warehouse? My warehouse is in the lake as well. So those questions, those modulations question as fundamental, but it cannot sell because it's not a product, it's a skill, it's a technique. And the way you will write your Python code, that's why I recently do some benchmarking because I wanted to know, should I use this function or that? Like a more pathonic way of ingesting some of, how is it called?

when you have like... anyway it doesn't matter but I lost a bit my foot right now but you question where...

Aaron Phethean (40:08.186)
Generally talking about the progression of tools, I think probably to maybe start you on a journey of a thought, I found that nobody on your stakeholders cares about the tools. They do care if there's some kind of like cost that shows up, that tends to be like something they care about, maybe finance cares about all of sudden. But actually like we are the users of the tools, sure.

Teddy Bernays (40:30.402)
Mmm.

Aaron Phethean (40:31.567)
But actually no one, like what they care about is how long it took to get there, the freshness of it, the correctness of it. No one actually cared which tool it was. That is sort of, their concern is much more related to the business hand and the problem. And I think as long as we respect that, I think the tools can then be the best tool for the job. That's kind of the thing.

Teddy Bernays (40:46.478)
Absolutely.

Teddy Bernays (40:53.614)
I think it's a good point what you say because when they come with you and we have this stack is because they want their whole team to retrain another stack on another tool. It takes some time to relearn something even though there is some crossover between data warehouses, between BigQuery and Redshift, like the snowflake, they kind of the same. But I think because we don't want to relearn, we don't want to set up or...

everything from the ground, like all the authorization and so on. takes time and it's like a part of the job that is not fun. I don't know what you think about it, I don't like it. I don't like setting up. It's like annoying. That's why also I'm code oriented because I can create templates. I like templating things and that's why I want to have a very code approach. And yeah, you asked me before if I use that. No, before I was...

Aaron Phethean (41:25.763)
Yeah.

Aaron Phethean (41:31.927)
Yeah, yeah, exactly. Yeah.

Aaron Phethean (41:39.373)
Yeah.

Teddy Bernays (41:50.126)
Because the client I was working with were more Google oriented. That's why I decided to specify in Google product because I wanted to answer the questions. How do we do this and how do we do that? Because we are already stuck around Google. I remember having the first time, what is a service account? was like, oh, it's a... For people who manage a service account, it's not fun. So you need to dig a little bit into that.

Aaron Phethean (42:01.994)
Yeah, yeah, we get it.

Aaron Phethean (42:07.554)
Yeah.

Aaron Phethean (42:15.15)
No, and I find that as a vendor, probably, I sometimes feel like I think quite different to other vendors because I'm thinking like, actually I fully appreciate that our technology is a good fit for the people I had in mind, but it's not a good fit for everyone. So, know, they're like, yeah, the person pick about technology and hoping it's the perfect fit for them. Actually, first of all, I'm just trying to discover whether it is, you know, is this the situation they have? And then we go from there.

I wonder in wrapping up, we've touched on quite a few different things in your consultancy and the idea of how projects are well run or not and technology. I wonder what final advice you might have for someone coming into data or trying to get into consultancy. kind of like, all your wisdom and experience, what's kind of word or wisdom for them?

Teddy Bernays (43:04.993)
Mmm.

Teddy Bernays (43:09.038)
Yes, okay, I like this. Let's say if you know nothing about data and you have to focus on one thing, as we say in French, good old jar makes the best jam. So be good at SQL. One thing, like when you start to learn things, you see roadmaps of learning this and learning that, just be good at SQL because you can almost do everything with SQL.

as a data specialist. pick PostgreSQL because that's the best. That's the best you can have. And your skill will be transferable. So if we want to talk a little bit about technology. then be curious, but don't listen to all the new trends. So be curious to...

That's the way you will maybe improve your SQL query and then just be a little bit better and optimize your warehouse without taking a new tool. And I was like, we don't want a new tool, we just want to be good. And I know this could be a bit frustrating because it is not the new tea of something. It could be. And yeah.

Aaron Phethean (44:28.271)
Yeah, yeah, yeah. I absolutely love that little quote. The old jar makes the best jam. And your sequel is the old jar. It's still everywhere. It's still incredibly powerful. Yeah, that's a great piece of advice.

Teddy Bernays (44:42.986)
And everyone saying that there is like this new technology that is better and... Except for text, like the good old-fashioned tables like solve a lot of problems. So that's how like SQL database are structured. So you can do so much, just be good at it. And focus on simplification. Just like try to... When you start to build, and we talked about it before, build your diagram, make it simple, understandable, because you will be...

Aaron Phethean (44:53.07)
Yeah.

Teddy Bernays (45:13.166)
You will bring value first in your architecture, in your modelization of your data, because you have different sources. How do you unify those things? That's the key to me as an engineer.

Aaron Phethean (45:25.295)
Yeah, actually just that socializing, helping people understand more of the whole domain is valuable in itself. know, that's, know, no technology required just to help people understand. Yeah. Yeah, Teddy, fabulous advice. been an absolute pleasure having you on the show. I really hope someone is out there thinking, that is me. That is just perfect for me. And I think we've nailed it. So thank you. Thanks for coming on.

Teddy Bernays (45:39.534)
Thank you. See you.

Teddy Bernays (45:48.428)
I hope so.

Yeah, pleasure, and if people want to connect, just like... I'm not a big social media user, except LinkedIn, just text me there and talk. Be like, me rotten tomatoes if you disagree, I'll be happy, but come with arguments.

Aaron Phethean (46:09.231)
I love that too, yeah. Thank you. We met and we no doubt will have many more conversations, but almost anyone I find is open to that. If it's genuine, I need to think about, have seen, have some experiences, help. People love to help. So yeah, just ask. That's cool. Right. Thanks. Thanks very much.

Teddy Bernays (46:24.27)
Yeah.

Teddy Bernays (46:31.192)
Such a great work, Twinn. Thank you. Take care.