Philippe Gamache  0:00  
what's up guys, welcome to the humans of martec podcast. His name is John Taylor. My name is Phil ganache. Our mission is to future proof the humans behind the tech so you can have a successful and happy career in marketing

Philippe Gamache  0:26  
What's up everyone today we have the pleasure of sitting down with Kevin who co founder and CEO at Mehta plane. Kevin did his undergrad in physics at MIT and litre collaborated with his biologist sister assisting her in analyzing five years worth of fish behavior data. This experience inspired him to further his research and earn a master's degree at MIT in data visualization and machine learning. He also completed a PhD in Philosophy at MIT, where he led research on automated data visualization and semantic type detection. His research was published by several conferences like chi sigma KDD, and he was also featured in The Economist New York Times and wired. And in 2019, Kevin teamed up with former HubSpot and AP US engineers to launch meta plain, initially set out to be a product focus on customer success and designed to analyze company data for churn prevention. But after going through YC, the company pivoted slightly to build data analytics focus tools. And today meta plane is a data observability platform powered by ML based anomaly detection that helps teams prevent and detect data issues before the CEO pegs them about those weird revenue numbers on the BI dashboards. Kevin, thank you so much for your time today. Appreciate it.

Kevin Hu  1:43  
Great to talk with you both and I feel bad for putting it out there saying that I was doing my research you guys definitely do your research, like, are you with the feds thing? There's nothing I can say. On top of that. I appreciate that really flattering intro.

Philippe Gamache  1:56  
This episode is brought to you by our friends at knack. launching an email or landing page in your marketing automation platform shouldn't feel like assembling an airplane mid flight with no instructions. But too often, that's exactly how it feels. NAC is like an instruction set for campaign creation for establishing brand guardrails and streamlining your approval process to knacks no code, drag and drop editor to help you build emails and landing pages. No more having to stop midway through your campaign to fix something simple Knack lets you work with your entire team in real time and stops you having to fix things midflight check them out@naqt.com That's kn a K and tell them we sent you. This episode is brought to you by our friends at revenue hero, I can't think of anything worse than finding out a lead waited a week for a response from sales. That's why we recommend revenue hero, it's the easiest way to qualify leads based on Form Values or enrich data and route them to the right sales rep. Their product is packed with a bunch of behind the scenes superpowers that ensures qualified leads are assigned to the right reps following your custom round robin rules and sending key data back to your CRM. That means more qualified meetings for your reps. We all know they want more of those. But more importantly, no more waiting time for your potential customers. They back all of this up with the best product support out there offering 24 Five support on Slack Connect for all customers, no matter your pricing plan. So if you want to three extra conversions with the same traffic, go to revenue hero.io And tell them we sent you your sales team will thank you for it. Yeah, thankfully, you've been on a couple other podcasts and did some interview around so it was able to mash up some of those intros together. But yeah, super interesting guy. I know that. One thing I wanted to start off by asking you about is that during your MIT studies, your advisor and mentor was the great Caesar Hidalgo who put together one of my favorite TED Talks ever. I was a TED Talk nerd for several years and watched a bunch of them. And I still remember his talk in 2018 on why we should automate politicians with AI agents. And this is back in 2018. Like way before Chad GPD was popular. So I'd love to hear like talk to us about the impact and influence of having such an esteemed mentor who's recognized not only for AI ml, but also like innovative methods for visualizing complex data and making that way more approachable for a broader audience.

Kevin Hu  4:32  
That was an amazing talk. I am not surprised that it's evergreen. I feel like every four years is highly relevant, at least the inside SES are was probably the most important teacher in my life. You mentioned that I got into the data world because of work with my sister and some contacts there as she was studying animal behavior, but after You're collecting all of this experimental data that Kevin didn't learn our MATLAB kind of stock. And it's a little bit ridiculous to me that the people who understand what the data means, understand what questions to ask are kind of bottleneck because they don't know how to write code. kind of arbitrary. So that's one part of why I went into grad school. The other part is because I started working as an undergrad with SES are. And I've never, I appreciate him to this day, because I was working as an undergrad, I was like, working on running analyses for one of his grad students. And he comes up to me, he calls me into his office and gives me a copy of Steven Pinker's book, the blank slate, and says, Okay, Kevin, like, you know how to do that analysis. But can you learn to ask the questions that make the analysis meaningful? Read through this book, and we'll talk about it every week. So he really invested in, I was the only one every single one of his students. That's what I thought, Okay. I, you know, I feel like there's so many paths to take at that point in one's life, where we've all been at that point, but I will not regret choosing a good boss. And I think that I did, he was a great boss.

Unknown Speaker  6:21  
Very cool. Now you're

Jon Taylor  6:22  
kind of in in that position at Mehta plane, right, as CEO, and co founder, like, you have a fascinating background and multiple degrees from prestigious universities. And like, as we've been talking in academia, you got to work with like Olympic gold medal caliber scientists and peers. How is your academic background influenced your approach as an entrepreneur? And if you went back to academia, what lessons from the startup world would you bring back?

Kevin Hu  6:51  
I mean, the highest level thing, and that's a great question that, like my answer changes all the time. But one answer is always the same, which is academia is a six year slog with an uncertain return, as Tiffin would start off. You don't know what you're gonna get at the end of the day, that's for sure. Yeah. The I think one thing that I've learned a lot from academia, which I, which has really been helpful in a startup is like making things as hypotheses, right? Like, of course, now that we're working in industry, like we want to move the needle for our business. And that's what determines the worthiness of our efforts at the end of the day. But especially if we take like a marketing example, right, like, what is our correct ICP? What is our messaging for the ICP and the right channel, combined with messaging for the ICP? Like, these are things that are very hard to determine in advance, right, you can have a guess, and maybe the guests is correct, because you have so many reps out this. But I feel like you can really, empirically and like methodically test your way into what works for now. And that that the answer might change in the future to that kind of empirical approach was really useful from academia. And a third was a respect for the past, this is a bone that I have to pick with people in the startup world is that we all think that we're doing things for the very first time. Right? That the The truth is that most of what we work on is not that new. Right? People have tried this before. But maybe the timing is right now, maybe they've made mistakes in the past. But being forced to cite previous work in a paper kind of puts you in a mindset of respecting previous efforts and building on top of them, that I think is missing, in a lot of like startup discussions that at least we try and do at meta plan.

Philippe Gamache  9:06  
Very cool. I love the the hypothesis, philosophy that that you're bringing back. I feel like we had our previous guests on the show, Brittany Mahler's former SEO scientist at Moz. And she said that, like one of her things, or her bones to pick is that we don't teach enough about statistical significance and like just the whole domain of statistics to marketers, because marketers play with data all the time to make decisions. And we're not well equipped with that academic background about how to make those decisions. So I love your point there. And in an interview that you did with TechCrunch, you said that every day executives and marketers are making decisions based on data that is incorrect. And I think John and I and listeners will all agree that like we've been there, we've all done that, especially the moment where you realize like, oh shit like that was The correct data and we did that thing because of it. What's like, I'm curious as to what's the balance between investing time in accurate data and making that data fresh and perfect to the best of your ability, versus especially early stage startups just like doing things to grow the company, because, you know, getting perfect data is and it's easier said than done. So I'm curious, like your take on that balance.

Kevin Hu  10:27  
You're right to talk about, it's very stage dependent. I would say that early stage companies not only to have to fight to survive, because you're dead by default, but also you don't have the data to make a really quantitative decision, right. So focusing on data quality, when the data that you have is potentially incomplete, at best. Not sure that that's such a great investment. The I would say that data quality and how much you should invest in it shouldn't be defined by the needs of the business. So if you're using data for quarterly board meetings, that kind of suggests that, okay, maybe the data doesn't have to be super fresh, like it can be every quarter, if you're meeting every quarter for that, and maybe it doesn't have to be correct to the dollar. If you're making decisions that are, you know, talking about millions 10s, hundreds of millions of ARR. However, that's really the the foundation of the pyramid when it comes to data use cases, because let's go one level higher. And this b2b SaaS company is also sending out daily emails based on PQ ELLs, someone coming in trying out the product, you want to send them something about a feature they didn't use, to kind of get them to book a demo, or come back into the product. Well, now your data has to be correct to the day. If you're missing some customers, that should have been PQ roles, that's revenue and growth left on the table, if you're sending emails that are incorrect, and saying, like, Hey, I saw that you use this feature, but really, you didn't come in that day, then you're starting to burn a little bit of trust with a customer. Now we're in this world where we have, like real time automations being triggered by the data within the warehouse. Let's say that you have an E commerce company, and you want to send out emails with discounts every time a user abandons their cart in your Shopify store. Like, okay, if that abandoned email, automation is down, because the data is delayed because your eat ETL ingestion was delayed. That's potentially a lot of money on the table. Right? So the, the need to invest in data quality is really dictated by the frequency of the data use case and how important it is like, is this a revenue driving? Use of data or not?

Jon Taylor  13:19  
I have a follow up because like you touched on this earlier, when you're talking about like a blank slate right? Is what questions to ask. And we're going to talk so much about like the mechanics of the data, and how to keep track of whether your data flows are going but like there's a component around data of being data overloaded, like you can look in these systems and see data forever, like Neo in the Matrix, but making sense out of it is can be a real challenge in your role as CEO, like, how do you prioritize data? How do you prioritize which data points are most important?

Kevin Hu  13:54  
Or that is? No, it's hard to know what you don't know, for sure. And a lot of times, in reality, for us, at least, it comes down to you relying on experts, and like what have they seen work in the past that kind of tell us where to shine the spotlight within this dark basement of data to collect? Right? You hire like a I had a demand gen, right. And then they come in and say, Okay, well, we don't know what the effectiveness of this type of campaign on this specific medium. Like, let's just start collecting some information on that. Oh, and by the way, like, because you're going to be linking to some gated content on your website, you got to put in UTM parameters. I would guarantee you that. So many of those steps would have been missed if you didn't have someone with the expertise to do that. So I've found that it's very like human and use case driven to determine And what kind of data to start collecting and the most effective way to start collecting it. But on the flip side, that's where the data team comes in. Because we look at how a company evolves. The first person to put a data stack together typically doesn't have data in their job title, it might be a growth marketer, there might be a salesman of like a sales engineer with an interest and who has used liquor in the past or a software engineer. They'll develop the data stack to meet their needs, but have another job, right. So then when the use cases start piling up, because you've hired that expert, demand gen marketer, you hired someone and Reb ops, you hired an F, like a fractional CFO, that's where the data team comes in. And they can start being much more intentional about how to design a data model that meets all these use cases.

Philippe Gamache  16:01  
Very cool. That tracks a lot with like, some of the topics we've had on the show. And like when, when John was talking about like being Neo in the Matrix with like, all the data coming out, like I feel like there's, there's one thing that met a plane, and maybe we can unpack this, like this idea of anomaly detection and helping you parse through all of this potential data and figuring out things that aren't working. So maybe we can break down this concept of anomaly detection and some of the use cases for marketing ops teams who, you know, myself in a role like that today, I work very closely with my counterpart on the data team, we built out a composable data stack powered by Census and redshift, and we're using air bite for ETL. But one of the examples that you walk through on the Super Data Science Podcast is that you had a b2b SaaS customer who had a product engineer who changed some of the code in the data collection process, and ended up skewing a lot of the behavioral activity data that made its way into amplitude or whatever. So it, it would be, it would have been impossible to catch this. Unless you were looking at like charts, every single date. Usually you find out about this either by the CEO saying like, hey, like, why is this dashboard completely out of whack? Or you start looking at the reports of the emails that went out and you sent 200 emails that day, in the previous weeks, you didn't even top 10. So maybe you can unpack that for us, because I feel like there's there's so many Martic marketing use cases there, right?

Kevin Hu  17:37  
Or shout out to census and airbike, who are amazing partners, and also users of meta plain, incredible tools. The year right, that one of well, I should have said this at the start, shout out to you both shout out to your audience, I have so much respect for what marketers do, especially how you have to balance not very qualitative, thinking with very quantitative data driven thinking as well as thinking about like big picture entire markets and the, you know, the nitty gritty about specific people sometimes. And I think the most that data can do, which is already a lot is to either help you determine what you should do, you know, is just one ingredient, right? There is no machine that you can turn on your told everything. But this is actually a VNC and remove some of the the toil of the day to day or like instead of going in and sending 100 emails can be automated. And within that spectrum of like data as a decision making tool and data as an action taking tool. Data quality comes in, because the data is not correct by default. What does that mean? It means that a lot of companies think of data as like the ground truth. And we collect it from source A and Source B and source C. And then this is this is all correct. We munge it together. And this is the data model. And we put an automation on top of that. And that's no biggie, right? I would actually say that it's the reverse that we did a start with how the business is using data. And then reverse engineer like what are the dashboards or tools that we need to meet those use cases? What's the data that we need to go into those tools that we use? And then what's the standard of data quality that we have? And this will give us one like a goal for us to hit. And to add Hope is the final data quality actually is because otherwise it's a little bit like playing Whack a Mole. You gave things couple that we talked about, about the product engineer who changed some code and data collection, let's say like they changed an event name in the CDP that they're using for front end analytics. And then a key user event that you have a lot of automations based out of it just goes, goes completely wacky, where you get kicked zeros. This is just one of I would say, 100 ways, even more ways that data could go wrong. We often think of like, data anomaly detection as like, an intruder alert in the front door of your house. And okay, like, data can go wrong in this way. And we want to know when there's an anomaly. In reality, your house has like 100 doors, and they're swung open all the time. And people know that there's no one in your house, and you have a bunch of valuables in the window, right? You can't be looking out for everything all the time, especially with so many unknown unknowns, which is where anomaly detection comes in. So that was a very long winded way to say that it can go wrong and read it in many different ways. Because there are so many people touching it.

Jon Taylor  21:17  
I think this is fascinating topic. And like the idea as you went through that, like the big use case that came to me, I've said this a few times on the on the podcast, but I just went through this and last year with the end of Universal Analytics and the beginning of Google Analytics for and I got to find all these Google Tag Manager instances that hadn't been touched by anybody for, you know, a decade. And people were shocked at how much the data was broken. Like, that wasn't working for two years, like how did I not know this kind of thing? One of the things that I talked to with my clients at least has like the idea of like, fragile setups, right. Like, I imagine that a lot of your conversations at Mehta plane with your customers are around anomaly detection to pivot that into anomaly prevention, like, what are data teams doing, you know, top tier data teams doing to make sure that their their setups are future proof? And how can marketers help steward these conversations as these like Use Case in point experts

Kevin Hu  22:15  
that, you know, their grade GA progression is a fantastic example of like reimagining or like reassessing where the data comes from. The most important thing, from my perspective, is that there's a feedback loop between the people who produce the data, or at least are very close to the production of data, and the people who use it for some business use case. You know, there's the joke of why do doctors have bad handwriting? Because they never have to read their handwriting. Right? pharmacists have to read their handwriting, the hammering just goes out, the scripts never come back in. And so when we talk about the product engineer who's changing an event name upstream, they they're just trying to do their job. Right. And it's hard for them to know what the downstream consequences are. So I would say, the more that this is not exclusive to marketing, right? Like everyone in the company, is like, if we can kind of hook together with these loops, where data is, like we see the consequences of the data that we produce, then there's a natural mechanism for becoming better over time, or at least were aware of its current state, instead of, unfortunately, very understandably being like she about to migrate. And this hasn't been working for two years.

Jon Taylor  23:51  
It's, it's it was painful time, but I won't relive it too much longer on this podcast. You know, you talked about like closing these feedback loops. Like one of the things that Phil and I both have in common as we did we work together met at a data visualization and dashboarding company many years ago. And so like some of what you're talking about, to me seems like well look like if you build these dashboards, people will come and take a look at the dashboard. And then they'll see if the data is good or not good. And they'll have additional questions. But for a lot of people data can be super intimidating, even with this layer of storytelling and data visualizations. So data visualizations kind of seek to bridge this gap between data insights and human realization. But it often feels short, at least in my experience talking to customers in my past life, What lessons do you have to share with our audience around data storytelling? On how to make this more effective for folks?

Kevin Hu  24:46  
That is such a good question. I'm still figuring it out all the time. I'm very curious what has worked for you both. For me, I I think of data storing is not as, like not different from regular storytelling. It is a specific case of regular storytelling and all the things that we can learn from what makes a compelling story. Right, having a hero's journey, having a hook. I think that applies to data storytelling to but that the cardinal rule of knowing your audience probably stands out above everything else. Like, no matter how perfect, we make a dashboard, or frame the words around the dashboard, if they're not interested, like, there's a limit to how much you can get someone interested. But that if someone is interested, now we have, like some data visualization specific frameworks to use and two of the ones that I really like, are one is from Tufty. The, I think it's like the quantitative display of visual information of the classic Tufty book, where he talks about the data ink ratio. So when you make a data visualization, what how much ink is being used to actually display information? And then how much ink is being used on like, the piece of paper or, more commonly the case the dashboard? The lower that number is, so in other words, how much like cruft there is, the more overwhelming, it can be, right? This is where you're saying before about being Neo in the Matrix, like, oh, there's so many numbers and bars and colors flying everywhere. It's like, like, let's keep it super simple to start to hook people in. And then like ease in some complexity, and I love referring to like, it's called Schneider mins mantra. But it's a fancy name for a simple concept of saying, first, provide the overview, right. And then you can zoom and filter, go into different parts, and then get details on demand. It's like you're playing video games, we have a mini map. First, we show you the whole world. And Nico, and a little bit, I think go in a little bit more. We fall into this trap of throwing the matrix at people because we want to show people that zoomed in version all the time. And I got it, because making these decisions of what do you want to show first? And starting simple. These are tough decisions to make sometimes, right? It's the equivalent of saying less, one year of writing copy, right? The whole, you know, I didn't have enough time to write a shorter letter. I didn't have enough time to make a simpler graph. Would Have you vote soon. Yeah,

Philippe Gamache  27:49  
I agree. I love the analogy with the video game of like discovering the math, like step by step as opposed to starting with the full app and like throwing that to people. One thing that's worked for me is, especially in the process of going from request to build a new dashboard or tweak an existing dashboard. I like to prototype it with the requester at least from the marketing upside. Maybe iterative storytelling from the request side is helpful in like having a prototype doesn't matter, like where you do it. But I like it. There's similarities with like the approach of like, instead of saying you want all of this, this map like you want all of this data, here's like, where we start from, and we kind of zoom out from there. But yeah, I really like your analogy. It's episode was brought to you by our friends at customer IO, oversold a note legacy marketing automation platform that is still struggling to update its user interface. I've done a tour of duty with all the major marketing automation platforms and many are definitely similar customer I O is the most intuitive and beautiful platform. I'm talking about the industry's top visual workflow builder to design and implement your unique messaging strategy. Powerful A B testing features inside your workflows, not just on subject line sense. Hold out testing functionality to see the incremental impact to your messages. Cue draft mode, so you can QA messages and conditions in production with real users before anything is sent. Copy workflow items, so you don't have to repeat the building process again and monitor campaigns, tests and key list membership growth from your personalized dashboard. The icing on the cake marketers using customer to have seen a 20% increase in conversion rates from strategic messaging. So stop using clunky old tools and adopt a multi channel approach that creates joyful interactions with your customers. Start a free trial without a credit card customer.eo And tell them we say you this episode is also brought to you by our friends at census the number one data activation and reverse ETL platform left by Activision Canva Sonos notion and more. As you might know, I'm pretty opinionated that the future of martec is composable. And that the single source of truth for your marketing data should be your data warehouse. Since this helps marketers solve an age old marketing problem getting real time complete access to your customer data without needing to write a line of code. Also, if you want your own face as a humans of martec style image, we're doing a fun monthly raffle with census for a personalized t shirt. Enter to win at get census.com/humans. I want to ask you, I chat with some folks at previous companies about data observability. And the one common theme that jumped out is this idea of avoiding false positives in anomaly detection. How do we avoid this idea of like excessive alerts, like if I was to use something like dB t tests to tell me when my endpoint is failing? I have to like instruct dB t test to tell me like if this is one or if it's zero. But like oftentimes it like I wordpress.com and my previous life like we would get flooded with a ton of noise and false positives from the tools that many folks just ended up muting those channels completely. So I'm curious like how you guys at metaplasia are taking a different approach to avoiding these, these false positives?

Kevin Hu  31:29  
You're right, no one wants to be the tool that cries Wolf. If you sent like We exist to help companies trust data, and hopefully save time along the way, if we create confusion and cost you a lot of time, what are we doing? Right? All right. And I say us it can be about a plant could be any of the tools in this category. But this is our Northstar. The the tough part about anomaly detection in our space. So let's call it business data quality, is that there's some peculiarities that make it difficult to use off the shelf anomaly detection libraries, or more like very tried and true by the book, anomaly detection methods, right? Both of them. Like, say like profit is in category one. Serene bars, and category two are very popular and very good, but not necessarily for our space. Because our space. For one, data is usually not being loaded in real time. Right? The real world is happening in real time for the data might be coming in every hour or every 15 minutes. It's not like temperature. It's not like a stock ticker. If your data is being loaded every 24 hours, it doesn't matter if you like check off the number of rows every 24 hours or every hour or every minute, you have to account for that. There's some specifics about our domain, like the number of rows in a table tends to increase. The number of rows in table could double, whereas the s&p 500 If it doubled, or if it halved, right? Not going to happen. Very unusual. And when it does double or have we have to have models that adjust immediately. Like, the first alert might be okay, the second alert might be tough. And the reason I list these examples is because for us, we have put a lot of effort into building models from scratch. I'm not saying they're perfect. I'm not saying that they're very complex. But I am saying that we've tried to be very thoughtful about how we craft models for our space for the kinds of issues that you might tend to see in a business setting. And the type of appetite you might have for different types of alerts based on your workflows as you develop models to be super, super concrete. None of these off the shelf models will let you adjust your baseline immediately. If you get if you have a million users, like active users, but then your company I acquired another company and now you have 2 million users. Right? Basically, every single type of off the shelf model, it's going to take you a while to adjust to the new baseline to do a mean shift so to speak. We have to develop our own types of models to do it instantly.

Philippe Gamache  34:51  
Very cool and like the models that you guys are building also adapt to like historical events from the company where Right. And that's really the secret sauce between DBT tests is like this idea that we don't have to like, say yes or no for some of these things like, I'm not going to pretend to be an expert on on DBT tests here, like this is totally data land. But I'm trying to put myself in the shoes of a marketing ops person who's listening who's just like, hey, like, I wonder how my data team is thinking about data? observability? I don't think we have something like meta plain, I may have heard of DVT tests, like, how would you help marketing ops person pitch meta plain, versus something like DVD tests that has like basic checks for data quality, where you can give simple yeses or No, like an example that we have at our company is like, ensuring that every single customer we have on eligibility files has a unique ID, and DVD test tells us when that customer doesn't have a unique ID or not. But meta Blaine is more advanced, right? And it's not like if or no, you're noticing when things are out of whack. And there's like the models that you're creating. But there's also the history, specifically of our data. And some of the examples that you have on this site are like significant deviation and like alerting on campaign performance. Maybe like we have an automated campaign that's like down for some reason, because the front end trigger on the events, like someone changed the name of it, right are making sure that the data is fresh. We talked a lot about that real time data there. So yeah, maybe walk through folks there, like the difference between how like midplane is used for more advanced detection.

Kevin Hu  36:46  
examples that you gave more excellent ones, of having, like enforcing a unique ID on our customers, versus something that might be more business focused. And the way that I would distinguish the two is between the unknown unknowns and the unknown unknowns. Some things, you know, kind of have to be true, right? Customers having unique ideas, having dates that exist only, right, like not in the future, having transaction values that are positive, right? These are some rules that kind of have to be true. In which case, it's really, really useful to have DBT tests or grid expectations is another great library, that you encode these rules upfront. And then you have a clear yes or no answer. When your data is correct. The challenge comes in when you have these unknown unknowns. So the you know, one example might be click through rates for different campaigns and different geos. What is a good click through rate? I'm not sure. Right, it might depend on the campaign, it might depend on the geo might change over time. So not only would it take a lot of effort to write down the rules, but that you had to keep updating the rules to write this is where some of the anomaly detection based on historical data comes in. And it is very important because one, because the data is evolving over time, but to because the data issues can occur anywhere, is what we found is sometimes if you introduce a new data source, or you introduce a new data transformation, it's not enough to know that you have the central vein of data that is being monitored by anomaly detection, because the moment you join this other side in right, you open up a whole new can of worms for what might go wrong. So it's important to keep the barrier to entry very low to adding more and more monitors, but to also have them have a high signal to noise ratio.

Jon Taylor  39:24  
Switching gears a little bit we can avoid this topic and the year of 2020 for AI we've been wrangling with this a ton on on the podcast and talking to many guests around this and in particularly find our marketing operations guests are feeling a little vindicated right now because they've been you know, I used to be a marketing operations. I remember being like, we got to clean datasets, we got to prioritize these pick lists, we need to make sure that data is clean. And we know that AI is very much steered by the types of inputs that it has. If you want a good output from Ai, it better have a good input. I feel like in our conversation around data hubs Probability, there's a huge trend here that people should be paying attention to. What do you think that marketers should be paying attention to in their data? And why do you think data observability is going to play an important role with the future of AI and marketing?

Kevin Hu  40:16  
I think marketers should pay attention. Because what makes our an AI use cases, special for business is typically not the model. Right? There's a handful of companies in the world that can train and deploy a truly unique, right? We're not even talking about the expertise needed, we can also talk about the money needed. Now, most companies don't have $100 million to dump into that. But what makes an OEM application special is the data that you have, right, the data on your leads on your customers on your business. And that, when these algorithms are put into use, right, we can think of many marketing applications but use across the business. We don't even know how for data quality can affect the output. Because many AI models are very non deterministic. We don't know you'd go in there with a screwdriver and poke a bunch of holes, and what's the result? But we know from firsthand experience, that alums who are at risk of hallucination and that humans, at least from my experience, have a tendency of saying like this, oh, and pass the smell test. Okay, well throw it in there. Right, the combination of being prone to hallucinations, and humans want to kind of extract themselves from the loop makes it very, very dangerous, especially when there is not a human in between what this very complex machine produces and what your customers say. Well, I would, I don't have a great solution to that. And I'm not trying to be a fear monger. But I would say that data professionals, speaking of learning from the past, have decades of experience on data governance, and data quality, meta plan and what we do, we're not new, right, companies have been trying to address data quality for the past 5060 years. And there have been a lot of best practices built up around that and that these best practices, thank you, for, you know, pushing on it and your previous roles will also have an impact on the new Erlang world as well.

Philippe Gamache  42:47  
Very cool. I think that's speaks a lot to like, marketing ops playing a role in on that, like internal AI Council are being like, the stewards are continuing to be the stewards of clean data. And, you know, oftentimes, marketing ops teams will be not the gatekeepers, but the team, you have to go through to implement this new shiny object, the new LLM that you saw, or the new copy assistant that you want to like, roll out. And so arming these marketing ops folks with this understanding and more tools to feel confident about the quality of the data, the quality of the inputs that are going into this system, because we've talked a lot about with other guests like this idea of what is going to be the role of the marketer in like 510 years when a lot of the martech is AI powered, and more than just like machine learning. But we have LLM 's that are powering a lot of this stuff. We think about like financial sectors or healthcare sectors, like my current startup, like screwing up an email that's written by AI that has PII. And that's wrong or financial data that's wrong, like the regulatory aspects of that are only going to be amplified. So I love the the mission of meta plain. And I think it speaks to the future of data management, but also the role that marketing ops can play in like chatting with your data team and saying, like, hey, like maybe, you know, there's nothing wrong with because like, I know, DBT does a lot more than just DBT tests. But like, instead of just checking for zeros and ones and constantly having to like, evaluate whether we need to update the zeros and ones, like trusting tools like meta plain to come in with, like bespoke models that can ingest a lot of our historical data is I feel like a key part of that future of enabling LLM. As

Kevin Hu  44:48  
I mentioned, I'm not a marketer. I'm not a world class marketer, but I know some, and I just find it so hard to imagine a future where the chat GPT that I use is gonna replace You know, what makes these folks so special and so good at their jobs? Because it's not about producing words, right? It's not about any particular action, but it's like this, you know, holistic understanding of what's the right way to educate people about a specific value prop that you're doing? And I would say, No, I'm not speaking about world class marketers in particular, probably every single person in your audience knows more about that than me. But that speaking about data as an ingredient to get an edge, and one of the pieces that turns someone into being world class, is that when it comes to data quality, the question is not whether data quality issues occur, right, they will occur. The question is, do you want to know about it before your users do? And do you want to get better at it over time? The faster we kind of have to say yes to both of those. Now, we're starting to build up some muscle and some core competency.

Philippe Gamache  46:08  
Very cool, Kevin, I appreciate all the the shadows for for marketers out there, it means a lot coming from from someone with your your academic and your technical background there. So really appreciate it. We, we asked all this question on the show to kind of wrap things up. Your co founder, CEO and academic frequent traveler, and now you go between a couple of different cities, you're also big into cycling, learning a bunch of new languages, and also a Chinese food aficionado. One question we ask all of our guests is, how do you remain happy and successful in your career? How do you balance all the things that you have going on while staying happy?

Kevin Hu  46:50  
That's a good question. You know, I oscillate between two different ways of thinking about myself, and we know like self perception is plays such a big role in happiness. And I go between thinking of myself as like a lizard, and as a complex emotional being. So when I think about, you know, Kevin, the lizard is like that, I get enough sleep at night drinking water, that I move my body today plays a big role, right. But when it when we think of ourselves as complex beings, with histories and tendencies and relationships, something that I've always found, makes me happy is focusing on learning. At a startup, any job, you both, it's like, we can't really control a lot of stuff that happens, right? Maybe we have a conflict with someone, maybe we don't get the results that we want from a project. But I feel like we can control whether or not we've learned about something. And for me, I always like having little projects from month to month that I'm trying to learn a lot about. So this is, you know, I was talking it's great timing, because one of the projects that I'm trying to learn a lot about right now is marketing campaigns. Right? How do you design a marketing campaign? How long does it last? How specific does it have to be? How do you emphasize different channels? This is like one on one for you both and your audience. But for me, it's like a whole new world. And I got a lot of fun from that.

Philippe Gamache  48:39  
And I think you're you're selling yourself short, a little bit there. You said at the start of the show to like I don't know that much about marketing. But if you're on the show you you talk to you talk to shop, you dropped a lot of the acronyms that we're all big fans of the wells, the IC. So feel fit right in there. But yeah, Kevin, feel free to reach out anytime happy to walk you through some some campaign stuff, some some use case stuff. Really appreciate your time being on the show, I think more more marketing ops folks and data teams should be thinking about data observability as we enter this beautiful age of AI, and appreciate your time and thanks so much.

Kevin Hu  49:19  
I appreciate you guys. If you TM that's me, almost big crowd. I'll just say like anyone listening to this, go to your data team and say slowly changing dimension. They'll be very happy to hear about

Jon Taylor  49:32  
trade doctrine. That was one day. Thanks again

Philippe Gamache  49:45  
folks, thank you so much for listening this far. We really appreciate you being here. Just wanted to call out two things before we go. Number one, the best way to support the show is by signing up for our newsletter on humans at Mark tech.com. We send you a quick email and Every Tuesday morning letting you know what episode just dropped. We include our favorite takeaways. So if you don't have time to listen to that one, no pressure, we have you covered with some learnings anyway. And number two proceeds from sponsors this year to have allowed us to venture into video. We recently launched a YouTube channel where we publish full length episodes. So if you want to see our radio faces, check that out. That's it for now. Really appreciate you listening again. Thank you so much.

Transcribed by https://otter.ai