00:00:00 Dr Genevieve Hayes Hello and welcome to value driven data science brought to you by Genevieve Hayes Consulting. I'm your host doctor, Genevieve Hayes, and today I'm joined by Doctor Tori Callen to discuss using data science to live better for longer. Tori is the data scientist at Australian health tech startup. 00:00:21 Dr Genevieve Hayes You are as well as working as a data scientist with fintech startups. 00:00:26 Dr Genevieve Hayes Peggy, he's spent the past five years setting up AI and automated risk management for leading finance companies in Australia. Tori, welcome to the show. 00:00:37 Dr Torri Callan Hi, thanks for having me. 00:00:39 Dr Genevieve Hayes We all want to live long, happy and healthy lives. 00:00:43 Dr Genevieve Hayes And in the age of technology, it comes as little surprise that people are turning to data to do just that. 00:00:51 Dr Genevieve Hayes Between smartwatches, aura rings, and fitness apps like Strava, we're all generating massive quantities of personal health and fitness data each day, sometimes literally in our sleep. 00:01:05 Dr Genevieve Hayes But that data is only valuable if it can be converted into useful insights, and that's something that a lot of startups are now looking to. 00:01:14 Dr Genevieve Hayes Right. 00:01:15 Dr Genevieve Hayes One such startup is UR spelled UARE. 00:01:22 Dr Genevieve Hayes Which as I mentioned before, Tory, you're the data scientist for now. For listeners who haven't come across, you are before. Can you give us an overview of what it does? 00:01:36 Dr Torri Callan You I was born out of the desire to make sense of the wealth of data with the deluge is a better word of personal data that you get from your fitness and health trackers. 00:01:49 Dr Torri Callan I think for those listeners who are familiar with. 00:01:52 Dr Torri Callan Strava, Garmin, or a group Apple. 00:01:56 Dr Torri Callan You look through the menu of those apps and you get a lot of numbers thrown at you. So for most people, I think it's probably too much. 00:02:06 Dr Torri Callan In some ways you have a lot of you. You have a lot of numbers and a lot of data thrown at you, and it's difficult to contextualise all of that to make sense of. 00:02:16 Dr Torri Callan Where that should be, and I think what's really challenging is what people need to actually do about it. So what we're attempting at you are is to. 00:02:26 Dr Torri Callan Develop a a broad scope holistic view of an individual that can actually contextualise all of your information and give you recommendations and feedback as to what you need to do to improve overall health. 00:02:42 Dr Torri Callan Well being and longevity. 00:02:44 Dr Genevieve Hayes What sort of recommendations does it produce? 00:02:47 Dr Torri Callan The key phases of the business is that a lot of the drivers of longevity and overall health is how long you're spending on exercise and activity. And then there's this key thing that we think is important. 00:03:01 Dr Torri Callan That's as far as we can tell, fairly unique in that your functional performance is actually what drives quite a lot of overall health. 00:03:11 Dr Torri Callan And then as you get into your old age, how well you live in your old age, I suppose it's not a unique idea in the health space because there are other. 00:03:19 Dr Torri Callan There are plenty of practitioners who have spoken about the need to stay fairly fit and healthy, but what we're trying to do is. 00:03:25 Dr Torri Callan Bring some recommendations around how well you're performing and also. 00:03:30 Dr Torri Callan As well as how much you're doing and then also trying to contextualise that with at some point rest sleep, making sure that that base is covered. 00:03:40 Dr Torri Callan Some of the insights we're working on at the moment and keep in mind. 00:03:43 Dr Torri Callan This is very much a work in progress has been around trying to give a baseline and a comparison of your performance in any particular activity. 00:03:53 Dr Torri Callan Compared to what we think is achievable. So for someone of your age and your gender for example, if you went for a 5 kilometre run. 00:04:01 Dr Torri Callan And for myself, at age 30 and as a male. 00:04:06 Dr Torri Callan We know that the fastest that someone can run that is roughly 12, I think 12 1/2 minutes. 00:04:12 Dr Torri Callan And so we know that if I go and do that in 20-3 minutes, I might be at, let's say, 50% of my overall achievement level that I could actually pursue if I wanted to. And So what we could do is we could give that a score. 00:04:27 Dr Torri Callan Out of 50 out of 100, sorry or what we might wanna do is we say, well, actually the 12 1/2 minutes, I think it's Joshua Chapter Guy who set that world record. 00:04:37 Dr Torri Callan That's achievable by someone who specialised in running and running for that particular distance. But what we actually know is that for someone who wants to be really fit and healthy into their old age. 00:04:49 Dr Torri Callan They want to be really fit and healthy across a number of different domains in terms of. 00:04:54 Dr Torri Callan Time and in terms of model domains, so not just being a really good runner, but being able to swim well, being able to ride a bike well, being able to lift well in the gym. 00:05:03 Dr Torri Callan And so I don't have exact numbers here. I'm approximating, but maybe that's 17 minutes for that performance is what we think is actually really necessary for you to try and achieve. So what we can then say is, hey, you've you've done this 5 kilometre run. 00:05:19 Dr Torri Callan We've looked at the heart rate that you did it at and so we saw that actually you were sitting at what we might call a level or a Zone 3 or zone 4. So we know that if you went as hard as you could. 00:05:32 Dr Torri Callan For that particular distance that you might actually be able to do it in 21 minutes or 20. 00:05:38 Dr Torri Callan That's. And so we think that you're actually achieving 3/4 of what your overall performance level is. So that's, I mean, that's one domain of insights. But what we also do at the same time is we say well, because you've done this activity. 00:05:53 Dr Torri Callan You might not be at 100% of your achievement level, but that's fine because you're accumulating this volume of training over time, and we know that's really useful in the way you build an aerobic base and the way you improve your cardiorespiratory system. 00:06:07 Dr Torri Callan You also know it's really valuable just in terms of moving the body. There's all this data around how regular exercise and activity improves a lot of health outcomes and longevity. When you look at the totality of someone's lifespan. 00:06:22 Dr Torri Callan And so we can also say, well, you've actually done a really good job in terms of achieving the activity minutes for this week and you've had a really nice mixture of moderate and low intensity and then a high intensity amount of work, both which are quite important and then we can summarise that and say well actually you're. 00:06:42 Dr Torri Callan Achieving really well because you're you're tracking. 00:06:45 Dr Torri Callan Upwards in terms of your performance, so it's getting better and you're doing the work to make your performance get better. 00:06:51 Dr Torri Callan And even if you are well away from the performance level because of lifestyle factors, because you've been sedentary for most of your life, you're doing the right thing. And so we can give you some really valuable feedback on. 00:07:05 Dr Torri Callan Yeah, the accumulation of minutes that is in a really positive aspect. 00:07:10 Dr Genevieve Hayes So if you'd had someone who had spent most of their life sitting on the couch watching television, who suddenly decided that they were going to try and run that 5 kilometres, now obviously it would not be possible to go from the couch to 5K's in a single day, so it would provide positive feedback to the fact that just doing anything is better than doing nothing. 00:07:31 Dr Torri Callan One of the really exciting things as a data scientist is actually coming into this product really, really early on and thinking about a lot of the personalization. 00:07:42 Dr Torri Callan And different personas from the start, I think, which I find that quite unique compared to other products I've worked on, where there's a really stable feature set. 00:07:53 Dr Torri Callan For one, a single cohort or a single view of someone, and then if you want to talk about personalization or recommendation systems. 00:08:02 Dr Torri Callan Or kind of an intelligent way of serving up a product. Yeah, you have to kind of tack that on top of what's already there. 00:08:09 Dr Torri Callan So a lot of what? 00:08:11 Dr Torri Callan We've discussed and thought about and are starting to implement is having almost personalised streams depending on the characteristics of someone who comes to you and and what lifestyle. 00:08:22 Dr Torri Callan Stage they're at so one example of of personalization strategy we can and employ what we're working on at the moment is. 00:08:29 Dr Torri Callan You have an individual who's potentially active in their teens and 20s, and maybe it's reached their 40s and 50s and hasn't been as active for the last 20 or so years. 00:08:41 Dr Torri Callan You know, work life gets in the way, that sort of thing happens to most of us. And So what we can do is we can design key product. 00:08:50 Dr Torri Callan Those and key bits of information where we might not get into the whole performance aspect cause that's just not relevant for those people. But what? 00:08:57 Dr Torri Callan We can do is we say. 00:08:58 Dr Torri Callan Well, if you started doing half an hour of activity a week, this is gonna have this percentage impact on your longevity. We know for someone. 00:09:08 Dr Torri Callan I don't have the numbers off the top of my head, so hopefully people will forgive me for approximating, but we might know the longevity of someone who's in their mid 40s based on where they live. Let's say their life expectancy is 80 and we might be able to see that if we. 00:09:22 Dr Torri Callan If someone in that situation increased their activity levels to half an hour a week by setting some goals to say we'll actually want you to try and hit 150 minutes of activity in the week and then beyond that start to get into ideas around how efficient you're moving and how well you're moving. 00:09:43 Dr Torri Callan 11 keep it that we've been able to. 00:09:45 Dr Torri Callan Do so far is actually look at. 00:09:47 Dr Torri Callan Like incidental activity so. 00:09:50 Dr Torri Callan Apple Health, I think, does a pretty reasonable job if you're going to. 00:09:55 Dr Torri Callan If you go into what they serve up, they'll give you information about how many steps that you've done and then how many flights of stairs you're doing and give you a comparison and how well you might be moving. 00:10:05 Dr Torri Callan But a lot of other fitness trackers tend to ignore what you do in between intentional business activity. 00:10:12 Dr Genevieve Hayes So Strava, you pretty much have to start an activity in order to record it. 00:10:18 Dr Torri Callan And what tends to happen with strawberries, it becomes quite performative because. 00:10:24 Dr Torri Callan Strava just popping really like it's a popular product, but I think it motivates people to display activities in a really performative way. 00:10:33 Dr Torri Callan So what gets shown is how far you went in a particular activity, how far you ran, how far you spam, how far you rode, and then how quickly you did it. There's almost a disincentive. 00:10:45 Dr Torri Callan For people to do, like easy activities, I think like a really common bit of feedback you'll hear heard this from cycling groups. Quite a lot is that once people get onto Strava. 00:10:54 Dr Torri Callan They'll want to do every ride as hard and fast as they possibly can because that looks better when it gets onto the app. All they actually know is that a lot of. 00:11:05 Dr Torri Callan Training mileage is really useful at a really low level, you almost want to sit at a comfortable pace where you could have a conversation like we're having now while you're doing an activity, because that is why it actually accumulates that that's what allows your body to accumulate some good aerobic cardiorespiratory fitness over many years. 00:11:27 Dr Torri Callan That's why you see a lot of insurance athletes. So I'm intra athletes and Tour de France cyclist peak in their early 30s is because because they've had almost 10, maybe 15 years of doing a lot of training volume, it tends to be the case that once you've done all of that work, you actually. 00:11:47 Dr Torri Callan And get really really. 00:11:48 Dr Torri Callan Fit in a short amount of. 00:11:49 Dr Torri Callan Time. So the focus. 00:11:51 Dr Torri Callan On trying to move really hard and move really quickly in a short amount of time. 00:11:58 Dr Torri Callan Is really useful, but can actually be detrimental if that's all you're doing, and I think this is where tech kind of gets into the loop and actually starts to impact people's behaviour, but not in an an in intentional manner. 00:12:11 Dr Torri Callan And I think that's one area in which we know that we can have a pretty big impact. So we've started to work on is just saying to people. 00:12:18 Dr Torri Callan The activity minutes is what matters like accumulation of time, movement, spending time outside trying to do it with other people because we know that social contact and being in a group of like minded people is really valuable and really impactful on your health and well being. 00:12:35 Dr Torri Callan And that's the sort of feedback that we want to give in a more intentional way, so. 00:12:39 Dr Torri Callan Going back to our early example like. 00:12:42 Dr Torri Callan For someone who's been sedentary, we might not show a performance level and we might not show a lot of information around pace or heart rate data or trying to give a really high level analysis of someones activity. All we might do is we might say. 00:12:57 Dr Torri Callan For a run you ran this far, you ran for this amount of time and you added this amount of time onto. 00:13:04 Dr Torri Callan The total activity you need for the week. See you now. Let's say another half hour. Closer to your goal. 00:13:09 Dr Torri Callan And because you've that much closer to hitting your goal for this week, and you've done the last, let's say six weeks of hitting your activity minutes goal, we know that it's going to have. 00:13:21 Dr Torri Callan A small but pretty valuable impact on your life expectancy. 00:13:25 Dr Torri Callan I can give you some feedback to say if you keep going then the impact on your longevity and life expectancy is going to keep going up and up. 00:13:34 Dr Genevieve Hayes One thing I've found cause I've used Strava in the past, I find that gets very intimidating after a while. 00:13:38 Dr Genevieve Hayes For all of those reasons that you just mentioned that you know, it feels like if you're not really going to be pushing it, then you're just going to embarrass yourself. 00:13:47 Dr Genevieve Hayes But I can see that what you're describing would be really motivating, because the prize is however many minutes or days that you're going to get extra at the end of. 00:13:57 Dr Torri Callan Yeah, exactly. And I think we need to be focused on who the person is. And one thing I always keep in the back of my mind is we're not treating people as a source of data or we're not treating people as the activities they do or the device that they have. We're trying to see. 00:14:13 Dr Torri Callan People as people. 00:14:15 Dr Torri Callan And trying to give them feedback based on where they are in their life and what they're trying to achieve. 00:14:22 Dr Torri Callan And so for most people, all we want to do is say you've added another half hour to activity and we just want you to keep going and want you to keep doing that. 00:14:31 Dr Genevieve Hayes As a data scientist, how do you manage to wrap your head around the idea that the data in front of you is not just data points? 00:14:39 Dr Genevieve Hayes Those data points actually connect to real human beings because I know that's something that a lot of data scientists struggle with. 00:14:46 Dr Torri Callan The key is having a life outside of data science. For mine I have a pretty they call it robust training volume through the week I have for most of my life and so I've just sort of continued doing that as I started to work. 00:15:00 Dr Torri Callan And so it's easy for me to think about, especially in this context because it's so close to what I tend to look at day-to-day for fun and out of interest anyway, it's really easy for me to think about that as a person 1st and then a data scientist second. 00:15:19 Dr Torri Callan And so I. 00:15:19 Dr Torri Callan Think maybe the key for. 00:15:22 Dr Torri Callan For people who were potentially newer to the field. 00:15:26 Dr Torri Callan Or potentially fairly deep down a specialty is because it can become quite engrossing and I think quite a lot of data scientists have somewhere between obsessive and really interested personalities. 00:15:39 Dr Torri Callan It becomes really easy to know a lot about a narrow kind of set of tools and techniques and different ways of looking at things. 00:15:49 Dr Torri Callan And it's harder to miss the full context. I I don't wanna be too prescriptive here and say this is the best way of doing things. 00:15:57 Dr Torri Callan I think I think certain approaches are really valuable in their time and place. If I were trying to reverse engineer this approach for other people, I would suggest try and have that view outside of. 00:16:10 Dr Torri Callan Data Science 1st and my background was in statistics and this is something that. 00:16:15 Dr Torri Callan Because it's a bit more of a mature field, it was perhaps taught a lot better and the the two questions that I've always really liked repurposing, I think it was Don Rubin who came up with them, the statistician. 00:16:29 Dr Torri Callan 30 odd years ago. 00:16:31 Dr Torri Callan But he he would always ask people. Well, if you had no data at all, what would you do? 00:16:37 Dr Torri Callan Maybe before you come to me with an analysis or an experiment you want to run or a model you want to build, what would you do if you had absolutely no idea what was happening? 00:16:46 Dr Torri Callan And then the second question he'd ask is, well, what if you had all the data available? So not just this small experiment or a simple model, but every bit of data you could possibly want? What would? 00:16:58 Dr Torri Callan And what he's trying to what we're trying to feel for with those two answers, it's like, what's the, what's the baseline? 00:17:03 Dr Torri Callan What's the default set of things that we're going to do if we don't know much if we have a lot of uncertainty and? 00:17:10 Dr Torri Callan And we have, we never have no data, but if we don't have a lot of rich and full information about a particular situation. 00:17:20 Dr Torri Callan And then with the second question, you're trying to understand well what would. 00:17:23 Dr Torri Callan Happen if you. 00:17:24 Dr Torri Callan If you had more information, if you had all of that data that you wanted, what's actually going? 00:17:28 Dr Torri Callan To change so to bring it back to our example. 00:17:31 Dr Torri Callan Like if I knew almost nothing. 00:17:36 Dr Torri Callan I'd be reasonably. 00:17:37 Dr Torri Callan Confident in saying that they need to do. 00:17:40 Dr Torri Callan At least a little bit of physical activity every day and they, if they spend time trying to get better at the physical activity, they're doing that, they're going to feel better. 00:17:51 Dr Torri Callan Feel happier, live better and live for longer. 00:17:56 Dr Torri Callan And so if that's my baseline. 00:17:58 Dr Torri Callan Then, as we gather more information from peoples devices, all we're trying to do is make that advice a little bit more. 00:18:07 Dr Torri Callan I suppose high fidelity we're trying to give a little bit more precision to the information we're trying to give back to people. 00:18:13 Dr Genevieve Hayes What you're saying before, what happens if you have no data? That's actually a good segue into another question. I was going to ask you, since you're working at a startup, there must have been a point where you had absolutely no, no data at all. How did you cope with that situation where you were trying to build a product from absolutely nothing? 00:18:34 Dr Torri Callan Absolutely nothing's relative, right? It's nice to have the big database with 1,000,000 or billions of rows of data that actually when it comes to physical activity and human performance, there's quite a bit of information out there in the world. 00:18:50 Dr Torri Callan The starting point for for us was actually just looking at a set of world records. 00:18:55 Dr Torri Callan So starting with open division for running across a bunch of different time domains. 00:19:01 Dr Torri Callan I think we looked at the five kilometre, 10 kilometre, the half marathon and the marathon. 00:19:06 Dr Torri Callan You compare that between women and men to get a sense of the difference between the two. The two divisions. And then we started to compare that across Masters divisions as well, so. 00:19:20 Dr Torri Callan We might know the marathon world record time for open men, but what does that look like for over 50s and over 60s and over 70s, what they did for us because that information is fairly available, it actually gives us a pretty good understanding of what what will happen for people as they. 00:19:38 Dr Torri Callan Get older and what's the difference for men and women when it comes to physical performance? 00:19:43 Dr Torri Callan And I think this is an understanding for everyone that comes through like what's the? 00:19:49 Dr Torri Callan What is the best possible level of performance look like? 00:19:53 Dr Torri Callan I think I mentioned. 00:19:54 Dr Torri Callan This right at the top but. 00:19:56 Dr Torri Callan We're not necessarily saying that everyone needs to be achieving world record times because I don't even think that it's probably outside the scope of health and longevity. 00:20:05 Dr Torri Callan Once you get beyond a certain point of functional performance. 00:20:08 Dr Genevieve Hayes And you'd also have with the people who are really pushing it. They're also going to do some damage to their body. So it's possibly not something you want to encourage. 00:20:17 Dr Torri Callan Potentially, I think there are advantages to. 00:20:21 Dr Torri Callan Trying to train at a fairly high level. 00:20:24 Dr Torri Callan Maybe this is just my personality speaking, because I'm tend to be more of a generalist in most things I say. 00:20:30 Dr Torri Callan I suggest that a lot of data scientists would be generalists in a lot of aspects as well, but I'd find it more interesting to try and be good at a number of different things than it is to be really good at one thing when it comes to physical activity. 00:20:44 Dr Torri Callan Like trying to be a really, really good runner. 00:20:47 Dr Torri Callan Is to me not as interesting as trying to be a a moderately decent runner and to be fairly competent. 00:20:54 Dr Torri Callan When I go to the gym and. 00:20:55 Dr Torri Callan To be I'm fairly modest swimming, but kind of sort of at least be better than what I was a year or two ago. 00:21:03 Dr Torri Callan Yeah, maybe on the totality and this is something we're still weighing up is maybe the best version for everyone is to be really good or fairly good, decently good at quite a few different things. 00:21:17 Dr Torri Callan And that and one interesting way to challenge yourself is just to find different activities to go be good at. 00:21:23 Dr Genevieve Hayes So what you're talking about with your starting point being looking at a lot of those world records, where did you start with regard to longevity? Were you looking at say actuarial tables to look at the life expectancies and things like that? 00:21:39 Dr Torri Callan Yeah, well this. 00:21:40 Dr Torri Callan The starting point for us was to actually look at the World Health organisations recommendations on physical activity off the top of my head is about 150 to 300 minutes a week of physical activity is going to maximise the impact on longevity. 00:21:59 Dr Torri Callan And beyond that point, what you want to do is start to optimise the the mix of moderate and intense activity time so. 00:22:09 Dr Torri Callan Intense. I mean it. It is a little bit relative, but like we've used anything over 75% of your heart rate as an intense activity. 00:22:18 Dr Torri Callan And something below that is what we call moderate. I suppose one interesting thing when you're doing this for the first time is you make all these approximations and 1st cuts that seem reasonable because I can think as I was talking through that of quite a few things that we could improve them. 00:22:33 Dr Torri Callan But there's a lot of peer reviewed literature and sort of practitioners in the health and Wellness space who speak about the value of staying physically active and give kind of guidelines of what they think is reasonable. 00:22:46 Dr Torri Callan And if you sample from enough of them, then you can start to triangulate between all the different recommendations. 00:22:53 Dr Torri Callan Out there and get an idea of the common threads and it does seem to be the case that when it comes to longevity, you have. 00:23:02 Dr Torri Callan I guess the baseline, which is what you'd see in an actuarial table and then you have shifts upwards or downwards from there based on how sedentary or active. 00:23:13 Dr Torri Callan We've also started looking to data around resting heart rates and sort of what happens during sleep and what happens at rest and whether or not including that information is useful for people. 00:23:26 Dr Torri Callan I tend to say it probably is so if you can then start to give recommendations around trying to. 00:23:34 Dr Torri Callan Pay off some sleep debt and get. Make sure you have enough time spent sleeping and potentially quality if the data is available to give recommendations on that then we think that's going to be. 00:23:44 Dr Torri Callan Useful as well. 00:23:45 Dr Torri Callan There's a few challenges again coming back to the idea that if we want tech to be in this loop. 00:23:53 Dr Torri Callan That you don't want tech to be driving behavioural change in an unintentional way. 00:23:59 Dr Torri Callan So trying to measure as many things as possible and trying to get someone to optimise them all all at once. It's probably going to be overwhelming for most people. 00:24:10 Dr Torri Callan And even if it's not overwhelming, I think. 00:24:12 Dr Torri Callan I've certainly experienced this with my own data. You you can spend a lot of time and energy trying to work out what it is you need to be doing next, and I think we're a platform like ours can be really useful is you can just. 00:24:29 Dr Torri Callan Just all of that down into the thing that we know is most high leverage and then borrow information from what other people have been able to do. 00:24:38 Dr Torri Callan And say, well, this is actually what? 00:24:40 Dr Torri Callan You need to be working on. 00:24:41 Dr Genevieve Hayes Next, so just give one recommendation rather than a whole list of recommendations. 00:24:46 Dr Torri Callan Yeah, exactly. So you give one recommendation and instead of trying to give 10 metrics that we think are really useful and tell you not even tell you that they need to be optimised, just imply that they need to be as high as they can because. 00:25:00 Dr Torri Callan They're all scores out of zero to 100, or they're all plotted on graphs that go from. 00:25:05 Dr Torri Callan Sort of low to high and give you a comparison week to week so. 00:25:09 Dr Torri Callan Trying to say like, hey, we actually know that the most high leverage thing you need to do at the moment. 00:25:15 Dr Torri Callan Is sleep more because we can stay. You're sleeping 4 hours a night and we know that you need to be around 8:00 or 9. 00:25:22 Dr Torri Callan Well for most. 00:25:22 Dr Genevieve Hayes People anyway, it sounds like a lot of the data science that you're doing here would be based predominantly on statistics, is that? 00:25:30 Dr Genevieve Hayes Right. 00:25:31 Dr Torri Callan It's a little bit circular because my approach to data science has always been based very heavily in statistical modelling. 00:25:37 Dr Torri Callan That's where I did my PhD and my undergrad studies, so my approach has always been fairly kind of heavily grounded in the field. 00:25:46 Dr Torri Callan One thing I've had to learn over time is to get really good at engineering practises. I'm still not good at them. 00:25:51 Dr Torri Callan Get better, I should say. Get better. A lot of engineering practises cause. 00:25:55 Dr Torri Callan You study your PhD, you can get away with a lot of crappy spaghetti code. It does what you need it to do, but if you change one thing, it'll break. 00:26:04 Dr Genevieve Hayes I'm embarrassed about the code that I wrote for my PhD. 00:26:08 Dr Torri Callan Yeah, it's crazy when you think about, like how little vigour goes into writing academic code. I think. And. And there's no kind of version control. 00:26:18 Dr Torri Callan I used to kind of number like scripts from one up till whatever I needed to and run them all in sequence. 00:26:24 Dr Torri Callan Rely a lot on like local files, local environment variables. Nothing was ever a function. Yeah, you do a lot of bad things then when you have to try and write code that sits in the production. 00:26:34 Dr Torri Callan Environment and have to talk to other systems you. 00:26:37 Dr Torri Callan You very quickly kind of work out what? 00:26:39 Dr Torri Callan You need to do. 00:26:40 Dr Torri Callan That on that front. 00:26:41 Dr Genevieve Hayes What? What does your tech stack look like? 00:26:44 Dr Torri Callan But one of the really like fascinating bits that I've worked on is trying to get like a lot of the data processing done in a really short amount of time in other places of works. 00:26:54 Dr Torri Callan A lot of. 00:26:55 Dr Torri Callan The tools you rely on have been kind of batch tools that run a whole set of data. 00:27:01 Dr Torri Callan To process a whole set of data all at once, and that those tools can be really useful even if you want to do really short or really low latency batch processing. So you want to run something every 5 minutes. 00:27:12 Dr Torri Callan But what we've been trying to do is have something that runs almost as soon as someone thinks or uploads data from a particular device, and we want particularly for like activity scores or activity processing. We want to actually have that. 00:27:29 Dr Torri Callan In the moment. So as soon as you sync your device, you'll get that feedback and that information and then try and to propagate all that information through the different kind of metrics and recommendations that we're trying to give based primarily on a lot of SQS queues that are SA lot of average on SQS that are tied together. 00:27:50 Dr Genevieve Hayes Button, A WS special. 00:27:52 Dr Torri Callan Fabulous service. It'll just it'll add messages to a message bus and then. 00:27:56 Dr Genevieve Hayes It's sort of like Kafka. 00:27:58 Dr Torri Callan Yeah, very similar. Like I mentioned with my background, I had to learn what all of this is, what it. 00:28:04 Dr Torri Callan As I've gone along because I've found writing the actual code to do data processing fairly straightforward, but trying to get it to talk to an SQS and then pass information back to a database or to another SQS is. 00:28:17 Dr Torri Callan A little bit more challenging, but that's all part of the fun. So yeah, we rely on that because that allows us to do a lot of processing in as close to real time as Poss. 00:28:26 Dr Torri Callan And then most of the processing at the moment is done with pandas and then sort of slowly introducing a lot of things that can be done just with really basic sequel, because I think that's going to be quite scalable over time and really easy to read. 00:28:41 Dr Torri Callan I don't know how prevalent this is, but I have seen it for some people. There's a default temptation to reach. 00:28:47 Dr Torri Callan Or the the tool or the package that will answer all your problems all the time. So using Panthers for data processing, you're using Spark because it'll scale to millions of rows all at once. 00:28:59 Dr Torri Callan But I've actually found that writing quite a lot of things in sequel makes it really, really easy to work out what it is you're trying to do and really easy to onboard new people if they need to be working on it and really easy to like. 00:29:13 Dr Torri Callan Debug handle errors. Add philtres you know all the things that it. 00:29:17 Dr Torri Callan Was built for so. 00:29:19 Dr Torri Callan Yeah, quite a few things. Since I'm introduced just with really basic SQL except. 00:29:23 Dr Torri Callan So perhaps some of the more complicated processing that I've used for a bit of pandas. 00:29:28 Dr Genevieve Hayes That's really interesting cause I've actually spoken to quite a few people who say that they use a lot of SQL for their data science. 00:29:36 Dr Torri Callan Yeah, I think I think it's almost a foundational skill in data science is writing SQL. But writing SQL almost in your sleep been able to come up with a query that will answer something. 00:29:45 Dr Torri Callan It's almost more valuable than trying to come up with like the really cool bit of code or the nice model that will answer something as valuable because you can just stack it on top of other bits of queries and. 00:29:56 Dr Torri Callan See the lineage and see what's gonna happening. 00:29:59 Dr Torri Callan Two different bits of the code, so I'd imagine most people are across this, but if you're not like, that's probably one recommendation I'd give is just get really. 00:30:07 Dr Torri Callan Comfortable kind of answering as much as you can. 00:30:09 Dr Genevieve Hayes In SQL and so you don't use any statistical packages on top of that. 00:30:14 Dr Torri Callan Most of the stats is done offline. I sort of came up with a few models or a few kind of calculations. I'll give one example like. 00:30:24 Dr Torri Callan We use someone's maximum heart rate. 00:30:27 Dr Torri Callan To indicate if the activity they've been doing is moderate or intense, and what you'll have is, you'll have like a time series of people's heart rate data. 00:30:37 Dr Torri Callan And you'll also have their age and gender. 00:30:41 Dr Torri Callan What we can do is we could look at just your heart rate data overtime and take the maximum of that. 00:30:47 Dr Torri Callan And that could be your Max. 00:30:48 Dr Torri Callan Right. 00:30:49 Dr Torri Callan But there's obviously there's sampling bias issues where if, for example, you are someone that I never wears a gum and watch when you're going. 00:30:58 Dr Torri Callan For an easy run. 00:30:59 Dr Torri Callan And you don't often do some high intense activity. We're not gonna actually see your true. 00:31:04 Dr Torri Callan Max height, right? 00:31:05 Dr Torri Callan On the other side, there are all these population measures of what your Max heart rate should be I think. But just based on age. 00:31:12 Dr Torri Callan The very common one is 220 minus your H is roughly what your. 00:31:17 Dr Torri Callan Maximum heart rate should be. There are updated versions of that which we use which don't round up to 220 and use slightly different coefficients I can. 00:31:28 Dr Torri Callan I don't. I don't know exactly what it is off the top of my head, but it's. 00:31:31 Dr Torri Callan It's similar in that day. 00:31:33 Dr Torri Callan And so on one hand, we could see what your Max heart rate is just based on what we get from your device. On the other side, we get. 00:31:41 Dr Torri Callan What we think it is from the population, what I then did is worked on a model that just tried to combine the weight of those try to combine those two estimates. If you think of. 00:31:53 Dr Torri Callan Them as estimates. 00:31:54 Dr Torri Callan And weighted them by. 00:31:55 Dr Torri Callan How often someone was using their device? 00:31:58 Dr Torri Callan That you said. Someone who used their device literally 24/7. 00:32:02 Dr Torri Callan We don't need an approximation of what the Max heart rate is. We can just read it straight off what the device is telling us. 00:32:10 Dr Torri Callan And on the other side, if you have someone that wears their device for one or two hours every week. 00:32:16 Dr Torri Callan We're not going to apply a lot of weight to the Max we see from their device. We're just going to look at what the population thinks. 00:32:22 Dr Genevieve Hayes So this is sort of a credibility theory approach if you ever come across that before. 00:32:27 Dr Torri Callan I don't know the terminology I've come across like a weighted regression or yeah, I think in like survey sampling. 00:32:32 Dr Torri Callan You'd probably do something similar where you kind of weight different surveys by either how guess how credible they are, or how much data you have. 00:32:41 Dr Genevieve Hayes I have an insurance background and credibility theory was originally developed for workers compensation. 00:32:48 Dr Genevieve Hayes And the idea is that it's basically a weighted average of industry experience and individual experience in order to calculate a workers compensation premium for a company. 00:33:00 Dr Genevieve Hayes So it sounds like you're doing a similar thing, whereas but instead of combining the individual experience for an organisation. 00:33:08 Dr Genevieve Hayes You've got the individual experience of a person and instead of industry experience, you've got general population. 00:33:17 Dr Torri Callan Yeah, I think the methods are many, but the concepts are few right? Like I think you I think you end up seeing the same thing across many different domains, which is kind of the nice bit about working in the field. 00:33:28 Dr Torri Callan It does sound like it does sound pretty close to what we're trying to achieve. We're just trying to weight different sources of data by how often we're seeing it. And then yeah, again. 00:33:37 Dr Torri Callan How useful we actually think it is. 00:33:40 Dr Genevieve Hayes You know, as a data scientist, I hate merging data that's being collected from multiple sources because it never quite meshes properly. 00:33:49 Dr Genevieve Hayes Is that problematic for you, given the fact that you'd be using data that was collected from multiple devices? 00:33:56 Dr Torri Callan Yeah, it's really problematic and it's like it's the most significant challenge we're trying to address at the moment. 00:34:02 Dr Torri Callan You'll have some devices that have, like a really rich data set. 00:34:07 Dr Torri Callan So if you go for a run, it'll tell you the latitude and longitude of where it. 00:34:11 Dr Torri Callan Was so we. 00:34:12 Dr Torri Callan Can give you kind of GPS data. 00:34:14 Dr Torri Callan It will tell you the pace that you're running at any given point in time. It'll tell you elevation changes. 00:34:19 Dr Torri Callan And so we can give you a lot of feedback about that particular activity. So we might be able to say if you did a really hilly run. 00:34:26 Dr Torri Callan We might be able to adjust back the pace for the given gradient to a certain performance level because we know that. 00:34:34 Dr Torri Callan You know, at a certain elevation change, a 6 minute kilometre is actually similar to a 4 minute kilometre in the flat so. 00:34:40 Dr Torri Callan There's a lot of things we can do there, but then for other devices you don't get much information at all. 00:34:46 Dr Torri Callan You might just get an activity name and then the amount of time and that's it. I think the owner we kind of got painted into was trying to treat it all as the same and then have Edge case handling if things were missing and. 00:35:03 Dr Torri Callan But I'm more open to now is actually handling different devices differently based on the data you have available and then trying to make the UX and UI really. 00:35:15 Dr Torri Callan Individualised based on what data is available as well. Again in that example, like you go for a run with a Garmin and you've. 00:35:22 Dr Torri Callan Got a heart rate monitor. 00:35:24 Dr Torri Callan What you might see in the front end of the app might show. 00:35:28 Dr Torri Callan How far you went, how quick you went, how quick you would have gone for the same run in that same in like flat conditions? 00:35:35 Dr Torri Callan How much time you spent resting versus moving, and then we might be able to categorise if it was an interval versus a time trial or a straight run, we can give you some feedback on your heart rate zones where you sat. 00:35:47 Dr Torri Callan And then for example, you do it with a different device and it just tells us, let's say distance and time. 00:35:53 Dr Torri Callan Instead of showing an empty screen with a bunch of empty data. 00:35:57 Dr Torri Callan Or pretending that we have more data than we do and trying to like bulk it out with what's available, where we're heading is we're just gonna have a view which just says you went for a run and you added 60 minutes to activity, minutes goal, and you're that much closer to the goal you were. 00:36:14 Dr Torri Callan Trying to achieve. 00:36:16 Dr Genevieve Hayes So there's something behind the scenes that selects a particular visualisation to show people based on what a data is available. 00:36:23 Dr Torri Callan Yeah, it's it's having like the data product really tightly integrated with the rest of the. 00:36:30 Dr Torri Callan So that based on what data we have available, we can actually decide what goes into the front end and how that's visualised and kind of customise to some extent the appearance of it. Do you have? 00:36:42 Dr Torri Callan Yeah, all these different options in the UX to say. 00:36:47 Dr Torri Callan This activity just contributed to the activity goal and then maybe this activity we're going to show these fields of information and then we're sort of starting to explore like different customizations that you could do for different activities, so. 00:37:00 Dr Torri Callan For example, you go on a run with a lot of different heels. We might want to show you. 00:37:05 Dr Torri Callan How much elevation changed there was and how how we think that run kind of compared to what you would have done in the flat? 00:37:13 Dr Torri Callan Whereas if you go on a run and it's fairly flat but actually beach. 00:37:17 Dr Torri Callan And there's no elevation. There's so much point showing you how much elevation you had, but maybe you'd be more interested in your overall pace or your heart rate or something like that, so. 00:37:28 Dr Torri Callan One of the nice bits about. 00:37:30 Dr Torri Callan Being the data scientist very early on in the piece is that you can kind of work on adding all of the data products into. 00:37:39 Dr Torri Callan Like, have it really tightly coupled with what's in the back end and what's in the front end, and almost have it as part of the product rather than this nice add-on that comes later, way down the line. 00:37:51 Dr Genevieve Hayes How early into the game did you join the start up? 00:37:53 Dr Torri Callan Yeah, it was more or less from conception. I knew the cofounders who manly Surf Club. They approached me as they were sort of formulating the idea and looking for some really early investment. 00:38:04 Dr Torri Callan And I helped out in bits and pieces saying an ad hoc manner in just in trying to test out the idea and trying to see if if we could get a little bit of data. 00:38:14 Dr Torri Callan To actually formulate what they were thinking about. 00:38:16 Dr Torri Callan And then, yeah. 00:38:17 Dr Torri Callan Over time, as they gathered a bit of investment just a little bit of advising on how to set up the team and yeah, how to how to kind of lay out the back end and the front end of the thing and then. 00:38:28 Dr Torri Callan At certain times I've had to come in and actually write some code to to do a lot of the activity processing, and as it's all evolved to sort of spending a little bit more time on kind of guiding the product and saying seeing where it can, where the product and how it. 00:38:45 Dr Torri Callan Gets displayed, can actually fit with the data we have available. 00:38:47 Dr Torri Callan And I think we're now at a point where it's a little bit more mature because we have an idea of what it looks like as. 00:38:53 Dr Torri Callan An MVP? Mm-hmm. 00:38:55 Dr Torri Callan And we can now see where all these personalization and intelligent recommendation can actually feed into. 00:39:02 Dr Torri Callan The way in which the product is presented. 00:39:04 Dr Genevieve Hayes To change the topic a bit, in addition to your work in the startup space, you've also completed a PhD in statistical and mathematical modelling. 00:39:14 Dr Genevieve Hayes Now one of the things you mentioned to me when we first spoke was that a lot of your PhD work involved the use of Bayesian methods. 00:39:23 Dr Genevieve Hayes And that you saw these methods as raising the level of rigour of statistics. Given my own statistical background, I'm very interested in learning more about your thoughts on this matter. 00:39:36 Dr Genevieve Hayes But before we go down this path, could you provide our listeners with an overview of what's involved in Bayesian methods? 00:39:44 Dr Genevieve Hayes In case they've never come across them before. 00:39:48 Dr Torri Callan A number of different approaches we could make into this. 00:39:51 Dr Torri Callan Topic but what I might do is I. 00:39:53 Dr Torri Callan Might give an overview. 00:39:54 Dr Torri Callan Of how I kind of came into. 00:39:57 Dr Torri Callan Bayesian methods, I think. I think most people have a. 00:40:01 Dr Torri Callan Really broad level understanding of what Bayes theorem is and then what I think gets presented is like this is what Bayes theorem is. 00:40:08 Dr Torri Callan This is how it falls out from the laws of conditional probability and and this is how you might actually use it, but that doesn't quite match up to what you guess used in practise, but the way which I I. 00:40:21 Dr Torri Callan Came across it and was taught with through the use of hierarchical regression modelling or what some people call multi level regression models. 00:40:30 Dr Torri Callan So these are regression models where you have data collected in clusters or groups where a particular parameter might be may or may not be. 00:40:42 Dr Torri Callan Influenced by the presence of a group, so do canonical example that gets used is you want to look at the impact of. 00:40:50 Dr Torri Callan Whether a student had breakfast before they did an exam on their exam scores, that's not quite canonical, but. 00:40:57 Dr Torri Callan Get the idea we wanna have. We wanna have some idea of if this if something a student did was influential on their exam scores or achievement levels at school. 00:41:07 Dr Torri Callan But what we wanna do is then account for the variability you'd get within classes, because different teachers obviously have different impacts on students. 00:41:18 Dr Torri Callan And the variability that you'd get within schools as well? 00:41:23 Dr Torri Callan And So what you would do if you imagine a typical regression model, it might be an intercept term plus a coefficient of the variable that you want to that you want to make some sort of inference about. 00:41:37 Dr Torri Callan So in this example it might be the average. The intercept would be the average exam score, and then the coefficient would be. 00:41:44 Dr Torri Callan Or did that student have breakfast that morning and in a multi level model what you'd do is you'd have an intercept for every class? 00:41:52 Dr Torri Callan So you, let's say there's a hundred classes within your data set, and let's say there's roughly 10 students in every class, although that can, it doesn't have to be exact it. 00:42:02 Dr Torri Callan Could vary. What you'd do is then you'd add on the average performance level within each class before you then decide to try and decide the value of. 00:42:12 Dr Torri Callan The coefficient that you're interested in. 00:42:14 Dr Genevieve Hayes OK, so it's a staged approach. 00:42:17 Dr Torri Callan Yeah, that's. I mean, that's how I think. 00:42:19 Dr Torri Callan Have it now. What you actually do is obviously you estimate all this at once, which is gonna, which is sort of like what leads me into patient. 00:42:26 Dr Torri Callan Approaches what you'll often have with this sort of clustered data is you'll have a lot of groups with a lot of. 00:42:34 Dr Torri Callan With not a lot of information, so you might only have one or two students in a lot of classes just because of the way in which you've sampled data, and then you might have some classes with a lot of students where it is easy to make that kind of influence. So what you will end up doing is you'll say. 00:42:51 Dr Torri Callan The distribution of all of these class level intercepts is governed by some kind of normal distribution, where I found it really valuable. 00:43:01 Dr Torri Callan Was then. It allows you to fit a much larger class of models for a certain set of data and a bit of a really interesting is you can actually start to fit start to think about models in a generative way. 00:43:16 Dr Torri Callan And if you have an idea of what the generative process is for a certain set of data, you could think about writing that down. 00:43:25 Dr Torri Callan And then if you put. 00:43:27 Dr Torri Callan Prior distributions over all the parameters in whatever that data generating process is, you would be able to fit it to a certain set of data. 00:43:35 Dr Torri Callan And so it's a lot more powerful and. 00:43:37 Dr Torri Callan Flexible because you can. 00:43:39 Dr Torri Callan For example, use nonlinear functional forms if you know that there are certain physical constraints on something you observe. So if if a certain set of data is sort of bounded below, its. 00:43:51 Dr Torri Callan Zero and bounded above. 00:43:53 Dr Torri Callan At another threshold that can be that can be information that you incorporate into your model, you can also have these monotonicity constraints, so you can say, well, I'm certain that this function starts at a value and ends at the value, and it's monotonic in between that and I don't know exactly what values it goes between, but this is my. 00:44:14 Dr Torri Callan Rough guess, this is my approximation which I applied for a prior distribution. 00:44:18 Dr Torri Callan And yeah, it it opens you up to a much larger class of modelling and allows you to be a little bit more. 00:44:26 Dr Torri Callan Prescriptive in what model you write, and I think it does. 00:44:30 Dr Torri Callan Ultimately, allow you to be a little bit more nuanced. 00:44:32 Dr Torri Callan In your inference. 00:44:34 Dr Genevieve Hayes And how have you managed to make use of these techniques either in your PhD work or in your work as a data scientist? 00:44:41 Dr Torri Callan PhD with some. There's an interesting one. My my personal favourite chapter in that we were looking at the impacts of sort of an IVF treatment and some other characteristics of pregnancy on birth weight. Essentially birth weight is highly governed by the gestational age of the pregnancy. So. 00:45:01 Dr Torri Callan Pregnancies that go to. 00:45:03 Dr Torri Callan Let's say 20 or 30 weeks obviously tend to have premature babies that are quite small and then babies that go to full term are more likely to be around, well, you're you're more likely to see a higher birth weight baby. And there was this idea out in the literature that actually IVF. 00:45:21 Dr Torri Callan Or listed reproductive technologies. So more than IVF. 00:45:25 Dr Torri Callan Was actually impactful on. 00:45:27 Dr Torri Callan Birth weight they would tend to reduce the average birth weight by a certain amount. 00:45:33 Dr Torri Callan And So what you want to do when you're doing this kind of analysis, we had a bunch of data that we kind of wanted to use to investigate that. 00:45:41 Dr Torri Callan What we wanted to do was say well. 00:45:44 Dr Torri Callan Let's once you adjust for gestational age. Do you see the same thing? 00:45:48 Dr Torri Callan And so the typical way you'd do this if you were a regression modelling is you would add in a term for gestational age and then you'd add in the term for usage of ART in the pregnancy or not. 00:45:59 Dr Torri Callan And then you might add another bunch of terms for other characteristics of the mother that you're interested in. 00:46:06 Dr Torri Callan But we're we're able to kind of apply this kind of thinking, I suppose you go back a step if you're gonna do that sort of model, you obviously want gestational age to be a non linear term because the impact of an extra week of gestation is not the same going from, let's say 29 to 30 as it is going from 39 to 40 weeks of pregnancy. 00:46:28 Dr Torri Callan And there are lots of techniques you can use to get like a non linear term. 00:46:31 Dr Torri Callan That you use splines or Gaussian processes, or polynomial terms, or there's a bunch of different options. 00:46:38 Dr Torri Callan But where? Where, where I was able to apply like a Bayesian approach with a kind of non-linear parameterized model was we actually had like an explicit model of gestational age and birth weight, and that was parameterized. I think with a logistic curves with four parameters. 00:46:58 Dr Torri Callan A logistic curve is like an S shaped curve. It starts at the lower asymptote. 00:47:03 Dr Torri Callan It ends at the highest and tote and looks like an S in between. 00:47:07 Dr Torri Callan And the four parameters roughly govern where the lower asymptote is, where the upper asymptote is, where the midpoint of that shape is, and then how steep the shape is in between that. 00:47:19 Dr Torri Callan And then what you can do as you fit this model is you can have a a regression on each. 00:47:25 Dr Torri Callan Parameter of that logistic curve. 00:47:28 Dr Torri Callan So instead of having an overall ART effect, you could say well, this is the effect of assisted pregnancies on pregnancies that go to full term, because that applies to the upper asymptote of the S curve. And this is the effect. 00:47:47 Dr Torri Callan Technologies on pregnancies in determining like how quickly. 00:47:52 Dr Torri Callan Birth rates will increase between kind of early term and then full term pregnancies. That's what it actually does. It actually gives you a lot more information to conduct your inference on. So instead of looking at one parameter. 00:48:06 Dr Torri Callan And then looking at. 00:48:08 Dr Torri Callan The significance of that parameter I won't go down that rabbit hole of like significance and P values, but. 00:48:14 Dr Torri Callan Instead of looking at 1 parameter and then conducting all of your scientific inference on that one parameter, you have a class of a set of parameters that you then can kind of investigate for marginal differences and decide on the whole if there is something you can observe. 00:48:30 Dr Genevieve Hayes It's actually really interesting. It's something that I'd like to give a go at in my own work, and I can imagine a lot of other listeners would want to give it a try. 00:48:40 Dr Genevieve Hayes For anyone who is interested in applying these sorts of techniques in their work, where would you recommend they begin? 00:48:47 Dr Torri Callan A statistical rethinking is a really good textbook to start. I think Richard Mccawley, I know if I pronounced his name right. But statistically thinking in your favourite search engine will get you to the right place. 00:49:00 Dr Torri Callan He's written that textbook as an introduction to statistics, but it's through a Bayesian workflow. So instead of teaching what at Test is and. 00:49:10 Dr Torri Callan Then teaching what? 00:49:10 Dr Torri Callan An OS regression is he'll just teach you the basics through. 00:49:15 Dr Torri Callan Bayesian methods and so I think it's a really good place. He'll teach you it. The textbook kind of shows you where like. 00:49:22 Dr Torri Callan Where MCMC comes from and how certain choices of prior distributions kind of impact the end result of a posterior under certain conditions and then also the different classes of models that often get used. 00:49:36 Dr Torri Callan So there's a kind of lengthy introduction to the hierarchical models similar to the one I gave. 00:49:41 Dr Torri Callan At the top but. 00:49:43 Dr Torri Callan Kind of a lot more professionally done and then it kind of introduction to the idea of Gaussian processes, which are really are really valuable tool and then. 00:49:51 Dr Torri Callan A sort of interaction to nonlinear models. 00:49:54 Dr Torri Callan Non what I call nonlinear fully parameterized models, which is sort of the example that I just gave before. So I think I'd start there, I think if. 00:50:03 Dr Torri Callan Online learning is more your flavour rather than textbooks. I use the Stan language, which is Bayesian modelling statistical language. We can write models that then can be called by either R or Python, so it's. 00:50:19 Dr Torri Callan It can be used by other option for the same given model. What you'll do is you'll write a given model in Stan and then you'll pass data across either from R or Python, compile the model and estimate it, and then gather the posterior samples back in whatever language you were using before to analyse. 00:50:36 Dr Genevieve Hayes So do you call it from Python or R in order to get the model trip? 00:50:40 Dr Torri Callan Yeah, exactly. So you'll call the model and you'll pass it a set of data, and you'll specify in the model what data you're expecting. 00:50:48 Dr Torri Callan And so, Stan, we'll take that set of data, take the model you've written, apply its version of MCMC to a sample posterior, or generate a bunch of posterior samples from the model, and then it will pass that back. 00:51:02 Dr Torri Callan If you're working, it will pass that back to R and you'll have like a data frame of posterior samples. 00:51:08 Dr Torri Callan That you can work with. 00:51:10 Dr Torri Callan And you can almost analyse it as you would any other data set. 00:51:13 Dr Genevieve Hayes That's interesting. And is there. Is there some sort of Python or library that connects the two languages like Pakistan or something? 00:51:20 Dr Torri Callan Yeah, exactly. So π stand and R stand will have Connexions to some stable version, and then there's. 00:51:28 Dr Torri Callan Packages called command stand Π and command Stand R and that will actually speak to like the most recent release of the stand language. 00:51:36 Dr Torri Callan If you want to kind of use more up-to-date features, and if you start looking to the language resources there, there's documentation on sort of basic model building and basic usage of the language, and if you're. 00:51:49 Dr Torri Callan Reference is to kind of learn. 00:51:53 Dr Torri Callan The theory of things as you go, which is. 00:51:56 Dr Torri Callan Really mine at the moment, then. That's actually a really nice way to start, because there's enough information there that you can pick up on the fundamental ideas of Bayesian models at the same time as learning how to write it in the language. 00:52:08 Dr Torri Callan And there are wrappers. Well, there's definitely a wrapper in R called BRMS, and that will allow you to write. 00:52:15 Dr Torri Callan Bayesian models, but as a regression model if you're more familiar with that syntax. 00:52:20 Dr Genevieve Hayes So just use a standard regression model syntax. 00:52:24 Dr Torri Callan Yeah, exactly. And it's quite like highly flexible because it's it, it can import a few different functions from other packages that allow you to use other splines or Gaussian processes if you want to do some sort of non-linear modelling. And I've used it before if you have. 00:52:42 Dr Torri Callan An idea about like a functional form that you want to write out yourself, so I don't know the exact terminology. I call it like a nonlinear, fully parameterized model, so something like a logistic curve that you. 00:52:55 Dr Torri Callan You want to do a regression of a certain Y variable and you want to do it on this logistic curve of your X variable. 00:53:01 Dr Torri Callan You can just write that out in normal regression syntax and it will do all the estimation for you, and there's a nice set of functions. 00:53:08 Dr Torri Callan That kind of. 00:53:09 Dr Torri Callan Wrap around the output you get from the model. That will allow you to conduct a lot of your inference pretty easily and. 00:53:17 Dr Torri Callan Quickly, I think there's an equivalent in Python, but I'm not 100% sure of the name. 00:53:23 Dr Genevieve Hayes I want to go off and have a go at this after this episode's finished. 00:53:26 Dr Torri Callan I highly recommend it. Like it it will. Yeah, I think it's. I think it forces you to be a little bit more considered and precise in how you do your modelling and it it, like I said it opens you up to a lot more kind of it's a lot more generative modelling which I think. 00:53:43 Dr Torri Callan I know it's your background like I think if you studied regression models and statistics for and enough time. 00:53:51 Dr Torri Callan You start to see where like typical regression models are a little bit of a. 00:53:56 Dr Torri Callan Crusty and bed of you try and fit a data in a question into an answer of. Or does my regression model give me a significant coefficient, and if so, then my theory is correct. 00:54:07 Dr Torri Callan So this does kind of open you up a little bit to be a little bit more considered and perhaps creative in how you do your modelling, which I really like. 00:54:15 Dr Genevieve Hayes Is there anything on your radar in the AI data and analytics space that you think is going to become important in the next three to five years? 00:54:23 Dr Torri Callan Building systems that can use data to give real. 00:54:26 Dr Torri Callan Like intelligent feedback and have that like, have that really tightly coupled to a product is going to be a. 00:54:34 Dr Torri Callan Real key feature. 00:54:36 Dr Torri Callan Of a lot of AI data space and I think being able to do that. 00:54:41 Dr Torri Callan On mass across a lot of different businesses will be valuable. Obviously the the leaders like Google and Facebook and a few others have been able to do this pretty well. 00:54:52 Dr Torri Callan But the teams that don't have a large analytics stack, it's been a little bit more difficult. I think the challenges most data scientists have faced. 00:55:02 Dr Torri Callan That come across the business, there's a number of business requirements that they need to address. So even though they're highly capable and technically gifted, you end up answering financers questions on how much revenue we had last month and then. 00:55:20 Dr Torri Callan You want to answer marketing these questions of how many people did you sign up and you you have to crawl before you walk before you run. 00:55:27 Dr Torri Callan But what I. 00:55:28 Dr Torri Callan Forecast happening is that that will become easier and easier to solve because of the amount. 00:55:33 Dr Torri Callan Of tools available. 00:55:35 Dr Torri Callan And it it kind of opens up room for the next stage of technical sophistication. 00:55:42 Dr Torri Callan And I think what that looks like is being able to come up with, not necessarily predictions from machine learning, but being able to come up with. 00:55:53 Dr Torri Callan Different sources of insights that can actually be utilised by a product in almost an optional manner so you know we spoke about quite a few examples. 00:56:02 Dr Torri Callan That we're starting to look at with you are but. 00:56:06 Dr Torri Callan There are other cases you can think of for like operational type requirements. I know churn modelling has been like a really staple example of a lot of data science teams, but this is not not necessarily talking about prediction of who's likely to churn, but also giving a recommendation of where that should go. So. 00:56:25 Dr Torri Callan If someone's likely to churn, you might want to place that person into whatever messaging service your customer service team is you and actually building out that system end to end. Or you might want to have some. 00:56:36 Dr Torri Callan And notification go out into the app or some or you might want to actually kind of modify what the UX and the UI of the app that you're working on, like how that actually gets displayed based on some prediction you generate. So it's almost about building an end to end system that uses the data you have available. 00:56:57 Dr Torri Callan And make the prediction, which I think I think most people kind of have their heads around, but then building, building a system around that to actually do something with it, because I think most of the data we have available is valuable. 00:57:12 Dr Torri Callan Only in as much as you can make decisions off the. 00:57:15 Dr Torri Callan Back of it. 00:57:17 Dr Genevieve Hayes And what final advice would you give to data scientists looking to create business value from data? 00:57:23 Dr Torri Callan I think being curious about what you're working on and trying to bring some level of passion to what you're doing almost as a data scientist or service for other peoples issues and taking on that service role is something like you can solve a lot of people's problems with the tools and. 00:57:44 Dr Torri Callan Insights you have available. 00:57:46 Dr Torri Callan Kind of sets you up in a really nice place. I'm just. I mean, as I'm speaking, I'm reflecting on this like. 00:57:52 Dr Torri Callan Have a few successes and quite a few failures at doing this, so I think the successes have been where. 00:57:59 Dr Torri Callan Being able to work with someone and find where that particular pain point or. 00:58:07 Dr Torri Callan Potentially an unknown unknown in that they don't. They're unaware of what isn't working for them, but you are aware of it and being able to bring that to someone is really valuable. 00:58:18 Dr Torri Callan And I think as all as I've also going on, it's not just about bringing in a set of insights and it's also not about bringing in a set of predictions. It's about working on like. 00:58:27 Dr Torri Callan What you can actually build and thinking about it. 00:58:31 Dr Torri Callan Almost. I mean, from a tech perspective, almost thinking about it as a product manager to say, well, if I were in charge of the product, this is what you could actually build and bringing that to someone. 00:58:41 Dr Torri Callan I think the the this. 00:58:43 Dr Torri Callan Successes that I've had. 00:58:44 Dr Torri Callan In my career have always come from being able to work really closely with a product manager or a product owner and being able to be very prescriptive in what can actually be built. 00:58:56 Dr Torri Callan So instead of just giving them a base level of insight or a base level of recommendations, give them the option of a few different things you can build. 00:59:06 Dr Torri Callan And I think if you were very engineered doing that for someone, I think you'd want to be fairly passionate and fairly interested in the product and being able to solve those problems for the business you're working in. And for the people who are using it. 00:59:22 Dr Genevieve Hayes So it's a product driven approach to problem solving. 00:59:26 Dr Torri Callan It's it's my bias to be fair. Like I think I'm sure you get different answers from different people, but my perspective is that you have almost like a. 00:59:36 Dr Torri Callan A change management. 00:59:38 Dr Torri Callan Optional approach to product development like you can look at a problem in a product or in an operational part of a business and make some really foundational changes with how that's done. 00:59:53 Dr Genevieve Hayes For listeners who want to learn more about you or get in contact, what can they do? 00:59:59 Dr Torri Callan LinkedIn is probably the best way, no too big on the socials or blog writing, even though I've thought about doing it plenty of times. But yeah, LinkedIn, if you search my name, it'll come up. Yeah. Otherwise happy to share an email. 01:00:13 Dr Genevieve Hayes May and I'll put a link to the URL homepage in the show notes for this episode. 01:00:21 Dr Genevieve Hayes Hey, thank you for joining me today. 01:00:23 Dr Torri Callan Yeah. Pleasure. Thanks for having me. 01:00:26 Dr Genevieve Hayes And for those in the audience, thank you for listening. I'm doctor Genevieve Hayes, and this has been value driven data science brought to you by Genevieve Hayes Consulting.