You may think of R as a tool for complex statistical analysis, but it's much more than that. From data visualization to efficient reporting, to improving your workflow, R can do it all. On this podcast, I talk with people about how they use R in unique and creative ways.
Hi. I'm David Keyes, and I run r for the rest of us. You may think of r as a tool for complex statistical analysis, but it's much more than that. From data visualization to efficient reporting to improving your workflow, R can do it all. On this podcast, I talk with people about how they use R in unique and creative ways.
David Keyes:Well, I'm delighted to be joined today by Nicola Rennie. Nicola is a lecturer in health data science based within the Centre For Health Informatics, Computing, and Statistics at Lancaster University in the UK. Her research interests include applications of statistics and machine learning to health care, communicating data through visualization, and understanding how we teach statistical concepts. Nicolas also has experience in data science consultancy, collaborates closely with external research partners, and is coauthor of the Royal Statistical Society's best practices for data visualization guide. She can often be found at data science meetups presenting at conferences and is the Our Lady's Lancaster chapter organizer.
David Keyes:Nicola, welcome, and thanks for joining.
Nicola Rennie:Thank you very much for having me. I'm very happy to be here.
David Keyes:Well, we learned a little bit just now about, kinda what you do now, and I wanna come back to that in a second. But I'm curious kinda how you got into R in the first place.
Nicola Rennie:So I think the first time I ever used R was about 9 or 10 years ago in an undergraduate statistics class, and I hated it so much. I'd never done any sort of programming before, and I very much felt like I was sort of thrown into the deep end of it, and I didn't it just didn't click at all. And I think, sorry, for the rest of my degree program, I actually mainly used Python before I finally came back to R maybe about, sort of, 6 years or so ago. And I think that was more when I was working with data, and I found this sort of data wrangling processes in r a little bit easier. And that was sort of during my PhD program, and I did my PhD in partnership with industry.
Nicola Rennie:And I actually saw people using r in industry and sort of this is how we actually do things in the real world with R, and it all just sort of started to click a little bit there. So I think yeah. Originally, 9 or 10 years ago, but actually in practice about 6 years ago was probably when I started properly using R.
David Keyes:That's interesting because you hear so much, you know, especially as posit, the company, has kinda made the shift into Python. You hear a lot about people going the opposite direction, but it's interesting that it sounds like you actually had worked with Python first, but then moved into R because you felt it was better suited to to working with data. It's just a contrast to to what you hear so much these days. I wonder if you could give me just kind of an overview of what your day to day use of R looks like. I know you do a bunch of things from teaching to data visualization, which we'll talk about in a bit.
David Keyes:But, yeah, on a day to day basis, what is your your usage of R look like?
Nicola Rennie:So I'm a a lecturer in health data science. So in terms of the sort of data processing side of it, you know, I'm doing stuff like processing and wrangling health related data. A lot of that comes with applying statistical models, some machine learning as well, and then things like making data visualizations for papers or reports or presentations. What I use R the most for at the moment is actually for developing teaching materials in combination with R Markdown or Quartho. So I make things like lecture slides, lecture notes, course websites, sort of simulated data for teaching as well.
Nicola Rennie:Because I work with health data, most of the time, you can't actually share that with students, so you have to sort of simulate data, and R is really helpful for that. The other thing I I really like about using R for this is that I can make parameterized reports in quarto. So I make parameterized tutorials for students. So I can make a version for students that doesn't have the answers in it, and then I can make a version for myself that has a few solutions in R as well. And it means I know that the code that I'm giving to students actually works as well when you're sort of making live documents every week, which is really nice.
Nicola Rennie:And then I actually
David Keyes:ask you about that.
Nicola Rennie:Yeah. Absolutely.
David Keyes:Sorry to interrupt you, but I'm very curious about that. So you make a say a Quattro document, and I assume you put all the code in that has the solutions. And then the parameterization of it involves rendering a version where you, like, set echo equals false or something like that so that it doesn't show the code. Is that accurate?
Nicola Rennie:Yes. Pretty much. So I have a as our parameter that's, like, hide or show answers. And then I actually use conditional displays in quartile. So you can sort of say, you know, if the parameter is true, then show this content or hide this content.
Nicola Rennie:Because that means that rather than using the echo is that I can also put sentences and sort of explanations for why I'm using code in the same sort of hidden or or shown section.
David Keyes:Sure. That's really interesting. I I mean, I've heard about parameterization. We do a lot of parameterized reporting, but never in that context that that's a fantastic use case that I never would have thought of. So, but I interrupted you.
David Keyes:You were talking about other things that you do with R on a daily basis, so I'll let you continue with that list.
Nicola Rennie:Yeah. So the other the other thing I did that would you kinda mentioned is sort of data visualization. Some of that is mainly for fun and for hobbies, but other parts of it is sort of with the Royal Statistical Society's best practices for data visualization guidance. That is also all built in quartile, and most of the examples in there are built with r as well. And part of that was our developing r packages so that people who are writing for publications don't have to know all the ins and outs of our code to make their plots look a particular way, you know, start building those helper functions so that people can implement the styling that we're asking them to as easily as possible.
David Keyes:You're clearly very into data visualization overall. I'm curious kinda where that interest has come from.
Nicola Rennie:So I think for me, at first glance, most people will probably see the sort of PhD in statistics and assume that because I work with data a lot, I've probably been making histograms and whatnot for for a long time. But I think that's actually not really how I go into data visualization. So actually, it's I think at the time when I started working more on visualization, I was looking for more of a creative outlet. So if you go all the way back to high school, my favorite subjects were art and music, not math and computer science. So I kind of got to the point where, you know, I started doing a lot of statistics and a lot of data analysis, and I wanted a more creative outlet.
Nicola Rennie:But I also wanted to get better at programming. So I was looking for something that sort of satisfied both of those things where I could do something reasonably creative or artistic and sort of design focused, but also learn how to use R a little bit more in a way that wasn't just working with the same data that I was working with every single day. Because I felt like when I was working with R, you know, you you see the same data set all the time. I know exactly how to process this, but it's sort of, well, if you throw me some other data that I've never seen before, how do I process that and and find something in it? So it was a combination of those two things, I think, that actually got me properly into visualization.
David Keyes:Yeah. That makes a lot of sense. And I think, you know, you mentioned before people assume came from doing your PhD in statistics, but I think a lot of people kind of naively assume that people who are good at statistics are also good at data visualization, and those are 2 very, very different skill sets. So I've always been struck that you're I don't know if I wanna say rare, but the relatively rare person who is able to do both of those things. Well, I I I don't know your work on the statistics side, but on the data vis side, you do really nice work there, which is unusual.
David Keyes:I'm curious with like, why do you use R? I mean, you mentioned before, for example, you came to R from Python. So what makes R well suited to doing the type of high quality data visualization that you like to do? And maybe if you want to even contrast it a bit with, like, Python. I don't do Python myself, but, you know, there's a lot of discussion of it.
David Keyes:I'm curious why you choose to do that work in R.
Nicola Rennie:So I think there's sort of 2 parts to to this question. What the first part in my head is why would you choose to use a programming language to make visualizations at all? You know, why would you choose to write code when you could do something that's more click and drag? If you're working in industry or you're working for a company, quite a lot of the time, you are making similar plots sort of every month or every year. And if you're using a programming language, then once you've made it once, you've essentially made it every time you might possibly need it.
Nicola Rennie:So it does sort of stop the repetitive elements of visualization. And I think if you're sort of more on the academic side, then it is just more reproducible as well. You know, when you come back to work 6 months later, you can rerun it without having to remember
David Keyes:Right.
Nicola Rennie:Exactly what you did. And then like you say, there are choices of programming languages. It's not necessarily default to r. And I think for me, there's a couple of reasons why I prefer r. Some of them are just this are wide range of packages that exist.
Nicola Rennie:Initially, when I started out in r, you know, you'd get maybe, like, 90% of the way into data visualization, and then there'd be the sort of final touches you wanted to make as sort of adding annotations or sort of logos or that kind of thing. And it always initially felt, okay. I might just have to edit this outside of R for a little bit. But I think in the last few years, those extra packages that help you get that last 10% have really come a long way. And now I can pretty much make my entire visualizations in R because of those extra packages.
Nicola Rennie:When I've tried to do the same thing in other languages, it's always sort of felt like sort of hacky solutions to try and get exactly what I wanted within a programming language. But within R, there's such a wide range of packages, and I think the community support as well for data visualization in R is is really broad. There's a lot of people doing really interesting things. It's also very easy to sort of build your own styling packages. So, you know, if you do want to implement the same styling for every single plot due, then you can make a package and it it sort of adds a couple of lines to your code and it does it all for you.
Nicola Rennie:It also does interface with other languages or other formats reasonably well. So there will be some people that no matter how nice your plot isn't or they still want it in an Excel spreadsheet or they still some sort of data, and it's reasonably easy to export those into other formats as well. So I think it does sort of give you that combination of reproducibility and overall customizability with the option to put it into different formats.
David Keyes:Yeah. That's great. I'm curious. You talked about packages that you use for that kinda last 10%. Are there particular packages that you would highlight that have made it possible for you to do everything in r?
Nicola Rennie:So I think for me, the my sort of favorite add on package is probably ggtext. So for working with text in r and then ggplot2, it's really nice because it does those sort of small things that I wanted to do sort of really easily. So things like automatically wrapping your text. So it fits in the area of the plot rather than having to manually add line breaks to plots, doing things like sort of custom annotations with different colored text. You know, if you want to replace a legend with some colored text in a paragraph.
Nicola Rennie:Again, it makes it really easy, and you can add sort of HTML and markdown text to to comments, and it just seems to work. So that's been really nice for doing even stuff like adding icons to plots because it processes HTML code in into, ggplot2. You can do that very quickly and easily without actually having to think about sort of icons or images or things like that.
David Keyes:Yeah. That's great. So I'm curious. What do you recommend, you know, if someone is thinking, hey. Like, I've seen the the type of data vis that you do or other people do in R, and it looks really great, and I'd I'd like to to do that.
David Keyes:What do you recommend for someone who wants to get into making that type of data vis in R?
Nicola Rennie:So I think if you've never really done much data visualization in R, the easiest thing is just to start doing it. So that is essentially how I got started. So as I sort of stumbled across tidy Tuesday, so you start a new data set every single week. People make visualizations. You see what other people have found with the same data set, or you sort of you can see the code of how they made their visualizations.
Nicola Rennie:So it was a sort of, like, constant in blocks of new data and new ideas, which was really good. The other thing I think is quite useful is looking at things like news articles and thinking, okay, what's the technical aspects of how do I actually remake the same plot in R, or are there better ways of visualizing that same data and and doing it that way, sort of doing a sort of new and improved version of news article visualizations that you you see? And then in terms of sort of thinking about how do you improve your data visualizations once you've you've gotten started a little bit, I think looking at what other people do with the same data is really useful. I know that I personally have a list of sort of ggplot packages and functions that I want to work through and a a list of sort of different types of visualizations I'd like to work on. So what I tend to do when I first start looking at data is actually sit down with a pen and paper and sort of sketch out some ideas before you sort of jump into wrangling data and making plot to support the actual idea.
Nicola Rennie:What's the story you're you're trying to tell with the data and and thinking about it from that side.
David Keyes:Yeah. You know, that's interesting. I mean, so many people I've talked to who do really good data viz. When I asked them, okay. How do you make a data viz in R?
David Keyes:Their answers are actually way less about r than you would expect. They'll say things like, look at news articles or I was talking to, for my book, Georgios Karamanis, who recommended looking outside of any kind of data viz things. You know, he's also a photographer, and he talked about, like, looking in nature and seeing kind of color palettes and that kind of thing. And your, you know, example of not sitting down and getting into the code right away, but sketching things out with pen and paper, I think, is also a really good example of how the kind of final product in our you know, comes after a whole process. It's it's not the the first thing that you're doing, which is really interesting.
Nicola Rennie:Yeah. Definitely. I mean, r is sort of I mean, it's a tool that you're using to build it. It's not necessarily the starting point. You know, it is essentially something that can draw lines and rectangles and circles.
Nicola Rennie:And you just start figuring out how do those shapes fit together on a page to to tell a story.
David Keyes:Yeah. Which is interesting. I you can correct me if my understanding of ggplot is wrong, but my understanding is ggplot was developed as a way to kind of, like, quickly, you know, make plots to do kind of exploratory data analysis. And, you know, obviously, it's been around long enough at this point that it's extremely mature and people now use it to make, you know, high quality production ready plots. So it's interesting to see that kind of trajectory over time.
David Keyes:Cool. Well, the specific topic that I brought you on to talk about was making data viz for mobile devices because I think it's especially as more and more, media and just anything is consumed on mobile devices. It's important to think about that. And you wrote a blog post that we'll link to in the show notes. But I was also hoping to have you kind of put your screen up and just walk through an example and give some tips because I think you have some really good thoughts on how you can make data viz that works well on mobile as well as desktop, of course.
David Keyes:Hey. David here. Just wanted to let you know that at this point in the conversation, we switched to a screencast. Now obviously, showing code doesn't work very well in an audio podcast, so if you wanna see the rest of this conversation, check out the video version of this pod podcast on YouTube. You can find a link to that in the show notes.
David Keyes:Great. Well, this is really helpful, Nicholas. So thank you again for joining us, and thank you for talking about designing Data Viz on mobile.
Nicola Rennie:Thank you very much for having me. I really enjoyed chatting.
David Keyes:That's it for today's episode. I hope you learned something new about how you can use R. Do you know anyone else who might be interested in this episode? Please share it with them. If you're interested in learning r, check out r for the rest of us.
David Keyes:We've got courses to help you no matter whether you're just starting out with r or you've got years of experience. Do you work for an organization that needs help communicating effectively with data? Check out our consulting services at r for the rest of us.com/consulting. We work with clients to make high quality data visualization, beautiful reports made entirely with r, interactive maps, and much, much more. And before we go, one last request.
David Keyes:Do you know anyone who's using R in a unique and creative way? We're always looking for new guests for the R For the Rest of Us podcast. Cast. If you know someone who would be a good guest, please email me at david@rfortherestofus.com. Thanks for listening, and we'll see you next time.