You may think of R as a tool for complex statistical analysis, but it's much more than that. From data visualization to efficient reporting, to improving your workflow, R can do it all. On this podcast, I talk with people about how they use R in unique and creative ways.
Hi. I'm David Keyes, and I run R for the Rest of Us. You may think of R as a tool for complex statistical analysis, but it's much more than that. From data visualization to efficient reporting to improving your workflow, R can do it all. On this podcast, I talk with people about how they use R in unique and creative ways.
Speaker 1:I'm delighted to be joined today by Christine Parker. Christine is the solo data wrangler and map crafter for the community broadband networks team at the Institute For Local Self Reliance. Her work guides and supports the team's education and advocacy work related to Internet access and digital equity across the United States. Christine uses R regularly for data wrangling tasks and creating geographic datasets that she can visualize in Maps. Christine, welcome, and thanks for joining me.
Speaker 2:Hi. Nice to be here. Thank you.
Speaker 1:So you and I met because we were in speaker training for the 2024 positconf. And your talk was particularly interesting to me because it was about how you used R as part of the development process for a map that showed access to the affordable connectivity program. And we'll talk more about that in a minute. Just I'll mention that it was a pandemic era program to help people get access to affordable Internet, you know, at a time when, obviously, everybody was working and going to school from home. So before we get into that program and kind of how the r work that you did fit into it, I'm curious if you can just give me kind of a brief sketch of your background and specifically kind of how you got into r.
Speaker 2:Sure. So I went to grad school at the University of Illinois at Urbana Champaign, and there I wound up getting both my master's and PhD in natural resource and environmental science. And I specifically focused on bird behavior, so I've had a major context change since then. But during my PhD, I really got into using R in more advanced ways. But initially, during my my master's years, any kind of stats courses that were available were only teaching SAS, but there were other students that were using R, and it became like a sort of peer pressure, like, these are the cool kids using R kind of situation.
Speaker 2:Like, I could hear them commiserating about their struggles with learning and using R, and I was like, well, I wanna be a part of that club. It also seemed to make more sense because it's an open source, and it's free to use, and looking forward into, you know, career stuff, it made sense to work with something that I could use on my own and practice outside of academia. Yeah. And then, yeah, in my PhD work, I was studying wild turkey behavior, and we put little, GPS backpacks on wild turkeys. And they collected it collected, like, a bit bit less
Speaker 1:than research. I'm just imagining little turkeys walking around with backpacks on. Sorry. Can you just ex I'll I'll let you continue, but can you just is that actually what happened?
Speaker 2:Kind of. It's not like what you would consider a standard backpack, but it's like a a little black sensor that's, not bigger than a standard cell phone at this point in time. Okay. And it has, like, this elastic cording that it straps over their wings like a backpack would on your arms.
Speaker 1:Okay.
Speaker 2:And it holds it on their back. And so that is communicating with satellites, and it collects location data every 2 hours, and then it's also continuously collecting movement information, and so it generates huge datasets. Yeah. So it became really useful for me to understand get into the, like, spatial data side of working with R. So yeah.
Speaker 2:Interesting. Okay. So sorry. I interrupted you. Sorry.
Speaker 2:I was distracted
Speaker 1:by turkey backpacks, so please continue.
Speaker 2:So, yeah, I was influenced by these other students using R. And because because there were no courses on it, it was largely self taught and, like, meeting with other students about, like, when I found out, you know, this person was using this package, so they're the quasi expert around here. And then I also found out about the swirl package, which I found immensely helpful. It's like an interactive teaching package built in that you can get for r, and it walks you through, like, really basic stuff, but they have kind of more complicated advanced packages now, and they're really helpful. So I always recommend that for new folks, like when I have interns.
Speaker 2:And over time, it just progressed, and I learned new things, and here I am today.
Speaker 1:So how did you move then from studying wild turkeys to working at the Institute For Local Self Reliance, doing Internet access work?
Speaker 2:So I mentioned that the turkeys had these GPS units on it, and it was a massive, like, spatial geographic dataset. And working with those kind of data and creating maps was really interesting to me and, a lot of fun. And when I finished my PhD and was, like, looking at the job market, I started to consider that focusing on that side of my skill set would be probably a better path forward career wise than being kind of fixed in my, like, ecology context. And so I wasn't, like, closed off to ecology, but I really wanted to stay on this map in GIS spatial data wrangling track.
Speaker 1:So you moved to the Institute For Local Self Reliance. Can you talk to me about, generally speaking, what your job there entails and how our fits into it?
Speaker 2:Sure. Like I mentioned, I'm in the Community Broadband Networks Initiative. And so we focus on advocating for locally owned and accountable Internet networks, and also advocating for affordable and reliable Internet access. Initially, when I started there, I was most frequently working with this massive dataset that's managed by the FCC or the Federal Communications Commission, and this dataset is now known as the broadband data collection. At the time when I started, the data were aggregated at a census block level, which is a really small geography that the Census Bureau uses, and R could handle that.
Speaker 2:I could pull in the whole dataset for the whole country. And and the by the way, rewind. This dataset represents claims of where any Internet service provider could offer a particular kind of Internet service. And so there's all kinds of information about, like, the type of service and Mhmm. Technology is used, but they have to update this twice a year.
Speaker 2:At the time, it was, like, a 20 gig size dataset, and our could handle the whole thing. It would really bog down. Sure. But I was able to do everything I needed it to. And then a couple years ago, they made this wonderful change and converted the dataset into a location level dataset, which is great because it's much more granular, and you can really get some more nuanced insights about connectivity across the country.
Speaker 2:It's massive now and R is like, no, no thank you. So I have been using R a little bit less lately only because I do a lot of the bigger wrangling. I have a, PostgreSQL database that I kind of keep everything in and do some overarching, like, wrangling there and then bring it into r. And so I use R as, like, kind of the fine tuning and creating, like, tables and plots and things like that. After being at the POSIT conference, I learned about DuckDb, and I'm really interested using the wrapper that they have for R.
Speaker 2:Unfortunately, I guess they don't have a Windows binary yet, so it's not working for me yet. But I look forward to that day. Yeah. Yeah. Hi.
Speaker 2:David here.
Speaker 1:Did you know that R for the Rest of Us does consulting work? We help organizations to communicate more effectively and efficiently with beautiful parameterized reports, interactive websites, and custom art packages. Learn more about how we can help your organization at art for the rest of us.com/consulting. So one thing that, you've worked on in your time at the Institute For Local Self Reliance is this affordable connectivity program. Can you describe its origins and kind of how your work at the Institute For Local Self Reliance began?
Speaker 1:How you began to do work on that program?
Speaker 2:Yeah. So when the program came about, it had a bank account of, like, $14,000,000,000, but it had no timeline. Like, no one knew when it was meant to end. There were no, like, checkpoints along the way. We just knew there was this funding, and there was, like, eligibility characteristics, so a household could be eligible based on their income and a bunch of other ways.
Speaker 2:But there was no good information about other than that, really. And I started to hear a lot of questions from my team, and then we started to get questions externally about the program. And because it's a federal program, there is a website where you can get access to all of the enrollment and claims data. So the providers would submit claims to the FCC based on, like, how many customers they have using this program. The FCC has a company that manages all this data and puts it on this website, and it is in many different formats, much to my dismay.
Speaker 2:And that and that's ultimately what gave me the idea to create a dashboard because there were many different, like, geographies of data available, and there were different timelines for when these datasets were updated. And it was just really confusing to try and explain to anyone what actually any of this meant without just summarizing it and creating some sort of a resource for folks to use that was more straightforward because pointing folks to this, like, oh, you could look up your ZIP code in this CSV file maybe Mhmm. Is not really particularly helpful. And the advocacy folks that we work with, especially those that are out in communities and helping people get signed up and dedicating their time and money on that kind of a effort, don't have the time to go and track down, like, what enrollment looks like in their city because it would just be a real hassle.
Speaker 1:And so the work that you did, was that about showing what programs are available for people who wanted to take advantage of them, or was it showing how many people are taking advantage of the programs or or possibly both?
Speaker 2:Yeah. So we wanted to highlight where enrollment was occurring and also, like, the actual rate. And the actual, like, eligibility was an important factor in all of this that and there ended up being kind of some, like, different variations of how you calculated that depending on who you talk to. And it became really important to be able to demonstrate because you could look at enrollment across the country, but without knowing, like, how many folks were actually eligible. It's not a terribly helpful number.
Speaker 1:Are you the one or other folks at your organization had the idea to make this dashboard?
Speaker 2:Yeah. It was kind of a a collaborative effort to go forward with that. And the dashboard, I think, had been kind of a tentative suggestion at the time because we really hadn't done anything quite like that yet. So, it was exciting to see it actually Yeah. Blow up like this.
Speaker 1:Yeah. And, well, not to cut to the chase, but in the end, it got a ton of views, and you ended up we'll come back to this, but you ended up speaking to folks, at the White House, about the work that you had done. Yeah. So can you talk just at a at a high level about the role that R played? Because I know the dashboard itself, the presentation layer was actually done with Tableau.
Speaker 1:But Mhmm. Talk about the role that R played before getting to to Tableau.
Speaker 2:Yeah. So R was really helpful in a couple of big ways. One was taking all these different data sources and combining them, cleaning them, rearranging them, and then ultimately summarizing them into the different values or elements that were displayed in that dashboard. And the other way was pulling data from the Census Bureau because rather than, like, going to their website and then also navigating another federal website, There's a great package called tidycensus that allows you to get access to those data in a really pretty straightforward and easy way. So those are kind of the big lifts that AR helped me with on this project.
Speaker 1:Yeah. And it seems like, you know, having looked at the code, which we'll we'll have you show in just a moment, a big part of this was being able to combine multiple data sources, like you said, bringing in data from the Census Bureau alongside other data and having it all in one place. It seems like that was a a major benefit of doing all the data wrangling in R. Hey. David here.
Speaker 1:Just wanted to let you know that at this point in the conversation, we switched to a screencast. Now obviously, showing code doesn't work very well in an audio podcast, so if you wanna see the rest of this conversation, check out the video version of this podcast on YouTube. You can find a link to that in the show notes. So what's the value in doing this in R versus you know, I imagine an Excel user could look at that and be like, well, I could do that same thing in Excel. Why do it in R?
Speaker 1:What did that get you?
Speaker 2:I have seen people work with these data in Excel, and my justification for doing this in our even so, sometimes it would take me a little longer to, like, troubleshoot code and things like that is that it's repeatable. And I'm not altering the data necessarily in any way. I'm just bringing them in, and then I edit the code in the script, but I'm not actually altering the numbers in the cells as I'm going along. And I'm not a proficient Excel user, so maybe there are more nuanced ways to go about it these days. But ultimately, it's much easier to repeat running the script than having to go in and manually type in and and edit cell level things.
Speaker 1:Because you also did tell me that you had to rerun this regularly. Right? Like, it wasn't just a one time thing. Correct?
Speaker 2:Right. Yeah. So they were updating some datasets monthly. The national data were updated each week. So our kind of happy medium was to do it, like, twice a month.
Speaker 1:Okay. Yeah. And so if you're, you know, having to do this every other week, you don't wanna have to do that manually every time. Having your R code allows you to just rerun that and have it get the updated results. Because in the end, like, what you produced was data that was then used in the Tableau dashboard.
Speaker 1:Right? Like, that was the kind of end result of that giant R Markdown document that you showed. Is that right?
Speaker 2:Yeah. Yeah. The end of the script, and it's I think it was like a there's a a Google Sheets package or something like that. And so I would just update that every time I did this, and that would be the workbook that we'd use to feed Tableau.
Speaker 1:Cool. And I think that's a really good example of how you don't have to use R for everything. And I think sometimes people get really wrapped up on, like, well, if I can't do, you know, every single step, then maybe I won't do it in R. But you, you know, tackle one piece of this, which was the data wrangling. And, you know, for the reasons that you just mentioned, found R to be really helpful for that piece of it, and then use Tableau for the actual presentation of the dashboard.
Speaker 1:So that makes a ton of sense. Talk to me about how popular this dashboard got and how you ended up speaking to folks at the White House.
Speaker 2:Yeah. I was not expecting it to get as popular as it was. I was still really new in my role when we were working on this initially, and so I had no idea, like, that that was possible, I guess. But it started getting really widely shared in our advocacy community. So we were sharing it on our podcast, talking about it a lot.
Speaker 2:We handed out, like, little postcards at a national conference that we went to, and a lot of people were like, oh, you guys made the dashboard when we were there. And I was like, like, oh, wow. What's happening? I didn't recognize that this would be something to be known for, I guess. But yeah.
Speaker 2:And people were, like, thanking us, and it was just it was really wild, and and it was really nice to, like, hear that it was so useful for people. Because like I mentioned, the folks that were trying to figure out where in their community they need to go focus their efforts to get people enrolled, they could use this to look at the ZIP code level and say, this area has really low enrollment relative to what it could be, and so we should figure out what we can do there to get people enrolled. And that, you know, at the end of the day, it really felt good to be able to help with something like that. And then at the policy level, it started getting shared with, like, different offices in Congress. And then once folks finally started to push for the program to be refunded, there were, like, letters going amongst congressmen, and this was getting cited in there.
Speaker 2:And it was in a senate committee hearing. And so, yeah, it was it was wild. Very exciting
Speaker 1:Yeah.
Speaker 2:To have something like this.
Speaker 1:I know when they were working on trying to get additional funding for this program, folks at the White House actually contacted you to to speak to them. Can you talk about what that was?
Speaker 2:Yeah. So when there was a need for, like, an actual number to pitch to Congress to, refund this program, numbers started floating around, and and the White House finally came out with a number of $6,000,000,000, and we had predicted it would actually take around 7,000,000,000 at the current enrollment rate. And I had a friend at ACLU who's like, well, I'm telling them this isn't enough, but they are going to better listen to someone who's actually working with these data. And so would you be willing to talk with them and explain your numbers to them? And sure enough, that day, I was on a call with them, and, they did not listen.
Speaker 2:But, I mean, they listened, but, you know, did they take my advice? No.
Speaker 1:Well, you can only you can only do so much. Right?
Speaker 2:Right. You know. Yes.
Speaker 1:Well, that's exciting. I mean, even if they didn't take your advice in the end, just, you know, seeing the impact that it had both in the advocacy community as well as, you know, ultimately getting the attention of folks in the White House is is impressive. So I'm curious, you know, now when you reflect on this, the what would you say you learned through the process of of creating this affordable connectivity program dashboard?
Speaker 2:Well, you kind of already touched on this, but I'll just reiterate that it's okay not to do things perfectly. Like, I will jokingly, like, call out a lot of things that are wrong with this script and how I use GitHub. Like, this is available on GitHub for folks that are interested. As someone who isn't using R in, like, the most perfect way, I feel allergic to functions, and other things. Like, I I'm still learning, and I want folks who are also still in that learning phase to feel comfortable to, like, put their work out there even if it's not perfect.
Speaker 2:Because if you don't actually do it, it's never going to get done whether or not it's perfect. So do it imperfectly. Do it scared, and he might be really successful at it in the end. And then it you know, just also, like, we've used a variety of tools because that's what we had available to us. It also allowed us to be, like for me not to work alone on this.
Speaker 2:I had, you know, my team members that were able to help with certain aspects of this, like designing the Tableau interface. I didn't do that part, And so that allowed us to work together, whereas if I had only done this in R, it would have been all on me. And Right. Who knows if this would actually have gone anywhere because I would have had to learn how to how to make a dashboard in R.
Speaker 1:Yeah. It
Speaker 2:was daunting. Yeah.
Speaker 1:Yeah. Well, I think that's a really good place to leave it. So, Christine, thank you again for joining me and and sharing about your work on this dashboard. It's really inspiring.
Speaker 2:Yeah. Happy to be here. Thank you for the invite.
Speaker 1:That's it for today's episode. I hope you learned something new about how you can use r. Do you know anyone else who might be interested in this episode? Please share it with them. If you're interested in learning r, check out R for the Rest of Us.
Speaker 1:We've got courses to help you no matter whether you're just starting out with r or you've got years of experience. Do you work for an organization that needs help communicating effectively with data? Check out our consulting services at R for the Rest of Us dot com/consulting. We work with clients to make high quality data visualization, beautiful reports made entirely with R, interactive maps, and much, much more. And before we go, one last request.
Speaker 1:Do you know anyone who's using R in a unique and creative way? We're always looking for new guests for the R for the Rest of Us podcast. If you know someone who would be a good guest, please email me at david@rfortherestofus.com. Thanks for listening, and we'll see you next time.