No Compromises

You've inherited a legacy app. It's "running fine" in production, but when you add an error reporting tool, you see hundreds of errors, warnings, and notices logged each day. What do you do?

Check out our books and courses at masteringlaravel.io

Creators & Guests

Host
Aaron Saray
Host
Joel Clermont

What is No Compromises?

Two seasoned salty programming veterans talk best practices based on years of working with Laravel SaaS teams.

Joel Clermont (00:00):
Welcome to No Compromises, a peek into the mind of two old web devs who have seen some things. This is Joel.

Aaron Saray (00:08):
And this is Aaron. One of the things that we're, we'll call it, blessed with sometimes is legacy code and projects. We kind of talked about this before, like what are some of the steps you get started with when taking over a legacy project. Question I had for you, Joel, is you know I'm a huge fan of bug reporting but how do you get started with bug reporting when you take over a legacy project and everything just kind of appears to be broken but mysteriously working?

Joel Clermont (00:43):
Ah, boy. I have a particular project in mind but no names will be named. Yeah, because let's just focus on the PHP error reporting and not even get into the JavaScript yet. But, yeah, we like to use BugSnag, so you wire that up and it's wired up in sort of a global way where any unhandled exceptions get reported up to BugSnag. And, wow, the first time you turn it on and it's in production, or even you're like running it in staging or something, just clicking around and you're seeing hundreds of errors pop through in just normal requests, and it's completely overwhelming. So, yeah, how do we get started? Well, first you install it, but then it's sort of like divide and conquer and figure out what's noise and what's actually something that requires attention.

Aaron Saray (01:31):
Well, when you say what's noise and what requires attention, don't all bugs require attention?

Joel Clermont (01:35):
They do, Aaron, they most certainly do. But more in terms of what requires immediate attention. Because to me, let's see, I would kind of think of two specific things that would make something rise to the level of requiring immediate attention. Number one, is this a user facing error? Like, for sure if that's happening, if somebody's clicking and it's not doing what they expect it to do, or even worse they're seeing some sort of ugly error message on the screen or in the console, to me that would require more attention. Than some warning on the backend that's saying, "Oh, this might not be doing what you would expect it's doing." The other thing is, even if it's not user facing if something isn't getting written to the database or is just behaving incorrectly, well that would certainly require attention too. And that's a little harder to sort out when you have those 500 error messages in front of you but it does warning, notice, info, exceptions. So that's kind of a way I sort of triage it too, is by starting at the most severe and working your way down from there.

Aaron Saray (02:42):
I think one of the things is if you're taking over a legacy project, you may not know what's most important about that project's success either.

Joel Clermont (02:48):
Sure, yeah.

Aaron Saray (02:48):
So one of the things I've done is you take your list and you have a general idea that like, "Fatal errors are things I want to fix." But then you have all these different warnings and things like that. I look at that and then I'll kind of get an idea of what that might be affecting and then talk to the stakeholder and say, "Here are some things that may be going wrong, which one of these are most important so I know which ones to kind of tackle first."

Joel Clermont (03:11):
Sure. Yeah, that's important context to have.

Aaron Saray (03:14):
I think that that can be difficult because stakeholder can be like, "Well, everything's working fine." And then that can kind of lead you to the conversation, "While it appears it's working fine, this is why we have maintenance." Your car kind of appears to be working fine until the engine seizes up. If you're not changing your oil, if you're not a mechanic you don't know things are going wrong until that last minute when everything kind of goes wrong. So you kind of mention you get to your bug reporting in there, you do a form of triage and then you mention there's different levels. Do you get to that point where you try to fix all of the errors? Or do you get to a point where you look at certain errors and you just say, "I'm going to mute those."?

Joel Clermont (03:58):
I feel like this question is a trap because I definitely have muted them but I know it's not good. One of the things I like about BugSnag is you can mute for a period of time. Because it can just be honestly just noise and annoying to see a bunch of warning and notices come in that you know you're not going to do anything about at least in the next two weeks. But generally speaking, I'd prefer not to have to do that. And maybe one comment specifically about triaging, it�s not just like, "Okay, have BugSnag open in one window and just start hammering out code in the other window." Like, a lot of times we will open a ticket specifically saying fix this error or a grouping of these errors. Because sometimes you fix one and like four go away, like it was one root cause leading to all of them. We won't necessarily, in the case of 300 different errors, we won't open 300 issues but we might have an issue that we're going to tackle that says, "Fix these 5 unhandled exceptions," or, "Fix the top 10 most common occurring warnings," or things like that. Just so it's tracked and it's a little less flying by the seat of your pants trying to put a fire out. Because in reality, those errors have been there for years, like if we're inheriting the code base it's not something you have to drop everything to do. It still fits into our normal scheduled work routine.

Aaron Saray (05:21):
Yeah. I don't know what kind of fantasy land you work in where you fix one error and four go away. My experience with Legacy code has been I fixed the one error and then I push that out to production, and then turns out I get three more errors further down the script. I'm like, "Oh, how would I have known that?" So I don't trust you now.

Joel Clermont (05:39):
No, it does happen, Aaron. It definitely happens. But you're right and that's actually bad when we fix something that we saw in BugSnag, like we know it's a bug, but the client doesn't think anything's wrong. And then yet when we fix it now they see something not working that's horrible. Like, try to avoid that. But it is also a good reason why you want to communicate why you're doing this and when you're doing this with the client so it's not just blindsiding them. Like, "Oh, you're just in there messing around and you broke something." It's like, "No, we're doing this important work and it is possible we might break something because this thing is so mangled and uncontrolled right now." Sometimes when you get it into shape, you do actually cause some issues along the way but we do try to minimize that.

Aaron Saray (06:22):
So I'm just following through the steps of this legacy app. You put it in bug reporting like BugSnag, you triage the issues, you start fixing some of them, you mute some of them until you have a chance to take care of them. Are you ever getting to a point where you've just knocked out all the errors and there's no more errors in BugSnag?

Joel Clermont (06:40):
Yes. I mean-

Aaron Saray (06:42):
So that's the goal?

Joel Clermont (06:42):
Yeah, I think so. I think that's a reasonable place to be. Because if you sort of accept the fact that there's always going to be errors then at a certain point you stop noticing anything. Now, I'm not saying you can get from 300 to zero in the first two weeks of work. It may be kind of a long-term goal, you have to fit it in with the priorities. But, yeah, to me a greenfield project that's the approach we take, right? If any error is reported, we're going to go look at it, we're going to fix it, even if it is a notice or just a warning that's not actually causing a problem. So I think, yeah, a legacy project you want to get it to the same state.

Aaron Saray (07:16):
Cool. Yeah, I agree with that because I think bugs and errors erode trust. And really as programmers that's one of our number one commodities actually is the trust in the application that we build. Not only that our client trusts that we're building stuff, but then the user of the software can trust that the software's going to work properly as well.

Joel Clermont (07:35):
Well, and actually just kind of a little bit of a tangent but related to trust and bug tracking. The nice thing about a tool like BugSnag is it not only tells you an error happened, but if the user was logged in, it captures, you know, what's their user ID and what's their email address. You know, some basic things like that. So I've had circumstances where you're kind of new to a legacy project and you see those errors come in and you see it's the client or one of their kind of main users of the system. And you can reach out to them, "Hey, I just saw you got this error. Could you give me a little detail about what happened? Did you see an error on the screen or?" And it's like a magic power. They're like, "What? How did you know? I get that error every Tuesday when I do this, but I just kind of like... I have a way of working around it." So it actually really builds trust having some visibility and reaching out to the person and asking them for more information. Or even saying, "Hey, I saw you got this error, just so you know I fixed that now. You shouldn't run into that again."

Aaron Saray (08:34):
Cool. I think the last thing I kind of wanted to mention about using something like BugSnag, and this is kind of a last resort if you don't want to mute things or you have specific errors that you know aren't supposed to be errors. Is they give you the ability to kind of intercept that before it goes to BugSnag. So one of the issues I've noticed, especially when I'm upgrading certain libraries, third parties will throw an exception that's not caught and it's actually perfectly fine. I don't want that as a as a bug, you know?

Joel Clermont (09:07):
Yep.

Aaron Saray (09:07):
This job has failed and that's perfectly fine that the job has failed, please don't report it each time. So you can write things in your bootstrapping code into the BugSnag handler to say if it's this particular type of exception or remove this information for like security or add additional context and information as well. I mean, that's another thing too. You talked about tracking the user. If it's a new to us legacy app, we might have maybe more stuff in the session that we want to track and we just kind of put it into the error reporting as they happen. Because I would like to know what... I can't think of a particular example. But, like what group membership they had or what step along the progress were they, if the URL doesn't properly say it or, you know things like that. You can also add that in.

Joel Clermont (09:56):
Yeah. The one example I thought of that where we did this recently was like a PDF rendering library that I think hadn't been updated since PHP 5 days. So it was throwing a bunch of warnings or... I can't remember if they were warnings or notices. But it was like, "We're never going to fix this, we're eventually just going to replace this library." We could have muted it but I think they were all different enough because of the dynamic nature of the PDF generation that muting it on the BugSnag side didn't work but intercepting it on the client side or on the server side did the trick. I think that's a great tip.

Aaron Saray (10:36):
You notice how we're sort of like conditioned that certain things in context are weird but then in a different context they're perfectly fine? Or, things like that.

Joel Clermont (10:47):
I'm going to need an example, Aaron.

Aaron Saray (10:49):
Okay. My example is, you see a guy walking through your neighborhood with a crowbar. Like, okay, I think that's kind of weird. That's a person that might want to break or something. But you see the same guy walking through your neighborhood with hands full of crowbars, like three in each side, you're a lot less like, "Oh, well, that's just odd." So a person with one crowbar an attacker, with six, "Oh." Does not compute, I just don't get that.

Joel Clermont (11:20):
Does the threat diminish because there's no possible way they could wield all six at once? You're like, "This guy's less of a threat."?

Aaron Saray (11:28):
No, I just think it... Imagine you have just a plain clothes person going through the yards and looking at people's windows or something. You're like, "Oh, they're a prowler." But on a high vis vest and they're like, "Oh, then they must be there."

Joel Clermont (11:43):
They work for the utility company.

Aaron Saray (11:44):
Yeah. But it doesn't say that, there's no utility trucks around. It's like if I wanted to break into someone's house, I would just definitely put on high vis vests and stuff and carry your own six crowbars.

Joel Clermont (11:54):
You know, the other magic thing you can hold that makes you look official is a clipboard, right?

Aaron Saray (11:59):
A clipboard, yeah.

Joel Clermont (12:01):
I had a friend who liked exploring buildings in the city and he could get so far into a building with just a clipboard. It was kind of amazing. No harm was done but he wanted to be at the top floor to look out of the window or something. But, yeah, high vis vest and a clipboard. I don't know about the crow buyers though, Aaron. I think a guy walking with six crowbars would still get like a second look.

Aaron Saray (12:24):
Yeah. I had to participate in a security attempt one time at a place I was working. And one of the things I did to get into the back door where there were smokers is I walked up with three cases of water. I could barely hold them and I'm like stumbling through the door. I'm like, "Hey guys, can you get the door?" And they're like, "Yep." It was all security badge stuff, but I just had three cases of water. And they're like, "Oh, well, no one would try to break into here with three cases of water." Guess what? A thirsty person would.

Joel Clermont (12:52):
That's right, they would. They're going to be sweating.

Aaron Saray (12:58):
What if you wanted to learn from us but you just didn't want to hear our voice anymore?

Joel Clermont (13:03):
Well, then you can hear it in your own voice when you're reading one of our books, which you can get at masteringlaravel.io.