Episode #84: Serverless Compute at the Edge with Tyler McMullen

Episode #84: Serverless Compute at the Edge with Tyler McMullenEpisode #84: Serverless Compute at the Edge with Tyler McMullen

00:00 01:04:33

On this episode, Jeremy chats with Tyler McMullen from Fastly about the future of compute at the edge, what it means for things like data replication, security, and observability, what are the current limitations, whether it's competitive or complementary to public clouds, and much more.

Show Notes

About Tyler McMullen

Tyler McMullen is CTO at Fastly, a global edge cloud platform, where he is responsible for evolving the system architecture and the company’s technology vision. He leads a team of experienced technology innovators focused on internet scale, and working on future-facing, ambitious projects and standards. As part of the founding team at Fastly, Tyler built the first versions of Fastly’s Instant Purging system, API, and Real-time Analytics. Prior to joining Fastly, Tyler worked on large scale web applications, text analysis, and performance. He can be found debating about edge computing, networking, and distributed systems all over the world.

Fastly: fastly.com
Email: Tyler@Fastly.com

Watch this episode on YouTube: https://youtu.be/3F5COSkQlf0

Transcript

Jeremy: Hi, everyone! I'm Jeremy Daly and this is Serverless Chats. Today, I'm chatting with Tyler McMullen. Hey, Tyler. Thanks for joining me.

Tyler: Hey, Jeremy. Nice to see you.

Jeremy: So, you are the CTO at Fastly. I'd love to know a little bit about your background, and what Fastly does.

Tyler: I'll start with what Fastly does. Fastly is an edge cloud platform. What that ends up meaning is that we help people to move their content, as well as their logic, their actual programs, out to run on the edge of the network. The whole goal of that is to make things much faster for your users, better user experience, as well as much more resilient.

It's actually a super exciting place to be, in my opinion. I got into, we founded Fastly, oh, wow. 10 years ago now, maybe more. I can't remember off the top of my head now, but it's been a while. I remember getting into it specifically because Archer, who was our CEO and our primary founder, came to me and he was like, "I have this idea. It's a content delivery network, but it's more like an edge computing network." I was working at a startup at the time. I said, "That sounds extremely exciting." As a distributed systems nerd, that was just, oh, man! It's catnip to me.

Jeremy: Right.

Tyler: So, for the last 10 years it's continued to be exciting. That's how it got started there.

Jeremy: Awesome. What about your background?

Tyler: My background is, I was just a kid who taught myself to program, and got started working when I was about 16 years old, and just never stopped. I skipped the whole college thing and hopped from startup to tech company to startup.

Jeremy: Awesome. So, I'm a huge fan of serverless. Again, I do a serverless podcast, so it's probably quite obvious to people. But one of the things that I am absolutely fascinated with is the idea of serverless computing at the edge, which is one of these things that Fastly is doing. I think that there's a possibility that this could be the future of serverless computing. No more data centers, or things like that, or regions. It's just right at the edge, and as close as possible, that we could get to the user that is actually interacting with this stuff. So obviously, a huge challenge, lots of things that need to be done to make that happen. But I think what would be great for the listeners is if we just take a step back and explain exactly what we mean by compute at the edge.

Tyler: Sure, sure. It's actually a great question, because this is something that keeps coming up. For years, I have been trying to explain exactly what is edge computing. The problem is that everybody has a different opinion as to what exactly it means. I think that the ultimate problem is that depending on who you talk to, that person is familiar with or working on one particular line. One particular edge, effectively, of that network.

So, if you're talking to someone who works at a telecom, they're going to talk about 5G, and how it needs servers inside of cell towers, effectively. Meanwhile, you talk to a traditional ops person, talk to an ops person from the '90s. The way that they think about the edge is actually the edge of their own network. It's kind of the border between their autonomous system and the rest of the network, the rest of the internet. You talk to me, we're going to talk about metro area data centers, as well as even more narrow ones.

Anyway, the list goes on and on. So to me, I think it's actually kind of, the problem is, in my opinion, in the word. The problem is the word "edge," because it implies a line. It implies a specific point within the network, and I don't think that's actually true. Because if you think about all of these different places that we're talking about having computation, they all have really important similarities in their models. The point is that it's not the client. It's not actually the person that you're interacting with. It's also not within your own specific data center. It's not within your core computing.

Everything in between there has a certain set of problems. It means that you don't necessarily have direct access to a database. It means that you probably have to think about doing things in a little bit more of a stablest way. It means that you need to think about doing things at high performance. So, I think that when we talk about edge computing, what we're really talking about is computing in the middle. It's between you and your data center, and your actual client.

Jeremy: Yeah. I think about it a lot. I try to look at it like a CDN. I think of something like a Cloudflare, or even CloudFront with AWS, where they have all these points of presence all over the world. Generally, even Akamai, and some of these other ones that have been around for a really long time, thinking about, you store some sort of static asset somewhere at the edge. It's a .pdf that people can download, or it's an image that loads faster, or what's been really cool happening now is a lot of the stuff with Jamstack, where they're putting HTML, pre-rendered HTML pages on the edge. So, things are just loading insanely fast.

But the idea of finding somewhere to do compute, where actually you can run some sort of business logic. That business logic might be as simple as saying, "Do I route them to the login page, or do I route them to a sign in page?" Or whatever it is, I route them somewhere differently. But the logic could be much more complex, as well. That's what's interesting to me is, if you think about it as a CDN, but with compute, then that unlocks a lot of really powerful use cases.

So, I'm just curious where you see edge computing, maybe a mixture of what we just talked about, some sort of hybrid of the definition, where you see edge computing integrating with what you think of as the traditional CDN.

Tyler: Oh. That's not where I thought you were going with that question. That's really cool. No, this is great. I think it's the mirror of it. You talk about a CDN, you're talking about moving the content. Now we're talking about moving the logic that generates the content. So, the integration there I think is actually going to end up being, for a lot of folks, super tight. It's actually, in my opinion, going to be pretty hard to have a proper, widely used, edge compute network without actually having a CDN attached to it.

I think there's a bunch of different reasons for that. One of them is that almost by its own definition, you're going to end up running the same code repeatedly. If we're talking about an HVP, like a website of some kind, or an API of some kind. You're going to be loading the same things repeatedly. Realistically, that's how the internet works. There tends to be a tail, a spike and a tail for how content is accessed on the internet.

When we're talking about putting servers out at the edges of the network, we're almost certainly talking about a limited resource of some kind. If you're talking about, say big data like machine learning, where you need a large amount of compute power to do it. You're not doing that at the edge of the network. You're not learning, you're not doing training models at the edge of the network.

The reason for that is because it's a lot more expensive to have servers in downtown Tokyo than it is to have them in the middle of the desert in Utah, for instance. So, coming back to it, ultimately, you're going to need to be doing quite a bit of caching. You're going to need to store data so you're not having to repeat the same things over, and over, and over again. I think to me, that's one of the key reasons why the two are almost inseparable, in my opinion.

Jeremy: Right. Yeah. I like the idea of, again, the caching aspect of it, of being able to cache those static assets, whether they're HTML. With compute added to it, there's a lot that you could do to those static files that were cached, where you wouldn't need to make those home runs, and you wouldn't need to do that. You could use things that were local to that particular CDN, or that particular POP.

Anyway, I find that fascinating. But I think there are a lot of different use cases, and I'd be really interested to hear from you. What are some of the use cases that you see people doing with compute at the edge? Maybe what are some of the ones that will eventually open up?

Tyler: Yeah, yeah. I think this is similar to any other new technology that comes out. You're going to have the initial use cases, which we're going to think are really cool. Then eventually, in a couple years, you're going to get the ones that are actually the real killer use cases that we didn't even think of yet.

So, a lot of the initial ones are really simple. They're simple things that make a big difference in end user perception of performance. For instance, instead of having to go all the way back to your centralized data center for every piece of data, what if I actually have 90% of that data, because it's static data, that's already sitting at the edge of the network. Now I just have to go grab that 10%. Or maybe I can feed you some of that while gathering the remaining stuff.

A lot of people think about, I'm thinking about how to put this. A better way to put this is, imagine running a GraphQL server that runs at the edge. You get one request, which actually fans out to multiple different requests. Most of them are already cached, so you're dealing with a much smaller amount of latency, a much smaller amount of variability in latency, I think most particularly.

You also see quite a bit of page rendering at the edge, in my opinion. A lot of that static data is already there, so why send down two different responses? Why send down multiple different responses? Let's just smash it together, right there at the edge, and it's down. Longer term, I think we're going to see all sorts of wild stuff. One of the ones that we worked on internally, just as a prototype, as a little idea, is actually games at the edge. What if you could use an edge compute network to do not only matchmaking of games, but to actually store the state of an ongoing game.

So, one of our little Hack Day projects that we had was doing a multiplayer version of Doom that ran at the edge. It's actually fast. It works. It's really cool to be able to get a bunch of people together to play Doom, and have all of the state actually just sitting there at the edge, ready to go. You can get much closer to a real time type of environment than you could typically, with a traditional game network.

Jeremy: Right. Yeah. I love some of those ideas. One of the things you said about, you maybe can request, 90% of the things you need are local or cache, and then you have to go and get that other 10%. I think about asynchronous processes that you could kick off where you could, say a user goes to a particular page. Then you could say, "The likely place they're going to go next is going to be X page," or something like that. So, now you could preemptively fetch pages, and make sure those are loaded into the cache for things like that.

Now of course, with the GraphQL example, that's an interesting use case because, think about the complexity of that request, to knowing when to fetch it from local cache versus when to fetch it from a home run, and things like that. So that opens up a lot of interesting challenges there.

Tyler: Yeah, yeah. No, fully agree. The other one I wanted to bring up is actually security/compliance/privacy-related things. That's one of the hardest things for us to deal with, and we keep seeing the ramifications of this not being done particularly well in our industry. But imagine being able to have a single layer of your network to be able to do all of that, to be able to say, "Actually, that's a password. I've seen that it's a password. That definitely can't be printed out in plain text within this page."

Or being able to confirm that certain data isn't leaving a particular layer of the network. It gives you a single point, I say single point, but it's actually spread across the world. But it gives you a single deployment point for you to be able to say, "This is our last chance. This is the point before you actually get to the end user." I think it's going to end up being a really powerful security tool for people.

Jeremy: So, speaking about all around the world. This is one of the things that is really interesting about edge computing, or even CDNs. The idea of replicating this to these points of presence, these POPs that are all around the world. The question I have for you, because I've read all of this stuff. I am not an edge expert in any way, shape, or form, but I try to read up on this stuff because I find it fascinating.

One of the things I've seen is companies like Verizon, and even AWS, partnering with other people, doing the 5G thing, putting compute or POPs on the cell phone towers. I guess my question for you is, how close do we actually need to get to the customer? Because that's pretty insane if you can do, again, we'll get to the data piece in a minute. But if you're able to do compute and pull data from the cell phone tower that's a mile down the street versus having to route it to somewhere on the West Coast of the U.S., or North Virginia, or something like that.

Tyler: Sure. Yeah. That is wild. I have actually heard even some even wilder ones, where there was one person pitching me on the idea of putting servers inside of light poles in neighborhoods. I'm like, "Why?" So, there are undoubtedly going to be use cases where that kind of thing is actually useful. The trouble with it, though, is that there's going to be such limited computing power in these place, such limited storage in these places. In order to make it worthwhile for you to use these, whatever you're doing has to be something that is not just for one particular user there. This has got to be something where it is actually so popular and so important, that it is worth it for you to spread this to, say tens of thousands or hundreds of thousands of locations around the world.

My argument is that metro area, city layer, city area, basically being within 15 milliseconds, 10 to 15 milliseconds of users, is plenty for the vast majority of use cases. Now, I could be proven wrong about that 10 years down the line, 15 years down the line, when we come up with some wild new use cases that require you to be within 100 microseconds of where your end user is. But that's not what we're seeing right now. We'll undoubtedly see some specific use cases where this is very valuable. But that's, to me, not the most important form of edge computing, and that's why. Because you can find use cases for something like what we're developing with computed edge for nearly any site. You can use this to make nearly any site faster.

It's going to be a lot harder, in my opinion, for the cell tower layer ones. There's actually a bunch of other reasons why it gets concerning, as well. One of the things that, people trust Fastly quite a lot to be able to handle their private data. We hold TLS keys for a lot of our customers. So, it's really important for us to have incredibly strong security, to be able to keep that sort of thing safe, so that your connections can't be snooped on. It's a lot harder to keep 100,000 locations safe than it is to keep the number of locations that we have safe. You can't make as many strong guarantees about the security of a cell tower as you can about a heavily guarded data center that is nearby in your town.

Jeremy: I guess one of my questions is, when I see people wanting to do those things, like putting them in the cell phone tower, it sounds really cool. There are probably use cases for that.

Tyler: Oh, yeah.

Jeremy: The idea of self-driving cars, for example, that maybe need to ping a network, or something like that. Or the remote surgery, although I don't know how much that would use edge networks. But things where maybe 15 milliseconds isn't enough. Do you see there being use cases where there's some extremely low latency that's needed?

Tyler: Sure. It's certainly possible. I have very mixed feelings about the whole self-driving car pinging the network thing. To me, if it requires, if your car requires the network to move safely, oh, man! We are going to have some real troubles in the future. Again, I think there's definitely going to be some use cases. I don't see them at the moment, though. Maybe that's lack of imagination on my part, but we'll see what happens.

Jeremy: Anyway, I do think that there are probably use cases. Not necessarily to drive, but for traffic updates, or if there's accidents. Things that would potentially, although, again, 15 milliseconds is pretty fast.

Tyler: Exactly. That's exactly where I go back to, as soon as people bring those things up. I'm like, "You really need it in half a millisecond rather than 10?"

Jeremy: Probably very true. All right, awesome. So, the other thing I think that happens with this, and again, you mentioned securing hundreds of data centers, or hundreds of cell phone towers, or these smaller POPs, or whatever. That gets really difficult from a security standpoint, sure. But what about just from a, I guess for building applications. How does the idea of now moving compute to the edge, how does that affect the future of distributed applications?

Tyler: Yeah. This is a great topic, and I think that no one really knows the answer to this yet. We're working on this. I think there's going to be a few stages to this. The first one is kind of where we're at now, where people think of the edge as, it's a proxy of some kind. You think of it the same way as you might think about, "I'm going to put some logic into my engine X server that will run across all of my microservices that are behind it," or something like that, or, "...into my ELB," or something.

I think that over time, what we are going to see, and we're already starting to see this actually, with some of the ideas that are coming out now is, people starting to think about the edge as part of their application. In the same way, and here's why I believe this. In the same way that people now think about the client as part of their application, that's not how we thought about the client a long time ago. I was a developer back in the '90s. I remember how we thought about browsers. The browser was the dumb thing. It was essentially a dumb terminal. You would do all of the rendering, all of everything, back at the server layer. The browser was just there so that the user had something pretty to look at. Over time, that's not how we think about the client anymore. Front end development is real development. It's just as hard and just as serious as back end development these days.

Jeremy: It might be harder.

Tyler: Yeah. No, you could definitely make that argument. You'd probably be right.

So, I think that we're kind of in the early stage of that with edge computing at the moment, where people still think about it as, "I can run little bits of code there." But at a certain point ... let me step back. What I really want people to think about with this is about where is the most efficient, advantageous place for me to run this code. For some things, that's on the client. For some things, that's in a data center somewhere. Some places, it's in a database somewhere. But there's going to be a large swath of things where the edge is actually the correct answer to that. Where if you're not needing to do, in some ways, I actually want to think about it as, being as close to the user as possible should be the default. If you can run something on the client, and it's a powerful enough client, it makes a ton of sense to just run that code there, because it's right next to the user.

So, unless there's a strong reason not to, moving things as close to them as we can, I think is actually going to be a pretty important development over the next few years.

Jeremy: Yeah. No, I totally agree. My concern is, and it's less of a concern, and more of, we don't know yet is, how does this affect how we've learned to build distributed applications over the course of the last five to 10 years? It was always, you start off building monoliths, and then we get into the cloud, and then we start building distributed systems, and we're getting better and better at that. Then all of a sudden, we're hyper distributed systems now because we want to replicate our applications closer to the client.

So I have a whole list of things that this affects and I would love to go through it. Let's start with your existing code base. What does this mean for your existing applications? What do you think a migration to edge computing even looks like?

Tyler: So again, I think this initially is going to look like ... Okay. Think about a traditional application architecture. I came from the Ruby world back in the day. It's been a long time since I've done any Ruby on Rails or anything like that, but that's where I came from. A lot of times, we would have what we referred to as middle wares, and things that would be running between the actual web server itself and the core business logic. So, if you think about it like that and go, "Those would probably be the easiest things for me to try moving out to the edge." If there's just a thing that is completely stateless, that is processing a request, and modifying it, transforming it along the way, that's a really easy thing to move out there.

I think that when we start thinking about the architecture of React apps for instance, there's going to be some really easy wins with that, with service side rendering and things like that, that doesn't actually need to be inside of the core application. It doesn't need direct access to the database, for instance, to be able to do it.

But over time, I think the limiting factor is of course going to be the lack of direct access to your database, or the lack of a really strong, stateful system to work with. I think that what's going to end up happening with that is that we're going to have to modify the way that we think about our data. Suddenly, man, this is something I've been thinking about for a long time. It's a really tough problem. We have good solutions for what to do with problems, with architectures that cannot have a strongly consistent database, that cannot have a strongly consistent distributed system and instead, have to be eventually consistent because of the distribution mechanisms happening.

The problem is that they don't naturally fit the way that we as humans think about problems. They're all these eventually consistent ideas which, you have to imagine multiple different things happening concurrently, and how these things merge together or don't merge together, and things happening in different orders in different places. They're tough to get our heads around, I think. So, I think one of the things that's going to have to happen is that we're going to end up having to develop almost use case specific versions of these things. You can imagine, say, I don't know, a sessions store that exists at the edge of the network. Even that is actually complicated. You think about that as one of the more simple things that a web application can do is just, "Okay, I have some data that goes along with the session. Easy enough."

When we're talking about the edge of the network, we're talking about thousands of servers. We're talking about a user that may be actually in motion.

Jeremy: Like driving. Yes, I was thinking the same exact, or on an airplane.

Tyler: Exactly. So, you could be connecting to one data center, you could be connecting to one server in one data center, even. Then next request, you're somewhere else. So, that data actually has to move along with you, for something like a session store. There's going to be other uses cases where that isn't the case, and we have other constraints that come up. I think that's going to be the trickiest part of this whole thing, though. I spent the last three, four years working on our computed edge product, this specific version of our computed edge product. I had commented to someone recently that I thought I was doing the hard part. Turns out the hard part is actually going to be the state.

Jeremy: Totally.

Tyler: But ultimately, again, coming back to what I was saying before, how we develop applications is going to have to change, I think, in order to take full advantage of this kind of system. I think that it's worth it, though. Again, coming back to the idea of, what if you had the ability to say, "I want to deploy this piece of code to the place in the network where it is most efficient to run, where it has everything that it needs, and nothing it doesn't, and is as close to the user as it can be."

That's a pretty powerful concept to me. I think that, in addition to the state opening things up, if we can find ways to decompose our applications into smaller components, that's going to make a big difference here, as well. Moving an entire application, wholesale from inside of your data center to outside of your data center, that's a big ask. But if I can say, "My application is actually composed of these 16 things or these 32 things," cool. I can pick and choose the ones that actually belong here, and that are communicating amongst each other, and have minimal communications across the wire, that's going to be really cool, if we can get to that stage.

Jeremy: Yeah. I wanted to ask you about the data, because that's one of the things where I can understand and I can wrap my head around building small, reusable components that can be deployed to the edge, because I do a lot of that with Serverless. Where again, you're building small bits of compute. You're separating those things. You're understanding how each one of those things interacts differently. You have to understand how you communicate between functions if that's something you need to do. So, that's something that makes a lot of sense to me. The state aspect of it, though, is just really, really hard to wrap your head around. Because even if you're doing something like a DynamoDB global replication, or something like that, you pick which regions you want it to replicate to, but it replicates all your data.

In the example that you gave of the session store, if I log in, and I'm in Boston, Massachusetts, and then I drive down to Providence, or something like that, and then make my way down to somewhere in Connecticut, I've now passed multiple metro areas that my data has to follow along with me. But under normal circumstances, maybe one, maybe my closest POP is fine. But you certainly don't want to replicate data from a user in Dublin to a user in New York City, if that user's never going to be near that POP. So, understanding what data needs to get replicated, understanding when you purge data, and things like that, even what regions you might want to replicate to, again, compliance, security, all kinds of reasons like that. That's just a really, that is the hard part, I think. I totally agree with you on that.

Tyler: I think so. Yeah, yeah, yeah. There's going to be some use cases where replicating it all over the world is actually the right answer.

Jeremy: Right. For certain things, sure.

Tyler: For certain things, right. I don't think that's going to be the common case, though. For instance, exactly as you were talking about, having a session store for one user that's replicated all over the world makes no sense. It shouldn't actually be replicated anywhere, until, of course, that user moves in some way. So, there's definitely going to have to, if I immediately start thinking about, obviously as an engineer, I immediately start thinking about, "How would I actually do that?"

Jeremy: How would you solve that?

Tyler: Right. It's almost certainly going to require some collaboration with the client, where the client remembers where it was, and can tell wherever it connects to, "By the way, I used to be over here." So, now your local one for wherever you are now, can then go, "Oh, okay, cool. Let me go back and get the state that was associated with this user."

State itself is also going to be, what does state actually mean? Are we talking about a MySQL database? Are we talking about a MongoDB? There are actually multiple ways to think about this. So for instance, one of my favorite systems that has been developed in the last 10 years is something referred to a Microsoft Orleans. Are you familiar with this by any chance?

Jeremy: I'm not, no.

Tyler: Entirely fair. There's no reason you should be. It was the system that ran the matchmaking and users for Halo 2; Halo 2 or Halo 3, one or the other. I can't quite remember now. One of the ideas that they introduced in there was the concept of durable actors. They introduced the concept of durable actors. The whole idea here was that every user, every individual player had a program that was running for them at all times. So, if they're not connected what happens is, that program gets serialized and stored. As soon as they reconnect, we just break that program, in its paused state, out of storage and they're right back where they started.

There's just so many really cool ideas for this. So, you could effectively, if we were going to go down some path like that, you can imagine, essentially you have a program for you on some particular website that has been running, maybe for years at that point. It just keeps getting reconstituted whenever you log back in. There's definitely going to be some interesting ideas that come out of the next few years. We're working on some of them, anyway.

Jeremy: There's going to need to be. Because again, that's one of those things where I'm just like, "How does it work?" I think what that opens up to as well is, the data piece of it is one thing, and the security piece, and compliance, that's another thing.

But what about operations teams? Even in a serverless world, some people talk about no ops. That's not really a thing. You still need to understand your infrastructure. You still need to worry about security and compliance, and do all those things that operations people need to do. How are operations people going to start dealing with thousands of POPs around the world? How does it affect them?

Tyler: Oh, man. No, this is tricky. Again, I think this is one of those ones where we are still in the early days of edge computing, because we don't have, or not many folks have great answers to these questions yet. There's definitely going to be new patterns that come out. There's going to have to be new patterns, because the things that we do now aren't going to work when we're talking about something like this. What does it mean to be able to attach a debugger to a program that is running inside of a server in Mongolia?

Maybe that's actually possible. We have a prototype of something like that working. But is that actually the right way to do it? Or do we just fall back on print app debugging? How does observability work inside of this?

Jeremy: Right. I was going to ask the same thing. Again, you think about that, it's hard enough to observe distributed applications when they're running in one region, in one data center. Spread that out across the world, what does that look like?

Tyler: Right, right. Not only that, but coming back to what I said about being able to break applications into components to be able to run them across multiple different layers of the network. So, if your application is broken into 16 different components, good luck observing that at the moment. So, this is going to require a lot of work from us, and a lot of work from any other Edge Cloud provider that comes onto the scene. But we're going to require, we've already developed some integrations with folks like Datadog, and Honeycomb, and so on, being able to feed data directly back down to them.

But it's also going to be about, if I have multiple components, if I have multiple hops that are happening here, I want to be able to see what's happening between these different places. Where did this request go wrong? Where did it get routed to the wrong place? Where did the data get corrupted, or something like that? I think that's actually going to come back to distributed tracing. It's something that, we all know this. This isn't a new concept. But I think it's going to be so much more important than it was a few years ago. It was a novelty, I think, for a lot of companies, for a lot of people who are working on it. I don't think it's going to be a novelty anymore.

Jeremy: No, it'll just be table stakes for cloud computing.

Tyler: Yeah, exactly.

Jeremy: So, the other thing, again, observability and being able to debug is one thing. But what about the overall developer experience, or just global deployment? I know a lot of these edge networks now, you deploy one place, it automatically replicates, and that makes a lot of sense. With CDNs, it's pretty easy. You just publish to the origin, and then everything picks up from there.

So, those types of global deployment strategies, how are those going to be similar with compute? Then, mix in the data aspect of it and say, "How do I know this node of my compute can access this type of data, or can't?" That seems like that's a pretty hairy problem, as well.

Tyler: Yeah, that's definitely a hairy problem. I think it's not actually that dissimilar, though, to having to do a big deploy, trying to do a big deploy onto a big cluster of machines as it exists today. Now, you may have 1,000 app servers for some companies out there. You already have to deal with the fact that some of them are always going to be out of date. Some of them are going to be broken. Some of them, even just when you're doing a deploy, there's going to be this wave that goes through the whole thing.

So, I don't actually think it's that dissimilar. I think we actually, this is possibly the one place where we do have the tools to be able to do it right now.

Jeremy: But what about Canary deployments, and roll backs, and things like that? That's certainly, I guess you can just roll back by redeploying, essentially.

Tyler: Sure.

Jeremy: But it does seem like there is more tooling, and more thought that still, a little bit of thought that needs to be put into this, probably.

Tyler: Yeah. No, that's definitely true. In some ways, I think we actually have a fun advantage with this. You want to do Canary deploys? We could actually start rolling out your application slowly throughout the entire network. You put it on one, let it run for a minute, put it on two more, and then let it epidemically spread. Sorry to reference epidemics at the moment, but let it spread throughout the network that way.

The developer experience question, though, that you brought up is such an interesting one. This is something that we talk about internally, quite a bit. We had an internal engineering summit a few months back. At the end of my personal talk that I was giving in there, I brought up a couple things that I'm worried about, that I'm like, "We don't necessarily know how to do this thing yet, and I think it's super important."

One of those is the developer experience for it. It's one thing to be able to say, "Great. Three steps, you can have something deployed at the edge." But that's not really the same thing as building an entire application from scratch, or breaking apart an existing application and spreading it onto the network, and spreading it across multiple layers.

I don't think anybody has the answers to this yet. I think it's going to require some new technology to do it. So, that's something that my team inside of Fastly is working on at the moment is, especially in the WebAssembly world. Do we have the tools that we need there, to be able to take multiple different components and have them work together seamlessly, without it feeling like every hop is a new network hop for you, effectively.

Jeremy: So, I do want to talk about WebAssembly, but before we get there, a couple other things on, big questions. These are maybe some of these are theoretical, at this point. But terms like regional compliance. Picking and choosing where or what POPs your applications and your data replicates to. Is that something you see as, a problem that you're solving at Fastly, or something that you will be solving?

Tyler: Right. I don't want to say, yet, whether or not that's something we're actively trying to solve, or will solve. But I do think it's actually a really interesting problem that is likely not going away any time soon. We've been dealing with this for a number of years, in terms of China, and European laws, as well as, we've seen pushes for this inside of the U.S., as well as inside of places like Australia. Regardless of what you specifically think about those laws, they're clearly not going away. I think that edge computing is actually in kind of an amazing place to be able to help developers solve this problem for their users, though. One of the reasons for that is because if we do have locations in all of these different places, it makes it easy for you to say, "Okay, this user's coming from their. Their data can't follow them."

Hearkening back to what we were talking about with that session earlier, maybe your data follows you all the way through the U.S. as you drive across, but then you hop on a plane and head over to Japan, and maybe it doesn't follow you over there. Okay, that's fine. It just means it's going to take you a little bit longer to get it while you're over there. You're going to have to hop back over to the West Coast of the U.S. to get it. That's something that would be really, really hard without edge computing, without something like edge computing coming in. Being able to serve users in all these different countries, it would be nearly impossible. Or at the very least, you're having to do a ton of the work yourself.

So, I think this is something that edge computing is poised to be able to solve for people, but I don't want to say much more than that at the moment.

Jeremy: I think it's interesting because what you potentially get there is a little bit of compute. Even if it's that small piece of compute that says whether or not a particular file can load, or a particular document can load based off of region. It's just much more accurate, or it seems more accurate than trying to guess peoples' location from their IP address, for example. Especially where people are using proxies, and things like that, where some of these other things are harder to fool.

So, I think that's really interesting. Then I guess one of the other questions I have around this, too is, we always get this question of vendor lock in, no matter what you're doing. I'm using AWS, so if I'm on Serverless and AWS, I'm using Lambda, I'm locked into Lambda. To some extent that's true, but you're also locked into MySQL, or Mongo, or some of these other things that you're going to have to do some work to migrate.

But I find it interesting with the idea of edge compute, where if you start spreading around compute to all these different places in the world, what if certain edge networks have better coverage in certain areas, and you want to use multiple edge networks. Intercommunication between them, or interoperability, is this something where you see maybe standards developing around this, so that not everybody's doing something different? Where there could be some way for maybe multiple networks to talk to one another?

Tyler: Oh, yeah. I'm so glad that you bring this up, because this has been the basis of our strategy in this area. We recognize the fact that building out an edge compute network is not something you can just do by yourself. We are one player in the space. I think we're the best player in the space, but we're going to have to be able to work with each other.

Even going back to what I was saying with the different layers of the network, when we're talking about, what if I can move a piece of computation to where it runs best. Whether that's on the client, or on the server, or it's somewhere in between. That is almost certainly going to require some sort of standard way of being able to have a piece of computation, a program, and being able to run it in multiple different locations and expect the same results. So, this is why we have spent so much time on the standards around WebAssembly. I'm undoubtedly going to keep coming back to WebAssembly until we talk about it.

But there's WebAssembly itself. There's WASI, which is the WebAssembly system interface, which is where we are putting a lot of the effort on this standardization thing. There are already multiple different companies that are using that at the moment. My personal favorite one of these is actually Shopify. Shopify has an early product that they have put out where you can run scripts of some kind within, essentially working on your shop, itself. That thing is actually using WebAssembly. It's using some of the software that we wrote, and it's also using that WebAssembly system interface.

So, in theory anyway, I can't say 100% that this is the case at the moment, but it will be soon if it's not. You could have a piece of software that runs on the Shopify platform, that will also run in the Fastly platform, that will also run in your browser, that will also be able to run in the server, as well. So to me, I think that standards are going to be super important for this, and that's why we're putting so much effort into that.

Jeremy: So, let's talk about WebAssembly, then. We can go to some of the other topics later. So, WebAssembly. Again, maybe if people aren't fully aware of what that is, why don't you give them a quick overview of what exactly WebAssembly is.

Tyler: Yeah, sure. So, WebAssembly is something that was developed for browsers, actually. So, it was kind of a response to, if you think back to, some of your listeners might be familiar with Native client that existed in Google Chrome back in the day. The whole idea with this is that it was for running existing C and machine code applications inside your browser. It was used for games and various other things.

Some other folks came out with Asm.js. Asm.js was a way of taking almost a response to that, being able to say, "Okay, that's cool and fast. But what if we could make Java Script really fast? What if we could make it so you've compiled that C application down to a Java Script program, with just a few little tweaks in it, and it would be nearly Native speed?" Then WebAssembly was essentially a response to that, and being able to say, "Okay, that was neat, and so was Native Client, but what if we made a standard way of doing this? What if we made a specific machine code-like language, that we can compile and we can run as fast as near Native speed, and can run in every browser?"

So, that's what WebAssembly was designed for initially. However, it turns out, it's actually great for things outside of the browser, as well. At its core, what it really is, is a super fast, super lightweight, super secure way of, I don't know, cross-platform language. So, if you have multiple different languages that can target this one, and you have a compiler that works for it, suddenly you have a platform that works across multiple languages and multiple different servers.

Jeremy: Right. I think back to, you said you'd been working on web in the '90s, so you're just as old as I am. Remember Java applets?

Tyler: Oh, yeah.

Jeremy: So, WebAssembly is like that, but not terrible. It's very cool. It actually works this time.

Tyler: That's the goal.

Jeremy: I just remember how bad Java applets were, and everybody wanted to do them.

So, WebAssembly is one of those things now where, again, compiling down to Native, runs extremely fast. I've heard a lot of people talk about browsers being those dumb clients, like you mentioned earlier, but you have all this compute power running on your laptop. Why not use some of it?

Tyler: Yeah.

Jeremy: When you have to use something like Java Script in order to do it, you run into all kinds of limitations. But with WebAssembly, it basically opens up that operating system in a way that you can use the full power of it to do a lot of work there. But then that same program, or some variation of it, runs at the edge. It runs in a data center. It can run anywhere. So, that's just fascinating to me.

Going back to this, you recently acquired, or Fastly recently acquired the Mozilla team that created WebAssembly, right?

Tyler: I wouldn't say acquired. We hired them.

Jeremy: Or you hired them, sorry.

Tyler: But yeah, one of the people on that team is Luke Wagner, who is one of the co-creators of WebAssembly, yeah. This was the team that was primarily working on their WebAssembly out of the browser projects. So, they're responsible for Crane Lift, and WASim Time. If you've been working with WebAssembly, those are nearly ubiquitous at this point. You've probably heard of them. You've probably used them.

When we started chatting with them, we were working with this team to create the Bytecode Alliance a couple years back. We've been collaborating with them for a long time. So, when we started talking, we realized that we're actually working toward exactly the same goal. So, when the Mozilla layoffs happened, they were happy to hop over and continue, actually, doing the same work that they were doing before, but now targeted at the edge, instead of at a more central location.

Jeremy: Awesome.

Tyler: Yeah, they're a fantastic team. I think we are super lucky to get them.

Jeremy: So now that you've brought that team in, I'm assuming that WebAssembly is going to be a big part of Fastly moving forward.

Tyler: I think that would be a pretty good bet, yeah. Yeah, yeah, yeah. That's kind of a fun story in itself. This started out as just a couple people working on it over a holiday break a few years back. You hire one person, two people. Now, we have probably one of the largest WebAssembly teams that exists out there, as well as one of the most experienced, probably the most experienced WebAssembly team out there, at this point.

So, it is simultaneously very exciting to me, to be able to really get things done. Fastly, historically, wasn't a language company. Historically, we're not a company that produces compilers and so on. So, now we have turned into a world class place to be able to work on those sorts of problems. But at the same time, I think that there's a lot of responsibility that comes along with that. We've hired up quite a few people who work in the WebAssembly world, and who are responsible for the future of WebAssembly. I have no desire for it to turn into Fastly WebAssembly. WebAssembly needs to exist on its own.

Even for our own benefit, WebAssembly has to be a strong community, has to be a really solid piece of technology. It can't just be for Fastly. That wouldn't make sense for anybody.

Jeremy: Right. I think that's super interesting, and good for Fastly for picking up that project. Again, I see that as being another exciting step. For serverless computing, as well, just the ability, again, to get that speed that we're looking for.

Tyler: Sorry, can I hop in there for a second?

Jeremy: Yeah.

Tyler: To me, you mentioned that for the speed in particular, and I totally agree with that. I think to me, the thing that's actually more exciting than that is the ability, coming back to what we were talking about before, is the ability to be able to take a program and run it wholesale, just bop, bop.

Whether it's on your Serverless platform that is running in a central cloud, whether or not it's in a regional area, or whether or not it's in a cell tower somewhere, or in the browser; that is such an exciting thing to me. Please continue.

Jeremy: No, I totally agree. Right. So, let's jump back to edge computing for a minute here. One of the things that I think is interesting with the amount of compute you can currently do at the edge. It is very, very small. I think 50 milliseconds, depending on which cloud it's on, or which edge provider.

So, what are some of those limitations that you're going to see? I guess edge computing versus your traditional, which is kind of crazy now, but you think about distributing computing, we're talking about hundreds of data centers, probably, anyway. But edge computing verses the data center approach, or the cloud approach. What are some of those limitations that you're going to see? Do you think that it lends itself to this hybrid approach, where some of the compute is done on the edge, but then maybe the more heavy stuff is done in a cloud computing data center somewhere?

Tyler: Yeah, yeah. The limitations for a lot of the edge computing providers, including us right now, are very low. That's just a thing. 50 milliseconds isn't a tremendous amount of time. But it turns out that for a lot of these initial use cases, it's actually enough. When we're talking about doing a GraphQL request, the actual computation time involved with that should be relatively low. That's something that I think we'll see lighten up over time. You'll see those numbers start to go up as people gain operational experience with it, and as the demand, as well, for more computation time increases.

But I think you actually hit the nail on the head, with this actually being about a hybrid approach. The reason for this is that, does it actually make sense to do 30 seconds of hardcore number crunching at the edge, in someone's neighborhood, where you have had to buy up a little piece of real estate in order to put your server? Or does it make sense, in some cases, for that to actually fall back to a centralized data center, where you have those massive computers, where you have thousands of servers all running simultaneously.

I think that's going to be a theme for the next couple years, and I mentioned it earlier. Where is the most efficient place for me to run this? Do you actually really want to do 30 seconds of hardcore computation at the edge? I think that we are eventually going to see a lot of computations, a lot of the shorter computations moving out to the edge, and lot of those heavier weight computations moving back into a centralized data center. That seems like the natural progression to me.

Jeremy: I think that some of those use cases, too, are small bits of compute at the edge, that maybe trigger larger compute jobs asynchronously, that then prepare data, or whatever it is. I think that's a potentially interesting way to think about it.

Tyler: Yeah, I think that's fair. But also, I think the other thing that comes to mind to me is that, if we are 10 milliseconds away from your end user, and you're doing, say 150 milliseconds of computation, let's say half a second. Let's say 500 milliseconds of computation.

So, now your total time do this is 10 milliseconds to get to you, 500 milliseconds to do the computation, and 10 milliseconds back. So, we're talking about 520 milliseconds. Your actual data center might only be 200 milliseconds away from that user. So, is the savings of time worth it in this case? I think this is going to end up being that balance that comes out.

Jeremy: That comes down to intelligent networks, too, being able to understand what's the closest one.

So, speaking about networks, this is a very big market. A lot of people are getting into this. Fastly's been around for quite some time, but you have Cloudflare, and Akamai, and AT&T, the mobile companies getting into it. There's a bunch of them. So, I guess maybe my question here is, who's going to, and maybe you can't answer this, but I like to think of these bigger questions here. Who's going to win? Is it the hyperscalers? Is it going to be the telco providers, like the Verizons and the AT&T? Or is it going to be more neutral providers you think, like a Fastly? I know you hope you win. Or is it just going to be a combination of everybody working together?

Tyler: I am an optimistic type of dude. I'm very Kumbaya, I guess you could say. I think that's it's almost necessarily going to have to be everybody. In order for this thing to work, we are going to have to learn to work together on it. One of the things I say a lot internally is, "I don't want to compete over the basics. I don't want to compete over who is able to run WebAssembly. I don't want to compete over who is able to run an edge computing network at all." I want us all to be able to do that. Once we have all reached this level of, now users can actually use all of these different things, let's compete over the features on top of that.

Competing over the ability to do basic stuff isn't good for users. No one wants that. That's, I think, a lot of the times how we think about standardization work, as well. We want to bring some of our, I don't know if I would say competitors, but some of the folks who are also in this space, we want to bring them along with us and be able to say, "Look, we're not just developing this for us. We're developing this for everybody. Cool. You deployed this, too? Great. Let's compete over what we do with it."

Jeremy: Well, it's funny because I almost look at edge computing almost as another utility. Similar to the internet itself; even though there are companies that own pieces of the network for the internet, for the most part, it's pretty open. So, I could see cloud computing, or edge computing, becoming one of those things where everyone's participating in this, as you said, Kumbaya, creating this thing together, but then building features and applications on top of it, is sort of where the differentiation might be.

Tyler: I think there's some pitfalls in thinking about it strictly as a utility. That direction, I think is reasonable. Again, just to reiterate my point, I really want us to compete over the content, the meat of this problem, not the basics, not over the infrastructure itself.

Jeremy: Right. So then, another question is, talking about competition. We talked about the hybrid piece. Do you see edge compute as being a competitive thing to public cloud providers, where people could eventually host all their applications there? I know you mentioned that you see that, I guess, the hybrid approach.

But I'm just curious because there is this thing called fog computing. I don't know if maybe you can explain it, but where you're using a combination of all these different services; edge, potentially your own data center, public cloud, things like that; and mixing them all together. So, is this something where you think it's just going to be all these things working together? Or is there going to be a niche space for purely edge computing services?

Tyler: Oh, yeah. I think there's definitely going to be a space for just purely edge compute services. A lot of our services inside Fastly work exactly like this. We do everything we can to avoid having a centralized component for almost anything. Obviously, there are quite a few things that we have to. But actually, yeah. I think that the most common case is going to be that hybrid one. To me, edge-only is always the ideal. Honestly, if I set aside my own personal interests from this and say, honestly, client-only is really the idea, because then there's no network involved. You don't have any of the failures associated with that. But of course, that's not realistic for most things.

Edge-only is the next best thing, but I think that realistically, hybrid is going to be where most applications land. I would have a hard time trying to define fog computing, because I remember reading about fog computing in 2002, or something like that. The idea has been around for a long time, and I'm not sure what exactly it's evolved into at this point. But it sounds like, based on what you said, similar to what we're talking about. Where ultimately, I think what developers are going to need to do is, stop thinking about their application as a thing that runs in a place, and start thinking about it as something that is actually spread across the whole network.

Again, I think we have kind of already gotten there to some extent, when people think about their client as just being part of the same application. So, it's a much shorter hop now, I think, to get to edge computing being part of it than it would have been a few years back.

Jeremy: That's a super interesting way to think of it, because that's one of the things is, wrapping your head around that your application does not run on a server somewhere anymore. It runs everywhere. You need to be hyper-aware of what that means for communication, what that means for security, what that means for reliability, for resiliency, for all these other things, observability, like we talked about. That is not an easy thing for people to wrap their head around right now.

Tyler: Yeah, yeah. I think that, oh, man. I think that, honestly, the React developers out there are going to be in one of the stronger positions to understand and really use this. Because React has this concept built in. You have the server-only components. You have the client-only components that go along with it. Why not an edge-only component? Or why not the ability to move some of those between multiple different locations?

I think back to, man, the asp.net days, when I was doing that, in the early 2000s. They had a couple of these concepts in there. They were almost there. Obviously, it stuck around and there are plenty of .net developers now. So, we'll see.

Jeremy: That's an interesting point, though. Again, even early 2000s, even the 2010s, we were still just learning about microservices. Now, we're doing serverless compute, and we're doing containers, and Kubernetes, and all these other things that are just becoming ... the technology is moving so quickly.

Again, serverless, at this point, is probably five, six years old; a solid six years old, if you think back to the beginning of Lambda. But with things moving so fast, I look at edge computing as, you're still day zero, so far to go. Not to use the AWS term, but basically, you're still right at the beginning. It's a very, very new thing. So, what's that path forward? I always ask my guests, "What do you think serverless computing is going to be in five years?" I'd love to ask you, where do you see edge computing in five years?

Tyler: Yeah. So, in some ways, I actually want it to, this is, it's hard to put my finger on exactly what my answer is to this. I simultaneously believe and want it to be, obviously, way more prevalent than it is now. I think that this will be a common piece of folks' infrastructure. That said, I also believe that it won't be obvious that, that's the case. This will just be another layer in your stack. This'll just be part of what you naturally do, where you're not even thinking about it, in the same way that you're not necessarily thinking about the individual server that you spun off on AWS.

Just deploying onto Fastly, or even some other edge cloud network, a rising tide raises all ships in this case, I think. I think will just be such a natural part of developing a high scale application that it won't even be a question. I won't necessarily have to be on podcasts explaining edge computing, because people will know what it is because they're using it.

Jeremy: Right.

Tyler: Again, I think it's also going to change how we develop applications. As we were talking about earlier, it's hard to, the entire concept of a monolith application can't work in a situation like this. We've already seen this broken down, again from the server to client division. It's going to have to be broken down even further, in my opinion. I think we'll see that in the next five years.

Jeremy: That's crazy. Well anyway, I think there's a long way to go, but you're clearly doing some amazing work over at Fastly.

Tyler: Thank you.

Jeremy: So, thank you for that. Thank you for being a guest, and taking the time to talk with me. I know I learned a lot from talking to you.

Tyler: I'm so glad.

Jeremy: Hopefully, the guests found this enlightening. I would say study up on edge computing, because that's the next thing. We'll have to start an Edge Computing Chats at some point.

Jeremy: But anyway, thanks again for being here. If people want to get ahold of you or learn more about what you're doing at Fastly, how do they do that?

Tyler: Check out Fastly.com. You can find me at Tyler@Fastly.com. I don't look at social media anymore, so there you go.

Jeremy: That's not a bad thing these days.

Tyler: I agree.

Jeremy: Well, again, Tyler, thank you so much. I really appreciate it.

Tyler: All right. Thanks for having me, Jeremy. I appreciate it.

What is Serverless Chats?

Serverless Chats is a podcast that geeks out on everything serverless. Join Jeremy Daly and Rebecca Marshburn as they chat with a special guest each week.

Serverless Chats

Episode #84: Serverless Compute at the Edge with Tyler McMullen

Episode #84: Serverless Compute at the Edge with Tyler McMullenEpisode #84: Serverless Compute at the Edge with Tyler McMullen

More episodes

Episode #84: Serverless Compute at the Edge with Tyler McMullen

Episode #84: Serverless Compute at the Edge with Tyler McMullen

Chapters

Show Notes

What is Serverless Chats?