Screaming in the Cloud

In house pronunciation habits are a slight annoyance of the industry, so for now when it comes to CNCF, we will stick with spelling it out one letter at a time. We are glad to say that Liz Rice, Chief Open Source Officer at Isovalent, and chair of CNCF's Technical Oversight Committee, does the same. Gracefully so, ditto to Isovalent’s eBPF, instead of “Ehbehpf” which is at the start of today’s conversation.

Liz Rice spills the beans on eBPF, or extended Berkeley Packet Filter, which on its own basically means nothing. But, what it actually does is allow you to do is to run custom programs inside the kernel. Liz breaks down the varied ways that eBPF is useful, the story on how the enterprise versions of Cilium are useful, and some of the other tools Isovalent is bringing forward.

Show Notes

About Liz
Liz Rice is Chief Open Source Officer with cloud native networking and security specialists Isovalent, creators of the Cilium eBPF-based networking project. She is chair of the CNCF's Technical Oversight Committee, and was Co-Chair of KubeCon + CloudNativeCon in 2018. She is also the author of Container Security, published by O'Reilly.

She has a wealth of software development, team, and product management experience from working on network protocols and distributed systems, and in digital technology sectors such as VOD, music, and VoIP. When not writing code, or talking about it, Liz loves riding bikes in places with better weather than her native London, and competing in virtual races on Zwift.


Links:

Transcript
Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.


Corey: Today’s episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that’s built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you’re defining those as, which depends probably on where you work. It’s getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that’s exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn’t eat all the data you’ve gotten on the system, it’s exactly what you’ve been looking for. Check it out today at min.io/download, and see for yourself. That’s min.io/download, and be sure to tell them that I sent you.


Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig is the solution for securing DevOps. They have a blog post that went up recently about how an insecure AWS Lambda function could be used as a pivot point to get access into your environment. They’ve also gone deep in-depth with a bunch of other approaches to how DevOps and security are inextricably linked. To learn more, visit sysdig.com and tell them I sent you. That’s S-Y-S-D-I-G dot com. My thanks to them for their continued support of this ridiculous nonsense.


Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. One of the interesting things about hanging out in the cloud ecosystem as long as I have and as, I guess, closely tied to Amazon as I have been, is that you learned that you never quite are able to pronounce things the way that people pronounce them internally. In-house pronunciations are always a thing. My guest today is Liz Rice, the Chief Open Source Officer at Isovalent, and they’re responsible for, among other things, the Cilium open-source project, which is around eBPF, which I can only assume is internally pronounced as ‘Ehbehpf’. Liz, thank you for joining me today and suffering my pronunciation slings and arrows.


Liz: I have never heard ‘Ehbehpf’ before, but I may have to adopt it. That’s great.


Corey: You also are currently—in a term that is winding down if I’m not misunderstanding—you were the co-chair of KubeCon and CloudNativeCon at the CNCF, and you are also currently on the technical oversight committee for the foundation.


Liz: Yeah, yeah. I’m currently the chair, in fact, of the technical oversight committee.


Corey: And now that Amazon has joined, I assumed that they had taken their horrible pronunciation habits, like calling AMIs ‘Ah-mies’ and whatnot, and started spreading them throughout the ecosystem with wild abandon.


Liz: Are we going to have to start calling CNCF ‘Ka’Nff’ or something?


Corey: Exactly. They’re very frugal, by which I mean they never buy a vowel. So yeah, it tends to be an ongoing challenge. Joking and all the rest aside, let’s start, I guess, at the macro view. The CNCF does an awful lot of stuff, where if you look at the CNCF landscape, for example, like, I think some of my jokes on the internet go a bit too far, but you look at this thing and last time I checked, there were something like four or 500 different players in various spaces.


And it’s a very useful diagram, don’t get me wrong by any stretch of the imagination, but it also is one of those things that is so staggeringly vast that I’ve got a level with you on this one, given my old, ancient sysadmin roots, “The hell with it. I’m going to run some VMs in a three-tiered architecture just like grandma and grandpa used to do,” and call it good. Not really how the industry is evolved, but it’s overwhelming.


Liz: But that might be the right solution for your use case so, you know, don’t knock it if it works.


Corey: Oh, yeah. If it’s a terrible architecture and it works, is it really that terrible of an architecture? One wonders.


Liz: Yeah, yeah. I mean, I’m definitely not one of those people who thinks, you know, every solution has the same—you know, is solved by the same hammer, you know, all problems are not the same nail. So, I am a big fan of a lot of the CNCF projects, but that doesn’t mean to say I think those are the only ways to deploy software. You know, there are plenty of things like Lambda are a really great example of something that is super useful and very applicable for lots of applications and for lots of development teams. Not necessarily the right solution for everything. And for other people, they need all the bells and whistles that something like Kubernetes gives them. You know, horses for courses.


Corey: It’s very easy for me to make fun of just about any company or service or product, but the thing that always makes me set that aside and get down to brass tacks has been, “Okay, great. You can build whatever you want. You can tell whatever glorious marketing narrative you wish to craft, but let’s talk to a real customer because once we do that, then if you’re solving a problem that someone is having in the wild, okay, now it’s no longer just this theoretical exercise and PowerPoint. Now, let’s actually figure out how things work when the rubber meets the road.”


So, let’s start, I guess, with… I’ll leave it to you. Isovalent are the creators of the Cilium eBPF-based networking project.


Liz: Yeah.


Corey: And eBPF is the part of that I think I’m the most familiar with having heard the term. Would you rather start on the company side or on the eBPF side?


Liz: Oh, I don’t mind. Let’s—why don’t we start with eBPF? Yeah.


Corey: Cool. So easy, ridiculous question. I know that it’s extremely important because Brendan Gregg periodically gets on stage and tells amazing stories about this; the last time he did stuff like that, I went stumbling down into the rabbit hole of DTrace, and I have never fully regretted doing that, nor completely forgiven him. What is eBPF?


Liz: So, it stands for extended Berkeley Packet Filter, and we can pretty much just throw away those words because it’s not terribly 
helpful. What eBPF allows you to do is to run custom programs inside the kernel. So, we can trigger these programs to run, maybe because a network packet arrived, or because a particular function within the kernel has been called, or a tracepoint has been hit. There are tons of places you can attach these programs to, or events you can attach programs to.


And when that event happens, you can run your custom code. And that can change the behavior of the kernel, which is, you know, great power and great responsibility, but incredibly powerful. So Brendan, for example, has done a ton of really great pioneering work showing how you can attach these eBPF programs to events, use that to collect metrics, and lo and behold, you have amazing visibility into what’s happening in your system. And he’s built tons of different tools for observing everything from, I don’t know, memory use to file opens to—there’s just endless, dozens and dozens of tools that Brendan, I think, was probably the first to build. And now this sort of new generations of eBPF-based tooling that are kind of taking that legacy, turning them into maybe more, going to say user-friendly interfaces, you know, with GUIs, and hooking them up to metrics platforms, and in the case of Cilium, using it for networking and hooking 
it into Kubernetes identities, and making the information about network flows meaningful in the context of Kubernetes, where things like IP addresses are ephemeral and not very useful for very long; I mean, they just change at any moment.


Corey: I guess I’m trying to figure out what part of the stack this winds up applying to because you talk about, at least to my mind, it sounds like a few different levels all at once: You talk about running code inside of the kernel, which is really close to the hardware—it’s oh, great. It’s adventures in assembly is almost what I’m hearing here—but then you also talk about using this with GUIs, for example, and operating on individual packets to run custom programs. When you talk about running custom programs, are we talking things that are a bit closer to, “Oh, modify this one field of that packet and then call it good,” or are you talking, “Now, we launch Microsoft Word.”


Liz: Much more the former category. So yeah, let’s inspect this packet and maybe change it a bit, or send it to a different—you know, maybe it was going to go to one interface, but we’re going to send it to a different interface; maybe we’re going to modify that packet; maybe we’re going to throw the packet on the floor because we don’t—there’s really great security use cases for inspecting packets and saying, “This is a bad packet, I do not want to see this packet, I’m just going to discard it.” And there’s some, what they call ‘Packet of Death’ vulnerabilities that have been mitigated in that way. And the real beauty of it is you just load these programs dynamically. So, you can change the kernel or on the fly and affect that behavior, just immediately have an effect.


If there are processes already running, they get instrumented immediately. So, maybe you run a BPF program to spot when a file is opened. New processes, existing processes, containerized processes, it doesn’t matter; they’ll all be detected by your program if it’s observing file open events.


Corey: Is this primarily used from a security perspective? Is it used for—what are the common use cases for something like this?


Liz: There’s three main buckets, I would say: Networking, observability, and security. And in Cilium, we’re kind of involved in some aspects of all those three things, and there are plenty of other projects that are also focusing on one or other of those aspects.


Corey: This is where when, I guess, the challenge I run into the whole CNCF landscape is, it’s like, I think the danger is when I started down this path that I’m on now, I realized that, “Oh, I have to learn what all the different AWS services do.” This was widely regarded as a mistake. They are not Pokémon; I do not need to catch them all. The CNCF landscape applies very similarly in that respect. What is the real-world problem space for which eBPF and/or things like Cilium that leverage eBPF—because eBPF does sound fairly low-level—that turn this into something that solves a problem people have? In other words, what is the problem that Cilium should be the go-to answer for when someone says, “I have this thing that hurts.”


Liz: So, at one level, Cilium is a networking solution. So, it’s Kubernetes CNI. You plug it in to provide connectivity between your applications that are running in pods. Those pods have to talk to each other somehow and Cilium will connect those pods together for you in a very efficient way. One of the really interesting things about eBPF and networking is we can bypass some of the networking stack.


So, if we are running in containers, we’re running our applications in containers in pods, and those pods usually will have their own networking namespace. And that means they’ve got their own networking stack. So, a packet that arrives on your machine has to go through the networking stack on that host machine, go across a virtual interface into your pod, and then go through the networking stack in that pod. And that’s kind of inefficient. But with eBPF, we can look at the packet the moment it’s come into the kernel—in fact in some cases, if you have the right networking interfaces, you can do it while it’s still on the network interface card—so you look at that packet and say, “Well, I know what pod that’s destined for, I can just send it straight there.” I don’t have to go through the whole networking stack in the kernel because I already know exactly where it’s going. And that has some real performance improvements.


Corey: That makes sense. In my explorations—we’ll call it—with Kubernetes, it feels like the universe—at least at the time I went looking into it—was, “Step One, here’s how to wind up launching Kubernetes to run a blog.” Which is a bit like using a chainsaw to wind up cutting a sandwich. Okay, massively overpowered but I get the basic idea, like, “Okay, what’s project Step Two?” It’s like, “Oh, great. Go build Google.”


Liz: [laugh].


Corey: Okay, great. It feels like there’s some intermediary steps that have been sort of glossed over here. And at the small-scale that I kicked the tires on, things like networking performance never even entered the equation; it was more about get the thing up and running. But yeah, at scale, when you start seeing huge numbers of containers being orchestrated across a wide variety of hosts that has serious repercussions and explains an awful lot. Is this the sort of thing that gets leveraged by cloud providers themselves, is it something that gets built in mostly on-prem environments, or is it something that rides in, almost, user-land for most of these use cases that customers coming to bringing to those environments? I’m sorry, users, not customers. I’m too used to the Amazonian phrasing of everyone as a customer. No, no, they are users in an open-source project.


Liz: [laugh]. Yeah, so if you’re using GKE, the GKE Dataplane V2 is using Cilium. Alibaba Cloud uses Cilium. AWS is using Cilium for EKS Anywhere. So, these are really, I think, great signals that it’s super scalable.


And it’s also not just about the connectivity, but also about being able to see your network flows and debug them. Because, like you say, that day one, your blog is up and running, and day two, you’ve got some DNS issue that you need to debug, and how are you going to do that? And because Cilium is working with Kubernetes, so it knows about the individual pods, and it’s aware of the IP addresses for those pods, and it can map those to, you know, what’s the pod, what service is that pod involved with. And we have a component of Cilium called Hubble that gives you the flows, the network flows, between services. So, you know, we’ve probably all seen diagrams showing Service A talking to Service B, Service C, some external connectivity, and Hubble can show you those flows between services and the outside world, regardless of how the IP addresses may be changing underneath you, and aggregating network flows into those services that make sense to a human who’s looking at a Kubernetes deployment.


Corey: A running gag that I’ve had is that one of the drawbacks and appeals of Kubernetes, all at once, is that it lets you cosplay as a cloud provider, even if you don’t happen to work for one of them. And there’s a bit of truth to it, but let’s be serious here, despite what a lot of the cloud providers would wish us to believe via a bunch of marketing, there’s a tremendous number of data center environments out there, hybrid environments, and companies that are in those environments are not somehow laggards, or left behind technologically, or struggling to digitally transform. Believe it or not—I know it’s not a common narrative—but large companies generally don’t employ people who lack critical thinking skills and strategic insight. There’s usually a reason that things are the way that they are and when you don’t understand that my default approach is that, oh context that gets missing, so I want to preface this with the idea there is nothing wrong in those environments. But in a purely cloud-native environment—which means that I’m very proud about having no single points of failure as I have everything routing to a single credit card that pays the cloud providers—great. What is the story for Cilium if I’m using, effectively, the managed Kubernetes options that Name Any Cloud Provider will provide for me these days? Is it at that point no longer for me or is it something that instead expresses itself in ways I’m not seeing, yet?


Liz: Yeah, so I think, as an open-source project—and it is the only CNI that’s at incubation level or beyond, so you know, it’s CNCF-supported networking solution; you can use it out of the box, you can use it for your tiny blog application if you’ve decided to run that on Kubernetes, you can do so—things start to get much more interesting at scale. I mean, that… continuum between you know, there are people purely on managed services, there are people who are purely in the cloud, hybrid cloud is a real thing, and there are plenty of businesses who have good reasons to have some things in their own data centers, something’s in the public cloud, things distributed around the world, so they need connectivity between those. And Cilium will solve a lot of those problems for you in the open-source, but also, if you’re telco scale and you have things like BGP networks between your data centers, then that’s where the paid versions of Cilium, the enterprise versions of Cilium, can help you out. And, as Isovalent, that’s our business model to have, like—we fully support or we contribute a lot of resources into the open-source Cilium, and we want that to be the best networking solution for anybody, but if you are an enterprise who wants those extra bells and whistles, and the kind of scale that, you know, a telco, or a massive retailer, or a large media organization, or name your vertical, then we have solutions for that as well. And I think it was one of the really interesting things about the eBPF side of it is that, you know, we’re not bound to just Kubernetes, you know? We run in the kernel, and it just so happens that we have that Kubernetes interface for allocating IP addresses to endpoints that happened to be pods. But—


Corey: So, back to my crappy pile of VMs—because the hell with all this newfangled container nonsense—I can still benefit from something like Cilium?


Liz: Exactly, yeah. And there’s plenty of people using it for just load-balancing, which, why not have an eBPF-based high-performance load balancer?


Corey: Hang on, that’s taking me a second to work my way through. What is the programming language for eBPF? It is something custom?


Liz: Right. So, when you load your BPF program into the kernel, it’s in the form of eBPF bytecode. There are people who write an eBPF bytecode by hand; I am not one of those people.


Corey: There are people who used to be able to write Sendmail configs without running through the M four preprocessor, and I don’t understand those people either.


Liz: [laugh]. So, our choices are—well, it has to be a language that can be compiled into that bytecode, and at the moment, there are two options: C, and more recently, Rust. So, the C code, I’m much more familiar with writing BPF code in C, it’s slightly limited. So, because these BPF programs have to be safe to run, they go through a verification process which checks that you’re not going to crash the kernel, that you’re not going to end up in some hardware loop, and basically make your machine completely unresponsive, we also have to know that BPF programs, you know, they’ll only access memory that they’re supposed to and that they can’t mess up other processes. So, there’s this BPF verification step that checks for example that you always check that a pointer isn’t nil before you dereference it.


And if you try and use a pointer in your C code, it might compile perfectly, but when you come to load it into the kernel, it gets rejected because you forgot to check that it was non-null before.


Corey: You try and run it, the whole thing segfaults, you see the word ‘fault’ there and well, I guess blameless just went out the window there.


Liz: [laugh]. Well, this is the thing: You cannot segfault in the kernel, you know, or at least that’s a bad [day 00:19:11]. [laugh].


Corey: You say that, but I’m very bad with computers, let’s be clear here. There’s always a way to misuse things horribly enough.


Liz: It’s a challenge. It’s pretty easy to segfault if you’re writing a kernel module. But maybe we should put that out as a challenge for the listener, to try to write something that crashes the kernel from within an eBPF because there’s a lot of very smart people.


Corey: Right now the blood just drained from anyone who’s listening, in the kernel space or the InfoSec space, I imagine.


Liz: Exactly. Some of my colleagues at Isovalent are thinking, “Oh, no. What’s she brought on here?” [laugh].


Corey: What have you done? Please correct me if I’m misunderstanding this. So, eBPF is a very low-level tool that requires certain amounts of braining in order [laugh] to use appropriately. That can be a heavy lift for a lot of us who don’t live in those spaces. Cilium distills this down into something that is all a lot more usable and understandable for folks, and then beyond that, you wind up with Isovalent, that winds up effectively productizing and packaging this into something that becomes a lot more closer to turnkey. Is that directionally accurate?


Liz: Yes, I would say that’s true. And there are also some other intermediate steps, like the CLI tools that Brendan Gregg did, where you can—I mean, a CLI is still fairly low-level, but it’s not as low-level as writing the eBPF code yourself. And you can be quite in-dep—you know, if you know what things you want to observe in the kernel, you don’t necessarily have to know how to write the eBPF code to do it, but if you’ve got these fairly low-level tools to do it. You’re absolutely right that very few people will need to write their own… BPF code to run in the kernel.


Corey: Let’s move below the surface level of awareness; the same way that most of us don’t need to know how to compile our own kernel in this day and age.


Liz: Exactly.


Corey: A few people very much do, but because of their hard work, the rest of us do not.


Liz: Exactly. And for most of us, we just take the kernel for granted. You know, most people writing applications, it doesn’t really matter if—they’re just using abstractions that do things like open files for them, or create network connections, or write messages to the screen, you don’t need to know exactly how that’s accomplished through the kernel. Unless you want to get into the details of how to observe it with eBPF or something like that.


Corey: I’m much happier not knowing some of the details. I did a deep dive once into Linux system kernel internals, based on an incredibly well-written but also obnoxiously slash suspiciously thick O’Reilly book, Linux Systems Internalsand it was one of those, like, halfway through, “Can I please be excused? My brain is full.” It’s one of those things that I don’t use most of it on a day-to-day basis, but it’s solidified by understanding of what the computer is actually doing in a way that I will always be grateful for.


Liz: Mmm, and there are tens of millions of lines of code in the Linux kernel, so anyone who can internalize any of that is basically a superhero. [laugh].


Corey: I have nothing but respect for people who can pull that off.


Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.


In your day job, quote-unquote—which is sort of a weird thing to say, given that you are working at an open-source company; in fact, you are the Chief Open Source Officer, so what you’re doing in the community, what you’re exploring on the open-source project side of things, it is all interrelated. I tend to have trouble myself figuring out where my job starts and stops most weeks; I’m sympathetic to it. What inspired you folks to launch a company that is, “Ah, we’re going to be in the open-source space?” Especially during a time when there’s been a lot of pushback, in some respects, about the evolution of open-source and the rise of large cloud providers, where is open-source a viable strategy or a tactic to get to an outcome that is pleasing for all parties?


Liz: Mmm. So, I wasn’t there at the beginning, for the Isovalent journey, and Cilium has been around for five or six years, now, at this point. I very strongly believe in open-source as an effective way of developing technology—good technology—and getting really good feedback and, kind of, optimizing the speed at which you can innovate. But I think it’s very important that businesses don’t think—if you’re giving away your code, you cannot also sell your code; you have to have some other thing that adds value. Maybe that’s some extra code, like in the Isovalent example, the enterprise-related enhancements that we have that aren’t part of the open-source distribution.


There’s plenty of other ways that people can add value to open-source. They can do training, they can do managed services, there’s all sorts of different—support was the classic example. But I think it’s extremely important that businesses don’t just expect that I can write a bunch of open-source code, and somehow magically, through building up a whole load of users, I will find a way to monetize that.


Corey: A bunch of nerds will build my product for me on nights and weekends. Yeah, that’s a bit of an outmoded way of thinking about these things.


Liz: Yeah exactly. And I think it’s not like everybody has perfect ability to predict the future and you might start a business—


Corey: And I have a lot of sympathy for companies who originally started with the idea of, “Well, we are the project leads. We know this code the best, therefore we are the best people in the world to run this as a service.” The rise of the hyperscale cloud providers has called that into significant question. And I feel for them because it’s difficult to completely pivot your business model when you’re already a publicly-traded company. That’s a very fraught and challenging thing to do. It means that you’re left with a bunch of options, none of 
them great.


Cilium as a project is not that old, neither is Isovalent, but it’s new enough in the iterative process, that you were able to avoid that particular pitfall. Instead, you’re looking at some level of making this understandable and useful to humans, almost the point where it disappears from their level of awareness that they need to think about. There’s huge value in something like that. Do you think that there is a future in which projects and companies built upon projects that follow this model are similarly going to be having challenges with hyperscale cloud providers, or other emergent threats to the ecosystem—sorry, ‘threat’ is an unfair and unkind word here—but changes to the ecosystem, as we see the world evolving in ways that most of us did not foresee?


Liz: Yeah, we’ve certainly seen some examples in the last year or two, I guess, of companies that maybe didn’t anticipate, and who necessarily has a crystal ball to anticipate how cloud providers might use their software? And I think in some cases, the cloud providers has not always been the most generous or most community-minded in their approach to how they’ve done that. But I think for a company, like Isovalent, our strong point is talent. It would be extremely rare to find the level of expertise in, you know, what is a pretty specialized area. You know, the people at Isovalent who are working on Cilium are also working on eBPF itself, and that level of expertise is, I think, pretty unrivaled.


So, we’re in such a new space with eBPF, we’ve only in the last year or so, got to the point where pretty much everyone is running a kernel that’s new enough to use eBPF. Startups do have a kind of agility that I think gives them an advantage, which I hope we’ll be able to capitalize on. I think sometimes when businesses get upset about their code being used, they probably could have anticipated it. You know, if it’s open-source, people will use your software, and you have to think of that.


Corey: “What do you mean you’re using the thing we gave away for free and you’re not paying us to use it?”


Liz: Yeah.


Corey: “Uh, did you hear what you just said?” Some of this was predictable, let’s be fair.


Liz: Yeah, and I think you really have to, as a responsible business, think about, well, what does happen if they use all the open-source code? You know, is that a problem? And as far as we’re concerned, everybody using Cilium is a fantastic… thing. We fully welcome everyone using Cilium as their data plane because the vast majority of them would use that open-source code, and that would be great, but there will be people who need that extra features and the expertise that I think we’re in a unique position to provide. So, I joined Isovalent just about a year ago, and I did that because I believe in the technology, I believe in the company, I believe in, you know, the foundations that it has in open-source.


It’s a very much an open-source first organization, which I love, and that resonates with me and how I think we can be successful. So, you know, I don’t have that crystal ball. I hope I’m right, we’ll find out. We should do this again, you know, a couple of years and see how that’s panning out. [laugh].


Corey: I’ll book out the date now.


Liz: [laugh].


Corey: Looking back at our conversation just now, you talked about open-source, and business strategy and how that’s going to be evolving. We talked about the company, we talked about an incredibly in-depth, technical product that honestly goes significantly beyond my current level of technical awareness. And at no point in any of those aspects of the conversation did you talk about it in a way that I did not understand, nor did you come off in any way as condescending. In fact, you wrote an O’Reilly book on Container Security that’s written very much the same way. How did you learn to do that? Because it is, frankly, an incredibly rare skill.


Liz: Oh, thank you. Yeah, I think I have never been a fan of jargon. I’ve never liked it when people use a complicated acronym, or really early days in my career, there was a bit of a running joke about how everything was TLAs. And you think, well, I understand why we use an acronym to shorten things, but I don’t think we need to assume that everybody knows what everything stands for. Why can’t we explain things in simple language? Why can’t we just use ordinary terms?


And I found that really resonates. You know, if I’m doing a presentation or if I’m writing something, using straightforward language and explaining things, making sure that people understand the, kind of, fundamentals that I’m going to build my explanation on. I just think that has a—it results in people understanding, and that’s my whole point. I’m not trying to explain something to—you know, my goal is that they understand it, not that they’ve been blown away by some kind of magic. I want them to go away going, “Ah, now I understand how this bit fits with that bit,” or, “How this works.” You know?


Corey: The reason I bring it up is that it’s an incredibly undervalued skill because when people see it, they don’t often recognize it for what it is. Because when people don’t have that skill—which is common—people just write it off as oh, that person’s a bad communicator. Which I think is a little unfair. Being able to explain complex things simply is one of the most valuable yet undervalued skills that I’ve found in this entire space.


Liz: Yeah, I think people sometimes have this sort of wrong idea that vocabulary and complicated terms are somehow inherently smarter. And if you use complicated words, you sound smarter. And I just don’t think that’s accessible, and I don’t think it’s true. And sometimes I find myself listening to someone, and they’re using complicated terms or analogies that are really obscure, and I’m thinking, but could you explain that to me in words of one syllable? I don’t think you could. I think you’re… hiding—not you [laugh]. You know, people—


Corey: Yeah. No, no, that’s fair. I’ll take the accusation as [unintelligible 00:31:24] as I can get it.


Liz: [laugh]. But I think people hide behind complex words because they don’t really understand them sometimes. And yeah, I would rather people understood what I’m saying.


Corey: To me—I’ve done it through conference talks, but the way I generally learn things is by building something with them. But the way I really learn to understand something is I give a conference talk on it because, okay, great. I can now explain Git—which was one of my early technical talks—to folks who built Git. Great. Now, how about I explain it to someone who is not immersed in the space whatsoever? And if I can make it that accessible, great, then I’ve succeeded. It’s a lot harder than it looks.


Liz: Yeah, exactly. And one of the reasons why I enjoy building a talk is because I know I’ve got a pretty good understanding of this, but by the time I’ve got this talk nailed, I will know this. I might have forgotten it in six months time, you know, but [laugh] while I’m giving that talk, I will have a really good understanding of that because the way I want to put together a talk, I don’t want to put anything in a talk that I don’t feel I could explain. And that means I have to understand how it works.


Corey: It’s funny, this whole don’t give talks about things you don’t understand seems like there’s really a nouveau concept, but here we are, we’re [working on it 00:32:40].


Liz: I mean, I have committed to doing talks that I don’t fully understand, knowing that—you know, with the confidence that I can find out between now and the [crosstalk 00:32:48]—


Corey: I believe that’s called a forcing function.


Liz: Yes. [laugh].


Corey: It’s one of those very high-risk stories, like, “Either I’m going to learn this in the next three months, or else I am going to have some serious egg on my face.”


Liz: Yeah, exactly, definitely a forcing function. [laugh].


Corey: I really want to thank you for taking so much time to speak with me today. If people want to learn more, where can they find you?


Liz: So, I am online pretty much everywhere as lizrice, and I am on Twitter. I’m on GitHub. And if you want to come and hang out, I am on the Cilium and eBPF Slack, and also the CNCF Slack. Yeah. So, come say hello.


Corey: There. We will put links to all of that in the [show notes 00:33:28]. Thank you so much for your time. I appreciate it.


Liz: Pleasure.


Corey: Liz Rice, Chief Open Source Officer at Isovalent. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment containing an eBPF program that on every packet fires off a Lambda function. Yes, it will be extortionately expensive; almost half as much money as a Managed NAT Gateway.


Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.


Announcer: This has been a HumblePod production. Stay humble.

What is Screaming in the Cloud?

Screaming in the Cloud with Corey Quinn features conversations with domain experts in the world of Cloud Computing. Topics discussed include AWS, GCP, Azure, Oracle Cloud, and the "why" behind how businesses are coming to think about the Cloud.

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: Today’s episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that’s built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you’re defining those as, which depends probably on where you work. It’s getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that’s exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn’t eat all the data you’ve gotten on the system, it’s exactly what you’ve been looking for. Check it out today at min.io/download, and see for yourself. That’s min.io/download, and be sure to tell them that I sent you.

Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig is the solution for securing DevOps. They have a blog post that went up recently about how an insecure AWS Lambda function could be used as a pivot point to get access into your environment. They’ve also gone deep in-depth with a bunch of other approaches to how DevOps and security are inextricably linked. To learn more, visit sysdig.com and tell them I sent you. That’s S-Y-S-D-I-G dot com. My thanks to them for their continued support of this ridiculous nonsense.

Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. One of the interesting things about hanging out in the cloud ecosystem as long as I have and as, I guess, closely tied to Amazon as I have been, is that you learned that you never quite are able to pronounce things the way that people pronounce them internally. In-house pronunciations are always a thing. My guest today is Liz Rice, the Chief Open Source Officer at Isovalent, and they’re responsible for, among other things, the Cilium open-source project, which is around eBPF, which I can only assume is internally pronounced as ‘Ehbehpf’. Liz, thank you for joining me today and suffering my pronunciation slings and arrows.

Liz: I have never heard ‘Ehbehpf’ before, but I may have to adopt it. That’s great.

Corey: You also are currently—in a term that is winding down if I’m not misunderstanding—you were the co-chair of KubeCon and CloudNativeCon at the CNCF, and you are also currently on the technical oversight committee for the foundation.

Liz: Yeah, yeah. I’m currently the chair, in fact, of the technical oversight committee.

Corey: And now that Amazon has joined, I assumed that they had taken their horrible pronunciation habits, like calling AMIs ‘Ah-mies’ and whatnot, and started spreading them throughout the ecosystem with wild abandon.

Liz: Are we going to have to start calling CNCF ‘Ka’Nff’ or something?

Corey: Exactly. They’re very frugal, by which I mean they never buy a vowel. So yeah, it tends to be an ongoing challenge. Joking and all the rest aside, let’s start, I guess, at the macro view. The CNCF does an awful lot of stuff, where if you look at the CNCF landscape, for example, like, I think some of my jokes on the internet go a bit too far, but you look at this thing and last time I checked, there were something like four or 500 different players in various spaces.

And it’s a very useful diagram, don’t get me wrong by any stretch of the imagination, but it also is one of those things that is so staggeringly vast that I’ve got a level with you on this one, given my old, ancient sysadmin roots, “The hell with it. I’m going to run some VMs in a three-tiered architecture just like grandma and grandpa used to do,” and call it good. Not really how the industry is evolved, but it’s overwhelming.

Liz: But that might be the right solution for your use case so, you know, don’t knock it if it works.

Corey: Oh, yeah. If it’s a terrible architecture and it works, is it really that terrible of an architecture? One wonders.

Liz: Yeah, yeah. I mean, I’m definitely not one of those people who thinks, you know, every solution has the same—you know, is solved by the same hammer, you know, all problems are not the same nail. So, I am a big fan of a lot of the CNCF projects, but that doesn’t mean to say I think those are the only ways to deploy software. You know, there are plenty of things like Lambda are a really great example of something that is super useful and very applicable for lots of applications and for lots of development teams. Not necessarily the right solution for everything. And for other people, they need all the bells and whistles that something like Kubernetes gives them. You know, horses for courses.

Corey: It’s very easy for me to make fun of just about any company or service or product, but the thing that always makes me set that aside and get down to brass tacks has been, “Okay, great. You can build whatever you want. You can tell whatever glorious marketing narrative you wish to craft, but let’s talk to a real customer because once we do that, then if you’re solving a problem that someone is having in the wild, okay, now it’s no longer just this theoretical exercise and PowerPoint. Now, let’s actually figure out how things work when the rubber meets the road.”

So, let’s start, I guess, with… I’ll leave it to you. Isovalent are the creators of the Cilium eBPF-based networking project.

Liz: Yeah.

Corey: And eBPF is the part of that I think I’m the most familiar with having heard the term. Would you rather start on the company side or on the eBPF side?

Liz: Oh, I don’t mind. Let’s—why don’t we start with eBPF? Yeah.

Corey: Cool. So easy, ridiculous question. I know that it’s extremely important because Brendan Gregg periodically gets on stage and tells amazing stories about this; the last time he did stuff like that, I went stumbling down into the rabbit hole of DTrace, and I have never fully regretted doing that, nor completely forgiven him. What is eBPF?

Liz: So, it stands for extended Berkeley Packet Filter, and we can pretty much just throw away those words because it’s not terribly
helpful. What eBPF allows you to do is to run custom programs inside the kernel. So, we can trigger these programs to run, maybe because a network packet arrived, or because a particular function within the kernel has been called, or a tracepoint has been hit. There are tons of places you can attach these programs to, or events you can attach programs to.

And when that event happens, you can run your custom code. And that can change the behavior of the kernel, which is, you know, great power and great responsibility, but incredibly powerful. So Brendan, for example, has done a ton of really great pioneering work showing how you can attach these eBPF programs to events, use that to collect metrics, and lo and behold, you have amazing visibility into what’s happening in your system. And he’s built tons of different tools for observing everything from, I don’t know, memory use to file opens to—there’s just endless, dozens and dozens of tools that Brendan, I think, was probably the first to build. And now this sort of new generations of eBPF-based tooling that are kind of taking that legacy, turning them into maybe more, going to say user-friendly interfaces, you know, with GUIs, and hooking them up to metrics platforms, and in the case of Cilium, using it for networking and hooking
it into Kubernetes identities, and making the information about network flows meaningful in the context of Kubernetes, where things like IP addresses are ephemeral and not very useful for very long; I mean, they just change at any moment.

Corey: I guess I’m trying to figure out what part of the stack this winds up applying to because you talk about, at least to my mind, it sounds like a few different levels all at once: You talk about running code inside of the kernel, which is really close to the hardware—it’s oh, great. It’s adventures in assembly is almost what I’m hearing here—but then you also talk about using this with GUIs, for example, and operating on individual packets to run custom programs. When you talk about running custom programs, are we talking things that are a bit closer to, “Oh, modify this one field of that packet and then call it good,” or are you talking, “Now, we launch Microsoft Word.”

Liz: Much more the former category. So yeah, let’s inspect this packet and maybe change it a bit, or send it to a different—you know, maybe it was going to go to one interface, but we’re going to send it to a different interface; maybe we’re going to modify that packet; maybe we’re going to throw the packet on the floor because we don’t—there’s really great security use cases for inspecting packets and saying, “This is a bad packet, I do not want to see this packet, I’m just going to discard it.” And there’s some, what they call ‘Packet of Death’ vulnerabilities that have been mitigated in that way. And the real beauty of it is you just load these programs dynamically. So, you can change the kernel or on the fly and affect that behavior, just immediately have an effect.

If there are processes already running, they get instrumented immediately. So, maybe you run a BPF program to spot when a file is opened. New processes, existing processes, containerized processes, it doesn’t matter; they’ll all be detected by your program if it’s observing file open events.

Corey: Is this primarily used from a security perspective? Is it used for—what are the common use cases for something like this?

Liz: There’s three main buckets, I would say: Networking, observability, and security. And in Cilium, we’re kind of involved in some aspects of all those three things, and there are plenty of other projects that are also focusing on one or other of those aspects.

Corey: This is where when, I guess, the challenge I run into the whole CNCF landscape is, it’s like, I think the danger is when I started down this path that I’m on now, I realized that, “Oh, I have to learn what all the different AWS services do.” This was widely regarded as a mistake. They are not Pokémon; I do not need to catch them all. The CNCF landscape applies very similarly in that respect. What is the real-world problem space for which eBPF and/or things like Cilium that leverage eBPF—because eBPF does sound fairly low-level—that turn this into something that solves a problem people have? In other words, what is the problem that Cilium should be the go-to answer for when someone says, “I have this thing that hurts.”

Liz: So, at one level, Cilium is a networking solution. So, it’s Kubernetes CNI. You plug it in to provide connectivity between your applications that are running in pods. Those pods have to talk to each other somehow and Cilium will connect those pods together for you in a very efficient way. One of the really interesting things about eBPF and networking is we can bypass some of the networking stack.

So, if we are running in containers, we’re running our applications in containers in pods, and those pods usually will have their own networking namespace. And that means they’ve got their own networking stack. So, a packet that arrives on your machine has to go through the networking stack on that host machine, go across a virtual interface into your pod, and then go through the networking stack in that pod. And that’s kind of inefficient. But with eBPF, we can look at the packet the moment it’s come into the kernel—in fact in some cases, if you have the right networking interfaces, you can do it while it’s still on the network interface card—so you look at that packet and say, “Well, I know what pod that’s destined for, I can just send it straight there.” I don’t have to go through the whole networking stack in the kernel because I already know exactly where it’s going. And that has some real performance improvements.

Corey: That makes sense. In my explorations—we’ll call it—with Kubernetes, it feels like the universe—at least at the time I went looking into it—was, “Step One, here’s how to wind up launching Kubernetes to run a blog.” Which is a bit like using a chainsaw to wind up cutting a sandwich. Okay, massively overpowered but I get the basic idea, like, “Okay, what’s project Step Two?” It’s like, “Oh, great. Go build Google.”

Liz: [laugh].

Corey: Okay, great. It feels like there’s some intermediary steps that have been sort of glossed over here. And at the small-scale that I kicked the tires on, things like networking performance never even entered the equation; it was more about get the thing up and running. But yeah, at scale, when you start seeing huge numbers of containers being orchestrated across a wide variety of hosts that has serious repercussions and explains an awful lot. Is this the sort of thing that gets leveraged by cloud providers themselves, is it something that gets built in mostly on-prem environments, or is it something that rides in, almost, user-land for most of these use cases that customers coming to bringing to those environments? I’m sorry, users, not customers. I’m too used to the Amazonian phrasing of everyone as a customer. No, no, they are users in an open-source project.

Liz: [laugh]. Yeah, so if you’re using GKE, the GKE Dataplane V2 is using Cilium. Alibaba Cloud uses Cilium. AWS is using Cilium for EKS Anywhere. So, these are really, I think, great signals that it’s super scalable.

And it’s also not just about the connectivity, but also about being able to see your network flows and debug them. Because, like you say, that day one, your blog is up and running, and day two, you’ve got some DNS issue that you need to debug, and how are you going to do that? And because Cilium is working with Kubernetes, so it knows about the individual pods, and it’s aware of the IP addresses for those pods, and it can map those to, you know, what’s the pod, what service is that pod involved with. And we have a component of Cilium called Hubble that gives you the flows, the network flows, between services. So, you know, we’ve probably all seen diagrams showing Service A talking to Service B, Service C, some external connectivity, and Hubble can show you those flows between services and the outside world, regardless of how the IP addresses may be changing underneath you, and aggregating network flows into those services that make sense to a human who’s looking at a Kubernetes deployment.

Corey: A running gag that I’ve had is that one of the drawbacks and appeals of Kubernetes, all at once, is that it lets you cosplay as a cloud provider, even if you don’t happen to work for one of them. And there’s a bit of truth to it, but let’s be serious here, despite what a lot of the cloud providers would wish us to believe via a bunch of marketing, there’s a tremendous number of data center environments out there, hybrid environments, and companies that are in those environments are not somehow laggards, or left behind technologically, or struggling to digitally transform. Believe it or not—I know it’s not a common narrative—but large companies generally don’t employ people who lack critical thinking skills and strategic insight. There’s usually a reason that things are the way that they are and when you don’t understand that my default approach is that, oh context that gets missing, so I want to preface this with the idea there is nothing wrong in those environments. But in a purely cloud-native environment—which means that I’m very proud about having no single points of failure as I have everything routing to a single credit card that pays the cloud providers—great. What is the story for Cilium if I’m using, effectively, the managed Kubernetes options that Name Any Cloud Provider will provide for me these days? Is it at that point no longer for me or is it something that instead expresses itself in ways I’m not seeing, yet?

Liz: Yeah, so I think, as an open-source project—and it is the only CNI that’s at incubation level or beyond, so you know, it’s CNCF-supported networking solution; you can use it out of the box, you can use it for your tiny blog application if you’ve decided to run that on Kubernetes, you can do so—things start to get much more interesting at scale. I mean, that… continuum between you know, there are people purely on managed services, there are people who are purely in the cloud, hybrid cloud is a real thing, and there are plenty of businesses who have good reasons to have some things in their own data centers, something’s in the public cloud, things distributed around the world, so they need connectivity between those. And Cilium will solve a lot of those problems for you in the open-source, but also, if you’re telco scale and you have things like BGP networks between your data centers, then that’s where the paid versions of Cilium, the enterprise versions of Cilium, can help you out. And, as Isovalent, that’s our business model to have, like—we fully support or we contribute a lot of resources into the open-source Cilium, and we want that to be the best networking solution for anybody, but if you are an enterprise who wants those extra bells and whistles, and the kind of scale that, you know, a telco, or a massive retailer, or a large media organization, or name your vertical, then we have solutions for that as well. And I think it was one of the really interesting things about the eBPF side of it is that, you know, we’re not bound to just Kubernetes, you know? We run in the kernel, and it just so happens that we have that Kubernetes interface for allocating IP addresses to endpoints that happened to be pods. But—

Corey: So, back to my crappy pile of VMs—because the hell with all this newfangled container nonsense—I can still benefit from something like Cilium?

Liz: Exactly, yeah. And there’s plenty of people using it for just load-balancing, which, why not have an eBPF-based high-performance load balancer?

Corey: Hang on, that’s taking me a second to work my way through. What is the programming language for eBPF? It is something custom?

Liz: Right. So, when you load your BPF program into the kernel, it’s in the form of eBPF bytecode. There are people who write an eBPF bytecode by hand; I am not one of those people.

Corey: There are people who used to be able to write Sendmail configs without running through the M four preprocessor, and I don’t understand those people either.

Liz: [laugh]. So, our choices are—well, it has to be a language that can be compiled into that bytecode, and at the moment, there are two options: C, and more recently, Rust. So, the C code, I’m much more familiar with writing BPF code in C, it’s slightly limited. So, because these BPF programs have to be safe to run, they go through a verification process which checks that you’re not going to crash the kernel, that you’re not going to end up in some hardware loop, and basically make your machine completely unresponsive, we also have to know that BPF programs, you know, they’ll only access memory that they’re supposed to and that they can’t mess up other processes. So, there’s this BPF verification step that checks for example that you always check that a pointer isn’t nil before you dereference it.

And if you try and use a pointer in your C code, it might compile perfectly, but when you come to load it into the kernel, it gets rejected because you forgot to check that it was non-null before.

Corey: You try and run it, the whole thing segfaults, you see the word ‘fault’ there and well, I guess blameless just went out the window there.

Liz: [laugh]. Well, this is the thing: You cannot segfault in the kernel, you know, or at least that’s a bad [day 00:19:11]. [laugh].

Corey: You say that, but I’m very bad with computers, let’s be clear here. There’s always a way to misuse things horribly enough.

Liz: It’s a challenge. It’s pretty easy to segfault if you’re writing a kernel module. But maybe we should put that out as a challenge for the listener, to try to write something that crashes the kernel from within an eBPF because there’s a lot of very smart people.

Corey: Right now the blood just drained from anyone who’s listening, in the kernel space or the InfoSec space, I imagine.

Liz: Exactly. Some of my colleagues at Isovalent are thinking, “Oh, no. What’s she brought on here?” [laugh].

Corey: What have you done? Please correct me if I’m misunderstanding this. So, eBPF is a very low-level tool that requires certain amounts of braining in order [laugh] to use appropriately. That can be a heavy lift for a lot of us who don’t live in those spaces. Cilium distills this down into something that is all a lot more usable and understandable for folks, and then beyond that, you wind up with Isovalent, that winds up effectively productizing and packaging this into something that becomes a lot more closer to turnkey. Is that directionally accurate?

Liz: Yes, I would say that’s true. And there are also some other intermediate steps, like the CLI tools that Brendan Gregg did, where you can—I mean, a CLI is still fairly low-level, but it’s not as low-level as writing the eBPF code yourself. And you can be quite in-dep—you know, if you know what things you want to observe in the kernel, you don’t necessarily have to know how to write the eBPF code to do it, but if you’ve got these fairly low-level tools to do it. You’re absolutely right that very few people will need to write their own… BPF code to run in the kernel.

Corey: Let’s move below the surface level of awareness; the same way that most of us don’t need to know how to compile our own kernel in this day and age.

Liz: Exactly.

Corey: A few people very much do, but because of their hard work, the rest of us do not.

Liz: Exactly. And for most of us, we just take the kernel for granted. You know, most people writing applications, it doesn’t really matter if—they’re just using abstractions that do things like open files for them, or create network connections, or write messages to the screen, you don’t need to know exactly how that’s accomplished through the kernel. Unless you want to get into the details of how to observe it with eBPF or something like that.

Corey: I’m much happier not knowing some of the details. I did a deep dive once into Linux system kernel internals, based on an incredibly well-written but also obnoxiously slash suspiciously thick O’Reilly book, Linux Systems Internalsand it was one of those, like, halfway through, “Can I please be excused? My brain is full.” It’s one of those things that I don’t use most of it on a day-to-day basis, but it’s solidified by understanding of what the computer is actually doing in a way that I will always be grateful for.

Liz: Mmm, and there are tens of millions of lines of code in the Linux kernel, so anyone who can internalize any of that is basically a superhero. [laugh].

Corey: I have nothing but respect for people who can pull that off.

Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.

In your day job, quote-unquote—which is sort of a weird thing to say, given that you are working at an open-source company; in fact, you are the Chief Open Source Officer, so what you’re doing in the community, what you’re exploring on the open-source project side of things, it is all interrelated. I tend to have trouble myself figuring out where my job starts and stops most weeks; I’m sympathetic to it. What inspired you folks to launch a company that is, “Ah, we’re going to be in the open-source space?” Especially during a time when there’s been a lot of pushback, in some respects, about the evolution of open-source and the rise of large cloud providers, where is open-source a viable strategy or a tactic to get to an outcome that is pleasing for all parties?

Liz: Mmm. So, I wasn’t there at the beginning, for the Isovalent journey, and Cilium has been around for five or six years, now, at this point. I very strongly believe in open-source as an effective way of developing technology—good technology—and getting really good feedback and, kind of, optimizing the speed at which you can innovate. But I think it’s very important that businesses don’t think—if you’re giving away your code, you cannot also sell your code; you have to have some other thing that adds value. Maybe that’s some extra code, like in the Isovalent example, the enterprise-related enhancements that we have that aren’t part of the open-source distribution.

There’s plenty of other ways that people can add value to open-source. They can do training, they can do managed services, there’s all sorts of different—support was the classic example. But I think it’s extremely important that businesses don’t just expect that I can write a bunch of open-source code, and somehow magically, through building up a whole load of users, I will find a way to monetize that.

Corey: A bunch of nerds will build my product for me on nights and weekends. Yeah, that’s a bit of an outmoded way of thinking about these things.

Liz: Yeah exactly. And I think it’s not like everybody has perfect ability to predict the future and you might start a business—

Corey: And I have a lot of sympathy for companies who originally started with the idea of, “Well, we are the project leads. We know this code the best, therefore we are the best people in the world to run this as a service.” The rise of the hyperscale cloud providers has called that into significant question. And I feel for them because it’s difficult to completely pivot your business model when you’re already a publicly-traded company. That’s a very fraught and challenging thing to do. It means that you’re left with a bunch of options, none of
them great.

Cilium as a project is not that old, neither is Isovalent, but it’s new enough in the iterative process, that you were able to avoid that particular pitfall. Instead, you’re looking at some level of making this understandable and useful to humans, almost the point where it disappears from their level of awareness that they need to think about. There’s huge value in something like that. Do you think that there is a future in which projects and companies built upon projects that follow this model are similarly going to be having challenges with hyperscale cloud providers, or other emergent threats to the ecosystem—sorry, ‘threat’ is an unfair and unkind word here—but changes to the ecosystem, as we see the world evolving in ways that most of us did not foresee?

Liz: Yeah, we’ve certainly seen some examples in the last year or two, I guess, of companies that maybe didn’t anticipate, and who necessarily has a crystal ball to anticipate how cloud providers might use their software? And I think in some cases, the cloud providers has not always been the most generous or most community-minded in their approach to how they’ve done that. But I think for a company, like Isovalent, our strong point is talent. It would be extremely rare to find the level of expertise in, you know, what is a pretty specialized area. You know, the people at Isovalent who are working on Cilium are also working on eBPF itself, and that level of expertise is, I think, pretty unrivaled.

So, we’re in such a new space with eBPF, we’ve only in the last year or so, got to the point where pretty much everyone is running a kernel that’s new enough to use eBPF. Startups do have a kind of agility that I think gives them an advantage, which I hope we’ll be able to capitalize on. I think sometimes when businesses get upset about their code being used, they probably could have anticipated it. You know, if it’s open-source, people will use your software, and you have to think of that.

Corey: “What do you mean you’re using the thing we gave away for free and you’re not paying us to use it?”

Liz: Yeah.

Corey: “Uh, did you hear what you just said?” Some of this was predictable, let’s be fair.

Liz: Yeah, and I think you really have to, as a responsible business, think about, well, what does happen if they use all the open-source code? You know, is that a problem? And as far as we’re concerned, everybody using Cilium is a fantastic… thing. We fully welcome everyone using Cilium as their data plane because the vast majority of them would use that open-source code, and that would be great, but there will be people who need that extra features and the expertise that I think we’re in a unique position to provide. So, I joined Isovalent just about a year ago, and I did that because I believe in the technology, I believe in the company, I believe in, you know, the foundations that it has in open-source.

It’s a very much an open-source first organization, which I love, and that resonates with me and how I think we can be successful. So, you know, I don’t have that crystal ball. I hope I’m right, we’ll find out. We should do this again, you know, a couple of years and see how that’s panning out. [laugh].

Corey: I’ll book out the date now.

Liz: [laugh].

Corey: Looking back at our conversation just now, you talked about open-source, and business strategy and how that’s going to be evolving. We talked about the company, we talked about an incredibly in-depth, technical product that honestly goes significantly beyond my current level of technical awareness. And at no point in any of those aspects of the conversation did you talk about it in a way that I did not understand, nor did you come off in any way as condescending. In fact, you wrote an O’Reilly book on Container Security that’s written very much the same way. How did you learn to do that? Because it is, frankly, an incredibly rare skill.

Liz: Oh, thank you. Yeah, I think I have never been a fan of jargon. I’ve never liked it when people use a complicated acronym, or really early days in my career, there was a bit of a running joke about how everything was TLAs. And you think, well, I understand why we use an acronym to shorten things, but I don’t think we need to assume that everybody knows what everything stands for. Why can’t we explain things in simple language? Why can’t we just use ordinary terms?

And I found that really resonates. You know, if I’m doing a presentation or if I’m writing something, using straightforward language and explaining things, making sure that people understand the, kind of, fundamentals that I’m going to build my explanation on. I just think that has a—it results in people understanding, and that’s my whole point. I’m not trying to explain something to—you know, my goal is that they understand it, not that they’ve been blown away by some kind of magic. I want them to go away going, “Ah, now I understand how this bit fits with that bit,” or, “How this works.” You know?

Corey: The reason I bring it up is that it’s an incredibly undervalued skill because when people see it, they don’t often recognize it for what it is. Because when people don’t have that skill—which is common—people just write it off as oh, that person’s a bad communicator. Which I think is a little unfair. Being able to explain complex things simply is one of the most valuable yet undervalued skills that I’ve found in this entire space.

Liz: Yeah, I think people sometimes have this sort of wrong idea that vocabulary and complicated terms are somehow inherently smarter. And if you use complicated words, you sound smarter. And I just don’t think that’s accessible, and I don’t think it’s true. And sometimes I find myself listening to someone, and they’re using complicated terms or analogies that are really obscure, and I’m thinking, but could you explain that to me in words of one syllable? I don’t think you could. I think you’re… hiding—not you [laugh]. You know, people—

Corey: Yeah. No, no, that’s fair. I’ll take the accusation as [unintelligible 00:31:24] as I can get it.

Liz: [laugh]. But I think people hide behind complex words because they don’t really understand them sometimes. And yeah, I would rather people understood what I’m saying.

Corey: To me—I’ve done it through conference talks, but the way I generally learn things is by building something with them. But the way I really learn to understand something is I give a conference talk on it because, okay, great. I can now explain Git—which was one of my early technical talks—to folks who built Git. Great. Now, how about I explain it to someone who is not immersed in the space whatsoever? And if I can make it that accessible, great, then I’ve succeeded. It’s a lot harder than it looks.

Liz: Yeah, exactly. And one of the reasons why I enjoy building a talk is because I know I’ve got a pretty good understanding of this, but by the time I’ve got this talk nailed, I will know this. I might have forgotten it in six months time, you know, but [laugh] while I’m giving that talk, I will have a really good understanding of that because the way I want to put together a talk, I don’t want to put anything in a talk that I don’t feel I could explain. And that means I have to understand how it works.

Corey: It’s funny, this whole don’t give talks about things you don’t understand seems like there’s really a nouveau concept, but here we are, we’re [working on it 00:32:40].

Liz: I mean, I have committed to doing talks that I don’t fully understand, knowing that—you know, with the confidence that I can find out between now and the [crosstalk 00:32:48]—

Corey: I believe that’s called a forcing function.

Liz: Yes. [laugh].

Corey: It’s one of those very high-risk stories, like, “Either I’m going to learn this in the next three months, or else I am going to have some serious egg on my face.”

Liz: Yeah, exactly, definitely a forcing function. [laugh].

Corey: I really want to thank you for taking so much time to speak with me today. If people want to learn more, where can they find you?

Liz: So, I am online pretty much everywhere as lizrice, and I am on Twitter. I’m on GitHub. And if you want to come and hang out, I am on the Cilium and eBPF Slack, and also the CNCF Slack. Yeah. So, come say hello.

Corey: There. We will put links to all of that in the [show notes 00:33:28]. Thank you so much for your time. I appreciate it.

Liz: Pleasure.

Corey: Liz Rice, Chief Open Source Officer at Isovalent. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment containing an eBPF program that on every packet fires off a Lambda function. Yes, it will be extortionately expensive; almost half as much money as a Managed NAT Gateway.

Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Announcer: This has been a HumblePod production. Stay humble.