Serverless Chats

In this episode, Jeremy chats with Joe Duffy about the biggest challenges for teams working in the cloud, how infrastructure-as-code (IaC) helps teams collaborate better, the distinction between serverless developers and cloud engineers, and a lot more.

Show Notes

About Joe Duffy

Joe Duffy is cofounder and CEO of Pulumi. Prior to founding Pulumi, Joe was a longtime leader in Microsoft’s Developer Division, Operating Systems Group, and Microsoft Research. Most recently, he was Director of Engineering and Technical Strategy for developer tools, where part of his responsibilities included managing groups building the C#, C++, Visual Basic, and F# languages. Joe created teams for several successful distributed programming platforms, initiated and executed on efforts to take .NET open source and cross-platform, and was instrumental in Microsoft’s company-wide open-source transformation. Joe founded Pulumi in 2018 with Eric Rudder, the former Chief Technical Strategy Officer at Microsoft.

Twitter: twitter.com/funcofjoe
Pulumi: www.pulumi.com/
Pulumi Twitter: twitter.com/PulumiCorp

Watch this episode on YouTube: https://youtu.be/MYOGfK9PHM8

Transcript

Jeremy: Hi everyone. I'm Jeremy Daly and this is Serverless Chats. Today I'm speaking with Joe Duffy. Hey Joe. Thanks for joining me.

Joe: Hey Jeremy. Thanks for having me.

Jeremy: You are the CEO and founder of Pulumi. Can you give the listeners a little bit about your background and tell us what Pulumi does?

Joe: Yeah, happy to. I founded Pulumi three years ago. Before that, I was an early engineer on the .NET framework at Microsoft. I was actually at Microsoft for a hearty 13 years working in and around developer tools the entire time, managing groups. I actually led the languages team before leaving, helped with the open source transformation at Microsoft, which was really cool to be a part of, and then founded Pulumi. Pulumi is a modern infrastructure as code platform that really brings everything we know and love about application development using great programming languages, great tooling, and actually brings it over to the infrastructure side of the house and really trying to help both infrastructure teams be super productive with great tools but also empower developers to use more of the cloud as part of their application architecture itself.

Jeremy: Awesome. You just mentioned there cloud engineers or the infrastructure team and then serverless developers, so I look at this and I tend to think... especially with smaller organizations... they're almost becoming one and the same. But as you get larger organizations and you start to separate that responsibility... and whether you have separate cloud teams and separate developers or different cells that do that kind of stuff, there is kind of this separation between the infrastructure dev ops people and the serverless developers. Can you explain that difference?

Joe: Yeah. And I agree. I think developers are doing more infrastructure now than they've ever done in the past and I think serverless is really forcing this issue a bit. Is a serverless function infrastructure or is it code? The line's a little blurry. But there's clear things that are still in the infrastructure domain: for example, setting up a virtual private cloud in Amazon, setting up a network, setting up a Kubernetes cluster. Even if you're going to run serverless functions within Kubernetes somebody's got to manage the cluster. Somebody's got to think about security. Somebody's got to think about monitoring. Some of that actually falls on the applications side. A lot of it falls on the infrastructure side.

I think of it as there are deep domain experts in the infrastructure space just like there are deep domain experts in the applications space. I think the magic of what we're seeing with serverless in particular is that the line is getting a little blurry. It's more of a policy decision, I like to say, than a technology decision about who does what. The infrastructure team is going to do the network because that's what they know how to do. They're experts in that. The development team probably doesn't want to become domain experts in how to set up networks. Similarly, the infrastructure team doesn't want to become domain experts in how serverless functions work, so it's better if the developers can be self-serve and really own their own destiny there. I think the tools and workflows really need to support the concept of these two disciplines working closer together going forward and I love that you used the phrase cloud engineering because that's really what I think of. It's the best of developers, the best of infrastructure engineers, really collaborating together.

Jeremy: Right. That's interesting, because I totally agree with that. Where, then, are these challenges, or what are these new challenges that you're seeing both sides face? Because as a developer, like you said, you need to get a little closer to the infrastructure. As a infrastructure person, you need to get a bit closer to the configurations that the developers are setting up. What are the challenges for the developers?

Joe: Infrastructure's hard. For developers specifically, I think no serverless function is an island. A serverless function is only interesting when it's paired with the infrastructure that triggers an event, whether that's a bucket, you want to do something every time a file gets added, or an API gateway where you're actually using serverless functions for infinite scale on the back end. It needs to be connected to infrastructure and historically what that's meant... Actually, one of the reasons we founded Pulumi was I got really excited about serverless and containers and I wanted to create a serverless application and it was great until I had to configure the infrastructure. I wrote 100 lines of JavaScript for a nice little serverless application. I'm like, "All right. I'm ready to go. What do I do next? Oh. For every 10 lines of JavaScript I have to write 100 lines of YAML." That was not pleasant.

That was one of the problems we wanted to solve, was really, "Let's just make it feel like we've got a real programming model, an application model, where we're just building serverless applications and the infrastructure is part of that." Not all of it, again, but a lot of it really should be closer to the applications. Most of the technology today doesn't have that worldview and so there's kind of a fundamental friction and mismatch.

Jeremy: Right. I think that's one of the things that you see more of now, especially with developers. They have to take the infrastructure into consideration now when they write in code. They didn't use to necessarily need to worry about that. Now it's like, "My code I know is going to interact with some other piece of infrastructure," which I think is a challenge there too. So what about the infrastructure side or the dev/ops people? Or I should say the ops people. What are the challenges that they face now that they've got people kind of meddling with their infrastructure?

Joe: It is a challenge to empower developers to manage more of the infrastructure, because there really is this... I think most technologies today and most teams today assume that there's this hard fall between the two sides of the house. As you pointed out, for smaller companies and companies who are born in the cloud, who are starting today, they have the advantage of not creating those silos, but for most teams there is a hard wall between them and for good reasons: because of these domain specializations and expertise. To your point, 10 years ago if I was just doing virtual machines I talk to my infrastructure team once a year when I was doing capacity planning. It's like, "Well, I need to go from three to four VMs and I need an extra database. Can I have that in a quarter or a month?" Things are so fast-paced these days the only way to keep up is to really empower the practitioners, the developers, to control their own destiny, and you need to think about security when you're doing that.

Joe: Right?

We hear all the time about, oops, a bucket was open on the internet and somebody slurped up all the credit cards. Those things happen all the time. So how do infrastructure teams let their developers control their own destiny but still make sure we're compliant, we're secure, costs are under control? All those hard problems are still hard problems.

Jeremy: And you have this overlap, right? So if I'm a serverless developer and I'm writing a Lambda function, maybe I'm not thinking about concurrency limits, but if I'm on the ops side then I'm thinking, "Hold on, we only have a thousand... or maybe we bumped it up to 5,000..." Now you've got one particular function that could go rogue and consume all of my concurrency capacity. That overlap... I know there's challenges there, but how do you manage that overlap between those two competing things when they're working on the same infrastructure probably at the same time?

Joe: It is tough. It's funny, in the early days of Pulumi we actually were playing around. We created a Lambda that created Lambdas and for every Lambda it created five more Lambdas and we couldn't kill the thing fast enough to actually stop the thing from spreading. It quickly added up to like 1,000 dollars in two hours. It's a real challenge.

There are tools out there that allow you to enforce constraints or guardrails, if you will: to say, "You need to stay within budget," or, "You need to stay within compliance: these guardrails." We offer such a tool, called policy as code. Just like there's infrastructure as code there's also policy as code. That's one tool in the tool belt that you can use to enforce these things. I think also there's just smart things you can do with setting up your accounts properly so that developers have their own sandboxes that they can play in and those are different than production. Some of these new CICD capabilities where you can really test things before rolling out to production I think is also a key element of this.

Jeremy: Yeah. And then, like you said, infrastructure as code is sort of that common language between both sides of those. That's always tough managing it, too. I've seen that where somebody checks in something on one side, somebody code reviews it, and then they want to change it. Next thing you know, something's not working right. Certainly managing that is kind of crazy. But I also see not just serverless developers building tools or building applications for serverless, or I guess applications, but you have sometimes the ops engineers that are using it as well to write automation things.

Joe: Absolutely. I think dev ops kind of started this whole trend that really laid the foundation for these two sides of the house really coming together. That led to a lot of this automation that you're mentioning where on the operations side we often do use code to automate things, whether that's Bash scripts... that's kind of a form of code... or Python scripts. So I think infrastructure teams are used to using code to solve some of these challenges. I think what we try to do is... Okay, as you say, the infrastructure as code platform. Let's use general purpose languages so that application developers now have access to infrastructure and can do infrastructure as code. It's not intimidating. You don't have to learn a new DSL. You can use familiar tools and approaches. But then because of dev ops and because infrastructure teams are used to automation, now infrastructure as code using real languages doesn't seem so foreign and now we're speaking the same language. We have a common foundation to start from.

Jeremy: Right. Let's talk about infrastructure as code a bit more. You have a ton of experience with this and whether you're doing a DSL cloud formation or something like that or you're using the CDK or Pulumi or any of these other more scripting, familiar type... Now there's the new one for Kubernetes. What's it's called? Like CDK8s or something like that that just came out. Whatever you're using, how much do developers need to know infrastructure now?

Joe: I think really the more you can learn the more powerful your abilities will be, honestly. If you look at the building block services, Amazon has 200 hosted services. Azure, Google Cloud, they have a large number as well. The way I think of it is you just think of those as building blocks that you can use to build more powerful software. If you want a data store, great, you've got a data warehouse at your fingertips. If you want a hosted AI, if you want to do speech recognition in your application, well that's just a service now. You can just take that building block and use it.

All of those things are infrastructure, right? So I think infrastructure has an intimidating connotation to it where it's like, "Infrastructure is like virtual machines and these networks and everything." Infrastructure these days really has become a lot more of these hosted services that developers can harness to increase the capability of software. That said, again to my previous point, don't feel like you have to go super deep in networking, like public-private subnets, route tables. Some of these things are really not at the level of abstraction that most developers are thinking of and that's okay, but I would say don't think of the word infrastructure as a frightening term. It really shouldn't be daunting. It really is a superpower that you can use in your application.

Jeremy: I think something that has changed certainly... and like you said, we're not talking about setting up VPCs or routing tables or some of that stuff. We might be talking about using an SQS queue or using blob storage or something like that. We're talking about using these components and that's great but we're no longer given the option of just saying, "We have a Kafka service running so we just use that as our service or as our event bus, and we have a database cluster set up and that is the database that we connect to." Now we're talking about setting up separate DynamoDB tables, an SNS topic, SQS queues, EventBridge: all these other things that we're just adding so many different components, and like you said, building blocks, but that's something that I think goes beyond coding now. Now we're architectural design. We're talking about how we architect these applications.

Where do developers fit in there? Because obviously you can't just write one Lambda function... I mean, you maybe could but you shouldn't... one Lambda function that does everything. You want to separate these into different building blocks and use all these different components. How much do developers now need to start thinking about architecture?

Joe: I think if you were an architect... I think senior developers always think about architecture. I think the kind of architecture just changes now. If you go on a whiteboard and you draw the diagram of the architecture and connect all these boxes, what are the boxes? In the past, they might've been monolithic applications with maybe little components within them. I'll date myself if I say what I was about to say, but comm components or J2EE... java beans or whatever. It's no longer those things. Now it's these services and microservices and they connect over RPC or what have you, but many of those building blocks that when you draw that system architecture now are going to be, quote, "infrastructure." That's great because infrastructure means you don't have to build it; you can use something off the shelf. Like you say, the database, the queues, the SNS topic: you don't have to go hand code your own pub sub system. You can just use one off the shelf and that's really powerful.

Jeremy: The other thing about architecture and assembling multiple things is in order for me to connect to DynamoDB or in order for me to connect to EventBridge, there are a lot of permissions: IAM permissions. It's the same in pretty much every cloud you use. There are going to be sets of permissions that need to be configured. That really opens up a lot of the security stuff. We know the cloud is really great at perimeter security. We don't have to worry about somebody getting in and messing around, but a poorly coded application can expose issues. If I'm using third party libraries there's all kinds of security issues. Saving the data: am I encrypting the data? Different compliance things. That's another thing. Where does infrastructure as code, developers, and ops people... where does that all come together to make sure that we've got not only things like reusability but also compliance and security?

Joe: I think that's, frankly, the hardest part of this whole transformation. I think IAM is something everybody has to think about. Security is something everybody has to think about. You can't ignore it. But there's levels of security. I think IAM is extremely fine-grained.

Jeremy: Maybe too fine-grained in some cases.

Joe: Right. It's kind of overkill for some... Like, once you get to the application tier do you really need to think about literally every fine-grain permission for this Lambda or is it okay if your infrastructure team gives you a sandbox and says, "Here's the permissions I'm comfortable giving my developers, and they can do fine-grain permissions within that box, but I'm going to give them something that's a reasonable starting point and assume that if they got everything wrong it's not the end of the world." I think that's the key for infrastructure teams and developers working together, is the infrastructure team needs to figure out, "What are the IAM permissions I'm willing to give to my developers?" And then developers can think about it, and they should think about it, but then it's not as much of a catastrophe if they get something wrong to the fine-grain details. It's almost like imagine if your job application every single object you had to have ACLs on your object. That would be insane. That's kind of where we are with IAM in some ways.

Jeremy: And I like this idea of infrastructure as code and CDKs for the reusability piece of it, because I feel like that's where I want to go, where it's like, "I need to build some system that processes a queue. What are all the things I have to do? What are those permissions?" And so forth. I don't want to have to write that by hand every single time. I want to pull that off the shelf and say, "This connects to the queue, this does all of the correct permissions and all that kind of stuff, and then here's my code that processes the actual data or whatever." It would be great if we could get to that point. I think we're moving there but not quite there yet.

Joe: And that's the direction that we're heading in for sure. We have lots of libraries. Actually, a lot of our customers use Pulumi to do exactly what you're saying, which is maybe a developer, they don't want to think about all of the different pieces of Kubernetes microservice. Maybe there are some security permissions that come with that. Maybe there is an RDS database. Maybe there are some services in an EKS cluster, if they're in Amazon. A developer may just want to come up and say, "Give me a new microservice." They don't want to think about all of these pieces and if you use the code to create these abstractions... real code with infrastructure as code... then the infrastructure team can build these abstractions that have built in best practices, hand it off to developers, and not only know that it's going to be secure and reliable and cost efficient but now the developers don't have to think about every little detail of how that building block was created. I think that's definitely the direction we're heading in.

Jeremy: Awesome. So we talked about developers and cloud engineers or infrastructure people working nicely together, and we know the dev ops thing is pretty strong. I think there are a lot of companies that have a good dev ops culture where they are working together but you see silos all the time. You always see it's like the developers over here and the infrastructure people over here and they're like, "We want to do this," and they're like, "Nope, because there's a security issue or there's some other reason why you can't do that," and I think it gets even crazier in the cloud because it's so easy to just grant people permission to something and let them do something but you've just got I guess maybe market forces is the right word that kind of prevents these teams from working together.

What are some of those... that's happening right now. What are some of the market forces that are keeping these things siloed?

Joe: I think, frankly, the tools and technologies are very different. Developers use a very different... Every day, their day job, they use a completely different set of tools than folks on the infrastructure side, so even if we want to collaborate it's actually kind of hard. And this is something that we're trying to change with Pulumi, where, okay, let's use Python, let's use JavaScript, let's use Go, let's at least speak the same language so we can start making it a policy decision, who does what, rather than one implied by technology.

I think that causes some of the silos. I think it really depends on the organization as well. I think the most disruptive companies that are transforming entire industries using the cloud as a competitive advantage have figured this out and they are figuring this out. I think that is forcing some of the larger, maybe more established, companies, let's say, to start reinventing how they're doing things. I think the literal market forces are pushing people in that direction. It is uncomfortable too. I think we've actually put more dev in the ops than we've put ops in the dev and I see now it's going in the other direction. You look at observability and application performance management and infrastructure as code. These are things that developers now are thinking about every day and even just five years ago they weren't. I think we're heading in the right direction but it's definitely an uncomfortable, difficult transformation for a lot of people.

Jeremy: And dev ops as a culture I think is really an important step. Like you said, there are a lot of companies who have figured this out and it's great because you get that agility, you get that speed, you remove all those bottlenecks, but has it... I guess my question is has dev ops also created more silos in a sense because now you've got really strict processes in place?

Joe: I think it's helped connect the operations team with the developers, unquestionably. The interesting thing is actually if you look on the infrastructure side of things I see silos within the infrastructure side of the house. Like, dev ops is sort of a different silo than sys admin, where dev ops is happy to write some code, happy to write some scripts, and sys admin maybe not so much. Maybe that's more point and click ticketing kind of stuff. I think now you're seeing the emergence of SRE and these more advanced infrastructure teams, which is even like a step beyond the dev ops approach where dev ops was kind of started 10 years. It's come a long way, but SRE is a relatively new practice that's actually more like software engineering than dev ops was. So each of these are slightly different factions and that does definitely cause a little bit of challenge because that just means we're speaking five different languages instead of one.

Jeremy: And I think that's part of the problem, too, where you start seeing these silos within just the organizational team. That's why a lot of companies create these cloud teams that are strictly dedicated to the cloud. But as serverless... One of those things I think maybe companies aren't ready for is giving developers more control and giving them more access. Is just this shift to serverless, like is that enough of a driver for people to be like, "Okay, serverless is going to make it faster, give us a faster time to market, more productivity from our developers, faster development cycles, or whatever"? Is that enough of a driver for these companies to change, do you think?

Joe: I think serverless on its own for some companies would be. What I see is for some companies serverless is really important to their entire strategy, their architecture. It's a naturally event-driven architecture. It's way more cost effective for them to adopt serverless. I think for other organizations it's a combination of things. I actually think containers is another forcing function for a lot of these things where building and publishing a container seems like it's something you can do without touching infrastructure until you start doing private registries and hosted load balance services and ECS or Kubernetes. Now you need to actually consume... So that line starts to get blurry as well.

Joe: To me, it's the combination of serverless and containers combined with just the rapid pace of innovation and the fine-granularity of these services because you kind of pointed out earlier it's not these monolithic things any more. It's just lots of little pieces that you need to stitch together and that means things move a lot faster and at a very fine granularity, which is even more difficult to stay on top of. I think all of those combined together [inaudible 00:24:18] the only way to keep up with the competition, frankly... the competition being the ones that are the most innovative and have already figured this out... is to really empower developers.

Jeremy: I wonder if this on ramp of containers, which I think is great... I'm not a fan of lift and shift. I think that just transferring everything into the cloud isn't going to give you much savings other than not having to manage that infrastructure any more. Well, manage the physical infrastructure at least. But the shift to containers, I like that. I think that it's a really good on ramp. But I don't see containers replacing all of these other cloud services too, right? So even if you're building your application on containers, you're still likely going to want to use SQS and RDS and DynamoDB. You're still going to want to use those things, so even that shift to me seems like it still opens up all these cans of worms with security and everything around that.

Joe: Absolutely. We talk with customers all the time that are at various points along this journey and any time somebody says to me, "I'm going to run a MySQL database in my Kubernetes cluster and manage the persistent volumes and backups and everything on my own and I'm running in AWS," my first question is, "Why aren't you using RDS? Do you really need that level of complexity?" For some people, yes, the answer is absolutely that makes sense. For most people it makes more sense to start with the hosted service. Now, like you say, you're having to manage lots of these moving pieces and stitch them together and it is infrastructure and infrastructure as code is the way to tame that complexity and chaos.

Jeremy: I totally agree. What about developers' responsibility? We talked a bit about it and maybe learning some infrastructure and learning some security and some of those things, but how much of that falls on them now and how much should we... I understand you can put in guardrails for certain things and you can do code reviews and you can have another dev ops team or an ops team that's looking at some of these things. Maybe you have a sec dev ops or dev sec... What is it called? Anyways, you have some other team, some fancy team name, that is looking over their shoulder and trying to do this, but how much of that do you expect those people to catch and how much of that responsibility now falls on the developer?

Joe: I think the unfortunate thing is secure by default would be the ideal world to live in where it's principle of least authority, which is generally regarded as the place to start from because then if you don't need a permission you don't get it. That's not the case today. The example of S3 buckets that I mentioned, the default shouldn't be that an S3 bucket is open to the internet and Amazon is definitely going that direction by adding controls and access blocks and things like that. That's the thing that a developer needs to be careful about, is just know that the defaults aren't always secure. In fact, often they aren't, so if you assume... But in the early days you would write threat models, right? Developers think about security. It's not like we never think about security. It's just kind of a different threat model. It's a different set of concerns, but it's a very transferrable set of concerns. I think it's not entirely foreign but you have to know going in and eyes wide open that there are a lot of foot guns out there and I think when you have these sister teams like the security engineering team and the infrastructure team, dev sec ops or sec dev ops... I always forget the ordering as well.

Jeremy: I don't remember either, exactly.

Joe: Lean on them as well because ideally those teams... kind of what I was saying earlier... they would set up your environment so that you can't shoot yourself in the foot. That's easier said than done but it is possible.

Jeremy: And I think the other thing you have with serverless now is that you can launch a Lambda function that's not in some private VPC, right? It's in the general VPC. Well, it's technically in VPC. It's in Amazon's VPC. But you can launch that function without having all of those other security things in place.

I agree with you. I think that you want to lean on other people as much as possible but I always now... more than I ever did before... when I'm building something I'm thinking about the security and I'm trying to think, "What happens if this happens and what are the worst case scenarios?" I don't know. I agree that it's good to have those people to lean on but I feel like developers maybe need to go a step further if they are building in the cloud and that might just be this new thing called the cloud developer or a cloud engineer as you said earlier. That might just be the new normal and where you need to be as a developer.

Joe: Yeah, and I think the complicated thing is the execution environment of the cloud is very different. Most developers are used to writing code that runs in one monolithic context, like on a server or on a desktop. In the cloud, your code is... especially serverless... spread across lots and lots of different servers or you may not even have the concept of a server if you're doing serverless, but that thing has permissions. That execution context has permissions and you need to think about are those the right permissions and what if somebody were able to get code to run in that context that I didn't expect and is that possible, and then you need to think about the network perimeter: your VPC example. You can access this thing? What are the APIs that are exposed? What are the capabilities of those APIs? Is there authentication? Is there authorization attached to it? How does that work?

I really think you got to think, to your point earlier, like architecture. You have to think architecture. You need to draw it on a whiteboard and think, "What is the security threat model for this overall architecture," and that's the way to go. I agree: you can't exclusively lean on a separate team, especially some companies don't have those teams, so you really need to take matters into your own hands.

Jeremy: Totally agreed. Let's move on to some tips, because I think that you have been working with a lot of companies. What are some of these processes that companies can put into place that help them adopt serverless faster by creating a better relationship between the developers and the cloud engineers or infrastructure people?

Joe: I think you have to figure out the tools and the workflows: the tools, the workflows, and the processes. It comes down to those three things. I'm biased. I think a tool in a workflow that works great for developers and infrastructure teams means that if you don't get it right on day one you can always change your mind down the road. It's not like, "You folks over there are going to use this set of tools and you over here you're going to use that set of..." Once you make that decision and people start building stuff using that it's incredibly hard to reverse that, so that's important to get right on day one.

From a process standpoint, I really do think finding some way where guardrails are in place is really critical because you don't want developers to always have to come to file tickets to get... You just want to empower them to run full speed ahead and know, and sleep soundly at night knowing, that nothing bad is going to happen. I also think by using infrastructure as code you can use familiar coding techniques like code reviews, like change management. You can actually just use code reviews and pipelines and a lot of these CICD platforms these days just raises the visibility for the whole organization in terms of what code is running where, who's pushing what change, because in the event that something does go wrong you're going to need to go and find out when it happened, where did it come from, who do I go talk to. That's also important.

I think it's really the tools, the workflows, and the processes and they really need to all gel and ideally not fundamentally different for the infrastructure team than the developers.

Jeremy: Right. I actually really like... One of the greatest things about infrastructure as code is I remember back in the day I was uploading a Pearl script to a web server somewhere that was running in a CGI bin and I would just upload that file. Oh, I needed a new server. I'd have to go configure that server separately, set up Apache, and do all those configurations. Things got better as we moved towards things like Ops Works or Puppet and Chef and those sort of things because it helped repeat those infrastructure deployments. But now it is so easy to spin up a new environment, especially if you're all in serverless. If you're using DynamoDB and SQS, you can spin these things up and tear them down.

I love that approach, too, where you give developers a lot of flexibility to put something out there in sort of a test or QA environment or something like that or maybe just a dev environment and then have that move through a CICD process, go through a code review if it needs to, and some of those other things. I really like that process because I think that that gives the developers that freedom to play around and actually get stuff up and running in that sandbox environment but then still put in those checks and have all that change management and that process management in there as well.

Just one more question on that, because I think this is something where, like we said, you've got developers thinking about architecture and then you might have ops people thinking about architecture as well. Where's the delineation of responsibilities?

Joe: It's interesting, especially with Kubernetes. One thing we're seeing is there's sort of like the infrastructure operators and the application operators and it's sort of like this natural divide is happening where there's like the base layer of any architecture and often times it's shared. Maybe it's company-wide and shared amongst lots and lots of applications. Sometimes it's maybe more fine-grained than that, but it's sort of the networking, the base security, the cluster, maybe some of the data services, encryption services.

There's this fundamentals layer that definitely the infrastructure team is, if you're in a larger company, going to be the one who manages that. Got to get that right 100%. It moves a lot less frequently. You change it occasionally but it's pretty stable. Once you get it up and running you might need to scale it up. You might need to go to new regions, things like that. Then on top of that it's all the application services and I think the application services, that's the stuff you want the developers to manage. That's serverless capabilities. It's data stores: Aurora, S3, Cosmos if you're on Azure, those level of things. And services, like load-balanced services, those really belong at the top.

Unfortunately, at the very front from a networking standpoint you sometimes have CDNs and some of the load balancers can get complicated and public subnets, so that cuts back over to the infrastructure team. But basically most of the stuff above the line should really go to the developers ideally.

Jeremy: I think that's great advice. I mean, especially it's like you don't want a developer going in there, setting up an RDS cluster with all the security groups and everything that's surrounded there. I mean, they certainly can, but if you've got a larger team and you've got somebody that can handle that that's definitely the way to go.

Another thing that Pulumi does is it deals with multiclouds. I hate the term multicloud because it's one of those things where it depends on what people are trying to do. Are we trying to be cloud agnostic, which probably a really bad idea, or are we just trying to find the best services in cloud A versus cloud B and use them all together? In your experience, because I always love hearing this feedback, how have companies that are using Pulumi and the customers you've talked to embracing or using multicloud?

Joe: It's a pretty broad spectrum. As you say, usually trying to abstract over what makes each of the clouds special and unique is probably not a great idea but there's some areas where that works. Kubernetes is a good example where we're finally kind of agreeing on what it means to run a container in one of the cloud environments, so I think of that as almost the POSIX or Unix API of running container-based compute... but it doesn't go much beyond that.

For multicloud we see a number of things. One, as you say, each cloud has different services. You might want to use S3 in Amazon and then machine learning in Google Cloud, for example, and that's totally fine. Furthermore, it's not always just the major clouds, right? You might be using CloudFlare. You might be using Datadog, New Relic, Mailchimp. There are these infrastructure service providers that are part of the infrastructure and you need to manage the infrastructure on those as well. There's very basic reasons. Like, we work with a customer. They were running in Azure and they get acquired by a company that runs everything in AWS. Did that company want to force them to rewrite everything just because they acquired them? No, they didn't. It wasn't the most cost effective thing, so now they're multicloud. They didn't really plan on it but they are.

The other pattern we see is companies selling a SaaS. If I'm selling a SaaS that runs in my customer's cloud, do I want to say I can only sell to Azure customers or AWS or whatever cloud I happen to pick? Probably not. You probably want to architect it so you can be flexible and sell to customers running in all of these different clouds. That's a common pattern. I don't see much folks talking about that but we see that quite a bit with our customers.

Jeremy: Do you see any vendor lock in concerns? Is that an argument that comes up?

Joe: It does sometimes. For us it's actually part of why Pulumi's interesting. It's not tied to one particular cloud. But that's just more admitting the reality that many people have to multiple clouds and they have to move some day down the road. We see some people wanting to avoid lock in, some people using it for maybe price negotiation at a very high C-level conversation. But most of the time when a CIO makes that decision or something their entire team is grumbling because it just adds so much pain and so much friction because they have to abstract over everything. I think the workflows being cloud agnostic is a good thing. Policy as code, infrastructure as code, that being consistent no matter which cloud you're going to go to is great. But once you get down to the actual building block services they're very different in each of the cloud providers and trying to abstract over them is usually a fool's errand.

Jeremy: It's the lower common denominator thing, right? We want to pick the best service for our application and if you're trying to do something that you can duplicate across multiple clouds or with the fear that some day I might need to move this thing, I think the amount of investment you make in that is more than it would be to re-engineer it to move it to a different cloud later.

So what's next for infrastructure as code or just for serverless? We talked a lot about CDKs and being to repackage things, reuse things like that, but that in and of itself is still kind of a problem in terms of learning curves. You still need to know all the individual services. You still need to know exactly how you're stitching these things together. Is this something where there's going to be more collaboration? Is there going to be a higher level of abstraction? What is that next step?

Joe: I totally agree. We're very early days. Pulumi, what we've done is we've taken those building blocks and we've exposed them in general purpose languages and given you an infrastructure as code platform where you can manage infrastructure reliably and you can bring that closer to your applications, and we've added some abstractions but definitely the average developer really just wants to get up and running very quickly and not have to worry about a lot of the low level details. The cloud APIs really were designed for infrastructure circa five years ago, 10 years ago. They're not really designed for great usability. You think of a developer. It's almost... my horrible aging analogies... like you go to build the Windows application. Are you going to program and see against the Win 32 API or are you going to use No Jazz or Python or something that you're hyper productive in?

We're still very much in the Win 32 C days. I think we'll get there. I think Pulumi... one of the things we're really excited about is it gives this foundation, so we started building these higher level abstractions and it's not like a Heroku or a Paz. I love Heroku. The challenge most customers run into is once they hit a level of complexity they say, "Now I need to abandon the platform because it's too high level." The question is, can we have that high level while still connecting to those lower level building blocks in a way that works at scale in some of the largest organizations? I think that's where we're trying to get to but it's definitely super early in that journey.

Jeremy: Yeah, and I think one of the things for me that I see a lot, especially now that everybody is working from home and you've got a lot of remote workers, is people trying to collaborate on larger blocks of infrastructure or larger applications where maybe you have 10, 15 microservices. Maybe you have 100 microservices and each one of those has 50 different services in it or Lambdas and queues and all these other things that are happening. It gets really hard to manage. Even if you're using CloudFormation or you're using the CDK or you've got a really good code repository and a good workflow for all that, you still have a lot of different things to manage. What do you see maybe being the tool of future? Besides Pulumi, maybe. What's that tool? How are people going to be able to collaborate across the world on all these different things and organize and better than just text files in a repo?

Joe: I think what we're really on the verge of is distributed computing. I think, finally, we're getting to the stage where we're moving from monolithic single computer programs. We went through concurrency with the multicore era and figured out how to do asynchronous programming, so now every language in the world supports asynchronous programming with async/await and tasks and promises and these things. Now we're about to do that with distributed computing. These fundamental concepts of having lots of little pieces that communicate with each other: our programming languages need to better support that and our programming models and it needs to be more first class. Ironically... I don't know if I should say this... I feel like we're almost converging with developers in infrastructure and if we really could figure out some of these distributed computing programming models I think they might diverge a bit because at that point developers really don't want to think about literally every small building block. They want to think about these higher level programming patterns and application models.

There are some folks that are trying to do this already and they're very exciting. I think it's going to be like a 10 year journey to get there. It took most of the 2000s, 2010s, just to figure out async, so these things take a while, but I think that's probably where we'll end up.

Jeremy: Awesome. Joe, I appreciate you being here. I do want to give you a minute just to explain what Pulumi is doing to solve this problem.

Joe: Pulumi... by choosing general purpose languages for infrastructure as code, you can build reusable abstractions. You get everything we know and love about languages, which is a great foundation... testing, four loops functions, basic abstraction... but you can really build these reusable components. I think for serverless, the team's a bunch of ex-compiler nerds. Our CTO is one of the two original guys who founded the Typescript Project, for example, so we figured out some cool ways on how to do serverless computing where Lambdas really are Lambdas in your favorite language. You don't have to do this 10 lines of code and 100 lines of YAML.

It's really exciting. As we've discussed, it's super early days, but I think we've laid a solid foundation. And it's open source. We've got a great community, support every cloud provider you can imagine: over three dozen other infrastructure providers. So very powerful. Great for developers. Also great for infrastructure teams who are trying to build that bridge between the two sides of the house.

Jeremy: Awesome. Joe, again, thank you so much for being here. I appreciate you sharing all this knowledge. I love what Pulumi is doing. I think that, like you said, early days but hopefully things will continue to progress and people will move more towards serverless and companies will figure these things out.

If people want to get a hold of you or learn more about Pulumi, how do they do that?

Joe: Pulumi.com is one stop shopping for everything. That blue getting started button will take you to download the open source and then it's easy to go from there. Great tutorials for different clouds depending on what you want to do next. Then follow us on Twitter: @PulumiCorp. We've got a great community Slack where the whole team hangs out if you want help or talk about things. Then I'm on Twitter: @funcOfJoe. Always happy to chat with people. DM me. They're open. But definitely if you run into any questions, want any help, want to talk about anything, I'm always here to help.

Jeremy: Awesome. Thanks again. I will make sure I get all that in the show notes.

Joe: Awesome. Thanks Jeremy.

This episode is sponsored by: Amazon Web Services, Serverless Security Strategies: Under the Hood Fireside chat: https://pages.awscloud.com/AWS-Fireside-Chat_2020_FC_e06-SRV.html

What is Serverless Chats?

Serverless Chats is a podcast that geeks out on everything serverless. Join Jeremy Daly and Rebecca Marshburn as they chat with a special guest each week.

More episodes

Chapters

Show Notes

What is Serverless Chats?