Real World DevOps

{{ show.title }}Trailer Bonus Episode {{ selectedEpisode.number }}
{{ selectedEpisode.title }}
|
{{ displaySpeed }}x
{{ selectedEpisode.title }}
By {{ selectedEpisode.author }}
Broadcast by

Summary

Yan Cui, creator of the Production-Ready Serverless course and serverless consultant, joins us this episode to school Mike on serverless, talk about the real business value behind why an organization should be interested, and a whole lot of intricate details around this new paradigm.

Show Notes

About the Guest

Yan is an experienced engineer who has run production workload at scale in AWS for nearly 10 years. He has been an architect and principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. He has worked extensively with AWS Lambda in production, and has been helping various UK clients adopt AWS and serverless as an independent consultant.

He is an AWS serverless Hero and a regular speaker at user groups and conferences internationally, and he is also the author of Production-Ready serverless.

Guest Links

Transcript

Mike Julian: Running infrastructure at scale is hard, it's messy, it's complicated, and it has a tendency to go sideways in the middle of the night. Rather than talk about the idealized versions of things, we're going to talk about the rough edges. We're going to talk about what it's really like running infrastructure at scale. Welcome to the Real World DevOps podcast. I'm your host, Mike Julian, editor and analyst for Monitoring Weekly and author of O’Reilly's Practical Monitoring.


Mike Julian: This episode is sponsored by the lovely folks at InfluxData. If you're listening to this podcast, you're probably also interested in better monitoring tools — and that's where Influx comes in. Personally, I'm a huge fan of their products, and I often recommend them to my own clients. You're probably familiar with their time series database, InfluxDB, but you may not be as familiar with their other tools. Telegraph for metrics collection from systems, coronagraph for visualization and capacitor for real-time streaming. All of this is available as open source, and they also have a hosted commercial version too. You can check all of this out at influxdata.com.


Mike Julian: Hi folks, I'm here with Yan Cui, an independent consultant who helps companies adopt serverless technologies. Welcome to the show, Yan.



Yan Cui: Hi Mike, it's good to be here.



Mike Julian: So tell me what do you do? You're and independent consultant helping companies with serverless. What does that mean?



Yan Cui: So I actually started using serverless quite a few years back, pretty much as soon as AWS announced it, I started playing around with it and the last couple of years I've done quite a lot of work building serverless applications and production. And I've also been really active in just writing about things I've learned along the way, so as part of that, a lot of people have been asking me questions because they saw my blog and talk about some problems that they've been struggling with, and asked me, "Hey can you come help me with this? I got some questions." So as part of the doing that, I like to help people, first of all and then just part of doing that is something that's been happening more and more often, so in the last couple months I have started to work as an independent consultant, helping companies who are looking at docking serverless or maybe moving to serverless for new projects and want to have some guidance in terms of things they should be thinking about and maybe have some architectural reviews on a regular basis. So for things like that, I've been helping with a number of companies, both in terms of workshops but also regular architectural reviews. And at the same time, I also work part-time at a company called The Zone, which is a sports streaming platform and we also use the serverless and is contained very heavily there as well.




Mike Julian: Okay, so why don't we back up like several steps. What the hell is serverless? Just to make sure that we're all talking about the same thing. What are we talking about?



Yan Cui: Yeah that's a good question, and I guess a lot of people has been asking the same question as well because now they say you see, pretty much everyone is throwing the serverless label at their product and services. And just going by popular definition out there based on what I see in the talks and blog posts, I guess in terms of my social media circle, I guess by the most popular definition, serverless is pretty much any technology where you don't pay for it when you are not using it because paying for OpTime is a very serverful way of thinking and planning, and two is, you don't have to worry about managing and patching servers because installing demons or Asians or any form of subsidiary or support software on it is again, definitely tied to having servers that you have to manage. And three, you don't have to worry about scaling and positioning because the systems just scale a number of underlying servers on demand. And by this definition, I think a lot of the traditional backend server's things out there like AWS S3 or Google BigQuery, they also qualify as the serverless as well.



Mike Julian: Okay, so Lambda is a good example of serverless, but there's also this thing of like a function as a service and they seem to be used interchangeably sometimes. What's going on there?



Yan Cui: So to me, functions as services, describes a change in terms of how we structure our applications and changing the unit of deployment and scaling to the function level that makes every application. A lot of the function and server solutions like a dual function or Lambda as you mentioned, they will also qualify as serverless, based on the definition we just talked about and generally I find that there are a lot overlap between the two concepts or paradigms between functions and service and the serverless. But I think there are some important subtleties in how they differ because you also have functions of service solutions like Kubeless or Knative that gives you the function oriented programming model and the reactive and event driven module for building applications, but then runs on your own Kubernetes cluster.



Yan Cui: So if you have to manage and run your own Kubernetes cluster, then you do have to worry about scaling, and you do have to worry about patching servers, and you do have to worry about paying for op time for those servers, even when no one is running stuff on them. So the line is blurred when you consider Kubernetes as service things like Amazon’s EKS or Google GKE where they offer Kubernetes as a service or Amazon's Fargate, which lets you run containers on Amazon's fleet of machines so you don't have to worry about positioning, and managing, and scaling servers yourself.



Yan Cui: At the end of the day, I think being serverless or having the right labels associated with your product is not important. It's all about delivering on business needs quickly, but having a well understood definition on those different ideas that we have, really helps us in terms of understanding the implicit assumptions we make when we talk about something. So now that everyone is talking about calling their services or products serverless, is really not helping anyone because if everything is serverless, then nothing is serverless and I really can't tell what sort of assumptions I can make when I think about your product. 



Mike Julian: Right, this is the problem with the buzzwords is, the more you have of them, the less it actually means and the more confused I am about what you do. So because I love talking about where things fall apart... Like serverless, it's a cool idea. I think it works really well and yet, I've seen so many companies get so enamored with it that they spend six months trying to build their application on serverless or in that model. And then a month later, they go under. I can't help but make that the tie between the two of  — you spend all your time trying to innovate on this and at the end of the day, you didn't have any time to innovate on the product. So that's an interesting failure model. But I'm sure there's others here where people are adopting serverless in the same way when we first started adopting containers. Like, "Hey, I just deployed a container and works on my machine, have fun." So when is serverless not a good idea? What are the pitfalls we're running into? What are people not thinking about?



Yan Cui: I think one of the problems we see all the time ... You mentioned when something's a hype, a lot of the adoptions happen because there's a lot of hype behind the technology and there's a lack of understanding of, this is the requirement we have and the technical constraints that you have and you go straight into it. I think this happens all the time and that's why we have the whole hype cycle to go with it. I think when you are a newcomer to a new paradigm and it's so easy to become infatuated by what this product can do and when you see the whole world as a hammer, you start looking for nails everywhere and this happens when we discover NoSQL. All of a sudden, everything has to be done with NoSQL. The MongoDB and Redis which is everywhere to solve every possible database problem, often again with disastrous results because again, people are not thinking about the constraints and the business leads they actually have and focus too much on the tech. If anything, I think with serverless, we have this great opportunity to think more about how do we deliver business value quickly, rather than thinking about technology itself. But as engineers, as technology people ourselves, you can see how easy it is to fall into that trap and I think there's a couple of used cases where serverless is just not very good in general right now. One of them is when you require consistent and very high performance.



Yan Cui: So quite a lot has been made about cold starts which is something that is relatively new to serverless, well to a lot of people using serverless but again, it's not something that's new entirely. For a very long time, we've had a deal with long garbage collection pauses or server being overloaded because low is not evenly distributed, but with serverless, that becomes something that's systematic because every time a new container is spawned to one of your functions, you get this spike in latency. For some applications, that is not acceptable because maybe you are building a realtime game for example, where latency has to be consistent and have to be very very fast. You are talking about, say a multiplayer game, leaving a nine percentile latency to be below 100 milliseconds, that's not just something that you can guarantee with Lambda or any serverless platform today.



Mike Julian: I worked with a company a while back that was building a realtime engine and that was a hell of a problem. So we were building everything on bare metal and VMware, and then had this really nice orchestration layer running on top of a puppet. And this is a hell of a problem because as load comes up, we're automatically scaling the stuff out, except as we're adding the new nodes, latency is spiking because we're trying to move traffic over to something that's not ready for it.



Yan Cui: Yes, and with serverless, you don't have this luxury of say, let the server warm up first and then you give it some time before you actually put it into active use. Literally you can respond on the first request that they don't have a spare server running around to handle. So you always have cold start, so you can't just say, "Okay I'm gonna give this server five minutes to get warmed up first." Maybe it's JVM that takes up your warmup time so that you can feel that you're low balanced and the rest of the system to take into account the time it needs to warm up before you put into active service. With serverless you can’t do that, so where you do need consistent high performance, serverless is a really bad fit right now. I think you just touched on something else there as well, the fact that you need to have a persistent connection to a server, so there's some kind of logical notion of a server.



Yan Cui: That's again, something that serverless is not a good fit for. If you want, say, a persistent connection in order to do realtime push notifications to connect the devices, or to implement subscription features in the GraphQL for example. In those cases, you also constraint by the fact that functions can only ... Run the occasion for a function can run for only certain amount of time. I think that's a good constraint. It tells you that there's certain used cases that are a really good fit for functions and service, but there are whole other cases that you just shouldn't even think about doing it. There are other ways you can get around it, but by the time you do all of that, you really have to ask yourself, "Am I doing the right thing here?"



Mike Julian: Right.



Yan Cui: And I think another interesting case is that, and this is again something that I find often made out of proportion is in terms of the cost. Sure Lambda is cheap because you don't pay for it when it's not running, but when you have even a medium amount of load, you might find that you might pay more for API Gateway where Lambda compared to if you just run a web server yourself. Now that's true, but one of the things that you don't think about and this most people don't think about enough is, the personnel cost, the amount of skill set you need to run your own cluster, to be able to look after your Kubernetes cluster, to do all these other things associated with having a server, that often is all that makes it more expensive than whatever premium you pay for AWS to run in your functions.



Yan Cui: However, if you are talking about a system that has got, I don't know, maybe tens of thousands to request per second, consistently all the way throughout a day, then those premiums on individual invitations can start to strike up really really quickly. And I had a chat with some of the guys at Netflix a while back and they mentioned that they did a precalculation that if everything on Netflix runs on Lambda today, it will cost them something like eight times more and therefore if you are running at Netflix scale, that is a lot of money, way more than the amount of money you will pay to hire the best team in the world to look after your infrastructure. So if you are at that level scale and the cost is out to wreck out, then maybe it's also time to think about maybe moving your load into a more traditional containerized or em-based setup where you can get a lot more out of your server and do a lot more of the performance organization there, than to run them in Lambda.



Yan Cui: And I think the final use case where, Lambda is probably not that good a fit or serverless is not that good a fit today is that, even though you get a good baseline of redundancy built in, so you get Multi-AZ out of the box and you can also build multi-region active APIs relatively easily; but because we are relying on the platform to do a lot more, and the platform service is essentially a black box to us, there are also cases where some of the built-in redundancy might not be enough. For example, if I'm processing events in real time with Kinesis and Lambda, the state of the polar is a black box, it's something that I can't access. So if I want to build a multi region set up whereby if the one region starts to fail, I can move the stream processing to a different region and turn it on. So have active passive set up, then I need to access the internal state for the poller which is not something that I can do, or I have to use some whole lot of infrastructure around it to be able to simulate that.



Yan Cui: And again, by the time I invest all the effort and do all of that, maybe I should just start with something else to begin with. Again, those are some of the constraints that I've had to think about when I decide whether or not Lambda or serverless is a good fit for the problem that I'm trying to solve. As much as I love serverless, again, I don't think it's about the technology. It's about finding ways that can deliver the business needs you have, so whatever you choose, you have to meet the business needs first and foremost, and then anything that can let you move faster, you should go with that.


Mike Julian: So all this reminds me of an image floated around Twitter a while back, that people dubbed, “Docker Cliff.” And the idea was that you had Docker at the very bottom of Dev and Prod, but to get something from Dev, like when I'm developing Docker on my laptop, to actually put it in production, takes way more than just a container. How do you do the orchestration? How do you do the scheduling? How are you managing network? What are you doing about deployment, monitoring, supervision, security and all this other stuff on top of it that people weren't really thinking about. And so for developers, Docker was fantastic. Like oh, hey, everything is great. It's a really nice self-contained deployable thing except it's not really that deployable. And I'm kind of seeing that serverless is much the same way of, we threw out a bunch of Lambda functions, like this is great. And immediately the next question is, “How do I know they're working? How do I know when they're not working? What's going on with them?” CloudWatch Logs is absolutely awful, so trying to understand what it’s doing through there is just super painful and the deployment model is kind of janky right now. How I've been deploying them is just a shell script wrapped around the aws-cli. I'm sure there's better ways to do it, so are there other stuff like this? Are there other things that we're not really thinking about and what do we do about those?



Yan Cui: Yeah absolutely. The funny thing is that a lot of the problems that you talk about are things I hear from other clients or from the people from the community all the time, in terms of how do I do deployment, and how do I do basic observability stuff and the thing is that there are solutions out there that do various different degrees and I think you find that as the case with a lot of AWS services, that they cover the basic use and needs. CloudWatch Logs being a perfect example for that, but it does sit very crudely.



Mike Julian: Right, it's like an MVP of a logging system.



Yan Cui: Yes.



Mike Julian: Every CloudWatch team, it's true.



Yan Cui: And the same goes to, I guess, CloudWatch itself as well, but the good thing is that at least you don't have to worry about having to install those agents and whatever to ship your logs to, your CloudWatch Logs. So CloudWatch Logs becomes a good staging place for your logs and gather them and then from there, you can actually ship them to somewhere else. Maybe a ELK Stack, or maybe one of the main services like [inaudible 00:18:48] Logglyor Splunk or something else. So the paradigm of doing that is actually pretty straightforward. I've got two blog posts which I guess we can link to ...



Mike Julian: Yeah we'll throw those in the show notes.



Yan Cui: ... In the show notes. One other thing, which I think is quite important is security. Again, as developers, we are just not used to thinking about security and I see a lot of organizations try to tackle this security problem with this hammer called VPC. As if having security is gonna solve all of your problems and most of VPC ... In fact, every single VPC I've seen in production, none of them do egress filtering, so that if anyone is able to compromise your network security then you find yourself in this fully trusted environment where services talk to each other because with no authentication because you just assume it's trusted because you're inside of VPC now, but then we've seen several times how easy it is to compromise the whole ecosystem by attacking the dependencies everyone [has]. I think it was last year when a researcher managed to compromise something like 14% of all NPM packages which accounts for something like a quarter of the monthly downloads of NPM, including-



Mike Julian: Well that's gonna make me sleep well.



Yan Cui: So imagine if someone just compromised one of your dependencies and put a few lines of code there to scan your environment variables and then send it to their own backend to harvest all these different AWS credentials and see whether or not you can do some funky stuff or to be commanding with them. And that is not something that you can really protect by putting VPC in front of things. And yet, we see people try to take this huge hammer and apply into serverless or the same, even though when it comes to Lambda, you pay a massive price for using VPCs in terms of how much cold start you experience. My experience tells me that having a Lambda function running inside a VPC can add as much as 10-seconds to your cold start time, which basically rules out any use of facing APIs you have. But with Lambda, you can actually control your permissions down to the function level and that's again something that I see people struggle with because we don't like to think about, oh this is a IAM permissions and stuff. It's difficult, it's laborious.



Mike Julian: Well you know, I think the real problem is that no one knows how IAM actually works.



Yan Cui: To be fair though, I guess I'm probably a bad example because I've been using AWS for such a long time and I'm used to the mechanics of IAM and writing the permissions and the policies, but yes, it is much more complicated than people-



Mike Julian: It is a little esoteric.



Yan Cui: Yes, definitely. And I have seen some tools now coming onto the market which I think PureSec is one of them and a few other ones are all looking at, how do we automate this process to both identify what your function needs by doing a static analysis on your code to see how you're interacting with AWS SDK to see, oh, your function talks to this table and when you deploy or doing a CICD pipeline, you notice that, hey, your function doesn't have the right permissions, it's overly permissive. Because again, a lot of people are using just star. Email function access everything, which also means now your function is compromised. The attacker can get your credentials and do everything with that sort of temporary credentials you have. So some of these tools is going to automate whatever pain that we experience as developers in terms of figuring out what permissions our function actually needs and then trying to automatically generate those templates that we can just put into our different framework. And you talked about a deployment framework being [clunky right now. There are quite a lot of different deployment frameworks that takes care of a lot of the sort of plumbing and complexity under the hood. I don't know if you ever tried to provision an API gateway instance that are using CloudFormation or Terraform, it's horrendous.



Mike Julian: It's not exactly simple.



Yan Cui: It's so, so complicated because the way resources are organized in API gateway. But with something like the serverless framework or AWS SAM or a number of other frameworks out there, I can just write a human readable URL in one line that translates to I don't know, maybe a 100 lines of a CloudFormation template code.



Mike Julian: That's awful.



Yan Cui: This is just not stuff that I wanna deal with, so there are frameworks out there that ease a lot of burdens with deployment and similar things. On the visibility side of things as well, there's also quite a lot of companies that are focusing on tackling that side of the equation in terms of giving you better choice ability. Because one of the things we find with serverless, is that people are now building more and more event-driven architectures because it's so easy to do them nowadays.



Mike Julian: Right.



Yan Cui: And part of the problem with that is, they are a lot harder to trace, compared to direct API codes. With API codes, I can easily just pass along some correlation ID along the headers and then a lot of the existing tools like Amazon X-Ray can just kick in and integrate with API Gateway and Lambda already out of the box, but as soon as my event goes over asynchronous event sources like SNS, Kinese or SQS, then I lose a trace entirely because they don’t support this asynchronous  event sources. But there are companies like Epsagon who are now looking at that problem specifically and trying to understand how the whole, how data flows through the entirety of the system, whether or not it's synchronized through APIs, or whether or not it's asynchronous to the event streams or task queues or SNS topics that you have. And there are also companies that are focusing on the cost side of things, understanding the cost of user transactions that spends across this massive web of different functions, loosely coupled together through different event sources, CloudZero being one of those. I guess the foremost, companies are focusing on the cost side of the cost story of the serverless architectures. So there are quite a lot of interesting stops that are focusing on various different aspects of the problems that we've just described so far. And I think definitely the next six to twelve months, we're gonna see more and more innovation in this space, even beyond what all the things that Amazon's already doing under the hood.



Mike Julian: Yeah that sounds like it will be awesome. This whole area still feels pretty immature to me. I know there's people using in production. There's also people that were using Mongo in production and it was dropping data like crazy every day. So more power to them if they don't like data. But I like stable things. So it sounds like serverless, it's still maturing. It is ready, but we're still kinda working some of the kinks out? That would be a fair characterization?



Yan Cui: I think that's a fair characterization in terms of tooling space because a lot things are provided by the platform and as I mentioned before, Amazon is good at meeting the basic needs that you have. So you can probably get by with a lot of the tools out of the box, but that also I guess just slows down some of the self-commercial tooling support it comes with, something like containers comes with Kubernetes because again, you only get so much out of the box so that's a huge opportunity for vendors to jump in very very quickly, but at the same time, I think those innovations are happening a lot faster than people realize. Maybe one of the problems is just in terms of the education, getting the information about all the different tools that's coming into the space and make people aware of them.



Mike Julian: That's really interesting, and what I think a lot of people forget is exactly how old Docker is because Docker was kind of in the same position of serverless, where it was really cool but it was still pretty immature. And thinking about when these things came out, now that we're seeing Kubernetes which is maturing that ecosystem further, that is actually in production. We know the patterns, and we know how all that stuff is being deployed, we know how to manage it, we know the security. It is pretty mature, but how long did it actually take to get there? And looking at it, you have Docker, its initial release was in 2013. That's like five years ago, which has blown my mind and Kubernetes initial release was in 2014, four years ago. But it's only really been in the past year or two that Kubernetes has been what we'd call mature. And now we're starting to see this massive uptick of abstraction layers on top of Docker in the form of Kube. At some point, I think we're gonna see that with serverless, where it's not just like, oh we're deploying this Lambda function and calling it a day. I think we're gonna see a lot more ... Tooling a lot more abstraction that brings it all together and makes it so much easier to deal with, especially like at scale.



Yan Cui: Yeah I absolutely agree and just in terms of the dates you just mentioned, the first initial announcement on Lambda was 2014, so in terms of age, it's not that much younger compared to Docker and the Kubernetes.



Mike Julian: Wow.



Yan Cui: Where it has differed, is that it's a brand new paradigm, whereas with containers and with Kubernetes, it's a lot easier for you to lift and shift existing workloads without having to massively restructure your application to be intermative for this paradigm. With Lambda, and with serverless, there is that requirement that in order to be idiomatic, there's a lot of restructuring and rethinking you need to do because with them, it's a mind-shift change. And that takes a lot longer than just technology change.



Mike Julian: Right, yeah. We're talking about something completely new here. So it's not like, oh we'll just go implement Lambda over night and we'll call it a day. We'll just move our whole application over. It's not like when we start putting things in containers. We could actually put a thing in a container, but really all we're doing by lifting and shifting was, moving from one server to another except now it's a smaller server.



Yan Cui: Yes.



Mike Julian: We had the idea of the fat container where you had absolutely everything in a container. That is a bad idea, it's a dumb pattern. And it's going the same way with serverless, I think. You can't just lift and shift. It is a brand new architectural pattern. It requires a lot of serious thought.



Yan Cui: Yeah, and I think one of the pitfalls I just see in terms of the serverless adoptions sometimes is that, we are so embraced in this whole movement into a new paradigm that sometimes we just forsake all the things we've learned in the past, even though a lot of principles still very much apply. And in fact, a lot of things I've been writing about is basically how do we take previous principles, but apply them, adjust them and make them work in this new paradigm? Because, the practices and patterns may have to change because some things just doesn't work anymore. A lot of principles still very much apply. Why do we do structure login? Why do we do sampling in production? All those things, the principles still very much apply when it comes to serverless. It's just, how we get there is different. And I think that is one of the key things I had to learn the last couple of years is that, a lot of things that we learn in the past, just with databases, a lot of things we learn about databases are still very much there to stay even if we don't need a specific skill set that DBAs provide for us in the new world of NoSQL databases. When it comes to serverless, I guess a leap from understanding and looking at practices, to understand the principles behind them, why do we do it, how can we apply those principles, that's super important when it comes to making a successful adoption of serverless in your organization.



Mike Julian: That's an absolutely fascinating perspective because I completely agree. What I absolutely love about it is, the principles of site reliability haven't actually changed. The principles of how we run and manage systems, has it really changed a whole lot in the past 10 years? Which is fantastic. That's how it should be. We should always be looking for true principles. It's stuff that kind of pillars of how we behave and how we look at what we work on. How we do it, changes all the time and it absolutely should, but the principles shouldn't change that much. So that's interesting of trying to apply the ... The principles that we already know to be true. The practices that we know, work. And how do we apply it to a new paradigm? And sure, maybe some of them aren't going to apply very well and we maybe have to create a new one, which I'm sure there will be coming out of this. But, we don't have to start from scratch.



Yan Cui: No, what's that saying again? Those who don't know the history are doomed to repeat them.



Mike Julian: Right, exactly. We've talked a lot about the failures and the challenges, and you keep mentioning this idea, the business case for serverless. So sell me on it. I want to deploy serverless in my company. I'm just an engineer, but I really like it, so I wanna move everything to it. I wanna do a new application in it. What should I be thinking about? How do I come up with this business case?



Yan Cui: I think the most important question there is, what does the business care about? And I think pretty much every business I know of, cares about delivery and speed. As a business, you want to deliver the best possible user experience and you want to build the right features that your users actually want, but to do that, you need to be able to hit the market quickly, and inexpensively, so that you can also then iterate on those ideas and that allows you to tell the good ideas from the bad ones and then you can double down on a good ideas and make them really great. And the more you have to do it, the more your engineering team have to do it themselves, than by definition, the slower you gonna be able to move. And that's why businesses should care about serverless because it frees the engineering teams from having to worry about a whole load of concerns. They need to know how the applications are hosted and let the real experts, the people that work for AWS, to worry about those undifferentiated heavy lifting. And then that frees the brainpower that you actually have, which by the way are super expensive on solving the problems that your users actually care about. No user cares about whether or not your application runs on containers or VMs or serverless, but they do care about when you gonna deliver them and they do care about building the right features. And that again, that needs you to optimize for a time to market and also, it will iterate quickly. A lot of people talk about vendor locking as if Amazon's gonna one day just worry about Amazon holding the key to your kingdom, but I think the real-



Mike Julian: That's the last thing I'm worried about.



Yan Cui: Yeah exactly, I think the biggest problem we should worry about is a competitor who can iterate faster than you, locking you out of the market altogether.



Mike Julian: Right.



Yan Cui: Yeah so I think that's why they should really really care about serverless.



Mike Julian: I agree with that. That sounds great. The biggest thing that I see with technology is, with engineers and their engineering architectural decisions, it seems that a lot of decisions are based essentially on resume-driven development. I've met a lot of engineers where I built this new application in Go because I wanted to learn Go, and I'm like, that's cool, what does the business have to say about that? And it's like well, "I convinced my boss to use Go." I'm like, "No you did." Like your entire shop's in PHP, you basically just said PHP is shit. That was your business case. Instead like, yes we should be looking at this from the perspective of how quickly can I get this new product to market? How quickly can I ship this feature? And yeah there might be some scenarios where switching a language or switching a framework would be useful, but I agree with you that we really should be focused significantly more on time to market and time to value. We're here to help our businesses make money, or in my case, help my business make money. But for me, I have an application that I'm writing in PHP right now. It's PHP and MySQL and it's gonna be a core facet of my own company. And most engineers would say I'm crazy for writing PHP, but the entire point is that I don't have time to deck around. I need to have this out in the market.



Yan Cui: Yeah absolutely, totally agree. And those kind of conversations, I've had quite a few of them in the past myself, and also I've heard a lot of similar arguments in terms of, oh why should we use, for example, functional programming. And one office already wrote the function of programming community for quite a long time and are still a big fan of function and programming, but not for the reason that it makes your code size more readable, but again, it's about moving up the abstraction ladder so that I have to do less and it's about getting that leverage to be able to do more with less and I think that's the argument that we should be making more, I suppose to, how I like to read my codes.



Mike Julian: Right, let’s take this from two different perspectives. For the people that are brand new to serverless, what can they do this week, or today, to learn more about it? And for the people that already have serverless in their infrastructure, what can they do this week to improve their situation?



Yan Cui: I think learning by doing is always the best way to get to grips on something. So if you are just starting, definitely with serverless, it's so easy to get started and play around with something, and when you're done, just delete everything with confirmation, you sync or button click, or if you're using the right tools, it scans a single command. So definitely go build something. If you got questions that you don't know how the platform behave, then build a proof of concept, try out yourself. It's super, super simple nowadays. That's how I've learnt a lot of things. I've learnt it now through serverless, is just by running experiments. Come up with the question, coming out with the hypothesis on how I expect things to do it, or how the platform to behave, do a proof of concept to answer those questions and then again, I like to write about things so that I have a record for it afterwards but also I can share with other people, things that I've learned and afterwards as well.



Yan Cui: And if you already started, and you want to take your game to the next level, don't wanna be boasting myself, but do check in my blog, I have shared a lot of the things that I've learnt about running serverless in production and solved problems you run into, and addressing a lot of the observability concerns, and I also have a video course with Manning as well. Feel free to check out where we actually build something from scratch and apply a lot of things that I've been talking about for the last year and a half, two years, in terms of how do you do auto basic observability things, how to think about security, VPCs and performance and so on. So all of that will be available on the podcast episode notes. Yeah, and also just go out there and talk to other people and learn from them. There's a lot of very knowledgeable in this space already. People like Ben Kehoe from iRobot, people like Paul Johnston and Jeremy Daily and there are quite a lot of people who have been very active in sharing their knowledge as well and their experiences. Definitely, go out there, find other people with who are doing this, and try and learn from them.



Mike Julian: That's awesome. So thank you so much for joining us. Where can people find more about you and your work?



Yan Cui: You can find me on theburningmonk.com and that's my blog, I try to write actively and you can also find me on Twitter as well. I try to share new things that I find interesting, anything I learn and whenever I write something also, I publish there as well. And if you don't wanna miss anything, I also have a newsletter you can subscribe to on my blog. And so I've tried to write up regular summaries, updates for things I've been doing. And also, I'm available for doing some consultancy work if you need some help in your organization. Or to get started, but also to tackle specific problems that you have with serverless as well.



Mike Julian: Wonderful. Well thank you so much for joining us. And on that note, thanks for listening to the Real World DevOps podcast. If you wanna stay up to date on the latest episodes, you can find us at realworlddevops.com. And on iTunes, Google Play or wherever you get your podcast. I'll see you in the next episode.



Yan Cui: See you guys.


What is Real World DevOps?

I'm setting out to meet interesting people doing awesome work in the world of DevOps. From the creators of your favorite tools to the organizers of amazing conferences, from the authors of great books to fantastic public speakers. I want to introduce you to the most interesting people I can find.