Don't just learn the cloud—BYTE it!
Byte the Cloud is your go-to, on-the-go, podcast for mastering AWS, Azure, and Google Cloud certifications and exam prep!
Chris 0:00
Hey everyone, and welcome back for another deep dive. It's
Kelly 0:02
great to be here. Today. We're
Chris 0:03
going to be talking about Amazon S3 which is such a fundamental service in AWS, it really is. And I think it's something that any cloud engineer is going to run into pretty frequently in their work, absolutely, and in their exams too, definitely. So to get started, can you give us a little bit of an overview of what S3 is and why it's so important. So
Kelly 0:22
at a high level, S3 stands for symbol storage service, okay, and it's AWS object storage service, okay, so instead of having, like, a file system, right? You know, like you would on your computer, yeah, it's all based on objects and buckets, okay, so you have these buckets, and then within each bucket, you have these objects, got it, and those objects contain both your data and metadata. Okay,
Chris 0:47
so it's not just the file itself, right? It's also information about the file exactly. Okay, so why is this object storage model so important? Well,
Kelly 0:57
it offers a ton of benefits. Okay, first of all, it makes S3 incredibly scalable, so you can store a massive amount of data, right from, you know, tiny little text files to huge video files and everything in between. Got it and it's also really durable and available. Okay, so what does that mean? Exactly. So durability means that your data is very unlikely to be lost. Okay, S3 is designed to withstand hardware failures. Got data center outages,
Chris 1:28
okay, all sorts of so it's like, really, really safe. Yeah, super
Kelly 1:31
safe. Okay, and then availability means that you can access your data whenever you need it. So S3 has a very high uptime, and it's designed to be always available. That's awesome. So it's a very reliable service. Awesome.
Chris 1:42
Okay, and so can you give us some examples of what S3 is used for, like, in the real world?
Kelly 1:48
Yeah, so there's a ton of use cases for S3 okay. One common one is storing media files. Okay, so like, images, yeah, videos, audio files, okay, a lot of companies use S3 to store their media libraries. That makes sense. Another common use case is backups and archives. So you can store your backups in S3 Yeah, and that gives you an extra layer of protection, and you can also use it for long term archival storage. Makes sense. You can also host static websites on S3 Okay, so if you have, like, a simple website, yeah, that doesn't need a lot of dynamic content, right? You can host that directly on S3 interesting. And it's also used for a lot of other things, like big data analytics, yeah, machine learning, all sorts of things,
Chris 2:34
awesome. So it sounds like S3 is used by a lot of different industries and for a lot of different purposes. Yeah, it's really versatile. Okay, so before we dive deeper into some of the features of S3 sure and how to prepare for the exam, yeah, can you tell us a little bit about where S3 fits into the overall AWS ecosystem? Absolutely.
Kelly 2:53
So S3 is kind of like a foundational service in AWS, okay? It's used by a lot of other services, right? So for example, you can use S3 to store data that's then processed by Ec two instances. Or you can use it to store data that's analyzed by services like EMR or Athena. Got it. So it's really like a central data store, right, that a lot of other AWS services can interact with. Okay, that makes sense. So it's really important to understand S3 because it's gonna touch a lot of different parts of your AWS infrastructure.
Chris 3:26
Got it okay? So I think we have a good understanding of what S3 is, yeah, and why it's so important, absolutely. So let's dive a little deeper into some of the features of S3 sounds good and how to prepare for the exam. Okay, so one of the things that I always find a little bit confusing about S3 is the different storage classes. Sure. Can you explain what those are and why they're important? Yeah, so
Kelly 3:48
S3 offers different storage classes, and they're basically different ways of storing your data, okay? And they each have different characteristics in terms of cost for performance and scalability, all right? So you can choose the storage class that best meets your needs, okay, that
Chris 4:05
makes sense. So can you give us some examples of the different storage classes?
Kelly 4:09
Yeah, so the most commonly used storage class is S3 standard, okay, and this is the default storage class. It's designed for frequently accessed data, okay, and it offers high durability and availability, right? But it's also the most expensive storage class,
Chris 4:26
okay, that makes sense, yeah? So if we don't need to access the data very often, right? Then we can use the different storage class Exactly. That's cheaper, yeah? Okay, cool. So what are some of the other storage classes? So
Kelly 4:36
another popular one is S3 infrequent access, okay? Or S3 IA for short, got it. And this is designed for data that's accessed less frequently, okay, so it's cheaper than S3 standard, right? But it has a slightly higher latency, okay, so it takes a little bit longer to retrieve your data, makes sense. And then there's also S3 glacier, okay, which is designed. For long term archival storage. Okay, so this is for data that you rarely need to access, and it's the cheapest storage class, okay, but it has the highest latency, so it can take hours or even days to retrieve your data, okay,
Chris 5:15
so it sounds like there's a trade off between cost and performance. Yeah, exactly. Okay, so we have to choose the storage class, right? That meets our needs in terms of how often we need to access the data and how much we're willing to pay exactly got it. So we've talked about the different storage classes, but there's a lot more to S3 than just storage Absolutely. So let's talk about some of the other features that are important for the exam. Okay, one of the things that comes up a lot is security, sure. So can you talk about some of the ways that you can secure your data in S3 Yeah, absolutely.
Kelly 5:47
So security is obviously super important when you're dealing with any kind of cloud storage, right? And S3 offers a lot of different features to help you secure your data. Okay, one of the most important things is access control, okay, so you can control who has access to your buckets and your objects, and you can do that using a variety of different mechanisms. Okay. One is bucket policies, okay, which allow you to define rules for who can access your bucket got it and what actions they can perform. Okay. You can also use access control lists or ACLs, okay, which are attached to individual objects, right? And allow you to define more granular permissions. Okay,
Chris 6:28
so bucket policies are for the whole bucket, yep. And ACLs are for individual objects, exactly. Okay, cool. And then another important security feature is encryption, right? Can you talk about that a little bit.
Kelly 6:40
Yeah, so encryption is basically just a way of scrambling your data, okay, so that it can't be read by anyone who doesn't have the key, right? And S3 offers a variety of different encryption options. Okay, you can use server side encryption, okay, which means that AWS manages the encryption keys for you got it. Or you can use client side encryption, okay, which means that you manage the encryption keys yourself.
Chris 7:04
Got it? Yeah, okay, so there's a lot of different ways to secure your data in S3 Yeah, absolutely. And it's important to choose the right options right for your specific needs. Okay, okay, cool. So let's move on to another important feature, which is versioning. Okay, can you explain what versioning is and why it's important? Yeah,
Kelly 7:22
so versioning is basically like a time machine for your data. Okay? So every time you modify an object, S3 creates a new version of that object, okay? And it keeps all of the previous versions got it, so you can always go back and retrieve an older version. That's awesome if you need to. So
Chris 7:41
it's like undo for S3 Yeah, exactly.
Speaker 1 7:43
Cool. And so why is versioning important?
Chris 7:48
Well, it's important for a couple reasons. Okay, first of all, it helps you protect your data from accidental deletion. So if you accidentally delete an object, yeah, you can always go back and retrieve an older version. That's great, and it also helps you protect your data from malicious modification. So
Kelly 8:04
if someone were to tamper with your data, right, you would be able to see the changes that they made, and you could revert back to the previous version. That's really helpful. So it's a really important feature for data protection.
Chris 8:14
Yeah, it sounds like it. Okay, okay, cool. So another feature that I wanted to talk about is life cycle management, okay. Can you explain what that is and why it's important? Yeah, so
Kelly 8:24
lifecycle management is basically a way of automating the management of your data in S3 okay. So you can define rules, right, that automatically transition your data between different storage classes, or that automatically expire your data after a certain period of time, that makes sense. So this can help you save money, right? By moving your data to cheaper storage classes, yeah, when it's not being actively used, okay? And it can also help you free up storage space. Makes sense by deleting data that you no longer need.
Chris 8:56
Okay, cool. So we've talked about a lot of different features of S3 Yeah, but there's one more that I want to touch on Sure, which is integration with other AWS services. Okay, so can you give us some examples of how S3 integrates with other services? Yeah,
Kelly 9:12
absolutely. So S3 is a very versatile service, okay? And it integrates with a lot of other AWS services, right? So, for example, you can use S3 to store data that's then processed by Ec two instances. We talked about that a little bit earlier. Yeah. You can also use it to store data that's analyzed by services like EMR or Athena. Got it. You can use it to store backups from your databases. Okay, you can use it to store logs from your applications, right? So it really integrates with a lot of different services, cool, so
Chris 9:46
it's kind of like the central data store, yeah, for a lot of different AWS solutions, exactly. Okay. Cool. So we've talked about a lot of different features of S3 Yep. But I think the best way to really learn about S3 is to actually. Use it absolutely so let's dive into some exam style scenario and see how we can apply what we've learned. Sounds okay. So let's jump into some of those exam style scenarios. Then, all right, sounds good. So let's say you're working for a company, and they need to store a massive amount of data, like terabytes or even petabytes, okay? And they need to be able to access it very quickly, like within milliseconds, right? What kind of S3 solution would you recommend? So
Kelly 10:24
in that case, you'd want to go with S3 standard, okay, because that offers the lowest latency, right? And it's designed for frequently accessed data. Got it, but it's also the most expensive storage class, right? So you'd have to weigh the cost versus the performance. Yeah,
Chris 10:40
that makes sense, yeah. Okay, so let's say we have a different scenario. Okay, this company has a lot of data that they need to store, right? But they don't need to access it very often, maybe like once a year or even less got it. What kind of storage class would you recommend in that case?
Kelly 10:56
So for that, you'd probably want to go with S3 glacier deep archive. Okay, that's the cheapest storage class, right? But it has the highest latency, so it can take hours or even days to retrieve your data,
Chris 11:08
okay, so if we really need to access the data quickly, right, then glacier deep archive is not the right choice, exactly. Okay, cool. So let's try another scenario. Okay, this company needs to store some sensitive data, and they need to make sure that it's encrypted. What are some of the options that they have for encrypting their data in S3 so
Kelly 11:30
there's a couple different options. One is server side encryption, okay, and there's a couple different flavors of that. Got it. You can use SSE, S3 okay, which is the simplest option, right? AWS manages the encryption keys for you got it. Or you can use SSE KMS, okay, which gives you more control over the encryption keys, okay. And then there's also client side encryption, okay, which means that you manage the encryption keys yourself. Got
Chris 11:57
it? Okay, so there's a lot of different options for encryption, and it really depends on the company's security requirements exactly. Okay, cool. So I think we've covered a lot of ground today. We have we've talked about what S3 is, why it's important some of the key features, right, and how to prepare for the exam. So I hope you found this Jeep dive helpful, me too, and good luck with your studies. Good luck everyone. You.