Don't just learn the cloud—BYTE it!
Byte the Cloud is your go-to, on-the-go, podcast for mastering AWS, Azure, and Google Cloud certifications and exam prep!
Chris 0:00
Welcome back to the deep dive. Today we're going deep on Amazon S3 glacier.
Kelly 0:08
Ooh, S3 glacier,
Chris 0:09
yeah, it's a service that you know might sound a little niche, right, but I think it's super relevant. Yeah? Anyone working with Cloud Storage Absolutely, especially for you mid level cloud engineers out there, listening, yeah, if
Kelly 0:22
you're prepping for those AWS exams, exactly, this is one you don't want to miss, for sure.
Chris 0:26
So we're going to break down exactly what it is, why it matters, and how to use it effectively. That's
Kelly 0:32
the goal. And then, of course, we'll wrap things up, yeah, with some targeted Exam Prep, definitely
Chris 0:36
got to make sure you're ready to ace those
Kelly 0:38
tests Exactly. So in the world of cloud computing, you know, we're always trying to optimize costs, right? Without, you know, compromising security, Oh, for
Chris 0:48
sure, durability, yeah, and that's where S3 glacier comes in.
Kelly 0:51
That's it. Think of it as like, you're super secure, ultra durable and incredibly cost effective. Cold storage solution. Cold storage. Yes, for data you need to keep safe, but you don't really access very often, okay,
Chris 1:05
so for data, we don't access frequently, but like, what kind of data are we talking Yeah, what kind of data give us some real world scenarios where S3 glacier would be the perfect fit, all right,
Kelly 1:17
so let's say you work for a media company, okay, and they have like, years of video footage, yeah, tons of it, audio recordings, huge media library, huge media libraries Exactly. They need to keep it
Chris 1:31
for a long time, right? Preservation future use exactly, but they don't need to look at
Kelly 1:35
it every day. Makes sense. So S3 glacier is perfect for that. Okay? You can store those massive files for way less money. Yeah, that makes sense than like, traditional storage, right?
Speaker 1 1:47
So way cheaper, way cheaper. It's just as safe, just as safe, just as durable. And then what other examples you got? Let's
Kelly 1:53
say a healthcare organization, okay, needs to keep patient records. Oh,
Chris 1:57
yeah, for compliance, exactly,
Kelly 1:58
compliance, you know, IPA Exactly. They don't need to access those records all the time, right, but they
Chris 2:05
need to keep them, but they got to keep them safe and be able to get them if they need them
Kelly 2:08
exactly. And S3 glacier is perfect for that
Chris 2:10
too. Okay, so we're talking long term storage for compliance, archiving media files, things like that. Yeah, that's the idea. But what makes S3 glacier so special. What makes it special? Yeah, like, what are the features and benefits that really set it apart?
Kelly 2:25
So one of the best things about S3 glacier is its durability. Okay, durability designed for 11 nines of durability, 11 nines, 11 nines. What does that even mean? It means your data is practically indestructible, wow, like losing one object out of 10 billion. Oh, wow. Okay, that's that's pretty good. It's pretty good. And it also has really strong security features, like, what, like, access control policies using im Oh, yeah, IMF, you can encrypt your data too. Okay, so encryption as well, with server side encryption options like SSE, KMS got it, and it has compliance certifications, like, yeah, oh, so
Chris 3:01
it's HIPA compliant for healthcare data, yeah, okay, wow. 11 nines of durability. That's practically unheard of.
Kelly 3:07
It's amazing. And
Chris 3:07
you mentioned security and compliance, which are obviously super important, oh yeah, mission critical, especially for sensitive data, like we were talking about healthcare records and things I got it. So it sounds like S3 glacier checks, a lot of boxes. It does a lot of boxes in terms of security, durability and cost effectiveness. Yeah, it's the trifecta. But are there any limitations, whoa, or situations where it might not be the best choice? Yeah,
Kelly 3:32
you're right. It's not one size fits all. It's made for infrequent access, right? So it's not ideal if you need to get to your data really quickly, okay, retrieval times can be anywhere from minutes to hours. Oh, well, minutes to hours depending on the option you choose. Okay, so if you're dealing with what we call hot data, hot data, you need right away, yeah, S3 glacier might not be the best
Chris 3:53
fit. Okay, so it's all about choosing the right tool for the job. Yeah, know your data right? And speaking of fitting in, yeah. How does S3 glacier play with the rest of the AWS ecosystem? It plays really well. Okay, good. Integrates with a lot of different AWS services. Give me some examples. Okay, so you can use S3 life cycle policies. Life Cycle policy to move your data automatically. Okay, from S3 standard or S3 intelligent tiering to S3 glacier so it can move it automatically, automatically after a certain amount of time. Oh, that's cool. Yeah, let's say you have some data in S3 standards. You know, you won't need after 90 days, got it, you can set up a life cycle policy to move it to S3 glacier automatically. Oh, that's awesome. And save you money. So I don't have to do anything. You don't want to do anything. Set it
Kelly 4:39
and forget it. Set it and forget it exactly. Automation at its finest, the best love that I'm all about efficiency as you should be so life cycle policies can move data to S3 glacier. What other integrations are important?
Chris 4:53
Another good one is AWS Snowball.
Kelly 4:56
Snowball. What's that?
Chris 4:57
It's a physical storage device. Device, a physical device. Yeah, you can request it from AWS, okay, transfer your data to the device and
Kelly 5:06
ship it back to AWS. Wow. And then they import your data right into S3 glacier.
Chris 5:12
So it's like the ultimate data mover, exactly when you need to get a ton of data into S3 glacier, quickly and securely. Quickly and securely.
Kelly 5:19
That's it. Okay? Yeah, it's especially useful if you have massive data sets and limited bandwidth. Yeah, that makes sense. Or, you know, bad internet connection, right?
Chris 5:27
Yeah, I can see how that would be helpful. Very helpful, all right, so I think we've got a pretty good understanding now, yeah,
Kelly 5:33
I think so
Chris 5:34
of what S3 glacier is all about, the basics, its strengths, its limitations, you hit all the high points. Yeah, for sure. Now let's put that knowledge to the test. Ooh, the test, yeah, with some exam style questions, let's do it. What are some things our listeners should be prepared to answer about S3 glacier? All right,
Kelly 5:52
so a very common question you might get is, what are the key features of Amazon S3 glacier that make it suitable for data archiving and long term backup.
Chris 6:03
Okay, that's a good starting point. It is a good one. It gets right to the heart of it. Straight to
Kelly 6:07
the heart. What
Chris 6:08
S3 glacier is all about? Yeah, what is it really about? So how would you answer that question? How would I answer that, yeah, hitting on all the most important points.
Kelly 6:16
Okay, so I would start by emphasizing S3 glaciers. Durability. Okay, durability, yeah, 11 nines, exactly. 11 nines, which means your data is incredibly safe, right, protected from loss. Yes, super safe, super safe. Then I would talk about the security features, okay, specifically access control policies using IAM right, which let you decide who can access the data and what they can do with it. Yeah, that's super important, super important. And of course, we got to talk about encryption. Oh, yeah, encryption is key, especially for sensitive data, for sure. So I'd mentioned the different server side encryption (SSE) options, like, what, like SSE S3 SSE KMS and SSCC. Okay, got it. And finally, I would wrap it up by talking about cost effectiveness. Oh, right, because it's super cheap, super cheap. S3 glacier is designed for infrequent access, right? Which makes it way cheaper than other options like S3 standard. Okay, so you're
Chris 7:13
getting amazing durability, robust security, yep, and all of that at a really low cost. That's amazing. It is amazing.
Kelly 7:22
Okay, so that's a great overview. That's the overview. I think that would definitely get you some points on the exam, hopefully. Now let's get into a more specific question, okay, me, how about this one? Oh, right. Explain the different retrieval options available in S3 glacier, okay, and when you might use each one, all right? So this is where
Chris 7:39
it gets a little more technical. Okay, buckle up. Remember S3 glacier is optimized for infrequent access, right? So retrieval time is important, okay, there are four main retrieval options, okay, four options expedited, okay, standard got it and glacier. Okay, so four options there, four options expedited. Retrieval is the fastest, yeah, you can get your data back in minutes, minutes, which is great if you need something urgently, okay, so if
Kelly 8:05
it's an emergency, exactly an
Chris 8:06
emergency, go for expedited. Expedited
Kelly 8:08
is your friend. Then we got standard retrieval standard, which is a little slower, usually it takes three to five hours, okay, bulk retrieval. Bulk retrieval is great for large scale retrieval jobs, no, that can take five to 12 hours. Got it, but it's the cheapest per gigabyte. Oh, so
Chris 8:26
if you're retrieving a lot of data, a lot of data, go
Kelly 8:29
with bulk. Bulk is the way to go. Okay. And then lastly, we have glacier retrieval. Glacier retrieval, which is the slowest, slowest but the most economical option, okay, takes 12 to 48
Chris 8:41
hours. Wow, 12 to 48 hours. So
Kelly 8:43
if you're not in a hurry, right? If you can wait and you want to save money, yeah, Glacier retrieval is the way to go. Got it. So
Chris 8:50
we've got four options, expedited for speed, speed standard for a balance of speed and cost, balance bulk for large retrievals, at the lowest cost per gigabyte, lowest cost and glacier. For when cost is the main concern and speed isn't a factor, that's it. Okay. I think our listeners will appreciate having that breakdown. It's important. It's really about matching the retrieval option
Kelly 9:12
to the specific use case. Exactly know your use case. Now
Chris 9:16
for another common exam question hit me, how does S3 glacier differ from Amazon S3
Kelly 9:21
okay, in
Chris 9:22
terms of cost and access patterns.
Kelly 9:24
Ooh, that's a good one. Yeah, I
Chris 9:26
like this one. This
Kelly 9:27
is all about understanding the trade offs, trade offs between cost and speed. Okay, so S3 glacier, as we said, is designed for infrequent access, right? Which makes it way cheaper than Amazon S3 okay, so cheaper but slower, cheaper but slower. You can store tons of data, okay, for much less money. Got it. But the trade off is retrieval time. There it is. The trade off, Amazon, S3 you can access your data instantly, right? But S3 glacier, you gotta wait
Chris 9:54
minutes to hours, minutes to hours, depending on the retrieval option, exactly,
Kelly 9:57
okay. So it comes down to how quickly. You need your data and how much you want to pay for that speed,
Chris 10:03
right? It's that classic balance, classic balance between cost and performance. You got
Kelly 10:07
it. Sometimes you need the data right away. Right now, you're
Chris 10:11
willing to pay a premium, premium for
Kelly 10:13
speed. Other
Chris 10:14
times, you can afford to wait a bit and save some money. I think we've covered a lot of ground here. We've covered a lot. Let's continue this exam prep in part two of our S3 glacier Deep Dive. Sounds good. I'll
Kelly 10:26
see you there.
Chris 10:26
All right. We'll dive into some more specific questions and scenarios, more questions, more scenarios, to make sure our listeners are ready for anything those AWS exams might throw their way anything until then. Happy studying.
Kelly 10:40
Happy studying.
Chris 10:41
Welcome back to the deep dive. We're picking up where we left off with Amazon S3 glacier. And this time we're going even deeper into the technical details, nitty gritty, yeah, and how you can actually use this service in the real world, in the real world, effectively. So in the last part, right? We talked about how S3 glacier is incredibly durable and cost effective, yeah? For data you don't access every day, right? Infrequent access Exactly. Now, let's unpack some of those finer points. Okay, the details, yeah, that might actually trip you up. Ooh, on the AWS exams. Wanna watch out for those, one of the areas that I know I struggled with when I was first learning about S3 glacier. Okay, yeah, what's that is this whole concept of Yeah, vaults and archives,
Kelly 11:24
vaults and archives, oh, yeah, that can be confusing.
Chris 11:27
Can you break that down for our listeners? Sure, maybe using, like an analogy to make it a little clearer.
Kelly 11:32
All right, so imagine a library, okay, a library, the library itself is like an S3 bucket, okay? And within the library you have different sections, right? Different sections like fiction, non fiction archives, yeah, those sections are like vaults in S3 glacier,
Chris 11:48
okay, so the vault is like a section, like a section within our S3 library,
Kelly 11:52
exactly. And each vault can have different access policies, okay, and retrieval settings got it so you have granular control over who can access what and how. So
Chris 12:03
the vault is like a specialized section, specialized section. And within each vault we have, we have archives.
Kelly 12:08
Archives. Think of those as the individual books, the books in a library section. Okay, each archive is a single unit of data, okay, a single unit of data that you store in S3 glacier got it, and it can be up to 40 terabytes in size. Wow,
Chris 12:26
40 terabytes. That's a big book. This is a huge book, and
Kelly 12:30
each archive has a unique ID, okay, which is like the call number of a book. The call number, that's how you find it later. Oh, okay, that makes sense. You got your library, your sections, your books, your call numbers. I
Chris 12:41
love that analogy. It
Kelly 12:42
all comes together. It
Chris 12:42
makes so much more sense. Now wait to hear it, and I can see how understanding that distinction between vaults and archives is really crucial. Oh yeah, for using S3 glacier effectively, absolutely. And for acing those AWS exams, of course, got to pass those exams. All right, let's dive back into some exam style questions hit me with them. Imagine you get this question on the exam. Okay, you have a critical database backup, okay, critical database that needs to be stored in S3 glacier. Got it. You need to be able to retrieve this data within minutes, in
Kelly 13:12
minute, in case of a disaster, oh, a disaster scenario, yeah.
Chris 13:15
How would you configure your vault and retrieval setting? Okay, configuration to meet these requirements, meet
Kelly 13:21
those requirements. Okay, so first things first, yeah, strong access controls, of course, using IAM right, make sure only the right people can get to that backup data. Super important, super important. And then you need to enable expedited retrieval, expedited retrieval on the vault, okay, this lets you get your data back in minutes, minutes, just like you need for a disaster. So
Chris 13:45
expedited retrieval is like the express lane, express for getting your data back from S3 glacier fast track. It might cost a little more, a little bit more, but in this
Kelly 13:53
case, yeah, it's worth it. Worth it for the peace of mind,
Chris 13:57
knowing that you can restore that critical database back up quickly, quickly and reliably. Okay. What if, instead of needing the data immediately, okay. Different scenario, you have a company that needs to store financial records, financial records for seven years, 10 years, okay? For compliance purposes, compliance and they want to use the most cost effective storage option, the cheapest option in S3 glacier. What would you recommend
Kelly 14:22
in that case? I would say S3 glacier, deep archive. Deep archive, it's the cheapest storage class within S3 glacier.
Chris 14:30
Okay, so even cheaper than regular S3 glacier,
Kelly 14:33
even cheaper. Designed for data that needs to be kept for years, okay, but doesn't need to be accessed very often. Perfect
Chris 14:39
for those financial records. Metric for that they can store them for the required seven years. Seven years, no problem without breaking the bank. Breaking the bank. Now let's say you're working with a health care organization, okay, health care that needs to store sensitive patient data in S3 glaciers, sensitive data, all right, what steps would you take to make sure they're complying with HIPA regulations?
Kelly 14:59
Yes, ah, IPA compliance, yeah, very important for patient data, of course. So S3 glacier actually has the tools you need to meet those requirements. Okay, I would recommend server side encryption. Okay, encrypt using AWS KMS, managed keys or SSE KMS. SSE KMS got it. This keeps the data encrypted at rest, okay, which is a HIPA requirement. So it's encrypted
Chris 15:26
even when it's just sitting there, even when it's just sitting there in S3 glacier
Kelly 15:29
exactly, and then strong access control policies. I
Chris 15:34
am with, I am, of course, so only authorized people can get to the data. That's the key. It's like a digital lock and key system exactly for that sensitive patient information, keeping it safe. So it's not just about choosing S3 glacier, right? It's about configuring it correctly, configuring it right to meet those compliance standards.
Kelly 15:52
Absolutely got to follow the rules. That's a crucial takeaway
Chris 15:55
for our listeners, for sure. Okay, here's another one for you. All right, bring it on. A company is migrating a huge amount of data, huge amount of data, okay, from their own service and premises to S3 glacier to the cloud. What service would you recommend for a secure and efficient transfer
Kelly 16:11
for that, I would recommend AWS snowball, Snowball,
Chris 16:16
our trusty friends, that
Kelly 16:18
physical device, that physical device. So
Chris 16:21
we're shipping hard drives around. Basically,
Kelly 16:22
you request it from AWS, okay, transfer your data to the device, ship it back to AWS. Got it. They import the data right into S3 glacier. So it's super secure, super secure, and often much faster, much faster than transferring over the internet, especially if you have limited bandwidth, right? Yeah, makes sense. So Snowball is your friend. For big migrations, it's
Chris 16:45
like a dedicated data courier service making those large data transfers Easy, easy peasy. Okay, I think our listeners are starting to see, I hope so, how all these different AWS services all the pieces fit together to create these comprehensive solutions. It's all connected. So for our final question, Okay, last one, how would you describe the role of Amazon S3 glacier? Okay, the big picture in a modern data management strategy in a modern world. So like, what's its place? Where
Unknown Speaker 17:15
does it fit in?
Chris 17:16
Yeah, where does it fit in?
Kelly 17:17
So I would say the S3 glacier is a key part of a tiered storage approach. Tiered storage you might have your frequently accessed data, your
Chris 17:26
hot data, your
Kelly 17:27
hot data in Amazon S3 okay, less frequently used data in S3 glacier got and then your long term archival data in S3 glacier, deep archive, deep archive, okay, this lets you optimize storage costs right pick the right tool for the job exactly, and make sure your data is available when you need it, but also saving money, but also saving money. That's the key. It's
Chris 17:50
all about choosing the right storage class for the right data at the right time. That's
Kelly 17:54
the key takeaway.
Chris 17:56
I love that. That's a fantastic summary. Thanks. I think we covered a lot of ground in this segment, a lot of ground exploring, vaults, archives, retrieval options, compliance considerations. We hit it all data migration strategies, all the good stuff. I think our listeners are well on their way to mastering S3 glacier, I hope so. And acing those AWS exams. Fingers crossed. Welcome back to the deep dive. We've been talking all about Amazon S3 glacier, you know, exploring its features and benefits and limitations and how it filled into the AWS world. But now in this last part, let's shift gears a bit and talk about, all right, shifting gears why this service is so valuable, like, in a practical sense, real world, yeah, real world scenarios where S3 glacier really shines, okay, real world
Kelly 18:43
examples, yeah, give us some good ones. All right, so one area where Esther glacier is really useful is in media asset management.
Chris 18:50
Media asset management, yeah, think about those big media companies,
Kelly 18:53
okay, with tons of video footage, audio recordings, images, Yeah, huge libraries, huge libraries, they need to archive this stuff, you know, for preservation, for preservation, exactly, long term, future use, future use, maybe.
Chris 19:07
But they don't need to access it all the time, not all the time, just every once in a while, exactly.
Kelly 19:10
So S3 glacier is perfect for that. Okay, cost effective way to store all those massive media assets.
Chris 19:18
Make sense? Keep those costs down. Keep those costs down without, you know, compromising security or liability. You don't want to lose that stuff. What other industries is it good for?
Kelly 19:27
Another good example is scientific research. Oh, yeah, scientific research. Scientists work with huge data sets, lots of data from experiments, simulations, observations. They need to keep this data, of course, for future, analysis, validation, collaboration, right? Got to preserve that knowledge, but they don't need to access it constantly. So S3 glacier is a good fit. There too. Perfect fit. Reliable, cost effective. It's like a digital time capsule.
Chris 19:54
I like that for all this valuable data, yeah, preserving our
Kelly 19:57
knowledge, scientific data, historical records, creative work. Works all the important stuff.
Chris 20:01
Now I'm curious, yeah, about how S3 glacier integrates with other AWS services.
Kelly 20:08
Ah, integration, yeah, give some examples of how those integrations can, like, unlock even more value, more value. Yeah, more power. Okay, so one really powerful integration is with AWS Lambda. Lambda, yeah, serverless, compute, okay, you can run code without managing servers. Serverless, I love it. So you can use Lambda to trigger automated workflows based on events in S3 glacier. Give me an example. Let's say a new archive is added to your vault. You can use Lambda to automatically send a notification, oh, a notification that's you a specific team, or even trigger a data processing job so it can kick off other processes, yeah, to extract insights from the newly archived data. Wow, the possibility,
Chris 20:50
the possibilities are endless. So S3 glacier and Lambda together can really streamline things. Streamline
Kelly 20:56
and automate, free
Chris 20:58
up time for your teams. That's the cool. Working smarter, not harder. Absolutely, we've covered so much. Graham, we have covered a lot in this deep dive into Amazon, S3 glacier. A deep dive, indeed. But if you had to distill it down to one key takeaway, one
Kelly 21:14
key takeaway, what would it be? What's the most important thing to remember about S3 glacier? What to remember about S3 glacier? Yeah, what's
Chris 21:22
the big idea? I
Kelly 21:23
would say the most important thing is that S3 glacier is more than just simple storage, more than just storage. It's a tool for building cost effective, scalable, secure data management solutions. Okay? It's about understanding your data right, how you access it, your access patterns exactly, choosing the right storage class, right tool for the job, and using those integrations to unlock new possibilities.
Chris 21:48
So it's all about managing data strategically, strategically
Kelly 21:51
for the long term, for the long term, preservation, security, accessibility.
Chris 21:56
It's not just about storing the data right. It's about managing it intelligently. That's the key. And S3 glacier is a key piece of that puzzle, a crucial piece. Well, there you have it, folks, that's a wrap. We've journeyed to the world of Amazon. S3 glacier deep into the glacier, uncovering its secrets, exploring its depths all the way down. And I think we're emerging, coming out the other side with a new found appreciation for this powerful service. It is a powerful service to our listeners, yeah, we encourage you to keep exploring. AWS, keep learning. The cloud is constantly evolving, always changing. There's always something new to discover, new services, new features. So embrace the challenge. Yeah, keep experimenting, and never stop learning. Never stop learning, until next time. Happy cloud computing.
Kelly 22:44
Happy cloud computing.