Mobycast

{{ show.title }}Trailer Bonus Episode {{ selectedEpisode.number }}
{{ selectedEpisode.title }}
|
{{ displaySpeed }}x
{{ selectedEpisode.title }}
By {{ selectedEpisode.author }}
Broadcast by

Summary

Just as the Twelve-Factor App methodology was born from real world experience of deploying successful apps at Heroku, architects at AWS created the Well-Architected Framework to document best practices they observed when running workloads in the cloud.

Originally announced as a whitepaper in October 2015, the Well-Architected Framework got center stage treatment at re:Invent 2016 during Werner Vogels' keynote address. Since then, it has evolved to become an indispensable resource when building and running workloads in AWS.

But the Well-Architected Framework is massive, consisting of 6 core whitepapers that total over 400 pages. It would be easy to dismiss it as just another boring set of documents. But doing so would be a big mistake. There is a lot of gold to be found if you are willing to do some digging.

In this episode of Mobycast, Jon and Chris kick off a three-part series where we grab our shovels and dig deep into the Well-Architected Framework and explain how you can best take advantage of this important resource.

Show Notes

In this episode, we cover the following topics:
  • AWS Well-Architected Framework
    • Provides consistent approach to evaluating systems against cloud best practices
    • Helps advise changes necessary to make specific architecture align with best practices
    • Comprised of 3 components:
      • Design Principles
      • Pillars
        • Operational Excellence
        • Security
        • Reliability
        • Performance Efficiency
        • Cost Optimization
      • Questions
  • General design principles
    • Cloud-native has changed everything. In cloud, you can:
      • Stop guessing capacity needs
      • Test at scale
      • Automate all the things to make experimentation easier
      • Allow for evolutionary architectures (you are never stuck with a particular technology)
      • Drive architectures using data (allows you to make fact based decisions on how to improve your workload)
      • Improve through game days
  • Pillars in depth
    • Operational Excellence
      • "Ability to run and monitor systems to deliver business value and to continuously improve supporting processes and procedures"
      • Design principles
        • Perform operations as code
        • Annotate documentation
        • Make frequent, small, reversible changes
        • Refine operations procedures frequently
        • Anticipate failure
        • Learn from all operational failures
      • Key service: CloudFormation
      • Focus areas
        • Prepare
          • Services: AWS Config, AWS Config Rules
        • Operate
          • Services: CloudWatch, X-Ray, CloudTrail, VPC Flow Logs
        • Evolve
          • Services: Elasticsearch (for searching log data to gain insights), CloudWatch Insights
      • Best practices
        • Prepare
          • Implement telemetry for:
            • Application
            • Workload
            • User activity
            • Dependencies
          • Implement transaction traceability
        • Operate
          • Any event for which you raise an alert should have associated runbook
            • Runbook defines triggers for escalations
          • Users should be notified when system is impacted
          • Communicate status through dashboards
            • Provide dashboards to communicate the current operating status of the business and provide metrics of interest
        • Evolve
          • Feedback loops
            • Identify areas for improvement
            • Gauge impact of changes to the system (i.e. did it make an improvement?)
            • Perform operations metrics reviews
              • Retrospective analysis of operations metrics
                • Use these reviews to identify opportunities for improvement, potential courses of action, and share lessons learned
      • Key points
        • Runbooks, playbooks
        • Document environments
        • Make small changes through automation
        • Monitor workload with business metrics
        • Exercise your response to failures
        • Have well-defined escalation management
  • In future episodes, we'll cover the remaining 4 pillars

Links
Whitepapers

End Song:
30 Days & 30 Nights by Fortune Finder

For a full transcription of this episode, please visit the episode webpage.

We'd love to hear from you! You can reach us at:

What is Mobycast?

A Podcast About Cloud Native Software Development, AWS, and Distributed Systems