Operational Excellence
Questions
6 best practices
- OPS01-BP01: BP01 - Evaluate external customer needs
- OPS01-BP02: BP02 - Evaluate internal customer needs
- OPS01-BP03: BP03 - Evaluate governance requirements
- OPS01-BP04: BP04 - Evaluate compliance requirements
- OPS01-BP05: BP05 - Evaluate threat landscape
- OPS01-BP06: BP06 - Evaluate tradeoffs while managing benefits and risks
6 best practices
- OPS02-BP01: BP01 - Resources have identified owners
- OPS02-BP02: BP02 - Processes and procedures have identified owners
- OPS02-BP03: BP03 - Operations activities have identified owners responsible for their performance
- OPS02-BP04: BP04 - Mechanisms exist to manage responsibilities and ownership
- OPS02-BP05: BP05 - Mechanisms exist to request additions, changes, and exceptions
- OPS02-BP06: BP06 - Responsibilities between teams are predefined or negotiated
7 best practices
- OPS03-BP01: BP01 - Provide executive sponsorship
- OPS03-BP02: BP02 - Team members are empowered to take action when outcomes are at risk
- OPS03-BP03: BP03 - Escalation is encouraged
- OPS03-BP04: BP04 - Communications are timely, clear, and actionable
- OPS03-BP05: BP05 - Experimentation is encouraged
- OPS03-BP06: BP06 - Team members are encouraged to maintain and grow their skill sets
- OPS03-BP07: BP07 - Resource teams appropriately
10 best practices
- OPS05-BP01: BP01 - Use version control
- OPS05-BP02: BP02 - Test and validate changes
- OPS05-BP03: BP03 - Use configuration management systems
- OPS05-BP04: BP04 - Use build and deployment management systems
- OPS05-BP05: BP05 - Perform patch management
- OPS05-BP06: BP06 - Share design standards
- OPS05-BP07: BP07 - Implement practices to improve code quality
- OPS05-BP08: BP08 - Use multiple environments
- OPS05-BP09: BP09 - Make frequent, small, reversible changes
- OPS05-BP10: BP10 - Fully automate integration and deployment
6 best practices
- OPS07-BP01: BP01 - Ensure personnel capability
- OPS07-BP02: BP02 - Ensure a consistent review of operational readiness
- OPS07-BP03: BP03 - Use runbooks to perform procedures
- OPS07-BP04: BP04 - Use playbooks to investigate issues
- OPS07-BP05: BP05 - Make informed decisions to deploy systems and changes
- OPS07-BP06: BP06 - Create support plans for production workloads
7 best practices
- OPS10-BP01: BP01 - Use a process for event, incident, and problem management
- OPS10-BP02: BP02 - Have a process per alert
- OPS10-BP03: BP03 - Prioritize operational events based on business impact
- OPS10-BP04: BP04 - Define escalation paths
- OPS10-BP05: BP05 - Define a customer communication plan for service-impacting events
- OPS10-BP06: BP06 - Communicate status through dashboards
- OPS10-BP07: BP07 - Automate responses to events
9 best practices
- OPS11-BP01: BP01 - Have a process for continuous improvement
- OPS11-BP02: BP02 - Perform post-incident analysis
- OPS11-BP03: BP03 - Implement feedback loops
- OPS11-BP04: BP04 - Perform knowledge management
- OPS11-BP05: BP05 - Define drivers for improvement
- OPS11-BP06: BP06 - Validate insights
- OPS11-BP07: BP07 - Perform operations metrics reviews
- OPS11-BP08: BP08 - Document and share lessons learned
- OPS11-BP09: BP09 - Allocate time to make improvements
The Operational Excellence pillar includes the ability to support development and run workloads effectively, gain insight into their operations, and to continuously improve supporting processes and procedures to deliver business value.
Key Areas
The Operational Excellence pillar includes the following key areas:
- Organization - How teams are structured and how they collaborate
- Prepare - Design for operations and understand workload health
- Operate - Understand workload health and achieve operational success
- Evolve - Learn, share, and continuously improve
Questions
The AWS Well-Architected Framework provides a set of questions that allows you to review an existing or proposed architecture. It also provides a set of AWS best practices for each pillar.
OPS01 - How do you determine what your priorities are?
View details →OPS02 - How do you structure your organization to support your business outcomes?
View details →OPS03 - How does your organizational culture support your business outcomes?
View details →OPS04 - How do you design your workload so that you can understand its state?
View details →OPS05 - How do you reduce defects, ease remediation, and improve flow into production?
View details →OPS06 - How do you mitigate deployment risks?
View details →OPS07 - How do you know that you are ready to support a workload?
View details →OPS08 - How do you understand the health of your workload?
View details →OPS09 - How do you understand the health of your operations?
View details →OPS10 - How do you manage workload and operations events?
View details →OPS11 - How do you evolve operations?
View details →AWS Services for Operational Excellence
AWS CloudFormation
Provides a common language to model and provision AWS and third-party resources in your cloud environment.
AWS Config
Enables you to assess, audit, and evaluate the configurations of your AWS resources.
AWS CloudTrail
Enables governance, compliance, operational auditing, and risk auditing of your AWS account.
Amazon CloudWatch
Monitors your AWS resources and the applications you run on AWS in real time.
AWS Systems Manager
Gives you visibility and control of your infrastructure on AWS.