Skip to content
OPS03

OPS03-BP03 - Escalation is encouraged

Implementation Guidance

“Escalation is encouraged” aligns people, process, and communication so operational execution remains predictable under pressure. Define responsibilities explicitly and validate that teams can execute procedures during real events.

For the question “How does your organizational culture support your business outcomes?”, define measurable outcomes, assign owners, and review execution regularly. Integrate this practice into delivery and operations processes so improvements persist as workloads and requirements evolve.

Key Steps

  1. Define communication and ownership model:

    • Clarify who is responsible for executing “Escalation is encouraged”
    • Document escalation paths and decision authority boundaries
    • Standardize communication templates for operational events
  2. Enable teams with repeatable practices:

    • Create runbooks, checklists, and onboarding materials
    • Train teams through drills, simulations, or tabletop exercises
    • Validate that procedures can be executed under time pressure
  3. Measure effectiveness and adapt:

    • Track response quality, handoff quality, and operational lead time
    • Address recurring coordination gaps with process updates
    • Share lessons learned and improvements across teams

Risk / Impact

Level of risk if not implemented: Medium

Impact: Without this best practice, workloads typically accumulate inefficiencies and execution drift that increase failure probability over time. Problems often surface during traffic spikes, major releases, or dependency failures.

Benefits of implementation:

  • More predictable operational and engineering outcomes
  • Better alignment between architecture decisions and business goals
  • Continuous improvement through measurable feedback loops

AWS Services to Consider

AWS Systems Manager Incident Manager

Coordinates incident response with predefined plans, contacts, and timelines.

Amazon CloudWatch

Collects metrics, logs, and alarms that support operational insight and performance management.

Amazon EventBridge

Routes events and triggers automation workflows for rapid operational response.

AWS Fault Injection Service

Runs controlled failure experiments to validate resilience and readiness.