Skip to content
PERF07

PERF07-BP02 - Use monitoring solutions to understand where performance is most critical

Implementation Guidance

Use “Use monitoring solutions to understand where performance is most critical” to make architecture and operational choices based on evidence rather than assumptions. Define clear criteria, collect current-state data, and compare options against business outcomes and risk tolerance.

For the question “How do you monitor your resources to ensure they are performing?”, define measurable outcomes, assign owners, and review execution regularly. Integrate this practice into delivery and operations processes so improvements persist as workloads and requirements evolve.

Key Steps

  1. Define decision criteria and scope:

    • Document what “Use monitoring solutions to understand where performance is most critical” must address in your environment
    • Set quantitative and qualitative criteria for comparing options
    • Identify stakeholders who approve final decisions
  2. Perform analysis with objective evidence:

    • Collect metrics, constraints, and dependency data from current operations
    • Compare alternatives against reliability, performance, and risk outcomes
    • Record tradeoffs and assumptions in an architecture decision log
  3. Operationalize and revisit decisions:

    • Implement selected decisions with explicit owner accountability
    • Define review triggers for demand, risk, or architecture changes
    • Update standards and patterns based on observed outcomes

Risk / Impact

Level of risk if not implemented: Medium

Impact: Without this best practice, workloads typically accumulate inefficiencies and execution drift that increase failure probability over time. Problems often surface during traffic spikes, major releases, or dependency failures.

Benefits of implementation:

  • More predictable operational and engineering outcomes
  • Better alignment between architecture decisions and business goals
  • Continuous improvement through measurable feedback loops

AWS Services to Consider

Amazon CloudWatch

Collects metrics, logs, and alarms that support operational insight and performance management.

AWS X-Ray

Traces distributed requests to identify latency sources and dependency failures.

Amazon EventBridge

Routes events and triggers automation workflows for rapid operational response.

AWS Systems Manager

Provides automation, inventory, and operational runbooks for day-2 management.