PERF02 - How do you select your compute solution?
Best Practices
Best Practices
This question includes the following best practices:
- PERF02-BP01: Select the best compute options for your workload
- PERF02-BP02: Understand the available compute configuration and features
- PERF02-BP03: Collect compute-related metrics
- PERF02-BP04: Configure and right-size compute resources
- PERF02-BP05: Scale your compute resources dynamically
- PERF02-BP06: Use optimized hardware-based compute accelerators
Key Concepts
Performance Architecture Fundamentals
Compute model selection: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Right sizing strategy: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Elastic scaling: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Optimization and Operations
Runtime optimization: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Placement strategy: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Cost-performance tracking: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Implementation Approach
1. Assess compute requirements
- Classify workloads as batch, latency-sensitive, or event-driven
- Identify OS and runtime dependencies
- Define startup time and scaling responsiveness targets
- Estimate CPU, memory, and network profiles
2. Choose compute services
- Compare EC2, containers, and serverless options
- Map workload constraints to service capabilities
- Select instance families optimized for workload type
- Use managed orchestration where possible
3. Implement scaling controls
- Configure autoscaling policies from demand metrics
- Set minimum and maximum capacity guardrails
- Use warm pools or provisioned concurrency where needed
- Monitor scaling events and performance outcomes
4. Optimize continuously
- Analyze rightsizing recommendations regularly
- Tune runtime parameters and JVM or language settings
- Review placement strategy and affinity constraints
- Adopt newer compute generations for better efficiency
AWS Services to Consider
Amazon EC2
Offers flexible instance families so you can match CPU, memory, storage, and network characteristics to workload needs.
Amazon ECS
Runs containerized workloads with managed scheduling and scaling for efficient compute utilization.
Amazon EKS
Provides managed Kubernetes control planes for container orchestration with high availability options.
AWS Lambda
Runs event-driven code without managing servers, ideal for automation and on-demand operational workflows.
Amazon EC2 Auto Scaling
Adjusts compute capacity automatically based on demand and policies to keep latency and utilization in target ranges.
AWS Compute Optimizer
Analyzes usage telemetry and recommends resource sizing adjustments to improve performance and efficiency.
Common Challenges and Solutions
Challenge: Overprovisioned compute resources
Solution: Use telemetry-based rightsizing and autoscaling to align capacity with demand.
Challenge: Slow scale-out during spikes
Solution: Tune scaling signals and pre-warm capacity for predictable high-traffic windows.
Challenge: Complex operational burden
Solution: Prefer managed compute abstractions when they meet workload requirements.