PERF03 - How do you select your storage solution?
Best Practices
Best Practices
This question includes the following best practices:
- PERF03-BP01: Use a purpose-built data store that best supports your data access and storage requirements
- PERF03-BP02: Evaluate available configuration options for data store
- PERF03-BP03: Collect and record data store performance metrics
- PERF03-BP04: Implement strategies to improve query performance in data store
- PERF03-BP05: Implement data access patterns that utilize caching
Key Concepts
Performance Architecture Fundamentals
Access pattern analysis: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Storage performance tiers: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Durability requirements: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Optimization and Operations
Data lifecycle strategy: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Caching and acceleration: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Throughput management: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.
Implementation Approach
1. Profile storage workload
- Classify data as object, block, or file workloads
- Measure read/write mix and IOPS requirements
- Identify throughput, latency, and consistency expectations
- Define retention and archival requirements
2. Select fit-for-purpose storage
- Map workload needs to S3, EBS, EFS, or FSx options
- Choose storage classes and performance modes
- Plan replication and backup strategy for critical data
- Define encryption and compliance controls
3. Optimize access paths
- Use caching layers for hot data access
- Tune mount options and client configurations
- Separate high-IO and archive data tiers
- Use multipart transfers and parallelism where appropriate
4. Manage lifecycle and cost
- Automate data tiering and lifecycle transitions
- Monitor storage utilization and access trends
- Test recovery time objectives regularly
- Refine storage mix as workload behavior changes
AWS Services to Consider
Amazon S3
Delivers highly durable object storage with storage classes and lifecycle controls for performance and cost optimization.
Amazon EBS
Provides block storage options tuned for latency-sensitive and throughput-intensive workloads.
Amazon EFS
Offers shared file storage with elastic scaling for Linux workloads across multiple instances.
Amazon FSx
Provides managed high-performance file systems for specialized Windows, Lustre, NetApp ONTAP, and OpenZFS workloads.
Amazon CloudWatch
Collects metrics, logs, alarms, and dashboards so teams can detect issues early and track operational outcomes.
Common Challenges and Solutions
Challenge: Using one storage type for all workloads
Solution: Align storage choices to workload-specific access and durability needs.
Challenge: Unexpected storage latency
Solution: Benchmark with realistic patterns and tune client-side settings and throughput allocations.
Challenge: Uncontrolled data growth
Solution: Automate lifecycle policies and archival to control performance and cost over time.