SUS04-BP03 - Use policies to manage the lifecycle of your datasets
One-Click Remediation
Deploy CloudFormation stacks to implement this best practice with a single click.
Stacks deploy to your AWS account. Review parameters before creating. Standard AWS charges apply.
Implementation Guidance
“Use policies to manage the lifecycle of your datasets” helps remove wasted effort, unused capacity, and inefficient patterns that degrade cost and performance outcomes. Focus on continuous tuning backed by observed workload behavior rather than one-time adjustments.
For the question “How do you take advantage of data access and usage patterns to support your sustainability goals?”, define measurable outcomes, assign owners, and review execution regularly. Integrate this practice into delivery and operations processes so improvements persist as workloads and requirements evolve.
Key Steps
-
Identify optimization targets:
- Profile current-state systems related to “Use policies to manage the lifecycle of your datasets”
- Prioritize bottlenecks and waste by business impact
- Define target utilization and performance goals
-
Apply targeted improvements:
- Implement architectural, configuration, or code-level optimizations
- Use staged rollout to verify gains and limit risk
- Capture before-and-after metrics for each change
-
Sustain gains over time:
- Automate periodic review and regression detection
- Retire ineffective optimizations and scale successful patterns
- Continuously refine targets as workload characteristics evolve
Risk / Impact
Level of risk if not implemented: Medium
Impact: Without this best practice, workloads typically accumulate inefficiencies and execution drift that increase failure probability over time. Problems often surface during traffic spikes, major releases, or dependency failures.
Benefits of implementation:
- More predictable operational and engineering outcomes
- Better alignment between architecture decisions and business goals
- Continuous improvement through measurable feedback loops
AWS Services to Consider
Amazon S3
Delivers durable object storage with lifecycle controls for efficient data management.
AWS Glue
Automates data cataloging and ETL workflows for efficient data processing.
Amazon Athena
Queries data in S3 with serverless SQL for analytics and reporting.
Amazon EMR
Runs scalable big data processing frameworks for batch and streaming workloads.