Skip to content
SUS04

SUS04-BP05 - Remove unneeded or redundant data

Implementation Guidance

“Remove unneeded or redundant data” helps remove wasted effort, unused capacity, and inefficient patterns that degrade cost and performance outcomes. Focus on continuous tuning backed by observed workload behavior rather than one-time adjustments.

For the question “How do you take advantage of data access and usage patterns to support your sustainability goals?”, define measurable outcomes, assign owners, and review execution regularly. Integrate this practice into delivery and operations processes so improvements persist as workloads and requirements evolve.

Key Steps

  1. Identify optimization targets:

    • Profile current-state systems related to “Remove unneeded or redundant data”
    • Prioritize bottlenecks and waste by business impact
    • Define target utilization and performance goals
  2. Apply targeted improvements:

    • Implement architectural, configuration, or code-level optimizations
    • Use staged rollout to verify gains and limit risk
    • Capture before-and-after metrics for each change
  3. Sustain gains over time:

    • Automate periodic review and regression detection
    • Retire ineffective optimizations and scale successful patterns
    • Continuously refine targets as workload characteristics evolve

Risk / Impact

Level of risk if not implemented: Medium

Impact: Without this best practice, workloads typically accumulate inefficiencies and execution drift that increase failure probability over time. Problems often surface during traffic spikes, major releases, or dependency failures.

Benefits of implementation:

  • More predictable operational and engineering outcomes
  • Better alignment between architecture decisions and business goals
  • Continuous improvement through measurable feedback loops

AWS Services to Consider

Amazon S3

Delivers durable object storage with lifecycle controls for efficient data management.

AWS Glue

Automates data cataloging and ETL workflows for efficient data processing.

Amazon Athena

Queries data in S3 with serverless SQL for analytics and reporting.

Amazon EMR

Runs scalable big data processing frameworks for batch and streaming workloads.