Skip to content

SUS04 - How do you take advantage of data access and usage patterns to support your sustainability goals?

Best Practices

Best Practices

This question includes the following best practices:

Key Concepts

Sustainability Design Foundations

Data access optimization: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Storage lifecycle design: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Data minimization: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Operational Sustainability Controls

Query efficiency: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Data movement reduction: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Retention governance: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Implementation Approach

1. Understand data usage

  • Classify data by access frequency and criticality
  • Identify hot, warm, and cold datasets
  • Map expensive data movement paths
  • Define retention and deletion requirements

2. Optimize storage and access

  • Place hot data on performant tiers and archive cold data
  • Use lifecycle policies for automated transitions
  • Cache repeated reads and precompute frequent aggregations
  • Reduce redundant data copies across environments

3. Improve processing efficiency

  • Tune queries and partition strategies
  • Process data incrementally rather than full scans
  • Run batch processing during efficient windows
  • Use compression and efficient formats for analytics

4. Govern and refine continuously

  • Audit retention policy adherence
  • Monitor cost and performance of data workflows
  • Retire stale datasets and unused pipelines
  • Update access patterns as application behavior changes

AWS Services to Consider

Amazon S3

Delivers highly durable object storage with storage classes and lifecycle controls for performance and cost optimization.

AWS Glue

Builds and automates data cataloging and ETL pipelines to improve data processing efficiency.

Amazon Athena

Runs serverless SQL queries on data in S3 for analytics and operational reporting.

Amazon EMR

Runs scalable big data frameworks for batch and streaming data workloads.

Amazon CloudWatch

Collects metrics, logs, alarms, and dashboards so teams can detect issues early and track operational outcomes.

Common Challenges and Solutions

Challenge: Cold data kept on high-performance tiers

Solution: Automate tiering and lifecycle policies based on access telemetry.

Challenge: Large repeated full-table scans

Solution: Adopt partitioning, pruning, and incremental processing techniques.

Challenge: Data sprawl across environments

Solution: Use governance controls and retention enforcement to remove unnecessary copies.