SUS04 - How do you take advantage of data access and usage patterns to support your sustainability goals?

Best Practices

SUS04-BP01 BP01 - Implement a data classification policy SUS04-BP02 BP02 - Use technologies that support data access and storage patterns SUS04-BP03 BP03 - Use policies to manage the lifecycle of your datasets SUS04-BP04 BP04 - Use elasticity and automation to expand block storage or file system SUS04-BP05 BP05 - Remove unneeded or redundant data SUS04-BP06 BP06 - Use shared file systems or storage to access common data SUS04-BP07 BP07 - Minimize data movement across networks SUS04-BP08 BP08 - Back up data only when difficult to recreate

Best Practices

This question includes the following best practices:

Key Concepts

Sustainability Design Foundations

Data access optimization: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Storage lifecycle design: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Data minimization: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Operational Sustainability Controls

Query efficiency: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Data movement reduction: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Retention governance: Use this concept to guide architecture and operating decisions for this question area. Define measurable targets, assign clear ownership, and review results regularly against expected business outcomes.

Implementation Approach

1. Understand data usage

Classify data by access frequency and criticality
Identify hot, warm, and cold datasets
Map expensive data movement paths
Define retention and deletion requirements

2. Optimize storage and access

Place hot data on performant tiers and archive cold data
Use lifecycle policies for automated transitions
Cache repeated reads and precompute frequent aggregations
Reduce redundant data copies across environments

3. Improve processing efficiency

Tune queries and partition strategies
Process data incrementally rather than full scans
Run batch processing during efficient windows
Use compression and efficient formats for analytics

4. Govern and refine continuously

Audit retention policy adherence
Monitor cost and performance of data workflows
Retire stale datasets and unused pipelines
Update access patterns as application behavior changes

AWS Services to Consider

Amazon S3

Delivers highly durable object storage with storage classes and lifecycle controls for performance and cost optimization.

AWS Glue

Builds and automates data cataloging and ETL pipelines to improve data processing efficiency.

Amazon Athena

Runs serverless SQL queries on data in S3 for analytics and operational reporting.

Amazon EMR

Runs scalable big data frameworks for batch and streaming data workloads.

Amazon CloudWatch

Collects metrics, logs, alarms, and dashboards so teams can detect issues early and track operational outcomes.

Common Challenges and Solutions

Challenge: Cold data kept on high-performance tiers

Solution: Automate tiering and lifecycle policies based on access telemetry.

Challenge: Large repeated full-table scans

Solution: Adopt partitioning, pruning, and incremental processing techniques.

Challenge: Data sprawl across environments

Solution: Use governance controls and retention enforcement to remove unnecessary copies.

SUS04 - How do you take advantage of data access and usage patterns to support your sustainability goals?

Best Practices

Best Practices

Key Concepts

Sustainability Design Foundations

Operational Sustainability Controls

Implementation Approach

1. Understand data usage

2. Optimize storage and access

3. Improve processing efficiency

4. Govern and refine continuously

AWS Services to Consider

Amazon S3

AWS Glue

Amazon Athena

Amazon EMR

Amazon CloudWatch

Common Challenges and Solutions

Challenge: Cold data kept on high-performance tiers

Challenge: Large repeated full-table scans

Challenge: Data sprawl across environments

Related Resources

Related Resources