AWS Cost Saving Toolbox

AWS Cost Saving Toolbox #

When organizations move their workloads to the cloud, they often do it at least partially because of the Cloud Economics benefits that it can bring. However, at a later point in their cloud journey they often find that they’re paying more than they expected.

I’m writing a draft post on how that journey looks at times, but a component of that is a toolbox of cost saving techniques.

Overall Approach #

Before jumping to using any of these tools, it’s a good idea to understand your cloud spending and workloads better first. You could start with your cloud bill and play with the AWS Cost Explorer.

As you explore the tools described below, you could project the impact and effort it would generate in your use case, and prioritize your efforts.

Compute #

Savings Plan #

This is the easiest to do, in the sense that it does not require teams to change in any way, and it could result in at least 10% savings.

The savings plan just needs an estimation / analysis on how much compute spending you typically have (in $/hour), and make a commitment to spend that much. The commitment term is similar to Reserved Instances - 1 or 3 years.

If your compute spending is at or below that commitment, you pay the reduced rate. Anything above will have on-demand pricing.

The Savings Plan is a fair bit more flexible than Reserved Instances. For example, it works across EC2 instance types, and supports some other compute services like Fargate and Lambda.

How much savings? #

It depends on your workload, but the AWS Console has a Savings Plan recommendation page that will give you a sense of how much you will save with various options.

Reserved Instances (RIs) #

This is similar to the Savings Plan, but has a little less flexibility, depending on the type you choose.

There are 3 types: Standard, Convertible, and Scheduled.

The Standard option ties you to a specific instance family, whereas the Convertible one doesn’t, but at a price.

The Scheduled RIs are as you may guess - it’s suitable for workloads that need to run only for e.g. a few hours a week / month / year, but at predictable times.

Spot Instances #

This has the largest potential impact, but depending on how your infrastructure is set up, it may come with quite a bit of effort to take advantage of.

Spot Instances are instances that AWS makes available to you at a steep discount, with the caveat that AWS may need to reclaim back that instance. When it does that, you will be given 2 minutes to wrap up before the instance is reclaimed.

The large potential impact come from how most modern workloads are stateless and fault-tolerant, i.e. if they get pre-empted, they can pick back up right where they started. Microservices are mostly stateless, and so are worker nodes of big data platforms like Spark and Hadoop, as well as CI/CD build agents.

The effort required #

The effort required depends on how your infrastructure is set up. For example, if you operate an EKS cluster that multiple teams use, you could add spot instances to the cluster with little effect to those teams. Otherwise, if teams use ASGs for example, they will need to configure their own ASGs to utilize spot instances (typically in conjunction with some on-demand ones).

Storage #

Right-sizing block storage (EBS) #

Unlike most cloud services that bill by usage, EBS bills you by how much you provision, regardless of how much you use. Did you overprovision your EBS volumes?

Decreasing EBS volume size is not straightforward

Decreasing the size of an EBS volume is not supported. You have to create a smaller volume and migrate your data to it.

In contrast, increasing the size of the EBS volume is supported and pretty straightforward to do.

Take this into acccount when deciding how much volume to provision.

Move cold data to cold storage #

A common anti-pattern: cold data is placed in storage designed for frequently-accessed data (multiple times a day). It is typically fine to start with, but this gets expensive relatively quickly.

AWS’s file and object storage services have support (reduced rate) for data that’s infrequently accessed, e.g. S3 Intelligent Tiering and IA (Infrequently Accessed), Glacier, and EFS IA.

Cost Awareness #

While not strictly a cost saving technique, cost awareness is a significant win that cloud providers enable, although it requires a fair bit of effort and even some culture change. Done well, teams will have enough information to factor in cost in the consideration for designing systems, and the business will have a better understanding of their unit costs (e.g. ‘cost per search’ for an e-commerce site), which will enable them to direct their investments more efficiently.

The main tools for these are:

  • tagging
  • cost allocation / budgeting
  • familiarity with cost-related tools on AWS (e.g. the bill, AWS Cost Explorer, custom reports / dashboards)
  • teams having accountability of their spending