Cost Optimisation tips for AWS

Cost Optimisation is one of the AWS Well-Architected Framework Pillars, and AC3 have been applying this discipline to our AWS accounts.

Where to start

Cost Explorer in the AWS console will give you a fair bit of detail, but you will need to dig around to find out what you need to know.

Select a few weeks of data and the daily view, then use the Group by feature to break the costs down by service. This will show you the AWS services that are costing you the most and start you off in the right direction.

AWS Cost Explorer, grouping by Service

Hopefully most of these services will not be a surprise to you, but you may wonder about the EC2-Other line item. This service grouping includes Load Balancers, which are usually the main cost component — to check what’s in the total, filter your report to the EC2-Other service category and group by Usage Type. The “Others” category can sometimes also be significant — you should scroll through the tabular view to see what’s contributing to that.

If you group your usage by Usage Type in Cost Explorer — it can help you identify costs being generated across all the AWS regions you have resources in. Here you can see we had forgotten to disassociate some Elastic IPs in the EU Ireland and US-West-2 regions.

The Usage Type grouping is handy for finding things in other regions.

Dates in Cost Explorer

Note that Cost Explorer (like AWS billing) is working in UTC time, so depending on where you are the dates on the graphs will not be lined up with your timezone.

Also note that billing data is only updated every six hours, so you may have to wait for a while to see the cost reductions you achieve appearing in the console.

Sneaky Services

Here are some services that might be adding to your bill without you noticing:

Unassociated Elastic IPs
Features you aren’t using (like RDS Performance Insights)
Unused KMS keys
CloudWatch logging you aren’t making use of
Unnecessary provisioned concurrency in Lambda

Many of these things are individually inexpensive and easy to miss, but they all add up! Sometimes the only way to know if they are costing you money is to go look — a clue might be an RDS instance that is costing you more than you think it should.

Do you really need those NAT Gateways?

NAT Gateways and Elastic Load Balancers are very easy to create and then forget about. By default, Control Tower will create NAT Gateways in every AZ in every region you set up — ouch!

Evaluate whether you are actually using the NAT Gateways you find (eg check metrics to see if they are actually routing any traffic) and delete them (and their Elastic IPs) if they’re not. If your workload is serverless or you’re not using private subnets you may just be paying for nothing.

If you need NAT but it doesn’t need to be highly available (HA), you can route traffic from all your AZs through one to save some money and delete the others.

With Load Balancers, consider if you can consolidate them (eg Elastic Beanstalk can now share one between multiple applications), or if you really need HA on your workload. Some might decide they rather keep the USD20/month and accept a small outage if their instance died (obviously you would choose differently if this was an important production workload).

Evaluate your Purchase Options

If you are seeing stable usage of on-demand resources that you intend to keep running, you can switch to discounted Reserved Instances (RIs). This is a great way to save up to 75% off the on-demand pricing and can be used on RDS and EC2.

Even though you can set RIs up for 3 years, we recommend only reserving them for 1 year — things change too fast in AWS and you don’t want to be locked into particular instance families for too long. A case in point is the new Graviton 2 instances — they aren’t available in every region yet, but when they are we will switch some workloads to them instead of locking in current generation instances.

To get up to 90% off EC2 on-demand running costs, you can swap to using Spot instances— these are instances that are taken from spare EC2 capacity. In theory, your Spot instance could be withdrawn at any time, but if you take a look at the Spot Instance Advisor in the region you’re using you’ll likely see that there is plenty of Spot capacity that is unlikely to be interrupted.

Unless you are running a workload that really cannot tolerate any interruption, the savings you can get from Spot are well worth considering. In my case, we have Spot instances that have been running for 7 months now without interruption.

Finally, if you have a mix of EC2 and Lambda and maybe ECS Fargate, Savings Plans is a great way to reserve flexible capacity across these services without being constrained to a single instance family.

Cleaning up

My clean up of costs saved me about USD180 per month. Not too shabby!

To achieve that, we deleted un-associated Elastic IPs, engineered a Load Balancer out of my architecture, got rid of a NAT Gateway, removed some unused KMS keys and switched to an RI for one of my RDS instances.

Have we inspired you to do a round of optimising? Do you have any tips for optimising costs on AWS? Let me know how much you end up saving.