One of our customers recently asked us to help him with an AWS Cloud Cost issue. His overall AWS bill was exploding over the last few weeks, without any adequate explanation of what could cause this drastic increase.
When we started to take a look, we quickly realized what the customer was referring to, the AWS Cost increased by 5x during August, by 10x in September, and October was already on the way to top this again.
Finding the root cause
We started to dig a bit deeper into the cost structure with the help of the AWS Cost Explorer and identified that the majority of the cost increase was caused by Usage Type DataTransferOut within Amazon S3. That was a good start but unfortunately, that’s how far we could go with the Cost Explorer alone. At that point in time, none of the Customers S3 buckets had any tags associated with them. Further, grouping and filtering based on something like the bucket name weren’t possible.
The AWS Usage and Cost Report is the most comprehensive set of AWS cost and usage data available and was the next thing we looked at. We downloaded a Usage Report for S3 and got a detailed CVS file containing a Line Item for each Usage Type and all Operations with an hourly granularity for the previous billing period.
In total the CSV file wasn’t that big, so we could load it into Excel and do some general filtering and some aggregations. With that we got a pretty good overview and could quickly identify one bucket, which by far contributed the most (around 97%) to the overall amount of DataTransfer-Out-Bytes usage.
Equipped with our findings, we went back to the customer and explained that this specific bucket must be responsible for the massive increase in their cost. It took only a few hours for the customer to figure out which part of their business logic was dealing with that bucket. This helped them to discover an unexpected and so far, not located issue in the application code. A Bugfix release and deployment were also very quickly on its way into production.
Implementing a Tagging Strategy
In the meantime, we took the chance to think about how we could improve the current situation and increase transparency and visibility of the AWS Cost Structure for the customer.
The most obvious thing was to develop and implement a tagging strategy for all S3 buckets. We came up with a straightforward one to start with:
Adding the bucket name to each bucket via the AWS Management Console doesn’t scale. So, we wrote a few lines of shell wrapper around the AWS CLI to do this job:
This solution has one downside though, whenever executed, it overrides every already existing tagging of each bucket. One way to handle that is to only run it before adding any other tags. Or by extending the script by another AWS CLI command, which fetches already existing tags from each bucket and adds them to the TagSet list before invoking the final put-bucket-tagging call.
Adding the same Tag Key:Value pair to a group of S3 buckets via a bulk operation can easily be done via the AWS Management Console and the Tag Editor. With the Tag Editor, you first search and filter for the resource you want to tag and then select from the resulting set the one where you want to do a bulk tagging:
The next step was to use these Tags for Cost Allocation Tags. To do so, you have to wait roughly one day before the newly created Tags are available and can be activated in the Billing Dashboard’s Cost Allocation Tags section.
From now on, all cost data will be enriched by the Cost Allocation Tags and can be used within the AWS Cost Explorer after another day of waiting. The picture below shows the Daily Cost grouped by Tag:name and a cost breakdown per bucket:
With the detailed information which bucket was contributing how much to the cost increase, our customer quickly identified an undiscovered and unexpected behavior of their application code and immediately implemented a fix for it. With that cost savings of multiple hundred dollars per day could be realized from one day to the other.
The new tagging strategy and Cost Allocation Tags enable quick and direct cost analyses down to the bucket Name via the AWS Cost Explorer and Cost and Usage Reports. It is a significant improvement for the customer in terms of cost transparency and will make future issues directly visible.