Savings Plans vs Reserved Instances
Back in October 2019, AWS introduced Savings Plans, and a belief that Reserved Instances are dead began.
People thought that because of the simplicity and flexibility of some types of Savings Plans, that Reserved Instances wouldn’t have a place anymore. However, that’s just not true. While Standard Reserved Instances have largely been superseded, there are a very small number of cases where Standard Reserved Instances make sense still. Convertible Reserved Instances, on the other hand, have a really important place in any cost optimization strategy.
Standard Reserved Instances vs Convertible Reserved Instances
There are some significant differences between Standard and Convertible Reserved Instances. When you buy a Standard Reserved Instance it’s a commitment to run that particular instance type in that location for a certain period of time. With Convertible Reserved Instances, you’re able to change the reservation to match different instance type families.
Most organisations have a legacy installed base of both Standard Reserved Instances and Convertible Reserved Instances. Managing that legacy base is really important. To do this it’s critical to track your Reserved Instances, and make sure that they’re currently delivering the savings that they were originally bought for, while also ensuring that they’re matched against actual workloads.
You can track your reservations using AWS Cost Explorer and cloud management tools like Cloudability and CloudHealth; as well as a whole bunch of other tools. The important thing is you have a process in place and you regularly track and check them. It’s also important that during that tracking process you identify individual reservations that aren’t matched to a workload anymore. These tools will point those out to you so you can take action.
Coverage & Utilization Metrics
From a global perspective, regarding your reservations and the estate of instances, usually you’ll attract two different metrics. The first one is coverage, which is the percentage of instances that are matched to reservations. If you have, for example, coverage of 90%, it means that 90% of your instances are allocated to reservations and 10% are just running on-demand. Therefore those 10% are costing you a lot more as a result, however, it is more complex because there’s risk involved. For example, if you set your coverage at 100% then if your usage dips you will have reservations that are no longer matched to actual workloads, and you’ll be paying for something that you’re not using. Due to this most people have a much lower coverage level than 100%.
Where you set your coverage level depends on your ability to manage it and on how unpredictable the workloads are. As far as coverage is concerned, if your coverage were 100%, that would reduce the flexibility that you have. If you wanted to do rightsizing or an exercise where you’re actually becoming more efficient to reduce new workloads, you’re still paying for those workloads, meaning there’s no point going through that effort. So, it’s very rare that you would want your coverage up at 100%, unless you have some automation in place or another strategy to give you extra flexibility.
The other metric, in addition to coverage, is utilization rate which is the percentage of reservations that are matched to instances. You always want that to be 100%. You never want reservations that are unmatched, because you’re paying for something again that you’re not actually using, which is just raw overhead.
Managing Your Reserved Instance Portfolio
The next thing to talk about is how you take action on the reporting you’re getting on the analysis. There’s no point in knowing you’ve got reservations that are not being used if you don’t do something about it. And the way that you act is different, depending on whether you’ve got Standard Reserved Instances, Convertibles Reserved Instances, or a mix.
Bear in mind that Reserved Instances are confined to regions, so you’ve got no flexibility to move reservations between regions. For example, one workload might be overcovered and you might have reservations that are being unused, then in another region you might not be completely covered. Unfortunately, you can not move reservations between those regions, so you need to be specific when you purchase the reservations in the first place.
Standard Reserved Instances
In terms of the actions that you take with Standard Reserved Instances, you’ve got to match the workload to the instance. That means if you purchased standard reservations for m5.large instances, and your team has decided that they want to use c5.large instead, then you are paying a lot of extra money because the reservation that you’ve purchased for the m5 is never going to get used. In that situation, typically, you’d move the c5 back to an m5, even changing the instance size if necessary just to make it work. This is one of the reasons why standard reservations are very inflexible.
You’ve got more flexibility with Linux because you’ve got instant sizing flexibility. For example, one m5.xlarge reservation would allow you to run two m5.large instances for that same reservation. That’s not the case with Windows however. So standard reservations are very difficult to take action on, as you’ve got to match them.
Convertible Reserved Instances
For Convertible Reserved Instances it’s a whole different story. You get a lot more flexibility because you can change the instance family, as well as the instance size, to match the workload. This is a manual process as far as AWS are concerned, but you do have full flexibility to do it. You have to be careful how you do so because there are issues around making changes; AWS never allow you to reduce your total commitment level. This means that if you’re paying $100 for a reservation and you want to change it to a reservation that’s worth $101, you can’t just pay the extra $1. You actually end up getting two reservations, and paying $200. Therefore you have to be really careful how you do it and the way to accomplish that is breaking down your convertible reservations into very small atomic units, like dot nanos, and then convert those up to the instance type that you want. So you use an intermediary in order to do that. That way you stop creating this overhang problem and increasing your commitment with AWS.
The other thing that you need to do is make sure you are regularly checking your Convertible Reserved Instances to see whether you need to make these conversions. One of the big advantages is that because of this flexibility, your ops team or your dev team are free to work without any constraints in terms of the instance families which they are using.
You may find these instances change regularly. How that works depends upon the platform which you’re running the workload on, but you do need to regularly check and make those changes as quickly as possibly. If the instance type changes and the reservation is not converted, then the entire period between when it changes and when it’s converted is wasted money, because you’re paying for the reservation and you’re also paying for the new instance at on-demand rates. That’s a big cost if you don’t regularly check. The other big advantage of using Convertible Reserved Instances over Standard Reserved Instances is that you can term optimize them. You can be very clever about how you make these conversions, and you can therefore squash down your commitment over a longer period of time or reduce the commitment to a short period of time to cope with peaks and troughs in usage. That’s a very powerful feature of Convertible Reserved Instances, and that’s the main reason why they still have a strong place in any discount strategy.
One of the problems with all of this is that it’s a manual process and it can take a lot of time and energy, and never be 100% efficient if you’re handling it manually. So the automation of both the tracking of these Reserved Instances, and the action process behind it, is highly beneficial.
All of this means that the original question, ‘how do you check your Reserved Instance purchase status?’ doesn’t actually ever come into play because an automated system will constantly check for you. It also means that the system can spot changes and make the conversions on that highly efficient base using the method described above about using smaller instance types, or reservations to be able to reduce the overhang problem. Those automation tools can do all of that in real-time to maximize the savings that you can make, and they will also automate the purchasing of new instances, according to the strategy that you set. So from a term optimization point of view, it can be absolutely optimized to the nth degree. We use our partner ProsperOps to do this automation, and we wouldn’t dream of doing this on a manual basis, unless it was a very small estate, or an incredibly stable workload.