The move to the cloud has enabled tech leaders to modernize their infrastructure and improve application availability, scalability, and performance. However, if they plan to continue getting approvals from the CFO for additional cloud spend, they must bring their expenses under control. That’s easier said than done, as optimizing cloud costs can be a challenge, requiring not only identifying sources of overspending but also gaining the support to implement a sustainable cloud cost management strategy across the organization.
As the world edges toward a possible global recession – with the tech industry in particular feeling the heat with layoffs and financial re-structures – cloud cost optimization must become a top priority.
In this definitive guide for any tech leaders looking to gain control of their cloud bill during 2023, we demonstrate several ways your organization can achieve quick wins to reduce cloud spend as well as develop long-term strategies to achieve ongoing cost reductions.
Cloud Cost Optimization Definition
Before we get into how organizations can keep cloud spend at bay, let’s first provide a cloud cost optimization definition. Simply put, cloud cost optimization includes the design and management of cloud architecture as efficiently as possible. This optimization involves ensuring that cloud architecture meets the needs of cloud-hosted applications while selecting cloud providers and service offerings, and managing cloud infrastructure to reduce cloud costs.
Best Practices for Cloud Cost Optimization
The reality is that companies are overspending on their cloud deployments. Underutilized and idle resources, poor application optimization, and cloud mismanagement can all drive up the cost of the cloud. By developing a cloud cost management strategy and implementing it across the organization, companies can dramatically reduce their cloud costs.
Here are the top ten best practices for optimizing the costs of the cloud.
#1. Purchase Instances using Savings Plans & Reserved Instances
Cloud service providers offer a range of cost-cutting options for corporate resources. Often, these are attempts by the provider to ensure that their resources are fully utilized at any point in time.
Reserved Instances are where an organization commits to use a particular resource for an extended period — often 2-3 years — for a reduced rate. Selecting Reserved Instances over On Demand ones can result in up to 70% in cost savings. As a result, Reserved Instances are a great way to cut costs for an organization’s baseline resource requirements, i.e. the minimum that will be used at any point in time.
To take full advantage of Reserved Instances, an organization needs a clear understanding of its baseline cloud resource needs. As mentioned previously, visual tools can help to identify usage patterns and determine the organization’s minimum, sustained level of resource utilization. Committing to a Reserved Instance for this minimum utilization reduces cloud costs for baseline usage, and On Demand or Spot Instances can be used to meet surges in demand.
#2. Locate and Remove Unused or Inactive Resources
Companies may have inactive or unused cloud resources for a variety of reasons. For example, a cloud instance spun up to test software during the development process may not have been deprovisioned when testing is complete, leaving it up and running. Alternatively, the storage associated with a terminated instance may not have been properly removed. Cloud service providers bill for all of the resources that an organization is using whether they are active or not. As a result, these oversights lead to inflated cloud bills for resources that are no longer in use.
To reduce waste due to inactive and unused resources, an organization should regularly scan for unused or inactive resources. Any that are discovered should be promptly deactivated to eliminate further costs. Ideally, this process of managing unused resources should be automated. This can include code to automatically deprovision resources at the end of testing processes or the use of tools that automatically identify inactive resources and flag them for deprovisioning. For example, automated tools can search for unused Elastic IP addresses in AWS for deprovisioning or ensure that storage attached to deleted instances is properly removed as well.
#3. Identify Underused Resources (and Act)
Unused and inactive resources can be a source of wasted cloud spending, but they’re not the only ones. Active resources can be a source of waste if they are underused or idle. For example, a cloud instance may offer a maximum amount of CPU and memory usage for a flat fee. If an organization is only using a fraction of these resources while paying the flat fee, then they’re wasting money. For example, a cloud instance with 10% CPU utilization is wasting 90% of its available resources.
Identifying and merging underutilized cloud resources can help an organization to reduce its number of cloud instances and cloud costs. In the event that traffic to a resource spikes, the organization can leverage auto-scaling, on-demand instances, and load balancing to handle the excess load. However, by consolidating resources, an organization saves money in the normal case.
#4. Analyze System Usage with Visual Tools
Humans are inherently visual learners. It’s much easier for a person to identify anomalies in a visualization than in a list of numbers. Visual tools can be invaluable for identifying potential opportunities for cloud cost optimization. For example, visualization can help to dive into the root causes of added costs that seem out of sync with your cloud norms.
Visual displays can help with identifying these wasted resources and performing resource planning. For example, a heatmap can be used to determine how fully various resources are being used within an organization’s cloud deployment. This allows the IT team to more quickly identify and decommission unused resources.
Alternatively, graphs and other visual tools can be used to map an organization’s cloud resource utilization over time. This can provide insight into the organization’s baseline level of resource utilization and opportunities for cuts. For example, an organization’s cloud infrastructure may be overprovisioned during nights and weekends if most of its customers use its tools during the working day. But at the same time, for example, global eComms giants need to adjust their planning according to different time zones.
#5. Take Advantage of Spot Instances
At the other end of the spectrum from Reserved Instances are Spot Instances, which are leftover resources available for purchase at the last minute. These resources are not always available and have little to no warning about termination (generally 30 seconds to 2 minutes).
These instances are great for low-priority batch jobs that can be terminated quickly if a Spot Instance expires. For example, development teams may wish to execute a large volume of tests for edge cases or error conditions in an application. Unreliable cloud resources may be acceptable for these tests, and using Spot Instances can dramatically reduce the price of testing.
Spot Instances have limitations, making them suitable only for certain purposes. However, these limitations also mean that these resources come with a greatly reduced price tag, enabling companies to reduce cloud spending if they take full advantage of these offerings when they meet business needs.
Automation for cloud infrastructure management via autoscalings can help organizations to overcome the limitations of Spot Instances and optimize their cloud spend. For example, companies can plan to deploy a certain percentage of their infrastructure on Spot Instances with the ability to automatically fall back to On Demand Instances if access to Spot Instances is interrupted or unavailable.
#6. Release Idle Elastic IP Addresses
Elastic IP Addresses in AWS are designed to help ensure the availability of cloud-based resources. If one cloud instance goes down, the IP address can be remapped to another instance, providing quick failover.
By default, an AWS account can hold up to five Elastic IP Addresses. A running instance can have a single Elastic IP Address associated with it for free. However, multiple Elastic IP Addresses mapped to the same instance and Elastic IP Addresses that are unused or pointing to a stopped instance incur charges.
Monitoring cloud accounts for unused Elastic IP Addresses can help to reduce cloud spend on unused resources. Ideally, monitoring should be automated so that Elastic IP Addresses are removed as soon as they are idle or unused, minimizing the cost to the organization.
#7. Automate Infrastructure Rightsizing During Provisioning
When provisioning cloud infrastructure, companies face a wide array of potential options. Configuration options include memory, storage capacity, database access, networking capabilities, and more. Different applications have different needs, and a “one size fits all” approach to cloud resource provisioning results in overspending on cloud resources. A better approach is to tailor resources to the unique needs of each application and use case. However, this approach can be time-consuming and unscalable if performed manually.
Cloud resource right-sizing tools can provide recommendations about which types of instances to use and the proper configurations to meet business needs. These recommendations can then be incorporated into infrastructure as code (IAC) tools — such as AWS CloudFormation — to automate the process of provisioning and rightsizing cloud resources during the deployment process. By automatically rightsizing cloud resources, an organization minimizes cloud costs while meeting business needs.
#8. Identify & Maximize Software Licensing Spend
Licensing costs are commonly one of the biggest contributors to software and cloud computing costs. Often, they are also one of the areas of greatest waste. Software license tracking is commonly decentralized and manual, which makes it difficult for an organization to achieve comprehensive visibility into its current license usage. As a result, companies will often pay for untracked licenses that are not being used.
In the cloud, companies can struggle to track their usage of Amazon Machine Instances (AMIs) and other cloud resources. If these untracked resources become unused or inactive, a company may be unknowingly paying for unused capacity. Agile development practices make manual tracking of software licenses and cloud resource utilization unscalable and ineffective. Cloud cost optimization requires automated tracking of cloud resource utilization to eliminate unused and wasted resources.
#9. Optimize Cloud Costs at Each Stage of the SDLC
Agile development practices mean that an organization’s cloud resource requirements can change rapidly. Minimizing cloud spend requires identifying and managing anticipated cloud spending as quickly in the software development lifecycle (SDLC) as possible.
The SDLC is a multi-stage process. Some steps that companies can take to manage cloud spend throughout this process include:
- Requirements: During the requirements and planning stage of the SDLC, anticipated cloud costs should be calculated for the proposed application. This enables the development team to architect the application in a way that optimizes cloud resource utilization.
- Development: During the coding and testing process, the development team should use cost-effective cloud resources — such as Spot Instances — whenever possible. Testing should also be used to collect data on anticipated cloud resource utilization, which can be used to inform and adjust estimates for cloud spend once the system reaches production.
- Deployment: During the deployment process, rightsizing tools and IaC should be used to appropriately scale and configure cloud resources. Resource sizing should be based on historical use data and resource requirements collected during development and testing.
- Monitoring: After an application is deployed to the cloud, its resource utilization should be monitored on an ongoing basis. This allows the organization to adjust resource utilization to optimize costs. For example, underutilized resources can be combined, or the company can invest in Reserved Instances to support baseline resource utilization by the application.
The earlier in the SDLC that cloud cost optimization is considered, the greater the potential savings for the organization. For example, an application that is architected to optimize the use of cloud resources can result in greater overall savings than an attempt starting later in the process when fewer tools and options are available.
Actively shifting cloud cost conversations left in the SDLC is essential to reducing cloud spend, however, DevOps participation and buy-in is essential to its sustainability success. FinOps can help build DevOps engagement by performing careful planning to streamline the process and free up DevOps teams to assist.
#10. Build a Culture of Cost Awareness
An effective cloud cost optimization strategy requires buy-in across the organization. Every department in an organization has a valid use case for cloud computing and may individually deploy and manage its cloud infrastructure. If cloud management is siloed, then cloud cost management is siloed as well, likely leading to waste due to underutilized and redundant resources.
Cloud cost optimization efforts should have executive support and standardized practices adopted across the organization. By building a strong FinOps culture and implementing visibility and reporting enterprise-wide, a company can identify waste and opportunities for resource consolidation to optimize its cloud spending.
How GlobalDots Reduced a Client’s Cloud Bill by $1.5 Million Annually
A giant eCommerce group – one of the largest retailers in South Asia – recently transitioned to cloud-based infrastructure; however, the process was not guided by a clear transition strategy or governed by a FinOps culture. As a result, cloud deployment and management were siloed across dozens of AWS accounts and organizational units, and cloud resources were wasted with many cloud instances utilizing less than 5% of available CPU.
The company was referred to GlobalDots, which performed extensive monitoring and developed a cost optimization strategy that saved the company $1.5 million annually. This 20% reduction in cloud spend was delivered via a combination of utilizing reservations, improving monitoring, and rearchitecting cloud resources to remove inefficiencies. Supported by an effective cloud management strategy, the eComms group was able to devote resources to growing the business, enabling 30% scaling due to improved access to corporate websites and the deployment of additional features in customer-facing applications.
Cloud Costs FAQ
What is cost optimization in the cloud?
Cloud cost optimization involves designing and managing cloud resources to minimize cloud spend while still meeting the needs of cloud-hosted applications.
Why is cloud cost optimization important?
Poor cloud management can drive up the price of the cloud, eating up budgets and reducing corporate profits. Cloud cost optimization is important because it enables companies to maximize the return on investment of their cloud infrastructure.
What is cloud cost management?
Cloud cost management is the practice of actively and intentionally monitoring and managing an organization’s cloud costs. For example, organizations can leverage cost-saving offerings to reduce the price tag of their cloud infrastructure.
How are enterprises optimizing cloud costs?
Enterprises are optimizing cloud costs by managing resource utilization and taking advantage of cloud savings opportunities. For example, companies reduce costs by actively eliminating wasted resources and taking advantage of discounted Reserved and Spot Instances.
How should cloud spending be prioritized?
Cloud spending should be prioritized based on business needs. Companies should start by identifying functional, cost-effective solutions for critical systems and work down the priority list.