Azure FinOps: Optimizing Costs and Best Practices

Nesh (Steven Puddephatt) Senior Solutions Engineer @ GlobalDots
11 Min read

In 2023, Microsoft’s total annual revenue exceeded $211 billion for the first time ever. A key driving force was its intelligent cloud segment, which has retained its position as Microsoft’s powertrain since 2020. There’s no disputing that Azure has emerged as one of the most sought-after cloud offerings on the market. However, while Azure bolsters Microsoft’s revenue to the tune of $88 billion a year, from the customer’s perspective, Azure’s pay-as-you-go model has gradually grown into a budget blackhole. 

Forced in part by global economic uncertainty, a new lens is helping even established cloud clients reduce spend for the first time. Many teams are now delving into the layers of cost reduction that Azure offers – either natively or via third-party tools. While Azure continues to grant DevOps unparalleled scalability, teams continue to be left in the dark about the budgetary realities of their day-to-day decisions. Unlocking this requires a complete shift in how your development teams view their own role in cloud cost optimization. With a new outlook and toolkit, Financial DevOps (FinOps) allows organizations to view, allocate, and forecast expenditures accurately, finally bringing IT spending back into line with financial plans.

How One AI-Driven Media Platform Cut EBS Costs for AWS ASGs by 48%

How One AI-Driven Media Platform Cut EBS Costs for AWS ASGs by 48%

What is FinOps in Azure?

While FinOps is a relatively new addition to many organizations’ toolkits, its foundations are rooted in the highly-iterative, adaptive approach of DevOps. Collaboration and data define the FinOps method – but while DevOps places a focus on high-speed innovation, FinOps adds a heavy nuance on cost. 

The focus has been placed predominantly on cloud innovation – in the shadows, however, one of the most fundamental shifts has taken place. In on-prem environments, the typical budget allocation process takes the form of an IT manager collaborating with the CFO to define the following years’ worth of expenditure. This would cover any initial investments in new hardware and infrastructure, alongside maintenance and power requirements. For a scaling company, on-prem practically necessitates over-provisioning. Every serverstack has an upper limit to the requests it can handle – but as hardware is purchased in set quantities, there’s no way to closely align on-prem usage with requests. At the same time, being under-provisioned represents a critical problem. As a result, IT spending would almost always be slightly bloated.

Then cloud changed everything. Tools such as Azure offer stellar compute power without the infrastructure demands: devs were suddenly able to provision as many resources as they desired. Scalability and agility are now staples of almost every thriving organization. While a boon for time-to-market, this has overshadowed the growing issue of devs being thrown into a position of budget allocation. Ironically, this shift has happened faster than many organizations can keep pace with: cost visibility is at near-zero for employees making the biggest budgetary decisions. As a result, organizations waste an average of 32% of all cloud spend – learn more about what FinOps is and why it’s critical here.

Ultimately, not all cloud spend is created equal: some resources enable faster product releases and greater customer growth; others are a hangover from static provisioning methods. Azure services allows on-the-ground teams to understand the difference – and directly take responsibility. Azure’s architecture revolves around the concept of responsibility, but many organizations take it only as far as the subscription level, issuing every department their own subscription. While a first stepping-stone, Azure offers a plethora of tools for seamless FinOps saturation. Knowing how to leverage your current Azure setup is one of the most important indicators of success. 

First, you’ll need three key groups onboard:

  • Finance: First and foremost are the people overseeing budget requests. Cloud spending forecasts help build an overarching picture of FinOps maturity, as finance teams define what cost is necessary and who’s responsible. This in turn defines how every team member approaches their own cost impact. 
  • Managers: Engineers need to be supported in a cohesive way, with every effort best-placed for maximum impact. This is where managers play a vital role in fostering a collaborative and iterative atmosphere. It’s crucial for these business decision makers to understand cloud spending, in order to foster the best spending results.
  • App teams: Last but far from least are the engineers that issue cloud resources on a day-to-day basis. These teams not only need the flexibility to deliver the most value for their budgets – but also the visibility to see the outcome of every cost decision. 

FinOps-driven teams analyze cloud usage patterns, collaboratively identify cloud cost optimization opportunities, and recommend resource optimizations. This can involve right sizing virtual machines, leveraging reserved instances, or adopting serverless architectures to reduce costs without sacrificing performance. to accurately keep a handle on each of these processes, there are three more overarching conditions:

  1. Be prepared with the proper tools for success.
  2. Be accountable for costs.
  3. Take appropriate action to optimize spending.

Achieving these three requires a thorough understanding of the FinOps best practices governing your Azure architecture.

Azure FinOps Best Practices

Maximizing the value of your cloud investments while minimizing costs is a fundamental goal for any organization. To achieve this, it’s imperative to implement strong financial governance practices that revolve around three core principles: visibility, accountability, and optimization. Keep in mind that the effectiveness of any tool or strategy hinges on the foundation it’s built upon. Below, we’ll delve into the key best practices that allow your Azure-hosted apps to do more with less.

Implement and Automate Tags

Azure’s resource tagging allows plain text keys to be added to in-production resources. For simple single-departmental resources, this process is straightforward. However, when winding up multi-departmental servers, a lack of policy can leave engineers in the dark. Likewise, you may encounter scenarios where web applications or environments, like Testing or Production, draw resources from multiple subscriptions managed by different teams. Keep everyone on the same page from the get go by defining what your new tagging requirements will be – and how you’ll support your engineers. 

Azure’s metadata tags allow you to include more information about the resource. From a FinOps perspective, these can be fed into a real-time cost analysis, helping distinguish the true business value of every app and service. For instance, deployment-related resources can be tracked by adding an environment key. With a key-value pair of environment = production, you’re able to see the specific cost drivers within your production resources. Best practice dictates that these tags be included when the resource is deployed; to support this, many organizations rely on policies that demand a minimum number of tags before the dev can provision it. 

However, keep in mind that the two most prevalent issues associated with Azure tags are either not employing them at all or adopting an excessive approach; where an abundance of tags is applied to every resource. In the former case, the organization is left with a half-formed picture of their own cloud. When dozens of tags are required, however, the administrative burden begins to erode any potential resource management benefits. All too often, policies can forget that devs are human too – highly-demanding tag policies can result in a mess of copy-pasted noise. Striking a balance is essential. One good answer is to demand tags mainly at the resource group level. Since resource groups act as logical containers for resources with shared life cycles, tagging these logical entities rather than individual resources can streamline governance without the need for overly complex controls. 

Furthermore, automatic tag copying allows tags to be inherited from the overarching resource group. This can support devs in making rapid changes without compromising tag accuracy. Of course, keep in mind that some scenarios may demand tags only for a specific resource type, so make sure to define what resources will still need individual tagging.

Dig Deeper into Your Costs

The FinOps foundation separates cost maturity into three main phases. Organizations at the crawl phase have a far more surface-level of cloud cost understanding, with only a comparatively small field of view. As a result, forecasted spend differs greatly from actual spend. This makes efficient provisioning essentially impossible. 

With a foundation of accessible tags in place, organizations can begin to enter the walk – and even run – phases, as an increasing amount of cost can be allocated and understood. To achieve this, however, the newly-discovered data needs to be put to work. 

FinOps culture demands an accessible financial image of day-to-day decisions. Azure’s cost management page allows organizations to conduct a thorough examination of their expenses, allowing for a detailed breakdown of costs. Regularly addressing some common questions can enhance your awareness and support team-focused fiscal responsibility. 

  • What total expenditure has this month incurred – does this align with budgetary expectations? 
  • Have monthly costs stayed within a reasonable range of monthly usage?
  • Now that a clear understanding of these charges are in place, how should these expenses be allocated or distributed within the organization? What is the most efficient breakdown?

Azure enables you to answer each question via a view. This is a customizable report that takes all recent cost data and provides a cross-section across resource levels. Azure offers a number of pre-set smart views which can help visualize this data as usable insights such as KPIs that accurately summarize cost, expandable details of the top contributors, and intelligent anomaly detection. 

With these numbers defined, the finance team is able to accurately build a picture of unit costs. As cost per customer metrics mature, your cloud expenses can define how much you charge customers, further increasing cloud efficiency.

Enable Cost Controls

With a new view of your cloud cost, it becomes possible to begin the optimization process.  The first element of this is to begin setting limits for yourself and your teams. Fundamental elements within the FinOps framework, budgets act as a compass that directs the organization’s cloud spending with precision and forward-thinking. Visualize these tools as dynamic blueprints that harmoniously align your ambitions with fiscal realities, ensuring informed and strategic decision-making

Budgets give you the ability to set either a cost or usage-based budget with many thresholds and alerts. Make sure to review the budgets that you create regularly to see your budget burn-down progress and make changes as needed. Azure offers an automated trigger that shuts down VMs when a given threshold is reached. If this is too high-risk, it’s also possible to shift the over-budget infrastructure to a different pricing tier. 

Make Use of Serverless 

As an increasing understanding of cost permeates an organization, employee hours start to become an increasingly important factor in cloud efficiency. Azure’s serverless computing offers a highly efficient and cost-effective approach to running small code snippets designed for specific functions. These functions run on existing servers shared among numerous users, eliminating the need for infrastructure maintenance – even the virtual kind. Consequently, the costs associated with serverless computing can be significantly lower than those of traditional cloud services. Notably, expenses related to access authorization, security, image processing, and other server operation costs are entirely eliminated in the serverless model. 

When re-working code that’s being moved over to the cloud, various considerations like operating systems, hardware resources and dependencies need to come into play. Serverless computing simplifies this process by focusing solely on code compatibility with Azure’s own offerings. As a result, serverless functions also require less coding expertise, making development more accessible to individuals with varying skill levels and paving the way to even greater savings. 

Don’t Forget Containers

Containers are, quite simply, a way to package software. Container orchestration tools such as Kubernetes are a popular way for engineers to deploy containers while keeping everything manageable.

However, Kubernetes is notorious for its opaque pricing structure. Relying on K8s has traditionally been a major hurdle for FinOps due to the blindspot it burns in cost analyses. Azure’s compatibility with third-party tools offers a major FinOps step forward, however, as cloud innovation now grants seamless visibility into your cloud spend – containerized or not. 

Rightsize Resources

Right sizing presents a strategic avenue for cost optimization without compromising performance. This strategy entails a meticulous evaluation of the computing power, memory, and storage requirements for individual applications or services, steering clear of the common pitfalls associated with excessive resource allocation (over-provisioning) or resource waste due to underutilization. Azure Virtual Machine auto shutdown helps minimize operating costs by automatically shutting down idle or unused virtual machines; this idle state is predefined by a specific length of time without activity.

By adopting this best practice, organizations can realize significant cost reductions almost immediately. Consequently, this leads to improved returns on investment, and supports greater FinOps implementation further down the line.

Optimize Compute with Machine Learning 

Azure compute is often purchased via an on-demand model. While fantastic for flexibility, if every compute requirement is served by on-demand instances, the price of such high-availability compute can swiftly become extortionate. 

Reserved Instances (RIs) allow organizations to make a commitment to a particular instance configuration for a specified duration, in return for significant cost reductions when compared to on-demand pricing. Conversely, Savings Plans offer a different level of flexibility by granting discounts on usage spanning a diverse range of instance types. This flexibility provides a valuable advantage as it allows organizations to adapt to evolving workloads with ease.

Azure offers a simple form of automated compute optimization via their consumption reservation API and the Azure advisor. A fantastic first choice for initial resource allocation attempts, these recommendations are calculated off the hourly resource usage over the last few weeks. This usage data is fed into an analysis engine that simulates cost with and without reservation – the mix that best maximizes your savings is then recommended.

RIs are only the first layer of compute discount, however. The next layer of cost effectiveness is provided via spot instances. Spot VMs offer spare Azure compute capacity for a significant discount on pay-as-you-go rates (up to 90%). When Azure wants to claim back that capacity, a 30-second notice is given, and all associated resources are switched over to pay-as-you-go instances. This innate volatility makes spot VMs a fantastic choice for temporary tasks such as batch jobs, rendering and analytics. Managing this ever-shifting field of compute allocation would seem like a nightmare – which is where automation steps in.

GlobalDots’ focus on innovation has uncovered third-party Azure integrations that automatically find and provision the optimal balance of spot, reserved and on-demand resources. This combines to form the most cost-effective and available cloud compute possible, fine-tuned to your own infrastructural requirements. 

By this stage in your FinOps journey, you’ve come a long way from resource tagging – to learn even more, explore more about cloud cost optimization strategies here.

FinOps Adoption Success Stories

Starting can be the most daunting part: that’s where GlobalDots can get the ball rolling. One client – a prominent eCommerce conglomerate – had embarked on an ambitious cloud migration journey. Unfortunately, this transformation was lacking in structure. Consequently, the client grappled with a disproportionate cloud expenditure and a sheer blindspot around their own resource.

Take, for example, the client’s 16 verticals: each operating as distinct business units, a total of 74 AWS accounts were being managed with no centralized optimization or governing entity. Furthermore, the absence of a FinOps culture led engineers to occasionally utilize just 5% of CPU capacity, exacerbating the cost issue.GlobalDots stepped in and proposed substantial annual savings of $1.5 million, focusing on both short-term and long-term cost optimization strategies. From doubling the number of machines that run on a reservation plan, to meticulously right sizing VMs, this FinOps case study is typical for cost optimization. With $250,000 in savings realized during the first 16 weeks, our client was able to shed new light on the financial decisions made by every team member.

Latest Articles

Cut Big Data Costs by 23%: 7 Key Practices

In this webinar, we reveal a solution that cuts big data costs by 23% and enhances system efficiency - without changing a single line of code. We’ll also explore 7 key practices that will free your engineers to process and analyze data at the pace and scale they need - and ensure they never lose control of the process.

Developer AXE-WEB
15th April, 2024
Project FOCUS: A New Age of FinOps Visibility

It’s easy for managers and team leaders to get caught up in the cultural scrum of FinOps. Hobbling many FinOps projects, however, is a lack of on-the-ground support for the DevOps teams that are having to drive this widespread change – this is how all too many FinOps projects become abandoned on the meeting room […]

Nesh (Steven Puddephatt) Senior Solutions Engineer @ GlobalDots
27th March, 2024
Optimize Your Cloud Spend with a FinOps Maturity Assessment

Achieving FinOps is a tall order: it demands a degree of organizational self-awareness that some companies are constantly battling for. Consider the predicament that many teams find themselves in: while their cloud environments may contain a number of small things that could be optimized, there are no single glaring mistakes that are consuming massive quantities […]

Nesh (Steven Puddephatt) Senior Solutions Engineer @ GlobalDots
27th March, 2024

Unlock Your Cloud Potential

Schedule a call with our experts. Discover new technology and get recommendations to improve your performance.

Unlock Your Cloud Potential