Cloud Cost Management Part I - Orientation

As Cloud Native consultants, we often get asked about managing cloud costs. The business that doesn’t ever want to save money is a rare one indeed. However, what we have found is that in the Zero Interest Rate Policy (ZIRP) age times of plenty businesses were more focussed on delivery and less on profit than they normally would be. Now that interest rates have returned to normal, investors are asking questions about whether their money might return more in a bank than being spent on rented IT infrastructure.

A recent estimate of cloud spend in tech-related industries claimed that 32% was being wasted, driving many CFOs to recover the costs and significantly reduce that waste.

There are a bewildering number of paths one can take when looking to save money on cloud costs. This can lead to confusing conversations with clients, where the subject of conversation can jump from VM resource utilisation percentages, to automatically tagging resources, to whether applications or even business lines are needed at all.

In this series of posts we seek to outline our approach to this topic. This first post looks at the first step: orientation.

The Three Levels

At its core, Container Solutions’ approach divides cost management work into three categories:

  • Quick wins
  • Cost Optimisation
  • Finops

These are analogous to the three horizons framework we often use when consulting in Cloud Native.

Each of these levels implies different levels of maturity, and allows the client to map where they are at the moment. Once the client is able to place themselves on this map, they can decide either where they would like to be, or where they need to be to make the savings they want. Once these positions are clear, we can proceed to tangible actions.

Level One: Quick Wins

Quick Wins are the simplest to identify and action. In this stage, clients are looking for fast ways to reduce cost without changing their business or technology in any meaningful way.

Typical ways to get Quick Wins boil down to answering the question: ‘is that resource needed?’ The resource might be compute, data, or feature-rich products, or licences that simply aren’t required for the business. Another rich ‘Quick Win’ area is non-production costs (such as development or staging environments), which can be very high if unchecked.

Quick Wins are characterised by requiring little or no knowledge of the business to identify, and little effect on the business to action. Typically, they are identified by looking at the biggest areas of cloud spend through standard tooling (such as AWS Cost Explorer) and then identifying what costs are ripe for reduction.

When we talk to clients about costs, most have already looked into these areas, and are looking for more things they might have missed. Unfortunately, there are typically not that many ways to cut costs with Quick Wins, and most of them are more or less commonsensical to anyone familiar with the cloud. The exceptions to this we’ll look at more closely in Part Two.

Level Two: Cost Optimisation

Cost Optimisation is here defined as the activity of seeking ways to change your cloud behaviour in a way that doesn’t materially change the way you do business or technology. It’s different from Quick Wins in that it might involve changes to your technology use that require engineers to get involved.

To take one common example: you may have a fleet of VMs that are somewhat elastic in their consumption. Workloads that test systems’ functionality or performance might be one example of this, or some kind of processing task that should generally be quickly available but can wait a little at busy times. If you can, divide these resources into necessary ‘core’ compute that you always need available, and ‘non-core’ resources that need not be generally available. Once divided, the ‘core’ compute can be made cheaper by using ‘reserved’ instances, and the ‘non-core’ can be provisioned using spot instances.

This change would likely require some meaningful work from the developers to manage the corner cases where spot instances are not available, and/or write code to provision the spot instances. As such it’s not a ‘quick win’, as changes will need some non-fundamental change to workflows, process, or technology. This is where the best ‘bang for the buck’ is often seen when we’re brought in to help.

Another area of optimisation is to rearchitect your applications to make them more Cloud Native, reducing extra costs you might have in managing resources that are better taken care of by the cloud service providers. For example, you might replace an application running on Kubernetes with a serverless solution. This requires deeper technical changes (so arguably could be a separate level), but still comes under the 'optimisation' heading because there is no business-level change required.

Level Three: Finops

Finops (a portmanteau of Finance and DevOps) is a broad concept that emphasises collaboration in cloud cost management between technical, financial and business functions within the business.

While the first two levels only require technical domain knowledge and capability to identify and implement, in our model this level is distinguished by requiring some level of change to business or technical process to achieve. It is described by the Finops Foundation as a ‘cultural practice’.

A Finops initiative might seek to identify cloud costs in a structured way by ensuring all resources are tagged with a reference to the team that is responsible for the cost. Those costs can then be ‘charged back’ to the owning technical team or division, rather than borne by a central technical function. This might sound straightforward in principle, but will require agreement, coordination and collaboration between financial, technical, and business functions. This triangle of interests rarely align on anything!

Finops is a huge topic, but for the busy budget holder, you can think of it as the way to truly get control of your cloud spending. This control does not come without effort, however, and the path to Finops nirvana can be long and organisationally challenging if not planned carefully, and encompass changes that can turn the endeavour into a broader transformational journey that introduces other Cloud Native concepts, such as increasing velocity by giving teams power over (and corresponding responsibility for) their cloud spend.

Why Does This Matter?

If you’re reading this you might be wondering why this is important: just show me the money saving tips!

Without this structure, we have found that cost saving discussions, if not structured, can quickly confuse the client by leaping across ideas from the three different levels. By mapping and auditing where the business currently is on the maturity levels, you can begin to determine which levels you need to focus on first, and which level you ultimately have the appetite for. 

Once that’s done you can then proceed to the fun stuff: identifying the tools and techniques you can apply to get that bill down. Stay tuned for more on those.

Leave your Comment