This is a comprehensive guide to multi-cloud billing and cost management.
If you work as a CIO, CFO, in IT financial management, as a Cloud Architect, in DevOps or you are just interested in cloud billing and cost management, this guide is for you.
We'll look at how to establish an automated end-to-end cost management process as part of an organization's cloud governance.
In this new guide you will learn:
- What cloud cost management is
- Why cloud cost management is becoming more important
- What strategies you can use to manage cloud cost
- The ins and outs of chargeback and showback
- About the challenges of cloud metering and pay-per-use in the private cloud
- What to expect and where to go on your cost management journey
Let's get started with the basics:
What Is Cloud Cost Management?
The cloud promises a simple, cost-effective, highly scalable alternative to running your own servers in a data center. But as cloud infrastructure becomes more complex - a lot of companies use multi-cloud architectures - the costs associated become difficult to track and evaluate:
Cloud cost management or cloud cost optimization is the effort to gain valuable insights into the costs of cloud usage within the enterprise and finding effective ways to maximize cloud usage and efficiency.
Multi-Cloud Cost: 4 Expensive Factors That Are Easy To Miss
Without proper cloud cost management in place, things can get out of hand and more expensive than they need to be.
Here are 4 factors that drive cloud cost and are easily overlooked:
Un- or underused cloud resources
Provisioning cloud resources may be difficult or not in your organization - but de-provisioning is forgotten easily. With the public clouds, pay-per-use model costs can spiral out of control.
Third-party cloud services
The usage of complementary cloud services like cloud monitoring or backend connectivity is necessary for DevOps teams. The cost of these services is often not attributed or monitored closely enough to detect the potential for significant savings.
Provisioning and organizational support
For developers, it's not always easy to get the cloud environments they need for their projects. Manual processes for approving and provisioning their requests are time-consuming and costly.
Projects exceeding budget
Budgeting cloud projects is common; Without the right cost management, it might not be possible to get an early warning on projects that might run out of budget. Shutting down cloud resources on a project because it's exceeding budget is no option and so it drives cloud costs up.
Why Cloud Cost Management is so Important
Cloud usage is an ever-increasing factor in most enterprises. The advantages seem clear - and they are - but the organization needs to adapt to and keep up with these advantages: Rapid scalability, pay-per-use, automation, overall agility - all this needs proper cloud governance to come into effect without exploding costs.
DevOps teams consuming cloud resources like they are free. Multi-cloud architectures that add extra complexity, and private cloud components that don't follow the pay-per-use model make effective cloud cost management an absolute necessity.
At the beginning of an organization's cloud transformation, the topic might not seem as pressing - but that's actually the right time to lay the organizational and technical foundation for cost-effective growth to avoid costly surprises further down the road.
Is Cloud Cost Management an Issue for Businesses?
A recent survey revealed that cutting cloud costs is the top priority for companies' cloud strategy. They face growing invoices from their cloud service providers and the lack of insight into those costs has considerable financial consequences. Most companies expected their cloud budget to increase by 10 to 25 percent in 2020 while considering the applications they deploy to the cloud as "mission-critical". That makes it critical for business success to put the budget increase to good use and not waste it on hidden and avoidable costs in their cloud organization.
Why a lot of Companies are Struggeling with Cloud Costs
Not utilizing the advantages of the cloud means real competitive disadvantages and poses a risk to business success - that's for sure. But going on that journey to the cloud, transforming the organization, and moving workload out of data centers is not an easy task and must not be rushed.
Some companies ran for the cloud and with time came the shocking suprise of mounting costs.
Here are some reasons for that (and for you to avoid):
- No proper cloud governance
- Not adopting the cloud-native mindset
- Overlooking the risks of the pay-per-use model
- Poor visibility of cloud costs (build a cloud cost dashboard)
6 Levers to Pull to Reduce Cloud Costs
There are a number of levers businesses can pull on to manage cloud costs.
Here are 6 of the most promising:
Ensure that the public cloud instances you choose are the right fit for your organization’s needs.
This allows organizations to scale up resources when needed and scale
down the rest of the time, rather than planning for maximum utilization at all times (which can be needlessly expensive).
Reserved Instances offer a significant discount over on-demand instance pricing (for example, up to 72% on AWS). To make full use of them you have to know what you actually need.
Not all instances need to be used 24/7. Scheduling non-essential instances to shut down overnight or on weekends is more cost-effective than keeping them running constantly.
Removing unused instances:
If you’re not using an instance, there’s no need to keep it around (and paying for it!). Removing unused instances is also important for security since unused resources can create vulnerabilities.
Since discount instances usually do not guarantee availability, they’re not
appropriate for business-critical workloads that must run constantly but for occasional use, they can result in a significant cost reduction.
The most effective lever is the organizational transformation to fit the requirements of the cloud: We'll get to the why and how later in this guide.
Multi-Cloud Billing: Showback vs. Chargeback
Strategic considerations aside showback and chargeback are pretty similar on an operational level. They share the same structure - but of course, there are essential differences:
- The showback approach does not contain bills or invoices that need to be paid. IT reports costs and usage, while also paying for the cloud service provider invoices, services, and so on. There is no charging of the actual source of these costs. A DevOps team won't be charged for a VM or a third-party cloud service it uses.
- In the chargeback approach, each consumer of cloud resources and services is actually billed for what they used. That can mean that money flows from one department to another within a company or just an entry in accounting charging the consuming department.
The Strategic Differences of Showback and Chargeback Models
On a strategic level, showback and chargeback are quite different. The question here is why should you use one over the other? What implications does that have on usage behavior and accountability?
Showback does not have any enforcement mechanisms to guide usage behavior. It is basically providing cloud cost transparency and visibility - which is great and useful - without encouraging users to pay attention and be proactive about the costs they produce.
In a chargeback, model users are incentivized to consider the cost of the cloud resources they request and use.
The Pros and Cons of Showback vs. Chargeback
On a scale of detail required, effort and cost showback requires less and is easier and faster to execute.
For a chargeback model more detail and effort are required and therefore it can be more expensive.
Both models expose levers to reduce cloud costs and optimize usage. They are both fit to demonstrate the value that the cloud brings to the business.
Chargeback offers total accountability and allows IT to recover costs across all DevOps teams and departments that use IT services.
Metering in the Private Cloud
The private cloud needs a little extra treatment when it comes to cloud billing and cost management. There is no large bill at the end of the month that lists the cloud costs on a pay-per-use basis. To get that you have to make or buy a metering implementation that does this for you.
Metering is the process of collecting and calculating cloud resource usage. It involves pricing resource usage to calculate the cost.
Cloud platforms record events and other information about deployed cloud resources. Some of these events are relevant for metering. For example, starting and stopping a virtual machine may
generate a corresponding stream of events that describes for how long the virtual machine was running. This data from the cloud platform can be used to calculate how much RAM-hours and vCPU-hours a virtual machine consumed in a given period.
Cloud resources have many different traits like a virtual machine has RAM and vCPU. In the public cloud, it is common to provide t-shirt sizes for cloud resources. You can build the same for the private cloud and offer VMs in S, M and L instead of giving exact quantities for RAM or vCPU.
A product catalog defines which of these traits are relevant for metering and how their usage is calculated. Typically usage is the product of a quantity and a duration, i.e. a single vCPU used for an hour. But there may be other usage units as well that consist only of quantities (i.e. bytes transferred over the network) or a duration (i.e. resource usage hour).
A product catalog also contains pricing rules that determine the cost for particular resource usage.
This way private cloud usage can be accounted for in a pay-per-use model like it is common with the large public cloud providers.
Cloud Billing and Cost Management Maturity Model
At meshcloud, we came up with a six-stage maturity model for cloud billing and cost management. We see these stages when we look at where our customers are on their cloud journey - but not necessarily in that order:
Private cloud metering
The private cloud does not come with a pay-per-use-based invoice at the end of the month like it is standard with the public cloud providers. A private cloud metering has to be built and implemented to get to the pay-by-consumption model.
One large bill per cloud provider
Every month or quarterly there will be one large bill for all cloud services that are used across the organization. There is no way of getting around that - a black box.
Per project cloud consumption
Allocating cost to the consumption in each project is a big leap for cost transparency and visibility. Projects exceeding their budget can be identified early on and adjusted accordingly.
Tenant fee for cloud governance
A cloud foundation team or a cloud center of excellence takes care of cloud governance: Security and IAM are centralized and the team aims to deliver a good user experience for the DevOps teams. They can focus on their actual work and pay a tenant fee for the services of the cloud foundation team.
Charge for cloud services
Since not only cloud-native services are used in development and operations additional cloud services (monitoring, CI/CD tooling, connectivity) are provided by the cloud foundation team and other DevOps teams that can then charge for them.
Cooperation and cost allocation with external partners
The internal transformation into a service organization with full chargeback makes it possible to offer the same services to external partners. This is the peak of the cloud billing mountain everybody strives to climb.
The 7 Steps to Multi-Cloud Cost Management Perfection
Let's take our maturity model and see how to get from where you are now in terms of cloud billing and cost management to where you would like to be.
We have 6 stages in our model, but we'll start with step #0 to get the very basics covered:
Step #0 - Laying the Foundation for Cloud Billing and Cost Management
This step is all about making the right choices and laying the foundation. It has nothing to do with cloud billing and cost management directly.
There are two things you need to consider:
Account structure: Projects, folders, subscriptions, resource groups, accounts - all these entities you are confronted with in the cloud need to be mapped to your organizational structure with teams, departments, and products. For example map an IT product (organizational construct) to a customer and an application stage like dev, QA, or production to a tenant (Azure subscription, AWS account, or GCP Project).
At meshcloud the meshProject is central to assigned project users and team leads, cloud tenants with landing zones, the service marketplace and approved budget and chargeback
Metadata and Tagging - Tags or labels are custom to every organization. Common tags are data classification, cost center, and environment. Defining a tagging schema helps with scaling and automating cloud usage. Resource tags and labels can be used to control and analyze costs.
Step #1 - Splitting up the Cloud Bill
This step is about getting from one big bill to allocated per project costs.
Central to this step is proper showback to increase transparency and visibility for cloud costs.
Why is that important?
For DevOps teams:
- be aware of costs
- rightsizing of cloud resources
- staying in budget
- identifying undesired use of cloud resources (Tesla had unauthorized bitcoin miners in their cloud environments)
- avoiding zombie resources: Its easy to provision resource but deprovisioning can be forgotten.
For central IT:
- negotiating better contract conditions with cloud providers
- defining service portfolio and seeing if certain cloud services drive costs to find alternatives
- taking cost optimisation steps
meshcloud offers usage reports on the projects with according costs to increase transparency.
Step #2 - Charging the Right People
After raising awareness with proper showback, chargeback is the next logical step.
The large bill that goes to central IT is something you can't avoid. In a worst-case scenario, it is allocated manually to teams and projects: A model that is not sustainable.
Automation is key when it comes to allocating cloud costs:
Aggregate cost per customer → create monthly chargeback statements per project → provide detailed tenant usage reports per platform. This can then be exported to SAP or other tools. The tags and labels from step #0 come into play here. Without them, automation is not possible.
Step #3 - Staying within Budgets
This step is about keeping projects in budget by mapping budgets to projects.
For that, you need an end-to-end integration: A project is created and approved with a budget before a single cloud resource is used. As cloud resource consumption can vary greatly over the lifecycle of a project it is not easy to plan a budget that fits exactly and keep it in check.
A common setup we see is a DevOps team lead requests a cloud account and the cost center manager and central IT approve. The estimated budget for the project is mapped to the project as it is created. The resources are tagged and tracked as the team works on the project and consumes cloud resources.
All stages of the product - e.g. development, staging, and production - have to share the overall budget for the project. Since you can't cut off a project when the budget is exceeded you need to implement processes that regulate the costs beforehand.
Excursion: Multi-Cloud Organization Done Right
To utilize the cloud to the fullest enterprises have to transform their internal organization to fit the new cloud-native approach. What we at meshcloud see most with our customers is a transformation from a silo-like organization to a centralized cloud foundation team or cloud center of excellence that takes care of cloud or multi-cloud governance.
They integrate platforms, define landing zones, organize provisioning, ensure continuous compliance and provide cross-cloud transparency. All this is centralized in a tool and highly automated. Manual processes are not scalable when DevOps grows. Self-service is implemented to decouple DevOps from cloud governance and the cloud foundation team. That accelerates time-to-cloud for DevOps and time to market for the business.
Step #4 - Cloud Governance to Boost Development
Traditionally DevOps teams have a lot of non-functional things to take care of (compliance, security, general organization, and governance) if not provided by a cloud foundation team. The cloud foundation team should take care of all of that and it can charge for the value it provides with what we call a tenant fee.
This model transforms the cloud foundation team from a cost to a profit center because they charge internally and there is a detailed cost allocation.
Step #5 - The Service Organization
Cloud-native resources from the cloud service providers need to be complemented by other building blocks like backend connectivity, CI/CD-tooling, or multi-cloud monitoring. These cloud services can be provided by the cloud foundation team, other DevOps teams, or by third parties as managed services.
To integrate these services you need cloud-native provisioning (self-service on-demand) and cloud-native billing (pay-per-use) to not destroy the advantages provided by the cloud. The goal is to avoid manual processes.
A service marketplace is a solution to keep the cloud-native processes and bring service owners and cloud users together. Provisioning of cloud services can be done with a service marketplace with plans and pricing as a basic setup to keep track of usage and costs. The service marketplace can also include third-party cloud services like datadog or a managed MongoDB.
A system like this incentivizes teams to provide services for others as well as paying attention to their own consumption.
The goal here is: No emails, no phone calls but a fully automated provisioning with costs showing up on the monthly chargeback statement that also contains the cloud costs.
Step #6 - Running IT Like a Business
This is the last step to fully integrated end-to-end cloud billing and cost management. With the service economy set up internally, it is now possible to offer and purchase cloud services to and from external organizations like start-ups or cooperation partners.
The service economy built in this step is based on the foundations you lay in step #0 to get proper isolation, be compliant with regulatory and internal rules.
With a cloud ecosystem like this, you have everything you need to boosts business success in the cloud-native way!