Financials dashboard in meshStack showing cloud spend per month, payment methods, costs per platform and open issues

Mastering Cloud Management: The 7 KPIs Every Enterprise Architect Should Track

Mastering Cloud Management: The 7 KPIs Every Enterprise Architect Should Track

Cloud computing has become an integral part of modern business, providing organizations with the ability to scale and adapt to changing demands. However, managing cloud usage can be a complex task, and it’s often up to a small team of enterprise architects and platform engineers to drive and enable cloud transformation.

For these teams – often referred to as cloud foundation teams or cloud centers of excellence – it’s important to keep an eye on certain key performance indicators (KPIs) to ensure the organization is getting the most out of its cloud investment. In this article, we’ll discuss seven KPIs that enterprise architects and platform engineers should focus on when managing their organization’s use of the cloud.

So if you are part of a Cloud Foundation Team, Cloud Competence Center or Cloud Center of Excellence, this post is for you:

  1. Cloud Spend per Month:

    This KPI measures the amount of money your organization spends on cloud services each month. Keeping an eye on this KPI can help you identify areas where you may be overspending and make adjustments to your cloud usage to reduce costs. If your organization, like most, has a multi-cloud approach, you know that collecting this data from each private and public cloud is tedious. With meshStack - our cloud foundation platform - you get a single view of cloud spend per month across all cloud platforms.

  2. Financials dashboard in meshStack showing cloud spend per month, payment methods, costs per platform and open issues

  3. Cloud Adoption Ratio:

    This KPI measures the ratio of on-premise to cloud projects within your organization. Tracking this metric can help you understand how your organization is leveraging the cloud and identify areas where you may need to shift more resources to the cloud. This is especially important for organizations with tight cloud adoption targets.

    For example, if you find that the majority of your projects are still on premise, it indicates that your organization is not taking full advantage of the benefits of cloud computing, such as scalability and flexibility. It can also indicate that your organization is not well equipped to handle cloud projects, so you may need to invest in additional resources such as cloud-skilled staff and a cloud foundation platform, or work on your cloud strategy to guide your organization toward cloud adoption.

    On the other hand, if you find that the majority of your projects are in the cloud, this suggests that your organization has fully embraced cloud computing and is taking advantage of its benefits. However, it's important to monitor this KPI over time to ensure that your organization is not over-investing in cloud projects, which can lead to increased costs.

    The meshStack API allows you to collect all cloud usage data such as number of users and number of projects from all cloud platforms (private and public).

  4. Cloud Adoption Rate:

    This KPI is closely related to Cloud Adoption Ratio: It measures the rate at which your organization is adopting cloud services. Monitoring this metric helps you understand how quickly your organization is able to take advantage of cloud computing and identify areas where additional resources are required to accelerate adoption.

  5. Deployment Frequency:

    This KPI measures how often new applications and updates are deployed to the cloud. Monitoring this KPI can help you understand how quickly your organization is able to deploy new capabilities and identify areas where deployment times can be improved. It helps you understand the agility of software development in your organization and the benefits the cloud can provide.

    For example, a high deployment frequency indicates that your organization is able to quickly and efficiently deploy new applications and updates to the cloud. This is a sign of a well-organized and efficient development process, and can also indicate that your organization is able to take advantage of new technologies and capabilities as they become available.

    On the other hand, a low deployment frequency is often a sign that your organization is struggling to keep up with the demands of cloud deployment, which can be a symptom of problems in your development process, such as inadequate testing or a lack of automation.

    In addition to monitoring deployment frequency, it's also important to track the time it takes to deploy new applications and updates. This helps you identify bottlenecks in your deployment process and take steps to address them.

  6. Security Vulnerabilities per Month:

    This KPI measures the number of security vulnerabilities identified and resolved each month. Monitoring this KPI allows you to understand the effectiveness of your organization's security measures and identify areas where additional security controls may be needed.

    For example, if you see a high number of vulnerabilities each month, it suggests that your organization's cloud environment is not properly secured or that there are gaps in your security protocols. This is a sign that your organization needs to invest in additional security measures, such as cloud security posture management systems, prevent misconfiguration through policies with your cloud landing zones, or that your employees need additional training on cloud security best practices.

    On the other hand, a low number of vulnerabilities each month indicates that your organization's cloud environment is relatively secure and that your security protocols are working as intended.

    It's important to note that vulnerabilities can come from a variety of sources, including outdated software, misconfigurations, and human error.

    Implementing cloud landing zones can positively impact the "vulnerabilities per month" KPI by providing a secure foundation for your cloud environment that is designed to minimize security risks.

    One of the key benefits of cloud landing zones is that they provide a consistent, repeatable, and automated way to provision and configure cloud resources. This reduces the likelihood of misconfigurations and human error, which are common sources of security vulnerabilities. Cloud landing zones also provide clear segregation of duties and access controls, helping to ensure that only authorized users have access to sensitive data and resources.

    With meshStack, you get policy violation notifications and reports, as well as an easy, integrated way to deploy cloud landing zones.

  7. Net Promoter Score:

    This KPI helps understand the level of satisfaction internal customers have with the cloud services provided by the organization. A high NPS indicates that internal customers are satisfied with the cloud services and are likely to recommend them to others, while a low NPS indicates that internal customers are not satisfied with the services and may look for alternatives. In this case, it means rejecting the cloud or creating shadow IT that threatens cost management, security and compliance efforts.

    To improve NPS, it's important to understand the factors that drive customer satisfaction or dissatisfaction. This can be done through customer feedback surveys, interviews or focus groups. Once you have identified the issues, you can take steps to address them, such as improving service quality, increasing transparency, or providing more training and support.

    With meshStack as the Cloud Foundation Platform, it is easy to address common issues: Cloud environments can be self-provisioned in minutes, application teams choose the cloud landing zone and cloud platforms they need for their project, the budget they've been allocated, and they're ready to go. They have native access to all the cloud services they need, while maintaining compliance in the cloud.

    It's also important to note that NPS should be monitored over time to track changes in customer satisfaction; this will help identify trends and take action if necessary.

  8. Time to Cloud:

    This KPI measures the time it takes from the decision to build an application to the provisioning of the required cloud resources. Monitoring this KPI can help you understand how quickly your organization is able to provision the resources needed for new projects and identify areas where the process can be streamlined. Self-service provisioning of cloud environments in meshStack has reduced our customers' time-to-cloud from days or weeks to minutes.

    A shorter time-to-cloud indicates that an organization is able to quickly and efficiently provision cloud resources and deploy applications, which can provide a competitive advantage. This can be achieved by automating the provisioning process, establishing clear governance and approval processes, and providing developers with easy access to cloud resources.

    A longer time-to-cloud, on the other hand, can indicate that an organization is struggling to provision cloud resources and deploy applications efficiently. This may be due to a lack of automation, unclear governance, or lack of access to cloud resources.

    To improve time to cloud, organizations should focus on automating the provisioning process, establishing clear governance and approval processes, and providing developers with easy access to cloud resources. This can help accelerate application deployment and improve overall efficiency.

In summary, enterprise architects and platform engineers play a critical role in managing their organization’s cloud usage. By tracking these seven KPIs, they can ensure that their organization is getting the most out of its cloud investments, while also identifying areas for improvement. By monitoring cloud spend, cloud adoption ratio, cloud adoption rate, deployment frequency, security vulnerabilities, net promoter score, and time-to-cloud, organizations can make data-driven decisions to optimize their cloud usage and ensure they are getting the best value for their investment.


Many cables coming out of a rack

Automated Self-Service for Azure Networking Services

Summary: Many cloud and networking teams struggle to provide their organization’s application teams with the Azure networking services they need. Why is that? As organizations strive to move to the cloud the high demand for networking services meets manual processes. The result is a real bottle-neck that slows down cloud adoption. Read this article to learn how to offer Azure networking in fully automated self-service with a solution developed by meshcloud. Additionally we offer a free demo.

The Problem: High demand for integrating applications into Azure meets manual network processes creating a cloud adoption bottle-neck

Organizations that move to the cloud and to Azure specifically encounter a high demand for integrating applications into their virtual networks. To offer on-prem or secure cloud connectivity they might build a hub-and-spoke architecture, having to create the spokes manually every time. A real bottleneck for the organization's cloud transformation and not a scalable solution.

The solutions offered by Azure are complex and lack central transparency and control. Organizations need to be able to offer Azure connectivity in a scalable and secure way that allows central control.

Most organizations either don’t have an automated solution implemented at all or don’t know how to realize the architecture for Azure networking services and offer them to their application teams.

Often Network Engineers use an ITSM or even email and phone to give the access to virtual networks the application teams need. They set up networks manually for each request they get.

In a better case scenario they get pull requests for a terraform deployment from application teams and approve them to roll out networking architecture.

The Solution: meshStack and UniPipe Service Broker let you provide Azure networking in fully automated self-service

With meshStack - meshcloud’s Cloud Foundation Platform - and UniPipe - meshcloud’s GitOps-based Service Broker solution - it is possible to offer Azure networking in fully automated self-service. The central cloud team and the network engineers can set up an Azure networking service broker in less than an hour. With this in place the application teams can book and use the integration of their application into the organization’s virtual network within minutes.

How it works

Central Cloud Team & Network engineers

  1. Create a Git repository that will be used for GitOps by UniPipe Service Broker. Data about all requested services and their status is stored in it.
  2. Add the Azure Networking Service Terraform Module provided by meshcloud to this repository. If needed you can modify the Terraform modules to your needs.
  3. Deploy and with only a few steps configure UniPipe Service Broker and UniPipe Terraform Runner docker containers.
    You can use Azure IPAM to automatically assign unique IP ranges.
    OR you can add a manual step to the Service processing in which a network operator assigns an IP range.
  4. Register the new Service Broker in meshStack, so its services appear in the marketplace.
  5. Add the service as a required service to your meshLandingZone, which will be applied after creating a new project.
  6. Enjoy your running and integrated Azure Networking Service

Application Team

  1. Create a new project in meshStack.

  2. During project creation you can pick a plan (On-Prem connectivity or Cloud-only connectivity), provide information about the vNet size and the target location in Azure.
    Automated Self-Service for Azure Networking Services

  3. After project creation the network will be successfully deployed into your Azure subscription and you can start using it within your application.

Conclusion: High demand needs automated supply to keep your cloud adoption goals in focus

meshcloud’s solution solves your scalability challenges when it comes to integrating applications into your organization’s Azure networking. Automating the provisioning of Azure networking services removes a common bottle-neck and enables networking and application teams to reach their cloud adoption goals.


The Ultimate Resource for Building a Cloud Foundation

With cloudfoundation.org meshcloud is launching a documentation for all cloud and enterprise architects looking to establish a cloud foundation and bring cloud adoption to the next level.

What is a Cloud Foundation?

One central piece of successful cloud transformation is the creation of a central cloud team: Cloud Foundation, Cloud Competence Center or Cloud Center of Excellence - many names, one core concept:

Centralizing cloud governance to enable more productive, more agile, and more innovative use of cloud.

The Cloud Foundation Maturity Model

Building cloud foundations is our daily business at meshcloud. With meshStack, we provide a technical platform for cloud foundation teams.

With our Cloud Foundation Maturity Model, we offer a framework for organizations to assess the state of their cloud adoption, validate their strategy, and plan a road map.

Your Go-To Resource for Building a Cloud Foundation

With our new website cloudfoundation.org, we make the Cloud Foundation Maturity Model available and open it up for debate to develop and refine it further.

Cloud Foundation Buiding Blocks on interactive Maturity Model
Explore the Cloud Fundation Maturity Model interactively on our website.

In over 50 actionable building blocks – covering Security and Compliance, IAM, Cost Management, Tenant Management, and the Service Ecosystem – our website describes the capabilities needed to mature a Cloud Foundation.


Cloud Tagging and Labeling on Azure, AWS and GCP - (Cheat Sheet 2022)

Are you looking for Azure tag requirements, AWS tagging documentation or you want to know how to use GCP labels? You have come to the right place!

In this post, we want to give you an overview of how different cloud platforms handle tags or labels.

You can see this as a cheat sheet to help you navigate the cloud tagging or labeling specifics of Azure, AWS, and GCP.

This post will go into detail on questions like:

  • "How many characters can a tag have in Azure?",
  • "How many tags can be assigned to one resource in GCP?" or
  • "What characters are not supported for tags in AWS?".

So let's dive right in!


Using Tags in Azure

Here's what you need to know to get started with tagging in Azure:

  • A resource does not inherit tags hierarchically from the respective resource group
  • You can assign up to 50 tags to a single resource. If you need more, there is a little trick to add more: Creating tags with multiple values is a valid workaround.
  • The maximum key length in Azure is 512. For values it's 256.
  • Tags in Azure are not case-sensitive
  • With tag keys you must not use the characters: < > % & / ?

How tags work with AWS

These are the AWS tagging specifications you need to follow:

  • You can assign 50 tags per resource
  • Tag keys must be unique for each resource and can only have one value
  • The character limit for keys is 128, for values it's 256
  • You can use the allowed characters across all AWS services: letters, numbers, and spaces representable in UTF-8, and the following characters: + - = . _ : / @
  • EC2 allows for any character in its tags
  • Tag keys and values are case-sensitive
  • The aws: prefix is reserved for AWS use. If a tag has a tag key with this prefix, then you can't edit or delete the tag's key or value. Tags with the aws: prefix do not count against your tags per resource limit.


Tagging with GCP

First: Google calls its tags in GCP "labels" - they are still tags though.

Let's see what requirements and restrictions GCP labels have:

  • A resource can have up to 64 labels assigned to it
  • Both keys and values have a maximum length of 63 characters
  • Keys and values can contain: lowercase letters, numeric characters, underscores and hyphens
  • You are able to use international characters
  • Label keys must start with a lowercase letter
  • Label keys cannot be empty


Cloud Tagging At a Glance

Constraint/Platform Azure AWS GCP
Max. # of tags 50 50 64
Max. tag name length 512 128 63
Max. tag value length 256 256 63
Case sensitive? no yes yes
Allowed characters < > % & / ? are not allowed letters, numbers, spaces, and + - = . _ : / @ lowercase, numbers, underscore, hyphens



meshcloud offers a global management of tags in multi cloud architectures and makes sure they comply with the specific platform requirements.

To learn more about the meshcloud platform, please get in touch with our sales team or book a demo with one of our product experts. We're looking forward to get in touch with you.


aks kubernetes usage report

Kubernetes Chargeback (splitting the bill made easy!)

Kubernetes (K8s) has become an extremely popular technology. With that Kubernetes chargeback and cost management have become pressing challenges: A recent survey by CNCF shows that there are 5.6 million Kubernetes developers today. A 67% increase compared to 12 months ago. 31% of all backend developers now use Kubernetes. This wide adoption of Kubernetes as an orchestration technology for containers makes challenges apparent - especially in larger organizations: Automated Kubernetes billing and cost management becomes more important.

What is Kubernetes exactly?

Kubernetes - aka K8s for short - is an orchestration technology for containers. You can use it for automating deployment, scaling, and management of containers. It works declerativly in a master and worker node setup. The master node checks the status and health of the worker nodes that run the actual containers. Everything is checked against a defined desired state and K8s makes sure to make this desired state come alive and stay alive.

K8s Challenges for Cloud Governance

Cost management for Kubernetes clusters is a pressing topic for many operations departments. Enabling cloud teams to work with the latest technology like Kubernetes is an important part of leveraging the cloud. But this has cloud governance implications: Cost management, security and compliance.

When running Kubernetes in a scenario in which you share Kubernetes clusters you run into billing issues. An example would be running Azure’s Kubernetes Service (AKS) as a multi-tenant cluster. There is no standard way of knowing how to split up the bill for the resource consumption of the Kubernetes cluster.

Onboarding internal customers and giving them a way to manage their Kubernetes namespaces, quotas and access rights is another cloud governance issue you might run into.

That’s quite a lot to tackle in terms of cloud management.

So here’s a short overview:

  • Kubernetes cost management
  • Security and compliance of shared K8s clusters
  • Kubernetes namespace management
  • Quota management
  • Managing access rights

Splitting the K8s Bill with meshcloud

With the cloud foundation platform meshStack, meshcloud makes it easy to split the bill from shared K8s clusters. Many of our customers use cloud-native K8s services like Azure’s Kubernetes Service (AKS): They get one big bill from Azure every month and have not been able to fairly split the bill and charge their application teams based on actual usage. In order to do this, you can define a pricing catalog for AKS resource usage (like based on Persistent Volume Claims or the number of created Pods), that will be used to calculate a per month pricing for individual customers.

aks kubernetes usage report

meshcloud makes accurate Kubernetes metering and billing possible and much more:

  • Managing multiple projects,
  • authorizing users on projects,
  • charging the cost for projects to different teams, and
  • setting usage quotas for projects.

Application developers want to (and should) develop applications

Application teams should do what they’re good at: Develop new applications and innovate. Administrative tasks often hold them back. DevOps teams want a managed K8s service where they don't have to set up the cluster themselves. They just want to develop applications. Typically we see OpenShift used for this type of scenario - OpenShift metering and management is also fully supported by meshcloud.

With meshcloud it is possible to lessen the administrative effort even more. An Openshift or AKS cluster can be shared easily across multiple customers for more effective resource usage. Project namespaces and access will be automatically set up so your teams are able to start deploying pods within minutes.


Log4Shell: meshcloud NOT affected by Log4J Vulnerability

Researchers have found a critical zero day vulnerability in the Apache Log4J library. Our solution meshStack is NOT affected. Our engineers checked the meshStack services and dependencies and confirmed that our solution does not include affected Log4J modules.

What is the Problem with Log4J?

Apache Log4J is a widely used library for logging errors.

The recently discovered vulnerability CVE-2021-44228 - dubbed Log4Shell or LogJam - was given a CVSS severity level of 10 out of 10. It's a vulnerability that seems to be easy to exploit and enables attackers to execute code remotly.

The german Federal Office for Information Security (BSI) warns of the Log4J vulnerability with the highest warning level "4/Red". It goes on to say that "the extent of the threat situation cannot currently be conclusively determined." The BSI also warns that the vulnerability is already being actively exploited, for example for cryptomining or other malware.

When are you affected by Log4Shell?

The first important part is, that the vulnerability is located in the log4j-core module. Other libraries like Spring Boot are only using log4j-api by default and are therefore not affected by this vulnerability (see Spring Boot's blog post on it).

The vulnerability can only be exploited, if your application logs input that was provided by users in any way (e.g. via an API or via a UI). They could provide certain messages that lead to Log4J executing some code from remote which can be used to access your server and execute some malware on it. As long as you and the libraries you are using are not logging any input given by users, you should not be affected by the vulnerability. But especially with the usage of other libraries it might be hard to judge on whether you are actually affected or not. So if you are using the Log4J for logging, you should - regardless of whether you think that messages provided by users are logged - follow the recommendations below.

Current Recommendations

The Apache Foundation recommends to update the library to version 2.15.0 or - if not possible - to follow the instructions on their Log4J security vulnerabilities page to mitigate the risk.


meshcloud is ISO 27001 certified

We at meshcloud are very pleased to announce that our organization is now ISO 27001 certified.

Our customers use meshcloud's cloud governance solution to manage thousands of their projects in the public and private cloud: Companies like Volkswagen, EnBW and Commerzbank trust meshcloud to provide a solid and secure platform for their cloud governance needs.

It's been very important to us that we can back up this trust with an ISO 27001 certification of our Information Security Management System.

What is ISO 27001?

The international ISO 27001 standard provides requirements and describes best practices for an Information Security Management System (ISMS). It is the only standard that sets out the specifications for an ISMS and an accredited certification demonstrates that an organization has identified its risks and has eliminated them or implemented appropriate countermeasures.

meshcloud's ISO 27001 certification

For us at meshcloud, it has always been of the highest importance to live up to the trust placed in us by our customers.

The ISO Certification was the next logical step for us as a company. We partnered up with ISO experts and aligned our operation to the high standard of ISO 27001:

  • Developing internal ISMS competence
  • Conducting risk assessment
  • Implementing required controls
  • Creating thorough ISMS documentation
  • Training staff regularly

The audit of our ISMS included:

  • Inspection of documents
  • Observation and on-site inspection
  • Interviews with staff members

ISO 27001 requires re-certification checks every year.


To learn more about the meshcloud platform, please get in touch with our sales team or book a demo with one of our product experts. We're looking forward to get in touch with you.


Multi-Cloud Security and Compliance: The Comprehensive Guide 2021

This is a comprehensive guide to cloud security and compliance in 2021.

If you work as a CISO, Enterprise Architect, Cloud (Security) Architect, in DevOps or you are just interested in cloud security, this guide is for you.

In this new guide you will learn:

  • What cloud security and compliance is
  • Who the stakeholders are – inside and outside of your organization
  • What threats jeopardize your cloud environment
  • How to address security challenges on an organizational and technical level
  • How to create a security concept that combines organizational and technical security aspects
  • How to improve the interface between central IT and compliance, governance and regulatory departments

If you want to dive deeper, this guide will go into more detail on:

  • The concept of shared responsibility
  • IAM and tenant isolation
  • Shadow IT
  • Best practices on landing zone configuration
  • Security specifics of AWS, GCP and Azure
  • Certifications like ISO 27001 or CSA

 

Chapter 1: Cloud Security and Compliance Fundamentals

Let’s get started with a chapter on the basics of cloud security and compliance.

Specifically, in this chapter we are going to cover why cloud security and compliance are extremely important!

This first chapter will also show what cloud security and compliance is.

Let’s dive right in!

What is Cloud Security?

Cloud security is a strategy and concept to protect cloud computing environments, applications and data. Important elements of cloud security are organizational and technical security, as well as an overarching strategy. Although it might never be possible to prevent every variety of attack – a well-designed cloud security strategy vastly reduces the risks to data, infrastructure and applications.

The CIA and Zero Trust in Information Security

In our case CIA has nothing to do with THE CIA - it stands for the three key components of information security: Confidentiality, availability and integrity.

Cloud security is no exemption - every security strategy should aim to achieve these goals.

Arguably the most relevant goal in current security considerations is integrity.

Today's enterprise IT architectures have become so complex, that traditional perimeter security thinking won’t get you very far any more. It is very probable that your organization has no clear picture of who is doing what within their enterprise IT.

That is a great threat to the integrity of data, applications and resources.

The modern approach to information and cloud security is the zero trust approach: Assume that your firewalls have been breached, assume that the enterprise network ist just as hostile as the internet - and construct your security strategy around that.

Here are the three guiding principles to the zero trust approach:

  1. Verify explicitly
  2. Use least privileged access
  3. Assume breach

Why is cloud security so important?

You can have the best applications and the most satisfied customers.

But what if sensitive data is exposed – due to misconfiguration?

Exposing sensitive customer data greatly reduces your customers trust and may result in churn.

What if an attacker gains access to your infrastructure? Imagine an airline that loses control of the airline app backend. Resulting downtime leaves customers unable to book flights and passengers unable to check-in.

Competitors that have access to your company secrets can pose an existential threat!

Those scenarios will cost reputation and in the end a lot of money:

According to a report from IBM and the Ponemon Institute the average costs of a data breach is just short of 4 million dollars.

High profile cases of cloud security incidents include Facebook, Capital One and Docker Hub.

What are the main cloud security risks?

Most cloud security incidents can be mapped to one of these 4 categories:

  • Data is exposed or leaked due to a lack of protection
  • An unauthorized user from outside the organization has access to internal data
  • An internal, authorized user has too much access to internal data
  • A malicious attack incapacitates cloud infrastructure

Cloud security measures aim at reducing the risks posed by these threats by protecting data, managing user authentication and access, and keeping resources and services running.

Infobox: Cloud computing vs. on-premise computing

"We wish for the same security measures and capabilities on-prem as they are available in the public clouds"

We hear this a lot when talking to our customers.

Cloud security risks can seem daunting. But they overlap a lot with traditional IT security risks. Cloud computing is often more secure than on-premises computing. Most cloud providers have more resources to implement comprehensive security concepts than individual businesses do, which lets cloud providers like Amazon, Microsoft or Google keep infrastructure up to date and patch vulnerabilities as soon as possible. For example, 3,500 security experts work on keeping Microsoft's Azure Service secure. Even the CIA decided to go all-in on the Cloud with a private AWS account.

How can you improve the security of your clouds?

Achieving a higher level of security in your clouds is done by a combination of organizational and technical measures.

To get an overview on the current state of your cloud governance capabilities, have a look at our cloud governance assessment.

Let’s have a look at what that means and what you need to take into account:

  • Shared responsibility
  • Stakeholders in your organization
  • Internal regulations
  • Identity and Access Management
  • Tenant Isolation
  • Shadow IT
  • Encryption at rest and in transit
  • Credential management
  • Network architecture
  • Monitoring
  • Audit logging
  • Regional restrictions
  • Landing zones
  • Transparency
  • Certification
  • Exit Strategy

And that is likely not an exhaustive list.

Luckily this guide covers all of these aspects and a few more!

Chapter 2: Shared responsibility and stakeholders

First of all: It is important to know who is responsible for what aspect of cloud security:

The cloud provider, cloud foundation or DevOps team in your company or - most importantly - yourself? To clarify ownership throughout your organization and enable the departments and teams concerned with aspects of cloud security is a vital process for implementing a cloud security concept and strategy.

This chapter will shed light on responsibilities and relevant shareholders.

Let’s get to know the people concerned with your cloud security!

What is behind the concept of shared responsibility?

A very basic concern in cloud security is preventing unauthorized physical access to the servers. Does that mean you have to lock server room doors at Amazon, Microsoft or Google? Of course not: The responsibility for all the different aspects of security is shared. You don’t have to worry about locked doors or even patching the underlying software of the clouds you use.

Example: Pooled Audit

However, there is another side of the coin. A lot of companies, especially in regulated environments, e.g. the financial services industry are required to ensure unrestricted audit rights, when they outsource material IT workloads. This means that they would have to provide access to the service provider’s premises, when conducting an audit. This had been a showstopper for a lot of organizations within the European Union. In 2017, pioneering companies like the “Deutsche Börse” initiated a new concept, called the “Collaborative Cloud Audit Group” that enables financial institutions to group together to perform such audits in order to reduce the effort on both sides, the cloud providers and the financial institutions.

In turn, that means that there are other areas, for which the cloud providers don’t take responsibility. For example:

 

  • How you set up your organizational structure, e.g. IAM processes
  • How you configure and secure your infrastructure, e.g. network setup, firewalls, load balancers
  • How you secure your applications  

A general rule is that the cloud provider is responsible for the security of the cloud, while you as their customer are responsible for the security in the cloud. However there are other aspects that affect the shared responsibility model, e.g. the service model you are using (Iaas/PaaS/SaaS).

Here is an example: If you use a SaaS service like Google Docs, you don't have to take care of the VM images or databases this application is running on. If you deploy your own MySQL cluster to AWS EC2 instances however, you are responsible to encrypt the connection via SSL or TLS.

Who are the stakeholders in the cloud security business?

Put simply, the providers are responsible for the security “of” the cloud- whereas you and your organization - as their customer - are responsible for the security “in” the cloud.

Let’s have a look at what that means in a little more detail as you’ll have to figure out how to distribute responsibilities within your organization as well.:

Provider: The provider usually makes sure that your infrastructure built within its platform is secure and reliable. This responsibility includes physical security of hosts, network and datacenter as well as the software security of storage, databases and networking.

CISO: The CISO cares about the information and data security of a whole organization. That includes - but is not limited to - the cloud. The CISO takes executive responsibility for security operations, data loss and fraud prevention, identity and access management and the overall security architecture.

SOCs: A SOC (Security Operations Center) is a central organizational unit that is responsible to protect IT infrastructure across the organization. Monitoring IT systems, identifying security risks and responding to security incidents when they occur are part of the team’s core responsibilities.

DevOps : DevOps teams are responsible for secure engineering, secure deployment and operations, availability management, backups, separation of duties and security evaluations.

Cloud Foundation: Most large organizations aim to transform their IT environment towards cloud-native technologies to achieve more agility in their software delivery. They often set up a dedicated team to manage cloud infrastructure and provide secure cloud environments to DevOps teams across the organization. These teams are often called "Cloud Center of Excellence" or "Cloud Foundation" as they lay the foundation for the use of cloud infrastructure. This foundation may cover security and compliance aspects and relieves DevOps teams from security or compliance requirements that are independent of the application, e.g. the geographic restriction of cloud data centers.

Chapter 3: Cloud Compliance

This guide has been talking about cloud security a lot without touching on the important topic of cloud compliance so far. The second your organization decides to move data and applications to the cloud compliance becomes a huge issue!

Time to take a look at what you need to know about compliance and regulations when it comes to the cloud.

What is cloud compliance?

The term cloud compliance refers to the need that cloud-delivered systems must be inline with internal and external regulations. A common example is the European General Data Protection Regulation (GDPR), which concerns virtually every organization. But there are also very specific regulations for example in the financial or healthcare sector that companies need to comply with. This compliance must be transparent and auditable for regulators.

Which rules are there for cloud compliance?

The first step towards achieving cloud compliance is to be aware of the standards and regulations that apply within your industry and specifically within your organization. Standards and regulations may apply to certain

  • Industries
  • Geographies

Depending on the relevance of your industry there may be different regulations in place. Here are some examples, however this list is not exhaustive.

  • KRITIS for critical infrastructure (national)
  • BAIT and VAIT (Supervisory Requirements for IT in Financial Institutions/Insurance Undertakings)  (national by the BaFin)
  • ISO/IEEC 2700x (international by the International Organization for Standardization and the International Electrotechnical Commission)

What happens if you fail to be compliant?

Organizations are made responsible to meet a large variety of regulations. Failing to meet these can result in fines and a negative impact on your trustworthiness and reputation.

A widely discussed example is the GDPR, which focuses on data protection and privacy. 

Even if your organization is based outside the EU – as long as you have business in the EU, GDPR compliance is required. The possible fines are enormous. According to enforcementtracker.com the highest fines were due to insufficient technical and organizational measures to ensure information security. Marriott International was fined more than 110 million euros and British Airways almost twice that amount because of information security violations.

Cloud security and cloud compliance is a shared responsibility between cloud providers and organizations. In the past years, cloud providers have invested a lot in providing better transparency and better tools to be an eligible infrastructure provider, even for very sensitive industries or governments.

How do you stay compliant when working with the cloud?

A great challenge to staying compliant is the ever changing environment and requirements of internal and external regulations. That brings us to our list of 4 aspects you need to take into account:

  1. Be aware of regulations and guidelines
  2. Control access
  3. Classify the data and document, where it's at
  4. Encrypt the data you are entrusted with

But there is a lot more to ensuring continuous compliance - especially across multiple environments and vendors: With a declarative approach you can fully utilize the advantages the cloud offers by automating the efforts of enforcing your compliance policies across your multi-cloud environments.

Chapter 4: Organizational Security in Cloud Computing

Let's talk about organizational security.

This chapter will tell you what organizational security is. It will also show why it is an important part of any cloud security strategy. And if you want to learn about what aspects you should definitely consider for your organization - you came to the right place!

Let's go!

What is organizational security?

Organizational security is everything you do on an organizational (as opposed to technical) level to improve the security of your cloud.

How can you improve the organizational security level?

In this case you don't have to evaluate whose responsibility the measures for organizational security are depending on the cloud operating model: Organizational security is for everyone!

(Sidenote: Clear responsibilities are important! In that way an organization stays responsive during a security incident.)

  • Implementing principle of least privilege
  • Isolating tenants in a multi-tenant environment
  • Practicing the 4-eye-principle
  • Fighting shadow IT
  • Preventing accidental misconfiguration
  • Classification of data

What is the principle of least privilege

The principle of least privilege (PoLP) is the concept of granting access to only the resources that are absolutely necessary to do the assigned tasks. It is pretty similar to what you might know from movies about secret agents: They only know what is necessary to accomplish the mission: In that way they can't endanger the whole operation in case of failure.

But we're not the CIA or MI5 so let's see what the principle of least privilege means in terms of cloud security!

4 Tips to implement the principle of least privilege

From developer onboarding to long-term management of user and permission lifecycles, managing access to cloud infrastructure is complex and security-critical. Authorizations should be granted as sparingly as possible (principle of least privilege) in order to reduce security risks. At the same time, the productivity of developers should not be restricted by lacking access rights or tedious approval processes. A simple and transparent process for assigning access rights is therefore essential.

  1. Avoid excessive use of broad primitive roles
  2. Assign roles to groups, not individuals
  3. Reduce risk and control access to your project by using networking features
  4. Consider using managed platforms and services

Quarantine your applications and environments

A relatively easy but very important step in securing your cloud workloads is tenant isolation.

By that we mean 2 things:

  1. Every application needs to run in its own tenant.
  2. Different development environments, e.g. development, staging and production environments of each of your applications should run within their own tenant.

A common setup with our customers are three cloud tenants for every application you move to the cloud: Development, staging and production. The number of stages ranges from 2 up to 6, depending on the use case.

Let's have a look at a company that did not follow this principle – Tesla:

In 2018 security experts discovered that hackers gained access to Tesla's cloud resources. Not to steal company secrets but use them to mine crypto currencies. All of that happened on a tenant that was used for several applications and environments. That made it relatively easy for the attackers to hide their malware in the general activity on this tenant. It’s much harder to identify deviations of regular activity within a shared environment. One team responsible for a single app would probably have identified high expenses or unknown resources much earlier.

The dark truth about Shadow IT

According to a survey by the Cloud Security Alliance, only 8% of the CIOs believe they know about the secret digital infrastructure in their company. This shadow IT, or Stealth IT as it is more aptly referred to, hides from the radar of IT managers. Eco, the Association for the Internet Economy, asked 580 experts from German medium-sized companies for its IT security report - the result is clear and worrying: three quarters of those surveyed assume that a shadow IT exists in their company. Nearly 25% fear a "considerable extent".

Even without an actual security breach, shadow IT costs your organization money, right now. It is uncontrolled and may contain unnecessary workload, organization-wide discounts by the cloud providers, e.g. for reserved instances are not taken into account. There are examples of cloud costs being billed as travel expenses on company credit cards, making the controlling of cloud costs hardly possible.

Chapter 5: Technical Security in Cloud Computing

Now that we have covered the ins and outs of organizational security, let's turn to its even more powerful brother: Technical security!

This chapter will cover what technical security is when it comes to keeping your cloud secure.

You will learn about important aspects of technical security, like encryption at rest, credential management and audit logging.

What is technical security?

In the realm of IT – or in our case more specifically – cloud security, the term technical security refers to technical actions that can be taken to implement and enforce security measures.

How can you improve your technical security level?

Here's a list of aspects and concepts you definitely have to take into account when thinking about technical cloud security:

Once again, make sure you understand who is responsible for each aspect. The cloud service provider? If not, who in your organization is it?

  • Encryption
  • Credential management
  • Network architecture
  • Monitoring
  • Audit logging
  • Regional restrictions

Even if you make use of cloud-native service offers, you have to evaluate whether the service is compliant to your internal and external regulations. To give you an example: With AWS Key Management Service (KMS), AWS offers a managed service to generate encryption keys. Some organizations however, have the requirement to have exclusive control on their HSM (hardware security module, the device that manages digital keys and performs encryption and decryption functions), which isn’t fulfilled by the KMS service. To address this need AWS launched additional services for this specific use case.

Encryption at rest: Protecting data, even if it has been stolen

Simply put, data encryption is the process of translating one form of data into another form of data that unauthorized users can’t decrypt. For example, you saved a copy of a paid invoice on your server with a customer’s credit card information. You definitely don’t want that to fall into the wrong hands. By encrypting data at rest, you’re essentially converting your customer’s sensitive data into another form of data. This usually happens through an algorithm that makes it practically impossible for somebody without the encryption key to decode it. Only authorized personnel will have access to these files, thus ensuring that your data stays secure.

Encryption in transit: Get the armored vehicle for your data

Data that is being moved from one place to the other is vulnerable to attackers. Unencrypted data transfer puts this data at risk. To protect it against eavesdropping or a Man-in-the-Middle you need to enforce your defined encryption requirements. It’s also a good idea to authenticate the network communications using Transport Layer Security (TLS) or IPSec.

Credential management

Credential management is key to securing any kind of system: From the traffic on our roads, to the traffic in our data centers. The management of drivers licences and IT credentials have a lot in common. They both are:

  1. Generated,
  2. stored,
  3. backuped,
  4. used,
  5. audited,
  6. changed,
  7. and eventually revoked and deleted.

Those 7 aspects need to be taken in account when managing credentials. That’s about how far our little metaphor will get us here. Let’s dive into multi-cloud credential management:

When we say credentials, we mean passwords, tokens or keys that grant access to your workload. Manage credentials and authentication mechanisms in a way that reduces the risk of accidental or malicious use.

Here are a few tipps, tricks and best practices:

  • Define IAM configurations to meet your organizational, legal, and compliance requirements.
  • Integrate with the centralized federation provider of your organization to reduce complexity. In that way, all users are authenticated in a centralized place.
  • Enforce password requirements to protect against password attacks like rainbow tables or brute force.
  • Enforce multi-factor authentication (MFA) to provide an additional layer of access control.
  • Lock physical credentials away That includes hardware MFA tokens.
  • Rotate credentials regularly to avoid unauthorized use of old credentials.
  • Audit credentials from time to time.

Network architecture

Most organizations have a hybrid infrastructure, with parts in the cloud and parts on-premises. Sensitive data is often kept on-prem, for security reasons, but also because large amounts of data in the cloud lead to an immense lock-in effect, as it is very easy to get the data in there, but costly to take it out again. That's why a lot of applications running in the cloud need access to on-prem infrastructure.

All 3 public cloud providers offer Virtual Private Clouds (VPCs) that enable you to build virtual network topologies that you can fully control.

Here are some VPC Best-Practices to improve network security:

  • Use multiple availability zones for high availability
  • Use public subnets for external-facing resources and
  • Private subnets for internal resources
  • Use ACLs —> Access Control Lists to limit the traffic between components to the minimum

It can make sense to hand out new cloud tenants with standard network components already deployed (via Landing Zones) to relieve DevOps teams and avoid insecure configurations.

4 Tips for monitoring cloud security

Cloud Monitoring is another critical aspect of keeping your workloads secure. Correct monitoring will tell you if your cloud infrastructure functions as intended while minimizing the risk of data breaches.

To do that there are a few guidelines to follow:

  • Your monitoring tools need to be scalable to your growing cloud infrastructure and data volumes
  • Aim for constant and instant monitoring of new or modified components
  • Don't rely on what you get from you cloud service provider alone - you need transparency in every layer of your cloud infrastructure
  • Make sure you get enough context with your monitoring alerts to help you understand what is going on

You can and should monitor on different layers (e.g. network, application performance) and there are different tools for doing this. SIEM (Security Information and Event Management) tools collect data from various sources. They process this data to identify and report on security-related incidents and send out alerts whenever a potential risk has been identified.

Audit Logging

With audit logging you document changes applied to your cloud tenants: Has a new user been added to your AWS account? Were access rights granted in your Azure subscription? Or who logged in when and for how long into your Google Cloud Project?

Audit logs are an absolute necessity when it comes to cloud security and compliance! And there are three main reasons why:

  1. Compliance auditing: Audit logs are official records that can be used to prove compliance to an auditor.
  2. Security analysis: Audit logs let you trace malicious behaviour and potential attacks.
  3. Operational troubleshooting: Audit logs help you find what is wrong with your tenants.

Closing remarks to this topic: Disk space is cheap, there is absolutely no reason to not keep audit logs.

Infobox: The most common audit logging services

At AWS user activity and API usage can be tracked with AWS CloudTrail. CloudTrail stores event logs in the CloudTrail console, Amazon S3 buckets and (optionally) in Amazon CloudWatch logs.

Azure provides a whole barrage of logging, auditing and monitoring tools. Audit logs can be retrieved from the Azure Active Directory portal.

Google Cloud Platform offers Cloud Audit Logs, that maintain three audit logs for each project, folder and organization.

Regional Restrictions

Regional restrictions in the context of this guide are mainly a compliance concern that can be addressed technically. Many european companies want to, or need to, make sure that their data is stored on servers within EU jurisdiction. Same with managed services: They often need to be delivered from a certain geography. 

With AWS for example you can disable entire regions (that’s not possible with all of them) and set up IAM policies that restrict access to certain geographies.

Chapter 6: The combined approach to cloud security

Achieving security in all aspects of cloud computing is a multilayered quest for every organization. Single measures of organizational and technical security are important but not enough.

This chapter will discuss the overarching organizational and technical security measures you need to know about and implement within your organization.

Learn about metadata for applications, the danger of configuration drift and how landing zones greatly improve the security and compliance of your tenants.

Metadata

Maintaining organizational metadata - or context information - for applications, such as Application IDs, cost centers or security contacts is a way to establish a connection between the organization and the actual implementation of the application and integrate cloud infrastructure to the surrounding IT landscape, including CMDBs (Configuration Management Databases), SIEMs or Accounting systems like SAP.

Let’s have a look at two examples:

  • Providing a cost center when creating a new cloud tenant enables cloud foundation teams or management systems like meshcloud to map the occuring cost of an application to the corresponding department.
  • SOCs scan infrastructure for vulnerabilities. If a vulnerability is detected, it is essential to know within which application the affected infrastructure is used and who is responsible for its security and therefore fixing the breach.

Declarative model vs. Workflow-centric approaches

We’ve almost reached the end. And by now it’s pretty clear that cloud security and compliance are a complex endeavour with many different technical and organizational aspects to take into account. The complexity increases, if you consider that all these aspects do not only have to be set up and implemented once, but for a heterogeneous application landscape throughout the application lifecycle. Looking at a longer term perspective this leads to a risk for a phenomenon called configuration drift.

What is configuration drift?

Configuration drift describes a deviation of configurations from their initial setup due to frequent changes in hardware and software.
Within a complex cloud landscape you’ll have to think about how to treat configuration drift, from detecting it, to managing and correcting it. 

Declarative model vs. Workflow-centric approach

A common way to speed up slow manual processes is to automate the workflow. So for example, instead of having an Azure Admin manually create and configure a subscription for a DevOps team, there will be a script automating the workflow to reduce the time needed. 

But what happens if the DevOps team lead goes ahead and changes the configuration to better suit the application’s needs? Right, configuration drift, and no one will be aware of it.

A superior approach is to define a desired state. To stick with the Azure example, this could be an Azure subscription with access permissions for a DevOps team lead and one of his team members. This desired state definition can be continuously compared to the actual state. If no subscription or permissions exist yet, they will be initially set up. If the DevOps team lead changes the configuration, this will be detected. If it is intended the desired state can be updated, if not the action can be undone to get back to the desired configuration.

Infobox: How vs. What

Workflow-centric approaches focus on “how” to achieve a desired outcome, while declarative approaches provide a clear definition of “what” is to be achieved. A declarative approach has the benefit that it enables a continuous validation of the actual state against the defined desired state (re-certification) and provides a single source of truth to avoid configuration drift.

Landing Zones

Configuring a new tenant to be secure and compliant can be quite the hassle – especially if you have to do a basic set of tasks over and over again. This is where landing zones come in. Landing zones allow to quickly set up a multi-tenant environment with a baseline of identity and access management, data security, governance and logging already in place.

The basic purpose of a landing zone is to build and secure the airport before an application lands in the cloud.

But landing zones are not "fire and forget": A proper landing zone lifecycle management is an important part to keep your environments secure and compliant.

The big cloud service providers have the concept of landing zones implemented in some way.

Exit Strategy

Moving your workloads to the cloud brings many advantages – like on demand infrastructure and elastically scalable services. While most cloud users love the feel of innovation and progress to it, many don't think too much about how to get out again – why would they? But having a solid exit strategy in place is essential! In some industries, like banking, it is a regulatory requirement, as stated in the EBA Guidelines on outsourcing arrangements. The goal of an exit strategy is to ensure business continuity under changing circumstances. What if the service provider terminates the contract? What if the services do not meet the defined quality standards? Being able to handle these scenarios without interrupting critical business functionality is part of a comprehensive cloud transformation strategy. 

So here are 4 aspects you will have to have an eye on when building your cloud exit strategy:

  1. Most importantly: Take inventory! Knowing your assets is essential. Exit strategies often apply to critical business functions only. So it’s important to know what you have running in which cloud – an up to date cloud inventory is of great help.
  2. Open-source infrastructure is key. Open-source infrastructure components like Kubernetes clusters or open-source databases can make a move between clouds much easier. The more proprietary services you use, the harder it will be to adapt your application to running in a new cloud environment.
  3. Go multi-cloud from the beginning. Contract negotiations between enterprises and cloud providers can take a while. It’s too late to start the process, when it’s actually time to move
  4. Watch out for organizational lock-in. Even if from a technical perspective your application can easily be moved to a different cloud provider, there’s more to it. If you are running cloud applications at scale, setting up the corresponding cloud environments transferring permissions and configurations comes with massive complexity. Use a centralized governance system like mehscloud to keep your organizational structures independent from specific providers.

To learn more about the meshcloud platform, please get in touch with our sales team or book a demo with one of our product experts. We're looking forward to getting in touch with you.


6 Things to Watch out for when Starting Your Cloud Journey

Enterprises plan their cloud transformation carefully and thoroughly. And that's exactly what they need to do in order to set their cloud journey up for success.

But the truth is that many organizations don't have a lot of experience when it comes to migrating to the cloud. They are up for a steep learning curve.

That's why we've compiled a list of 6 aspects you need to keep in mind when embarking on your cloud journey:

  1. Breaking up silo structures
  2. Assessing the technical expertise of your teams
  3. Understanding cloud vendor lock-in costs
  4. Understanding the shared responsibilities in the cloud
  5. Considering Managed Services
  6. Develop an agile, cloud native way of working

Let's get to it:

1. Breaking up silo structures

Moving to the cloud requires a change in the organizational structure. Just signing a contract with AWS, GCP or Azure is not enough. Infrastructure silos focusing on databases, networks, and so on are not ideal, to say the least. Everybody working on an application has to communicate with those silos.

Developing and running applications in this scenario puts a lot of overhead responsibilities on the shoulders of DevOps teams. And it grows with each cloud platform they add.

Optimizing in silos can make each silo run their cloud platform perfectly but it won't remove inefficiencies in the overall cloud transformation effort.

A cloud foundation team that sees itself as an enabler for DevOps is the best practice. The cloud foundation can optimize for applications and go-to-market.

2. Assessing the technical expertise of your teams

You have decided on one or more cloud platforms - like AWS, Azure, or GCP - to migrate to and build on. It is now important to focus on assessing the technical expertise in your organization and upskilling your teams to enable them to work with these cloud platforms.

Migrating to the cloud will most likely - and this is often overlooked and not talked about - automate certain positions out of existence. But keeping skilled and qualified IT staff on board should be a priority: Identifying and reskilling people in these positions and offering them new and valuable opportunities within the organization is the way to go.

A cloud foundation team can offer consulting and training to support the ramp up.

3. Understanding cloud vendor lock-in costs

Enterprises must review and fully understand the costs that come with choosing a cloud service provider. The cost reduction promised by the cloud can only be achieved if the cloud transformation is done right and all costs are made explicit.

Going all-in with one cloud vendor leads to a strong dependence on their proprietary technologies. Switching costs are high and may prohibit the move to competing vendors further down the road.

Make sure to have a viable cloud exit strategy in place and go with a cloud governance solution that makes the organizational and technical transition to another vendor economically feasible.

In addition being credibly able to switch providers gives you strong leverage in negotiations.

4. Understanding the shared responsibilities in the cloud

A general rule is that the cloud provider is responsible for the security of the cloud, while you as their customer are responsible for the security in the cloud. However, there are other aspects that affect the shared responsibility model, e.g. the service model you are using (Iaas/PaaS/SaaS).
Here is an example: If you use a SaaS service like Google Docs, you don't have to take care of the VM images or databases this application is running on. If you deploy your own MySQL cluster to AWS EC2 instances, however, you are responsible to encrypt the connection via SSL or TLS.


Another important factor is to assign responsibilities clearly to the cloud foundation team and the DevOps teams. The cloud foundation can offer a security baseline with predefined cloud landing zones and takes care of organizational overhead. The DevOps teams have more freedom when working with the cloud - compared to the data center approach - and with that freedom comes the responsibility to take care of application security. The Cloud Foundation Maturity Model provides more insights on how to structure shared responsibility in a Cloud Foundation context.

5. Considering managed services

Migrating to the cloud is a major task in terms of organization, technology, and operations. Wanting to do everything in-house may be understandable but the already very busy IT teams just might not have the capacity or skill set to take on every project.

Making use of higher-level managed services may be the right choice to keep the cloud migration on track and within budget. You may want to have more than just infrastructure-as-a-service (IaaS) and use more than just one cloud service provider: That's also why an abstraction layer that unifies all clouds brings no value to your cloud transformation.

Even if you start off with a pilot project that your organization can handle capacity- and expertise-wise: The challenges will build up as you move on and broaden the scope of your cloud journey. That is a development we see quite often in the market - companies wasting time and money and then turning to external partners a good way down the road.

The same goes for intra-organizational services: Not every team should have to solve problems that other teams have already successfully overcome. Teams should be enabled to offer their solutions and services to other teams - via a cloud service marketplace - to push innovation and speed up development.

6. Developing an agile, cloud-native way of working

Going with a cloud strategy is only part of making full use of the competitive advantage the cloud can offer. Without an agile and cloud-native way of working the potential will not be fully explored. It is the prerequisite to moving the actual workload to the cloud and taking advantage of the scalability, flexibility, and speed the cloud can provide.

A cloud foundation or a cloud competence center should take care of the organizational overhead and enable developers to fully focus on their products.

A DevOps team lead should be able to provision a cloud account and deploy applications without the interference of some kind of central IT. Offering a self-service cloud account creation requires a high degree of automation. This reduces manual workload and with that reduces the "time-to-cloud" for the developers. Using an existing ITSM tool for cloud resource provisioning seriously limits the usefulness of the cloud.

Moving to the cloud is a deep-rooted transformation in an IT organization and means fundamental changes in how things are done. A cloud foundation team needs to evangelize the use of the cloud and empower the teams on their way. It can not be expected that everybody is on board with the cloud strategy right away. Some applications will have to be refactored - a lot of work - the transformation will only be successful if there are communication efforts to show that it’s worth it.


To learn more about the meshcloud platform, please get in touch with our sales team or book a demo with one of our product experts. We're looking forward to getting in touch with you.


The Cloud Foundation - Key to Cloud Excellence

Organizing your IT the cloud-native way: Read why the creation of a central cloud team needs to be a central piece in your cloud strategy.

A cloud journey is as much an organizational journey as it is a technological one. The establishment of a Cloud Foundation, also known as the Cloud Competence Center or Cloud Center of Excellence (CCoE) is a best practice to leverage the advantages of the cloud.

In this post on the Cloud Foundation, we want to introduce you to the core concept of such a team and why it makes such a big difference.

So let’s dive right in:

What is a Cloud Foundation?

A Cloud Foundation is a multi-disciplinary team of enterprise architects, developers, and operators, network and security engineers, system and database administrators. The team governs and enables the organization's cloud transformation process.

Enterprises that follow a cloud strategy to reduce costs and become more agile need to take on organizational transformation to leverage the cloud to the fullest.

One central piece of this puzzle is the creation of a central cloud team: Cloud Foundation, Cloud Competence Center or Cloud Center of Excellence - many names, one core concept: Centralizing cloud governance to enable more productive, more agile, and more innovative DevOps.

It is effectively the team that pioneers and paves the way that is used by DevOps teams to safely travel to and navigate the cloud.

The Advantages of a Cloud Foundation over Cloud Silos

Many IT departments in enterprises are still organized in so-called silos. When it comes to the cloud that may mean there is a division for Azure, one for AWS, and one for GCP. Every one of these silos has to build up know-how and tooling for proper cloud governance.

Developing and running applications in this scenario puts a lot of overhead responsibilities on the shoulders of DevOps teams. And it grows with each cloud platform they add.

Optimizing in silos can make each silo run their cloud platform perfectly, but it won't remove inefficiencies in the overall cloud transformation effort.

Global optimization is not possible in a silo structure: With a Cloud Foundation on the other hand you can optimize the entire cloud journey of your organization. The Cloud Foundation centralizes cloud governance competencies to enable and drive the cloud journey.

Cloud governance is not platform-specific - and so it does not make sense to reinvent the cloud governance wheel for every platform in every silo. In a Cloud Foundation team boundaries and best practices can be shared better and faster leading to better platform-specific implementations.

Done well, this enablement function will achieve the following outcomes:

  • Accelerate cloud adoption across the organization
  • Enable and encourage innovation
  • Optimize costs (based on FinOps practices)
  • Minimize the risks involved (based on continuous compliance and compliance as code practices)

Done badly, however, a Cloud Foundation can actually end up acting as a barrier to consuming cloud within the organization.

It is important to have a value mindset and focus on what value the cloud can bring to the teams: If it is just a top to bottom company initiative there will be resistance and you will lose speed.

The Main Tasks of a Cloud Foundation

Let's talk about more specific tasks for the Cloud Foundation team.

To achieve the goals we've talked about in the previous section a Cloud Foundation has to take on the following eight tasks:

  1. Implement cloud transformation and strategy
  2. Manage cloud cost and billing
  3. Manage and report cloud transformation
  4. Implement and enforce organization-wide cloud journey policies
  5. Provide guidance and training
  6. Manage cloud identities and access for DevOps teams
  7. Keep up with the latest cloud technologies
  8. Manage cloud security and compliance risks

How to build a Cloud Foundation Team

Ideally, you start building a Cloud Foundation team well before the first workloads are migrated to the cloud. But that doesn't mean that at some point it may be too late to start: It is never too late to tackle the organizational transformation needed to fully capitalize on the competitive advantages of the cloud.

Let's say you have a small team that has a cloud-native use case, and they are the first lighthouse team going into the cloud. After this move, the organization itself learned a lot from actually doing and learning instead of planning a long time and never implementing anything. So sometimes the better approach is just trying because it puts you on a steeper learning curve. The cloud foundation team could be a team of architects accompanying the first five teams and then evaluate what went well, what bad, and how they can support others in building a solid practice on their cloud journey.

Regardless of the size of the business or the extent of its presence in the cloud, a Cloud Foundation team should start small. Building the team from developers, system administrators, network engineers, IT operations, and database administrators: start small, learn fast, and grow big!

Building a solid foundation first and then the house on top of it is definitely a good sequence of events. However, it is absolutely worthwhile equipping the building that houses your business success with a stable foundation; even after you came to realize it might be built on softer ground.


To learn more about the meshcloud platform, please get in touch with our sales team or book a demo with one of our product experts. We're looking forward to get in touch with you.