Building Blocks Interaction Diagram

Designing meshStack Building Blocks

In this post I want to provide some technical background of meshStack’s new building blocks design that our team is actively working. As outlined in our previous post, the key challenges we set out to solve are publishing building blocks for self-service, resource orchestration using familiar tooling, incremental automation and enabling composition of building blocks.

Defining the Problem Space for Building Blocks

Equipped with a lot of learnings from helping cloud foundation teams build hundreds of landing zones and delivering them to thousands of application teams, we defined the problem space we want to design for

  • Automation is a a must, though for many building blocks it becomes crucial only once you hit a scale > 50 cloud tenants (→ obligatory xkcd).
  • Building blocks ought to represent meaningful high-level capabilities – as such application teams usually compose only a rather small number of them. Extrapolating from our existing experience with marketplace services, we expect most applications will have < 20 building blocks per cloud tenant. We think this is fundamentally different from orchestrating on the cloud resources level where it’s common to have hundreds or thousands of resource to deploy an application on.
  • Cloud foundation teams will leverage building blocks to design and deliver modular landing zones. The lifecycle of building blocks like VPCs or firewall rules will consequently be closer tied to the lifecycle of the cloud tenant than to the lifecycle of an application deployment.
  • The key value proposition of a cloud foundation platform like meshStack is integrating all essential governance functions like tenant management, IAM, cost management, security and compliance in a single application. It’s thus important that building blocks seamlessly integrate with every governance function available via meshStack.
  • Most smaller cloud foundation teams lack the capacity to implement all cloud foundation capabilities by themselves. The ability to tap into an established ecosystem offering a starting point in the form of reusable building blocks is an immense value add.

Orchestrating Building Blocks

We found it helpful to phrase the underlying problem for implementing building blocks in meshStack as an orchestration problem. Platform Engineers and Enterprise Architects already interact with many different incarnations of this problem on a day to day basis like Terraform, Kubernetes, or CI/CD Tools like GitHub Actions.

Adding and reconciling building blocks

Our design takes inspiration from these examples, combining the relative strengths of these orchestration solutions in a way that makes sense for our problem space. The figure below describes the key elements of our design

Building Blocks Interaction Diagram

  1. Application teams can select building blocks from a catalog. Building blocks can have inputs of different types like manually specified inputs (entered by the application team or platform engineers), outputs from other building blocks (creating a dependency between blocks) or metadata derived from other meshObjects like meshTags
  2. meshStack validates and updates the building block graph. The building block graph is a DAG (directed acyclic graph) representing the dependencies between building blocks. meshStack always maintains a single source of truth for the desired state of this graph.
  3. block-runners are independent processes that can reconcile a certain building block implementation type. Runners connect to meshStack via meshObject API, polling for runnable blocks. Building blocks are runnable when they have all their inputs satisfied.
  4. meshStack passes the desired state of the building block to the runner, including the desired lifecycle state of the block as well as all of its inputs.
  5. The block-runner reconciles the building block, for example by executing a terraform apply in case of a terraform building block.
  6. The block-runner collects the resulting output and returns it to meshStack. This information can also include detailed execution logs to help debugging.
  7. Application teams can at any team inspect the status of their cloud tenants and associated building block graph.

Benefits of meshStack's Building Block Design

This design offers a number of interesting properties relevant to our problem space.

Desired state reconciliation - a typical meshStack customer will manage thousands of building blocks with meshStack across multiple cloud providers. Desired state reconciliation of each individual block makes management robust in face of the inevitable cloud failure and curbs configuration drift.

Autonomous reconciliation - for managing a huge number of building blocks it’s important to intelligently prioritize reconciliation of building blocks with changed inputs with detecting configuration drift while carefully observing cloud API rate limits and automatically recovering from transient error conditions.

Predictable execution - when failures occurs its important that cloud engineers are able to quickly troubleshoot what’s gone wrong. With a single source of truth for block graphs and a central coordinator, reconciliation failures always occur at a defined point in the execution plan and do not propagate further.

Simple Composition - the types of composition we need to enable between building blocks are mostly simple input/output relations. For example, a firewall rule building block may need a VPC id specified as an output of a VPC building block. This means that we will start our design from simple 1:1 mapping from inputs to outputs. There will be different types. Building block implementations can perform more complex computations from inputs with familiar and more suitable tools like HCL (for a terraform building block), pushing this complexity to the edges of the system.

Swappable Block Definitions - it’s important that platform engineers can reuse various existing automation technologies to implement building blocks. In that sense, building block definitions act like an interface for which platform engineers can seamlessly swap the implementation.

Extensible Runner model - simple things should be simple, so meshStack will include out-of-the-box runners for common scenario like terraform modules. Similar to GitHub actions, our runners it will be open source and easy to self-host. This enables advanced scenarios like deploying runners with access to sensitive environments like on-premises or special secrets, runners implementing custom automations and so on.

The Migration path for Marketplace Service based Open Service Broker API

When we designed the first version of meshStack’s marketplace almost five years ago in 2018, we wanted to provide a capable platform that enabled private cloud as well as public cloud use cases alike. The problem space we had in mind was to provide application teams with a PaaS experience as pioneered by platforms like Heroku and Cloud Foundry. Approaching the challenge from that angle, OSB API shines with great support for service catalogs to aid application team’s discovery of compatible services as well as metering service usage for internal chargeback.

From the perspective of a cloud foundation team however, implementing building blocks on top of OSB API poses a few challenges. Implementing a service broker requires implementing a conformant API – a software engineering problem. While especially bigger cloud foundation teams are willing and able to develop and operate custom service brokers, we learned that smaller teams are keenly aware of the their limited bandwidth.

To deal with this, we sought to enable cloud platform engineers by building on workflows they already experts at like writing simple scripts and IaC. With the unipipe open source project we tried to make OSB API more accessible by transparently translating it to a GitOps workflow. Despite all of our efforts, the resulting experience still fell short of our ambitions while simultaneously critically lacking desirable composition capabilities as outlined our previous post on modular landing zones.

Building blocks will offer a clean migration path for customers already leveraging the OSB API marketplace - OSB API service instances will be just another type of building block implementation. We will be tackling OSB API integration with an OSB API compatible block runner in a later milestone, but our current plan is to support the migration as much possible out of the box without requiring changes to existing service brokers, service definitions and service instances.

Enabling Custom Platforms through Building Blocks

One of the design areas we are actively looking into as well is exposing existing meshStack tenant replication capabilities as built-in building blocks. This will give platform engineers fine-granular control over how their modular landing zones apply to cloud tenants.

Modular Landing Zones – The next frontier for Cloud Foundations

Modular landing zones enable cloud foundation teams to deliver cloud tenants to application teams that these teams can flexibly extend and configure with optional building blocks like virtual networks, on-premise connectivity or vertically integrated DevOps toolchains. The capability to build and deliver modular landing zones is essential for delivering use-case tailored landing zones for the variety of different workloads most organizations have from traditional lift & shift deployments over container platforms to cloud-native workloads.

Our cloud foundation platform meshStack helps cloud foundation teams deliver modular landing zones at scale with full self-service for application teams. We see a clear trend that application teams expect these landing zones to serve as internal platforms for their workloads. This means that the landing zone should not only deliver a secure cloud tenant but also come with “batteries included” building blocks that accelerate application deployment and reduce operational overhead.

Building landing zones and internal platforms is challenging. In this post I want to share our plans and vision for improving the experience for enterprise architects and platform engineers designing and building these landing zones. I will also be covering how meshStack’s current marketplace features will evolve to align with this vision.

Key Challenges delivering Modular Landing Zones

Based on our learnings helping cloud foundation teams build more than one hundred landing zones, we identified four key challenges faced by enterprise architects and platform engineers who want to deliver a modular landing zone.

  1. Publishing building blocks for self-service
    Cloud foundation teams make building blocks available to application teams via a publishing process that enables staging and versioning. Application teams must be able to discover the building blocks available for their landing zone in self-service. Self-service makes it possible to add, modify and remove building blocks without manual interaction by the cloud foundation team and ensures there’s a single source of truth for the configuration of every cloud tenant.
  2. Resource orchestration using familiar tooling
    Building blocks typically have to orchestrate cloud resources into a configuration that provides a reusable capability like a secured object storage bucket or virtual network. Platform engineers already use infrastructure as code tools like terraform very successfully to automate resource configuration. It must thus be possible to easily define building blocks from existing automation like terraform modules instead of adding additional complexity with yet another tool.
  3. Incremental automation
    Many landing zone capabilities like on-premise connectivity have dependencies on legacy infrastructure like on-premise firewalls or IP address management systems that are difficult to automate. It’s thus crucial that cloud foundation teams can take an incremental approach to automate building block implementation, maybe even starting from a fully manual fulfillment approach backed by old-school ITSM over to a semi-automated workflow. Incremental automation allows cloud foundation teams to focus their efforts on those building blocks where automation provides the best return on invest.
  4. Enable composition of building blocks
    Building blocks have to compose seamlessly with landing zones, cloud tenants and other building blocks. This enables use cases like modeling the dependency between a “firewall rule” building block and a “default VPC” building block added to a cloud tenant by mandatory landing zone configuration.

Modular Landing Zones Support in meshStack

meshStack has historically supported the capabilities to build and deliver modular landing zones using an internal service marketplace, which we appropriately enough called “service marketplace”. The service marketplace has been an integral part of meshStack since the earliest inception of the product and many of our customers have developed a strong and successful service ecosystem for their landing zones. In fact, the number of service instances managed by our marketplace far outweighs the number of cloud tenants.

While we think that meshStack’s service marketplace fundamentally provides the right kind of self-service experience to application teams, we are highly aware of some major shortcomings in its current technical design when it comes to empowering cloud foundation teams to deliver modular landing zones.

  1. Leveraging familiar tooling like terraform requires a lot of plumbing with unipipe and a GitOps workflow, adding a lot of complexity to simple automation use cases like “just deploy this terraform module”.
  2. Starting with a manual service implementation as part of an incremental automation strategy is not possible directly from meshStack’s user interface, instead requiring unipipe and manual GitOps operations via its companion cli tooling.
  3. Service instances are independent of cloud tenants by default. It requires additional steps to connect them to a cloud tenant, increasing complexity of standard landing zone use cases like a Virtual Network Service that adds a default virtual network to a cloud tenant.
  4. Service instances are also “flat” and do not compose with each other, requiring clever workarounds to model dependencies and interactions between different services.

After a lot of discussions with our customers and cloud foundation stakeholders we have decided to fundamentally reboot our approach to building and delivering modular landing zones with meshStack. Which brings us to…

Building Blocks – Modular Landing Zones with Ease

Going forward, building blocks will become the new universal primitive for assembling landing zones and cloud tenants in meshStack. Each building block represents an encapsulated piece of functionality provided to an application team. Explained in a single picture below, application teams can flexibly assemble building blocks on the landing zone’s “baseplate” as required to support their use case.

To design landing zones, cloud foundation teams can designate building blocks as mandatory (pink) and optional (blue) building blocks, giving application teams a great deal of flexibility while retaining essential control.

Key Design Elements of Building Blocks

On a high level, here are the key design elements of meshStack’s new building blocks.

Building blocks:

  • are individually reconciled, receiving inputs and producing outputs
  • can depend on other building blocks, forming a directed acyclic graph (DAG)
  • can attach directly to cloud tenants (meshTenants), with each tenant having its own independent building block graph
  • are mandatory and optional components of a modular landing zone, delivered to application teams in a self-service experience similar to meshStack’s current service marketplace
  • provide swappable implementation options:
    • “manual” building blocks enable incremental automation starting from a GUI-based manual process
    • meshStack will include out-of-the-box support for deploying terraform modules as building blocks
    • platform engineers can implement custom blocks using on an external block-runner API
  • are aware of their desired and actual state

How Building Blocks address Key Challenges of Cloud Foundation Teams

The new building block design directly addresses the key challenges faced by cloud foundation teams:

  • out-of-the-box support for popular IaC tooling like terraform for implementing building block leverages widely available skills and enables plug&play reuse of existing automation assets instead of requiring complex GitOps pipeline setup
  • incremental automation becomes a first class concept supported by meshStack’s GUI, enabling cloud foundation teams to first focus on overall landing zone design before investing into automation to solve operational challenges
  • a conceptually simple yet powerful composition of building blocks enables advanced scenarios without requiring challenging API integrations or hidden backchannels between services to coordinate their functionality

Moving forward with Building Blocks

I will be sharing some more in-depth insights about the technical design of building blocks in an upcoming post. Our vision is that building blocks will ultimately supersede the existing tenant replication and marketplace functionality – unifying both in a single design that is conceptually less complex yet more flexible. This will make meshStack considerably more useful across a wider array of platform use cases.

Building on building blocks as a foundation, we plan to empower cloud foundation teams to define custom cloud platforms and landing zones more easily, for example to integrate internal developer platforms and specialized cloud providers. We will enable this by making more of the meshObject model available as APIs so that cloud foundation teams can tap into the same concepts meshStack uses to deliver out-of-the box capabilities for building AWS, Azure, GCP and other cloud platforms.

We are very excited about these changes and will ship our first MVP of building blocks this week. As part of the MVP we will first enable “manual building blocks”, followed by supporting building blocks based on terraform modules. We will be sharing more updates about our planned and upcoming features soon, including how building blocks will integrate with meshStack’s other capabilities like cost management, security and compliance.