At meshcloud we are huge fans of infrastructure-as-code, GitOps and declarative approaches in general. This means we manage a lot of declarative files in our Git repository and these files get processed automatically when they change. Ideally we want our CI system to make sure that changes we’re making to these files won’t cause any problems once they reach a staging or production environment.
Some tools like Terraform allow you to run validation on your declarative definitions but it’s very likely that you’re also dealing with some JSON or YAML files (e.g. CI pipeline definitions, application configurations, Kubernetes manifests, etc.) that are required by an application that does not provide any way to check these files beforehand.
Let’s have a look how we can use Dhall to solve this problem and even improve on the limitations of formats like JSON or YAML.
Syntactic Correctness in Dhall
As a first step we need to make sure that any files we check in are syntactically correct, this can also be achieved by using something like a JSON/YAML linter but if we have everything as Dhall files we can get by with one tool and get some added benefits like being able to share code via imports, define constants, and use functions.
Let’s see an example, here is a YAML file containing configuration for an application:
roles:
- name: admins
permissions: admin
priority: 100
- name: developers
permissions: editor
priority: 10
Okay, there is a list of roles and each group has a name, permissions and a priority value. Now let’s make a change, we’ll add another role.
roles:
- name: admins
permissions: admin
priority: 100
- name: developers
permissions: editor
priority: 10
- name: guests
permissions: viewer
priority: 1
Can you spot the mistake? By removing a single space before permissions
we’ve now created an invalid YAML file so our application won’t be able to parse it and will crash!
Let’s write the same config in Dhall instead:
{ roles =
[ { name = "admins", permissions = "admin", priority = 100 }
, { name = "developers", permissions = "editor", priority = 10 }
, { name = "guests", permissions = "viewer", priority = 1 }
]
}
Running this file through Dhall (dhall --file config.dhall
) will let us know about any syntactic errors and if everything is correct we can be sure that we can use dhall-to-yaml
to generate a valid YAML file for our application later (e.g. as part of our CD process).
Type Correctness in Dhall
So far so good, but what about other kinds of mistakes, for example:
{ rolos =
[ { name = "admins", permissions = "admin", priority = 100 }
, { name = "developers", permissions = "editor", priority = 10 }
, { name = "guests", permissions = "viewer", priority = 1 }
]
}
Looks like we misspelled roles but Dhall does not complain since everything is syntactically correct!
To ensure that this does not happen we need to tell Dhall about the shape of our configuration. We do this by specifying a type in other languages this may be called a schema or a spec. It looks like this:
{ roles : List { name : Text
, permissions : Text
, priority : Natural
}
}
Alright, instead of assigning values with =
we’re now specifying types with :
. Roles is a List
with an inner type for the list members and this inner type has two text fields name
and permissions
and a field priority
which should contain a natural number.
We can clean this up a bit by pulling the definition for the inner type into a let binding (basically a constant). By convention, types are given upper case names so let’s call the inner type Role
:
let Role = { name : Text
, permissions : Text
, priority : Natural
}
in { roles : List Role }
This is the same definition as before just a bit more readable.
Now that we have our type written out we can combine it with our actual value. There are different ways to go about this but we’ll keep it simple and add a type annotation to our values directly:
let Role = { name : Text
, permissions : Text
, priority : Natural
}
let Config = { roles : List Role }
in { roles =
[ { name = "admins", permissions = "admin", priority = 100 }
, { name = "developers", permissions = "editor", priority = 10 }
, { name = "guests", permissions = "viewer", priority = 1 }
]
}
: Config
First we move our type definition into its’ own let binding and call it Config
. Then we add the same config values as before but we make sure to add an annotation of : Config
which tells Dhall about the type this value should conform to.
Running it through dhall-to-yaml
will yield the same result as before:
$ dhall-to-yaml --file config.dhall
roles:
- name: admins
permissions: admin
priority: 100
- name: developers
permissions: editor
priority: 10
- name: guests
permissions: viewer
priority: 1
Introducing the same mistake as before will result in an error though:
let Role = { name : Text
, permissions : Text
, priority : Natural
}
let Config = { roles : List Role }
in { rolos =
[ { name = "admins", permissions = "admin", priority = 100 }
, { name = "developers", permissions = "editor", priority = 10 }
, { name = "guests", permissions = "viewer", priority = 1 }
]
}
: Config
$ dhall --file config.dhall
Use "dhall --explain" for detailed errors
Error: Expression doesn't match annotation
{ - roles : …
, + rolos : …
}
8│ { rolos =
9│ [ { name = "admins", permissions = "admin", priority = 100 }
10│ , { name = "developers", permissions = "editor", priority = 10 }
11│ , { name = "guests", permissions = "viewer", priority = 1 }
12│ ]
13│ }
14│ : Config
config.dhall:8:7
As we expected the expression (our value) doesn’t match the annotation (our type), - roles
tells us that the roles field is missing (hence the minus sign) and + rolos
tells us that there was an unexpected additional field called rolos
.
Great, we’ve achieved type correctness, our configuration will definitely be valid and every field will have the correct type!
Semantic Correctness
We’re not done yet though, what happens when we do something like this:
let Role = { name : Text
, permissions : Text
, priority : Natural
}
let Config = { roles : List Role }
in { roles =
[ { name = "admins", permissions = "admin", priority = 100 }
, { name = "developers", permissions = "editor", priority = 10 }
, { name = "guests", permissions = "read-only", priority = 1 }
]
}
: Config
We’ve changed the guest role permissions to "read-only"
, it’s still a text value so our types are in order but when we feed this to our application we get an error:
$ my-app config.yml
ERROR: unknown permissions "read-only" expected one of "admin", "editor", "viewer"
Oh no, permissions
isn’t actually a text value it’s an enum! There is no way to encode this in a YAML file so our application reads the text and tries to map it to an enum value of the same name. Even though all our types are in order we’ve failed to provide correct data because the underlying format (YAML) is not expressive enough to specify values that are semantically correct.
Luckily it’s very easy to cover this exact case with Dhall thanks to union types. Union types are used if you have a value that can be of different types and the simplest way of using it is to specify a type that consists of different specific text values.
We create a new Permissions
union type as a let binding and use it in our definition of Role
:
let Permissions = < admin | editor | viewer >
let Role = { name : Text
, permissions : Permissions
, priority : Natural
}
Between the angle brackets < ... >
we have the different values we want to allow (called constructors) separated by pipes |
.
Since we’re no longer working with text values we also need to update our values. Instead of using text values like "admin"
we write Permissions.admin
. Here’s the full example:
let Permissions = < admin | editor | viewer >
let Role = { name : Text
, permissions : Permissions
, priority : Natural
}
let Config = { roles : List Role }
in { roles =
[ { name = "admins", permissions = Permissions.admin, priority = 100 }
, { name = "developers", permissions = Permissions.editor, priority = 10 }
, { name = "guests", permissions = Permissions.viewer, priority = 1 }
]
}
: Config
By writing it in the form Type.constructor
we can differentiate between alternatives of the same name, for example we could also have a Role.admin
.
If we use a non-existent role like Permissions.read-only
we’re greeted with an appropriate error:
$ dhall --file config.dhall
Use "dhall --explain" for detailed errors
Error: Missing constructor: read-only
11│ Permissions.read-only
config.dhall:11:44
Dhall error messages can be a bit hard to read but in this case the --explain
flag does a pretty nice job:
$ dhall --explain --file config.dhall ~
Permissions : Type
Role : Type
Config : Type
Error: Missing constructor: read-only
Explanation: You can access constructors from unions, like this:
┌───────────────────┐
│ < Foo | Bar >.Foo │ This is valid ...
└───────────────────┘
... but you can only access constructors if they match an union alternative of
the same name.
For example, the following expression is not valid:
┌───────────────────┐
│ < Foo | Bar >.Baz │
└───────────────────┘
⇧
Invalid: the union has no ❰Baz❱ alternative
You tried to access a constructor named:
↳ read-only
... but the constructor is missing because the union only defines the following
alternatives:
↳ < admin | editor | viewer >
────────────────────────────────────────────────────────────────────────────────
11│ Permissions.read-only
config.dhall:11:44
Now we’ve done pretty much all we can do to ensure that our configuration is as correct as possible and by using Dhall our CI system can inform us about such problems before they turn into failing deployments or runtime errors. Of course we can’t catch all possible errors before we actually run an application (e.g. we specify a wrong but still valid email address) but we should try to catch problems as early as possible.
Where to Go From Here
We’ve only scratched the surface!
- Union types are much more expressive and each alternative can contain a full other type not just a fixed value.
- Dhall can automatically embed type information in generated JSON/YAML.
- Using packages and default values.
- You can bypass more limitations of JSON/YAML by using Dhall to write you configuration in a way that makes sense semantically and then use a function to generate one or multiple output files.