Platform services model classification – be aware of what you need

Platform services play an increasingly important role in cloud infrastructures. They enable application operators to quickly stick together the dependencies they need to run their applications. For example, when deploying your application and you need a database, you just request a service instance for the database of your choice and connect it to your application. Done.

Specifications like the Open Service Broker API provide a standard interface to provide backing services such as databases or analytics to applications. However, the nature of a service offered by someone else to your application is not always that clearly defined and therefore you should pay attention to the exact contract the service offers.

At meshcloud we work with customers to integrate Open Service Broker API compatible services into a private multi-cloud marketplace. Based on that experience, we provide a compact check matrix for clear communication and understanding what type of service you receive when requesting a service instance. The following abstract scheme might not necessarily be complete (please comment if you have anything to add), but it gives a first idea which questions to ask and to ensure there is no misunderstanding between service owner and service consumer.

Service Model classification matrix

[supsystic-tables id=8]

 

The gold standard and typical understanding of a platform service is certainly: Managed Blackbox. However, there are cases when the other service models make sense, e.g. for highly customizable systems – let's say a Kubernetes cluster. Providing a K8s cluster as unmanaged whitebox service would mean you get a fully provisioned K8s cluster and take over from there in further configuration and maintenance. You still save the time to setup and provision a cluster on your own, but don't have to bear the costs of a fully managed K8s cluster.

In any case, there should be no misunderstanding between service vendor and consumer as to what the level of support really is. Especially, when procuring services becomes fast and easy and happens with a few clicks, simply assuming the vendor will take care of everything might create unpleasant surprises. Be sure to be aware of the exact service conditions, which are hopefully communicated transparently and easy to access.


Transferring large Datasets to Swift Object Storage through a CLI

Once you discoverd the possibilities of object storage, you may want to migrate your apps and services to it. When mirgrating a service, you have to move all it's data into the new storage for sure. While there is a Swift CLI, there are some problems with the limitations of file and folder sizes which shouldn't be any larger than 10 GB. To avoid running into this limitation, you need to write your own upload script based on the Swift CLI. Follow this tutorial to see how to do that:

Setup

First of all we need the Swift CLI. Hence it's written in Python and messing around with Python versions and libraries is kind of annoying, I prefer to use tools like "Virtualenv" for an isolated Python environment. Make sure that you've installed pip for the following steps.

pip install virtualenv
virtualenv ENV_DIRECTORY_YOU_WANT_YOUR_ENV_IN
source ENV_DIRECTORY_YOU_WANT_YOUR_ENV_IN/bin/activate

The last three commands were used to set up our Python environment. Now we can install and set up the Swift client. Just follow these two commands and you'll be set.

sudo pip install --upgrade setuptools
sudo pip install python-swiftclient
pip install python-keystoneclient

Having installed the CLI, you have to authenticate against the server now. There are different methods to do so, we'll choose the easiest, the configuration through env variables of the shell. If you want to authenticate via HTTP request, feel free to read this OpenStack dock. For the meshcloud OpenStack authentication you'll need to use the Keystone V3 API, as V2 is deprecated and not supported by the meshcloud.

First of all, you have to get access to your OpenStack credentials. If you are a meshcloud customer, you have to log on to the meshPanel, choose your project and the datacenter you want to store your data in. To acces the credentials needed for the Keystone API, you have to click on the last item in the sidebar called "Service User". Here you can create a service user by typing in a description, choosing "OpenStack" as platform and hitting the "Plus" button. After creating a service user, an automatic download starts, providing you with everything you'll need to authenticate. In this file is a Bash script for your operating system. Just copy all instructions of this section and paste it into the terminal you started Virtualenv in. This is it. You should be authenticated against the Swift CLI now.

Before you leave the meshcloud Panel, make sure you have created a Swift bucket. To do so, click on "Objects" in the sidepanel and enter a name of your choice for your container.

Since this is an upload script and uploading is a great task for parallelisation, we use the GNU library parallel to gain perfomance. The installation of parallel is pretty straightforward.

(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash

The Script

Now with a fresh new Swift CLI setup you can play around a bit to check whether everything is working as intended. Now we have to think about what our script should do. We need to upload every single file in the directories and all the corresponding child directories by starting a Swift client instant for every file.

To do so, we create a shell script written as follows:

#!/bin/bash

upload(){
URI=$1
BUCKET=$2
for file in $(find $URI -type f -not -name '.*')
do
$(echo "swift upload $BUCKET $file" >> commands.txt)
done
}
upload $1 $2

parallel < commands.txt
rm commands.txt

So let's step through the script to understand what's happening there. First of all, we created a method called upload which has two parameters. The first parameter is the URI of the folder the script should upload to and the second one is the Swift bucket (e.g. the one you created before) where this stuff should be stored in. Then we iterate through a set of files that is provided by this neat chunk of code batch find $URI -type f -not -name '.*'. The file is searching through a folder hierarchy, we pass our filepath over to it and specify to look for files only with --type f, with the -not -name '.*' excluding dotfiles. If you need other filters, feel free to read the man page of find.

In the body of the loop, we write the string "swift upload $BUCKET $file" to a file called commands.txt. After running through all files, this file contains all commands to upload every single file to the Swift Object Storage. We do that instead of running the commands directly in order to be able to parallelize the computation via "GNU parallel". This tool allows us to keep a constant threadpool and prevents us from hacking things with the very poor synchronisation mechanisms of the shell.

After the method, we execute the first instructions. Firstly, we run the just declared method with the shell parameters $1 & $2. Now it's time to process all our newly created commands in our .txt file. We do so by passing all the commands to GNU parallel which runs the commands in a threadpool of a consistent size. If you want to have a custom amount of threads running, you can specify this with the -j $amount option. If you want more control over the threads, there are a lot of options to be defined in parallel just read through the man page.

After uploading all the files, we delete our file with all instructions and exit the shell script. Now you have to save the script and modify the rights, just like every other script.

You can run it now with

./upload.sh YOUR_FILE_PATH YOUR_BUCKET_NAME

CAUTION: Swift does not know directories. Many tools are using the filename to store the complete path. The Swift CLI uploads the name with the path that is the input. If you need a relative path, make sure to call the script with that exact same path.

Evaluation of the Upload

When transferring a huge amount of application data of an active service, you want to make sure that every file reached it's destination savely. Sadly, there is a huge file size mismatch between the size of the object store and my HFS+ Filesystem which nearly caused a heart attack when I first saw it.

Hence size is not a valid criterion, the first and easiest proof of work is to compare the amount of files in the bucket and the directory. To count the number of files in the directory and every underlying directory you can use the commamd we used in our script with the "wc"-command piped.

find $URI -type f -not -name '.*' | wc -l

Or if you want to rely on another command, you can use this one, which uses the recursive function of ls, strips out directories via grep, removes empty lines via sed and does finally do a word count.

ls -pR /Users/jannikheyl/Downloads/cf.eu-de-darz.msh.host-cc-packages/ | grep -v / | sed ‘/^$/d’ | wc -l

The amount of files in the bucket can be found in the meshPanel at the bottom of the bucket overview or is shown in the commandline by typing:

swift stat YOURBUCKET

Even though Swift uses md5 hashes while uploading and ETag, there is no proper way to receive all hashes of a bucket. If you want to be 100% sure that your data is healthy, download the data you uploaded again, put it into a folder and run it through this script.

#!/bin/bash
origins(){
URI=$1
$(rm files.txt)
$(rm hash_origin.txt)
for file in $(find $URI -type f -not -name '.*')
do
if [[ -f $file ]]; then
$(echo "md5 -q $file >> hash_origin.txt" >> files.txt)
fi
done
}
downloaded(){
URI=$1
$(rm hash_downloaded.txt)
for file in $(find $URI -type f -not -name '.*')
do
if [[ -f $file ]]; then
$(echo "md5 -q $file >> hash_downloaded.txt" >> files.txt)
fi
done

}
origins $1
downloaded $2

parallel -j 20 < files.txt $(cat hash_origin.txt | sort -d > hash_origin1.txt)
$(cat hash_downloaded.txt | sort -d > hash_downloaded1.txt)

rm hash_downloaded.txt
rm hash_origin.txt
echo $(diff hash_origin1.txt hash_downloaded1.txt)

It takes two folders as input, your original folder and the folder downloaded from your Swift storage. It runs a md5 hash on every file (in parallel), writes those hashes to the file, sorts it to be the exact same order and prints a diff. If the diff tells you to have identical files, you can be sure that your files are feeling warm and fuzzy in their new home.

Troubleshooting

If you run into the problem stated below, run batch pip install 'requests[security]' to resolve it.

/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3
/util/ssl_.py:79: InsecurePlatformWarning: A true SSLContext object is not
available. This prevents urllib3 from configuring SSL appropriately and
may cause certain SSL connections to fail. For more information, see
https://urllib3.readthedocs.org/en/latest
/security.html#insecureplatformwarning.

Federated Authentication with the OpenStack CLI

Multi-Cloud applications are a core business for us here at meshcloud. Therefore, we also put some effort into the integration of several OpenStack locations using federation. After all, meshcloud is a federation of public clouds.

Recently, we re-assessed the federated authentication with the OpenStack CLI client using v3.12.0 against a Mitaka OpenSource environment. On the identity provider (IdP) side, we use Keycloak's OIDC capabilities.

Password vs. Token Auth

We look at two options for federated authentication at the command line:

v3oidcpassword

Auth Method: User/Password credentials of Identity Provider

Pros:

  • Static RC file, no refresh of token necessary
  • RC file itself does not provide any access without password

Cons:

  • Password necessary at CLI
  • Manual Keystone Token Issuing
  • Risk of using password at CLI

v3oidcaccesstoken

Auth Method: OIDC Java Web Token (JWT) issued by Identity Provider

Pros:

  • Direct bearer usage, no password necessary (=> automation)
  • Keystone token automatically issued

Cons:

  • Access not secured if RC file is leaked

So, depending on your use case you might prefer the one over the other. In any case, you prepare an appropriate RC file (=a shell script setting the environment variables to configure the openstack cli).

v3oidcpassword

Environment

If you want to use the OpenStack command line using your Identity Provider's user/password credentials, create an RC file like this:

# clear unnecessary settings
unset OS_PROJECT_NAME
unset OS_REGION_NAME
# set needed settings
export OS_INTERFACE="public"
# insert here Keystone auth url for the specific cloud
export OS_AUTH_URL="https://keystone.example.com:5000/v3"
export OS_IDENTITY_PROVIDER="keycloak-idp"
export OS_PROTOCOL="oidc"
export OS_CLIENT_ID="meshfed-oidc"
# use any value for client secret if we use a public oidc client
export OS_CLIENT_SECRET="ac2aa84b7685-5ffb-9f1d"
export OS_DISCOVERY_ENDPOINT="https://idp.example.org/auth/realms/meshfed/.well-known/openid-configuration"
export OS_IDENTITY_API_VERSION="3"
export OS_AUTH_TYPE="v3oidcpassword"
# insert here the user name to authenticate
export OS_USERNAME="username"
# this is the local openstack project id
export OS_PROJECT_ID="27a7e59d391d55c6cf4ead12227da57e"
# set password by querying user
export OS_PASSWORD=""
echo "Please enter your Meshcloud Password: "
read -sr OS_PASSWORD_INPUT
export OS_PASSWORD=$OS_PASSWORD_INPUT

You need the OS_AUTH_URL refers to the Keystone auth endpoint of the cloud you want to access. OS_IDENTITY_PROVIDER is the label you created with openstack identity provider create when creating the identity provider within OpenStack. OS_CLIENT_ID is the name of the OIDC client configured at the IdP. Now it is important that you set OS_CLIENT_SECRET even if you have not a confidential client configured at the IdP because the OpenStack cli expects it to be present. If you use a "public" OIDC client configuration, just put in a dummy value. OS_DISCOVERY_ENDPOINT sets the metadata endpoint for the IdP server. It saves you to set a lot of other config options (it's already more than enough, isn't it?).

Of course, we set OS_AUTH_TYPE to v3oidcpassword. The username of your IdP account goes to OS_USERNAME and OS_PROJECT_ID needs the local project id of your OpenStack project you want to access. See below if you do not know the project id. The remainder of the script is to ask you for your password on the command line, but you can also set OS_PASSWORD directly.

Command line usage

Now get to your prompt and source the RC file:

➜ source v3oidcpassword.sh
Please enter your OpenStack Password:

After you entered your password, you need to issue a Keystone token and you can start issuing commands:

➜ cli-test openstack token issue
+------------+----------------------------------+
| Field      | Value                            |
+------------+----------------------------------+
| expires    | 2017-08-23T13:37:33+0000         |
| id         | d6d222ccfba24bfa9c85d5baa039f110 |
| project_id | 5edf2f36ac334618a614731a146a60ec |
| user_id    | 7ca9d31f46da4e52b44ac263745e4a77 |
+------------+----------------------------------+
(openstack) router create test
(openstack) router list
+--------------+------+--------+-------+-------------+-------+----------------------------------+
| ID           | Name | Status | State | Distributed | HA    | Project                          |
+--------------+------+--------+-------+-------------+-------+----------------------------------+
| d1907a5b-... | test | ACTIVE | UP    | False       | False | 5edf2f36ac334618a614731a146a60ec |
+--------------+------+--------+-------+-------------+-------+----------------------------------+

If you would like to see the projects you have access to, e.g. to get the right project id for the RC file, issue:

openstack federation project list

And you will see a list of projects you can access with your federated account. Pretty cool, huh? :-)

v3oidcaccesstoken

If you don't want to deal with passwords at the command line, v3oidcaccesstoken is an alternative to authenticate with OpenStack using the JWT issued by the IdP. A JWT is an encoded JSON data set containing identity and authorization information. You can decode them using jwt.io.

Environment

export OS_INTERFACE="public"
export OS_IDENTITY_API_VERSION=3
export OS_AUTH_TYPE="v3oidcaccesstoken"
export OS_AUTH_URL="https://keystone.example.com:5000/v3"
export OS_IDENTITY_PROVIDER="keycloak-idp"
export OS_PROTOCOL="oidc"
export OS_ACCESS_TOKEN="eyJhbGciOiJSUzI1NiIsIn..........9uFum6TWK_69OAbM3RjFbjiDvg"
export OS_PROJECT_ID="27a7e59d391d55c6cf4ead12227da57e"

You see the configuration is much simpler as lots of information is contained in the JWT access token (it usually is a very long string, we just cut it here for display purposes). Sourcing this file enables you to issue a Keystone token (and hence do work in the project) as long as the OIDC JWT is valid.

Command line usage

Source the file, then start working:

➜ source v3oidcaccesstoken.sh
➜ openstack server list
+--------------------------------------+------------------+--------+---------------------------------------------+------------------------------------------+------------+
| ID                                   | Name             | Status | Networks                                    | Image                                    | Flavor     |
+--------------------------------------+------------------+--------+---------------------------------------------+------------------------------------------+------------+
| 6447040b-cc8c-46f3-91a8-949aa1744981 | flinktest        | ACTIVE | test=192.168.106.25, 212.56.234.218         | ubuntu-xenial-16.04_softwareconfig_0.0.2 | gp1.medium |
...

You see it is much shorter in application and does not require to set a password.

Using CLI access with meshcloud

To make multi-cloud access easier, meshcloud already prepares the necessary RC files for your project. Just go to the Panel and select the project and location you want to access. Choose "CLI Access" and you find the RC files for download. Source them into your shell and you're ready to go.

Outlook

The v3oidcpassword and v3oidcaccesstoken are very helpful to use CLI access on federated OpenStack clouds. However, for automated access (e.g. scripts) neither of them is perfect – either you have to store a password or you have only a limited access token. The OIDC standard has a solution called Offline Tokens. Those are refresh tokens (ie. tokens used to re-issue a new access token without credential re-authentication) that never expire. They are intended for automatic procedures who need access to protected resources. Upon access, the bearer requests a fresh (short-living) access token using the Offline Token. The major advantage is that Offline Tokens can be revoked at the IdP site. A revoked Offline Token cannot be used to acquire new access tokens and hence access is disabled. So you still have a central veto when providing decentral access without passwords.

Soon, meshcloud will provide an integrated CLI client that makes accessing multiple clouds and projects more convenient and helps you speed up with your open-source multi-cloud experience. Stay tuned :).

Questions

Any questions left? How are you using federated authentication with OpenStack? Let us know. We're looking forward to hear from you.


Deploying Concourse CI on OpenStack using Docker

At meshcloud, we use a continuous delivery process to deliver cloud infrastructure and software updates. Since we operate multiple cloud platforms on a variety of hardware configurations managed by our partner providers, we need a continuous integration platform that enables us to test updates in large number of configurations before we roll them out. The continuous integration server concourse.ci is a perfect match for these requirements with its immutable and flexible pipeline model. This pipeline model that can execute build jobs in arbitrary docker containers sets it apart from more rigid pipelines offered by other continuous integration servers like Jenkins or TeamCity.

It's not a coincidence that Concourse was developed by Pivotal Software to meet the demands of the Cloud Foundry PaaS project. In fact, meshcloud also operates Cloud Foundry PaaS as one of the service on our open cloud federation. The easiest way to deploy Concourse is to use the docker images provided by the Concourse team. So without further ado, let's get right to it.

Virtual Network Setup

This tutorial assumes you have an OpenStack project with at least one floating (=public) IP. You should have created a private network called concourse-net in your OpenStack project that is configured with DHCP and DNS. This network needs to be attached to a router that has public internet access. You will also need these security groups set up in OpenStack:

  • ssh (TCP port 22)
  • docker (TCP ports 2375, 2376)

The provisioning of this network structure is beyond the scope of this tutorial, but you can read an excellent introduction at the Openstack Superuser Blog.

Provisioning a Docker Host

The next thing we will need is a docker host that will execute the containers that make up Concourse. We will provision this host from our OpenStack cloud using docker-machine from the command line. Source your OpenStack credentials environment file (typically called openrc.sh) to load your OpenStack credentials: $ source openrc.sh.

After that, we will create a VM in OpenStack using docker-machine.

docker-machine create --driver openstack 
--openstack-ssh-user ubuntu
--openstack-net-name 'concourse-net'
--openstack-image-name 'ubuntu-16.04'
--openstack-flavor-name gp.large
--openstack-floating-ip-pool public00
--openstack-sec-groups default,ssh,docker
concourse

Tune these parameters to match your OpenStack environment, e.g. if you don't want to use Ubuntu or your floating-ip pool has a different name (usually the floating-ip-pool has the same name as your public network in OpenStack that you use to connect to the internet). Provisioning the machine may take a few minutes. Once complete, you should be able to do a docker-machine ls and see your freshly provisioned docker host running:

$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
concourse - generic Running tcp://x.x.x.x:2376 v1.12.1

So now that we have a freshly minted docker host, let's deploy Concourse.

Concourse Configuration

A minimum Concourse config consists of at least three components: a PostgreSQL database, the Concourse-ui and one or more Concourse-workers that run the actual builds. We will deploy these three components together on the same host using a docker-compose.yml file. Before we can do that we need to generate RSA keys that the concourse components can use to work together.

RSA Keys

Create a "concourse-deploy" directory and run the following command to generate those keys:

mkdir -p keys/web keys/worker

ssh-keygen -t rsa -f ./keys/web/tsa_host_key -N ''
ssh-keygen -t rsa -f ./keys/web/session_signing_key -N ''

ssh-keygen -t rsa -f ./keys/worker/worker_key -N ''

cp ./keys/worker/worker_key.pub ./keys/web/authorized_worker_keys
cp ./keys/web/tsa_host_key.pub ./keys/worker

The next step is to create the actual docker-compose file. While the Concourse docs have a minimum example, their configuration is not "production ready" as it misses a few critical things. We have extended their configuration to fix the following issues.

Configuring HTTPS

In the minimum example, the concourse-ui is not https protected. This is important when using https basic auth. To fix this, we want to deploy a nginx reverse proxy in front of Concourse UI for SSL termination. This proxy should use a SSL certificate from Let's Encrypt. We use Docker Let's Encrypt Companion to provide the nginx and Let's Encrypt scaffolding for us.

Limit Docker Log Size

Docker will by default collect all log output by a container and never roll these logs over. This will very quickly fill up your disk with logs if you're not careful. We hence limit log output per container using:

log_driver: json-file
log_opt: # limit log file size to prevent indefinite growth
max-size: "10m"

Restart containers with host

Since Concourse runs on a virtual machine on the cloud, we need to expect that this VM can be terminated and restarted anytime. Docker thus needs to automatically restart containers after the host reboots. We thus set restart: always on each container.

Deploy Concourse

You can find our full docker-compose.yml file in this gist. Save this file to the root of your "concourse-deploy" directory. Now target the Concourse docker-host created earlier:

eval $(docker-machine env concourse)

We need to fill in a few parameters via environment variables. You'll need to remember the generated passwords.

export POSTGRES_PASS=XXX # Insert a random database password
export CONCOURSE_DOMAIN=example.com # the domain you'll use to host Concourse
export LETSENCRYPT_MAIL=test@example.com # email to verify your let's encrypt account, must match CONCOURSE_DOMAIN
export CONCOURSE_PASS=XXX # Insert a random password for the Concourse main team.

Now we can finally deploy Concourse:

docker-compose up -d

That's it, you should now have a working Concourse installation at your configured domain.