Automatically generate PKI certificates with Vault

A while a go I wrote an item on how to setup a secure Vault with Consul as backend and its time to do something with Vault again. With this blogpost we will setup Vault with the PKI backend. With the PKI backend we can generate or revoke short lived ssl certificates with Vault.

The goal with this blogpost is that we create intermediate CA certificate, configure Vault and generate certificates via the cmd line and via the API. The reason we use intermediate CA certificate is that if something might happen with the certificate/key, its much easier to revoke it and recreate a new intermediate certificate. If this would happen with the actual ROOT CA, you’ll have a troubles and work to fix it again. So keep the ROOT CA files on a safe place!

Preparations

We will create an intermediate certificate that Vault will be using to create and sign certificate requests. We have to create a new key and the certificate needs to be signed by the ROOT CA. First we create the key:

openssl genrsa -out private/intermediate_ca.key.pem 4096

And now we need to create a certificate signing request:

openssl req -config intermediate/openssl.cnf -new -sha256 \
-key private/intermediate_ca.key.pem -out \
csr/intermediate_ca.csr.pem

We have to make sure that we fill in the same information as the original CA, but in this case we use a slightly different Organisation Unit name so we know/verify that a certificate is signed by this intermediate CA instance. Once we filled in all data, we have to sign it with the ROOT CA to create the actual certificate:

openssl ca -keyfile private/cakey.pem -cert \
dj-wasabi.local.pem -extensions v3_ca -notext -md \
sha256 -in csr/intermediate_ca.csr.pem -out \
certs/intermediate_ca.crt.pem
Using configuration from /etc/pki/tls/openssl.cnf
Check that the request matches the signature
Signature ok
Certificate Details:
        Serial Number: 18268543712502854739 (0xfd86e7b7336db453)
        Validity
            Not Before: Aug 23 13:56:08 2017 GMT
            Not After : Aug 21 13:56:08 2027 GMT
        Subject:
            countryName               = NL
            stateOrProvinceName       = Utrecht
            organizationName          = dj-wasabi
            organizationalUnitName    = Vault CA
            commonName                = dj-wasabi.local
            emailAddress              = ikben@werner-dijkerman.nl
        X509v3 extensions:
            X509v3 Subject Key Identifier: 
                93:46:3D:69:24:32:C7:11:C4:B7:27:66:89:67:FB:1F:8E:1B:50:97
            X509v3 Authority Key Identifier: 
                keyid:60:63:7E:0F:54:5E:7D:A5:37:A8:6F:BD:27:BF:73:15:56:B2:89:31

            X509v3 Basic Constraints: 
                CA:TRUE
Certificate is to be certified until Aug 21 13:56:08 2027 GMT (3650 days)
Sign the certificate? [y/n]:y

1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Update

With the -keyfile and -cert we provide the key and crt file of the root CA to sign the new intermediate ssl certificate. Ok, 10 years might be a little bit to long, but this is just for my local environment and my setup probably won’t last that long. 🙂

We are almost done with the preparations and one thing we need to do before we go configuring Vault. We have to combine both the CA certificates and the intermediate private key into a single file, before we can upload it to Vault.

cat certs/intermediate_ca.crt.pem dj-wasabi.local.pem \
private/intermediate_ca.key.pem > certs/ca_bundle.pem

First we print the contents of the newly created crt file, then the ROOT ca crt file and as last the intermediate private key and place that all in a single file called ca_bundle.pem.

Vault

Now we are ready to continue with the Vault part. We open a terminal to the host/container running Vault and before we can do somehting, we have to authenticate ourself first. I use the root token for authenticating:

export VAULT_TOKEN=<_my_root_token_>

The pki backend is disabled at default so we have to enabled it before we can use it. You can enable it multiple times, each enabled backend can be used for a specific domain. In this post we only use one domain, but lets pretend we need to create a lot more after this so we don’t use “defaults” in paths and naming.

We will mount the pki plugin for the dj-wasabi.local domain, so lets use the path: dj-wasabi. We give it a small description and then we specify the pki backend and then hit enter.

vault mount -path=dj-wasabi -description="dj-wasabi Vault CA" pki

There are some more options we don’t use for now with this example but maybe you want some more control for it, you can see them by executing the command: vault mount –help.
We can verify that we have mounted the pki backend by executing the vault mounts command:

bash-4.3$ vault mounts
Path        Type       Accessor            Plugin  Default TTL  Max TTL    Force No Cache  Replication Behavior  Description
cubbyhole/  cubbyhole  cubbyhole_2540c354  n/a     n/a          n/a        false           local                 per-token private secret storage
dj-wasabi/  pki        pki_6e5dc562        n/a     system       system     false           replicated            dj-wasabi Vault CA
secret/     generic    generic_fb0527dd    n/a     system       system     false           replicated            generic secret storage
sys/        system     system_347beff9     n/a     n/a          n/a        false           replicated            system endpoints used for control, policy and debugging

Now its time to upload the intermediate bundle file. I have temporarily placed the file in the config directory of Vault (Its a host mount, so it was easier to copy the file to the container) and now we have to upload it to our dj-wasabi backend. We have to upload our ca bundle file into the path we earlier used to mount the pki backend: <mount_path>/config/ca, in my case it is dj-wasabi/config/ca:

vault write dj-wasabi/config/ca \
pem_bundle="@/vault/config/ca_bundle.pem"
Success! Data written to: dj-wasabi/config/ca

If you get an error now, it probably means something went wrong with either creating the ca bundle file or validating the intermediate certificate.

Now we need to set some correct urls. These urls are placed in the certificates that are generated and that allows browsers/applications to do some validations. We will set the following urls:

  • issuing_certificates: The endpoint on which browsers/3rd party tools can request information about the CA;
  • crl_distribution_points: The endpoint on which the Certification Revocation List is available. This is a list with revoked Certificates;
  • ocsp_servers: The url on which the OCSP service is available. OCSP Stands for Online Certificate Status Protocol and is used to determine the state of the Certificate. You can see it as a better version of the Certificate Revocation List;

Lets configure the urls:

vault write dj-wasabi/config/urls \
issuing_certificates="https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/ca" \
crl_distribution_points="https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/crl" \
ocsp_servers="https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/ocsp"
Success! Data written to: dj-wasabi/config/urls

We will come later on the blog post about this. 🙂

Before we can generate certificates, we need to create a role in Vault. With this role we map a name to a policy. This policy describes the configuration that is needed for generating the certificates. For example we have to configure on which domain we need create the certificates, can we create sub domains and most important, what is the ttl of a certificate.

vault write dj-wasabi/roles/dj-wasabi-dot-local allowed_domains="dj-wasabi.local" allow_subdomains="true" max_ttl="72h"
Success! Data written to: dj-wasabi/roles/dj-wasabi-dot-local

We are all set now, so lets create a certificate.

We specify the just created role and at minimum we have to provide the common_name (In this case small-test.dj-wasabi.local). You can find here all the options you can give when generating a certificate. The command looks like this:

vault write dj-wasabi/issue/dj-wasabi-dot-local common_name=small-test.dj-wasabi.local
Key             	Value
---             	-----
ca_chain        	[-----BEGIN CERTIFICATE-----
MIIFtTCCA52gAwIBAgIJAP2G57czbbRTMA0GCSqGSIb3DQEBCwUAMFcxCzAJBgNV
...
-----END CERTIFICATE-----
issuing_ca      	-----BEGIN CERTIFICATE-----
MIIFtTCCA52gAwIBAgIJAP2G57czbbRTMA0GCSqGSIb3DQEBCwUAMFcxCzAJBgNV
...
-----END CERTIFICATE-----
private_key     	-----BEGIN RSA PRIVATE KEY-----
MIIEpQIBAAKCAQEAsFSmpBCFN945+Chyz/YqsB2a/T73kdst4v7qm2ZLK50RxCj0
...
-----END RSA PRIVATE KEY-----
private_key_type	rsa
serial_number   	03:f2:bb:f5:27:16:81:20:76:0d:91:6f:fd:10:05:2d:a6:e1:59:e3

The command provides a lot of information and I have removed some of it to not full a whole page with unreadable data. It provides you all the data you’ll need to create a service that needs ssl certificates. As you see, it provides the certificate and the private_key, but also the ca_chain.

API

Lets generate a SSL certificate via the API.

curl -XPOST -k -H 'X-Vault-Token: <_my_root_token_>' \
-d '{"common_name": "blog.dj-wasabi.local"}' \
https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/issue/dj-wasabi-dot-local

We do an POST, and as a minimum we only provide the common_name (In this case blog.dj-wasabi.local). We use the X-Vault-Token which in my case is the ROOT Token as a header and we post it to the url https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/issue/dj-wasabi-dot-local url. If you remember, the dj-wasabi-dot-local is the name of the role, so this role has the correct ttl etc.

Lets execute it and once the certificate is created, a lot of output is returned in json format:

curl -XPOST -k -H 'X-Vault-Token: <_my_root_token_>' \
-d '{"common_name": "blog.dj-wasabi.local"}' \
https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/issue/dj-wasabi-dot-local
{"request_id":"e1d0f686-d0d8-d1d8-d7ab-428c7322229b","lease_id":"","renewable":false,"lease_duration":0,"data":{"ca_chain":["-----BEGIN CERTIFICATE-----asas-----END CERTIFICATE-----","-----BEGIN CERTIFICATE----asas------END CERTIFICATE-----"],"certificate":"-----BEGIN CERTIFICATE-----asas-----END CERTIFICATE-----","issuing_ca":"-----BEGIN CERTIFICATE-----asas-----END CERTIFICATE-----","private_key":"-----BEGIN RSA PRIVATE KEY-----asas-----END RSA PRIVATE KEY-----","private_key_type":"rsa","serial_number":"11:42:ba:66:94:b4:c9:5c:e5:1a:77:da:76:2e:57:5d:b5:64:f5:c3"},"wrap_info":null,"warnings":null,"auth":null}

Again I removed a lot of unreadable data from the example. Again you’ll see the private_key, certificate and the ca_chain which can be used with a service like nginx.

Lets do an overview of all certificates stored in our Vault:

curl -XGET -H 'X-Vault-Token: <_my_root_token_>' \
--request LIST https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/certs 
{"request_id":"fb5e7060-0d02-211a-ae25-50507a334706","lease_id":"","renewable":false,"lease_duration":0,"data":{"keys":["03-f2-bb-f5-27-16-81-20-76-0d-91-6f-fd-10-05-2d-a6-e1-59-e3","11-42-ba-66-94-b4-c9-5c-e5-1a-77-da-76-2e-57-5d-b5-64-f5-c3"]},"wrap_info":null,"warnings":null,"auth":null}

We see that there are 2 certificates stored in the Vault, the “keys” has 2 values. These keys are the Serial Numbers of the certificates. We have to use this Serial Number if we want to revoke it or we just want to get the certificate. An example of getting the certificate:

curl -XGET -H 'X-Vault-Token: df80e726-d3f0-8344-3782-fec19fe7a745' \
https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/cert/11-42-ba-66-94-b4-c9-5c-e5-1a-77-da-76-2e-57-5d-b5-64-f5-c3
{"request_id":"ae6e63f9-c04e-ac4c-d8a8-254347284771","lease_id":"","renewable":false,"lease_duration":0,"data":{"certificate":"-----BEGIN CERTIFICATE-----asasas-----END CERTIFICATE-----\n","revocation_time":0},"wrap_info":null,"warnings":null,"auth":null}

Again I removed some data from the example. You can only get the certificate, not the private key. I’ve copied the contents of the certificate in a file called blog.dj-wasabi.local.crt on my Mac, so when I run the openssl x509 command, it will show some information about this certificate:

openssl x509 -in blog.dj-wasabi.local.crt -noout -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            11:42:ba:66:94:b4:c9:5c:e5:1a:77:da:76:2e:57:5d:b5:64:f5:c3
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=NL, ST=Utrecht, O=dj-wasabi, OU=Vault CA, CN=dj-wasabi.local/emailAddress=ikben@werner-dijkerman.nl
        Validity
            Not Before: Aug 23 16:51:36 2017 GMT
            Not After : Aug 26 16:52:05 2017 GMT
        Subject: CN=blog.dj-wasabi.local
 ...
            Authority Information Access: 
                OCSP - URI:https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/ocsp
                CA Issuers - URI:https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/ca

            X509v3 Subject Alternative Name: 
                DNS:blog.dj-wasabi.local
            X509v3 CRL Distribution Points: 

                Full Name:
                  URI:https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/crl
 ...

The output shows that the certificate is only valid (Validity) for 3 days (72 hours). If you take a look at the “Authority Information Access”, you’ll see the urls (OCSP and the CA Issuers) we have set earlier. And a little bit further we see the CRL Distribution Points, an url we also have set with the set urls command.

Keep in mind: Only during the generation of the certificate, the private key is returned. If you did loose the private key, then revoke the certificate and generate a new one.

As last command in this blogpost we do a revoke of an certificate. We have to do an POST and sent the serial_number to the revoke endpoint.

curl -XPOST -k -H 'X-Vault-Token: <_my_root_token_>' \
-d '{"serial_number":"03-f2-bb-f5-27-16-81-20-76-0d-91-6f-fd-10-05-2d-a6-e1-59-e3"}' \
https://vault.service.dj-wasabi.local:8200/v1/dj-wasabi/revoke
{"request_id":"ea8a7132-231f-7075-f42b-f81b272cc9cd","lease_id":"","renewable":false,"lease_duration":0,"data":{"revocation_time":1503506236,"revocation_time_rfc3339":"2017-08-23T16:37:16.755130614Z"},"wrap_info":null,"warnings":null,"auth":null}

It returns a json output with a key named revocation_time. This is the time since epoch when the certificate is revoked, 0 if the certificate isn’t revoked.

So, that was it! Have fun!

Advertisement

Monitoring Consul with statsd exporter and Prometheus

My choice for using a monitoring tool is currently Prometheus. With Prometheus you can easily gather metrics of applications and/or databases to see the actual performance of the application/database. When you have a tool like Zabbix or Nagios, you’ll need to write one or multiple scripts to gather all metrics and see how much you can store in your database without loosing performance of your monitoring tool. About the why Prometheus and not doing this with Zabbix or other monitoring tool is an subject for maybe an other blogpost.

One interesting application to monitor is Consul. When you look for monitoring Consul in google, you’ll find a lot of pages that shows you that you can use Consul as a monitoring tool but not many on how you can monitor Consul itself. With this blogpost I’ll describe what steps I have taken to monitor Consul. Please keep in mind this is is just a start and it is incomplete, so if you have suggestions to improve it please let me know.

On this blogpost we will do the following actions:

  • Configure Consul
  • Configure statsd exporter
  • Create some graphs

Configure Consul

Consul has a way for exposing metrics, called Telemetry. With Telemetry you can configure Consul for sending performance metrics to external tools/applications to monitor the performance of Consul. You can see some more information about configuring Consul for Telemetry on this page https://www.consul.io/docs/agent/options.html#telemetry. With this blogpost we will use the “statsd_address” option. In order to make this happen, we have to update our Consul configuration on the Consul Servers to add the following configuration:

    "telemetry": {
        "statsd_address": "192.168.1.202:9125"
    },

The IP Address is from the host itself, and in this case we have to send it to port 9125. Once we have configured this on all the Consul Servers, we need to restart them one by one so we keep the Consul Cluster running.

Configure statsd-exporter

When you use Prometheus, you’ll use exporters for your applications or databases to expose the metrics for Prometheus. Prometheus will scrape these metrics every 15 seconds (Well, you can configure that) and store them in the database. Consul doesn’t have an endpoint available to gather these metrics, we have to make use of the “statsd-exporter”. We already configured the Consul Servers to send metrics to a statsd server, so we only have to make sure we start one on each host running Consul Server.

Before we start an statsd-exporter, we first have to do some configuration first. We need to make sure we have a statd mapper file. With this file we map statsd fields into fields for prometheus and we can add labels per metric. On this page I have configured almost all mapping entries: https://gist.github.com/dj-wasabi/d9b31c4b74e561c72512f4edbdfe6927

Lets explain how an entry looks like:

consul.*.runtime.*
name="consul_runtime"
type="$2"
host="{{ inventory_hostname }}"

The first line in this mapping construction is the name of the statsd field. You’ll see asteriks, these are wildcards and these can be used as a value by assiging it to a filter. First asteric can be used as $1, second as $2 etc. The “name” is the name of the metric field in Prometheus, in this case the name is consul_runtime. Prometheus doesn’t accept dots in the names, so we have to use underscores for this.

We then create a label named “type” and we assign the value $2. The original statsd field that Consul has sent to the statsd-exporter looks like this:

consul.b139924a6f44.runtime.num_goroutines

With this mapping construction, we assign $1 with value b139924a6f44 and $2 with value num_goroutines. The last “host” label is something I add with Ansible. I use Ansible to deploy this statsd mapper file (And all other monitoring related configuration) to all my Consul servers and then I can filter in Prometheus or other graphing tool like Grafana which metrics belongs to which host.

I use the Docker container for the statsd-exporter, I place the statsd mapper file on /data/statsd-exporter.conf and start the following command:

docker run --name statsd-exporter \
-v /data/statsd-exporter.conf:/tmp/statsd-exporter.conf:ro \
-p 9102:9102 -p 9125:9125/udp prom/statsd-exporter \
-statsd.mapping-config=/tmp/statsd-exporter.conf \
-statsd.add-suffix=false

I mount the statsd mapper file as ro (Read Only), open 2 ports and configure the statsd-exporter tool to use the mapper file. In this case 2 ports are openend. One port on which the statsd is available for retrieving performance metrics (9125) and the other port (9102) is used for Prometheus to scrape these metrics.

Prometheus

At this moment, I have added the following into the Prometheus configuration to let Prometheus scrape the statsd-exporter metrics:

scrape_configs:
  - job_name: 'consul'
    static_configs:
      - targets: ['192.168.1.202:9102']
        labels:  {'host': 'vserver-202'}
      - targets: ['192.168.1.203:9102']
        labels:  {'host': 'vserver-203'}
      - targets: ['192.168.1.204:9102']
        labels:  {'host': 'vserver-204'}

This works for now because I Ansible to generate a Prometheus configuration, but I’ll go probably using a consul_sd_config in the near future so I won’t have to add all kinds of static configuration.

Once we have restarted Prometheus and started the statsd-exporter containers, I can see the following metrics appear in Prometheus:

consul_runtime{host="vserver-204",type="free_count"} 2.3117552e+08
consul_runtime{host="vserver-204",type="heap_objects"} 22853
consul_runtime{host="vserver-204",type="num_goroutines”} 82

(And much more, but the above 3 are examples which are used as an explanation in the previous paragraphs.)

Create some graphs

Now we have the metrics in Prometheus, but now we need to create some graphs. We use Grafana for this. Grafana can be used for creating Graphs to show the actual performance of Consul. I’ve created a Dashboard and uploaded it to grafana.com: https://grafana.com/dashboards/2351

Grafana Dashboard for Consul

Some of the following can be found on the dashboard:

  • Who is the Consul Leader;
  • How many Consul Servers are running?
  • Some CPU idle utilisation and load information (You’ll need the node-exporter for this);
  • Performance of writing information on the Consul leader to disk or the other nodes;
  • etc

This dashboard is not finished yet and is a mixed combination of Consul Leader data and Consul Server specific. So some graphs shows information specific to the selected Consul server (Dropdown at the top of the page) and some graphs show specific data for the Consul Leader.

If you have suggestions to improve the current situation, by either suggestion a better statsd mapper configuration file or for the Dashboard, please let me know so I can improve it. I hope we can all benefit from each other to improve the availability and performance of Consul with this.

Setting up a secure Vault with a Consul backend

vault_logo

With this blogpost we continue working with a secure Consul environment: We are configuring a secure Vault setup with Consul as backend. YMMV, but this is what I needed to configure to make it work.

Environment

We should have an working Consul Cluster environment. If you don’t have one, please take a look at here for creating one. With this blogpost we expect a secure Consul cluster with SSL certificates and using ACL’s.

In this blogpost we make use of the wdijkerman/vault container. This container is created by myself and is running Vault (At moment of writing release 0.6.4) on Alpine (running on 3.5). Vault is running as user ‘vault’ and the container can be configured to use SSL certificates.

prerequisites

We have to create SSL certificates for the vault service. In this blogpost we use the domain ‘dj-wasabi.local’, as Consul is already running with this domain configuration so we have to create ssl certificates for the FQDN: ‘vault.service.dj-wasabi.local’.

On my host where my OpenSSL CA configuration is stored, I execute the following commands:

openssl genrsa -out private/vault.service.dj-wasabi.local.key 4096

Generate the key.

openssl req -new -extensions usr_cert -sha256 -subj "/C=NL/ST=Utrecht/L=Nieuwegin/O=dj-wasabi/CN=vault.service.dj-wasabi.local" -key private/vault.service.dj-wasabi.local.key -out csr/vault.service.dj-wasabi.local.csr

Create a signing request file and then sign it with the CA.

openssl ca -batch -config /etc/pki/tls/openssl.cnf -notext -in csr/vault.service.dj-wasabi.local.csr -out certs/vault.service.dj-wasabi.local.crt

We copy the ‘vault.service.dj-wasabi.local.key’, ‘vault.service.dj-wasabi.local.crt’ and the caroot certificate file to the hosts which will be running the Vault container into the directory /data/vault/ssl. Hashicorp advises to run vault on hosts where Consul Agents are running, not Consul Servers. This has probably todo with that for most use cases they see is that Consul is part of large networks and thus the servers will handle a lot of request (High load). As the Consul Servers will be very busy, it would then be wise to not run anything else on those servers.

But this is my own versy small environment (With 10 machines) so I will run Vault on the hosts running the Consul Server.

ACL

Before we do anything on these hosts, we create a ACL in Consul. We have to make sure that Vault can create keys in the key/value store and we have to allow that Vault may create a service in Consul named vault.

So our (Client) ACL will look like this:

key "vault/" {
  policy = "write"
}
service "vault" {
  policy = "write"
}

We use this in the ui on the Consul Server and create the ACL. In my case, the ACL is created with id ’94c507b4-6be8-9132-ea15-3fc5b196ea29′. This ID is needed later on when we configure Vault. Also check your ACL for the ‘Anonymous token’. Please make sure you have set the following rule if the Consul default policy is set to deny:

service "vault" {
  policy = "read"
}

With this, we make sure the service is resolvable via dns. In my case this is for ‘vault.service.dj-wasabi.local’.

Configuration

We have to configure the vault docker container. We have to create a directory that will be mounted in the container. First we have to create an user on the host and then we create the directory: /data/vault/config and own it to the just created user.

useradd -u 994 vault
mkdir /data/vault/config
chown vault:vault /data/vault/config

The container is using a user named vault and has UID 994 and we have to make sure that everything is in sync with names and id. Now we create a config.hcl file in the earlier mentioned directory:

backend "consul" {
  address = "vserver-202.dc1.dj-wasabi.local:8500"
  check_timeout = "5s"
  path = "vault/"
  token = "94c507b4-6be8-9132-ea15-3fc5b196ea29"
  scheme = "https"
  tls_skip_verify = 0
  tls_key_file = "/vault/ssl/vault.service.dj-wasabi.local.key"
  tls_cert_file = "/vault/ssl/vault.service.dj-wasabi.local.crt”
  tls_ca_file = "/vault/ssl/dj-wasabi.local.pem"
}

listener "tcp" {
  address = "0.0.0.0:8200"
  tls_disable = 0
  tls_key_file = "/vault/ssl/vault.service.dj-wasabi.local.key"
  tls_cert_file = "/vault/ssl/vault.service.dj-wasabi.local.crt"
  cluster_address = "0.0.0.0:8201"
}

disable_mlock = false

First we configure a backend for Vault. As we use Consul, we use the Consul backend. Because the Consul is running on https and is using certificates, we have to use the fqdn of the Consul node as the address (same as how we did in configuring Registratror in this post). We also have to configure the options ‘tls_key_file’, ‘tls_cert_file’ and ‘tls_ca_file’, these are the ssl certificates needed for accessing the secure Consul via SSL. Because of this, we have to set the ‘scheme’ to ‘https’ and we have to specify the token for the ACL we created earlier and add the value to the the token option.

Next we configure the listener for Vault. We configure the listener that it listens on all ips on port 8200. We also make sure we configure the earlier created SSL certificates by using them in the ‘tls_key_file’ and ‘tls_cert_file’ options.

The last option is to make sure that Vault can not swap data to the local disk.

Starting Vault

Now we are ready to start the docker container. We use the following command for this:

docker run -d -h vserver-202 --name vault \
--dns=172.17.0.2 --dns-search=service.dj-wasabi.local \
--cap-add IPC_LOCK -p 8200:8200 -p 8201:8201 \
-v /data/vault/ssl:/vault/ssl:ro \
-e VAULT_ADDR=https://vault.service.dj-wasabi.local:8200 \
-e VAULT_CLUSTER_ADDR=https://192.168.1.202:8200 \
-e VAULT_REDIRECT_ADDR=https://192.168.1.202:8200 \
-e VAULT_ADVERTISE_ADDR=https://192.168.1.202:8200 \
-e VAULT_CACERT=/vault/ssl/dj-wasabi.local.pem \
wdijkerman/vault

We have the SSL certificates stored in the /data/vault/ssl and we mount these as read only on /vault/ssl. With the VAULT_ADDR we specifiy on which url the vault service is available on, this is the url which Consul provides like any other server. With the VAULT_CACERT we specify on which location the CA Certificate file of our domain. The other 3 environment variables are needed for a High Available Vault environment and is to make sure how other vault instances can contact it.

When Vault is started, we will see something like this with the docker logs vault command:

==> Vault server configuration:

Backend: consul (HA available)
Cgo: disabled
Cluster Address: https://192.168.1.202:8200
Listener 1: tcp (addr: "0.0.0.0:8200", cluster address: "0.0.0.0:8201", tls: "enabled")
Log Level: info
Mlock: supported: true, enabled: true
Redirect Address: https://192.168.1.202:8200
Version: Vault v0.6.4
Version Sha: f4adc7fa960ed8e828f94bc6785bcdbae8d1b263

==> Vault server started! Log data will stream in below:

But where are not done yet. When Vault is started, it is in a sealed state and because this is the first vault in the cluster we have to initialise it to. Also when you check the ui of Consul, you’ll see that the vault is in an error state. Why? When Vault starts, it automatically creates a service in Consul and add health checks. These health checks will check if a vault instance is sealed or not.

Initialise

As vault is running in the container, we open a terminal to the container:

docker exec -it vault bash

Now we have a bash shell running and we going to initialise vault. First we have to make sure we set the ‘VAULT_ADDR’ to this container, by executing the following command:

export VAULT_ADDR='https://127.0.0.1:8200'

Every time we want to do something with the vault instance, we have to set the ‘VAULT_ADDR’ to localhost. If we won’t do that, we will send the commands directly against the cluster.

As this is the first vault instance in the environment, we have to initialise it and we do that by executing the following command:

vault init -tls-skip-verify
Unseal Key 1: hemsIyJD+KQSWtKp0fQ0r109fOv8TUBnugGUKVl5zjAB
Unseal Key 2: lIiIaKI1F6pJ11Jw/g1CiLyZurpfhCM9AYIylrG/SKUC
Unseal Key 3: 298bn4H8bLbJRsPASOl3R+RPuDKIt6i5fYzqxQ3wL4ED
Unseal Key 4: W4RUiOU3IzQSZ8GD2z8jBEg2wK/q17ldr3zJipFjzKQE
Unseal Key 5: FNPHf8b+WCiS9lAzbdsWyxDgwic95DLZ03IR2S0sq4AF
Initial Root Token: ed220674-24da-d446-375d-bbd0334bcb31

Vault initialized with 5 keys and a key threshold of 3. Please
securely distribute the above keys. When the Vault is re-sealed,
restarted, or stopped, you must provide at least 3 of these keys
to unseal it again.

Vault does not store the master key. Without at least 3 keys,
your Vault will remain permanently sealed.

As we set the ‘VAULT_ADDR’ to ‘https://127.0.0.1:8200&#8217;, we have to add the ‘-tls-skip-verify’ option to the vault command. If we don’t do that, it will complain the it can not validate the certificate that matches the configured url ‘vault.service.dj-wasabi.local.

After executing the command, we see some output appear. This output is very important and needs to be saved somewhere on a secure location. The output provides us 5 unseal keys and the root token. Every time a vault instance is (re)started, the instance will be in a sealed state and needs to be unsealed. 3 of the 5 tokens needs to be used when you need to unseal a vault instance.

bash-4.3$ vault unseal -tls-skip-verify
Key (will be hidden):
Sealed: true
Key Shares: 5
Key Threshold: 3
Unseal Progress: 1
bash-4.3$ vault unseal -tls-skip-verify
Key (will be hidden):
Sealed: true
Key Shares: 5
Key Threshold: 3
Unseal Progress: 2
bash-4.3$ vault unseal -tls-skip-verify
Key (will be hidden):
Sealed: false
Key Shares: 5
Key Threshold: 3
Unseal Progress: 0

We have executed 3 times the unseal command and now this Vault instance is unsealed. You can see the ‘Unseal Progress’ changing after we enter an unseal key. We can verify that state of the vault instance by executing the vault status command:

bash-4.3$ vault status -tls-skip-verify
Sealed: false
Key Shares: 5
Key Threshold: 3
Unseal Progress: 0
Version: 0.6.4
Cluster Name: vault-cluster-7e01e371
Cluster ID: b9446acf-4551-e4c2-fa5f-03bd1bcf872f

High-Availability Enabled: true
Mode: active
Leader: https://192.168.1.202:8200

We see that this vault instance is not sealed and that the mode of this node is active. You can also see that the leader of the vault instance is in my case the current host. (Not strange as this is the first Vault instance of the environment.) If we want to add a 2nd and more, we have to execute the same commands as before. With the exception of the vault init command, as we already have an initialised environment.

As we are still logged in on the node, lets create a simple entry.

bash-4.3$ export VAULT_TOKEN=ed220674-24da-d446-375d-bbd0334bcb31
bash-4.3$ vault write secret/password value=secret
Success! Data written to: secret/password

We first set the ‘VAULT_TOKEN’ variable, this value of this variable is the value of the ‘Initial root token’. After that, we created a simple entry in the database. Key ‘secret/password’ is created and had the value ‘secret’.

It took some time to investigate how to setup a High Available Vault environment with Consul, not much information can be found on the internet. So maybe this page will help you setting one up yourself. If you do have improvements please let me know.

Configuring Access Control Lists in Consul

consul_logo

This is the 2nd post in securing Consul and this is about using ACLs in Consul. The first post (this one) we configured a Consul cluster by using gossip encryption and using SSL|TLS certificates. Now we cover the basics about Consul ACL’s (Access Control List) and configuring them in our cluster.

Master Token

First we have to create a master token. This is the token that has all rights (Thats why its called the master), sort of the ‘root’ token. We have to generate it first and we can use the uuidgen command in Linux (or Mac) for this. We use this output of the uuidgen command and place it in the following file: /data/consul/config/master-token.json

{
  "acl_master_token":"d9f1928e-1f84-407c-ab50-9579de563df5",
  "acl_datacenter":"dc1",
  "acl_default_policy":"deny",
  "acl_down_policy":"deny"
}

We have to store/configure this file on all Consul Servers. You’ll see that we set the default policy to “deny”, so we block everything and only enable the things we want. When we have created the file, we have to restart all Consul Servers to make the ACL’s active.

If you may recall what we did configure the Consul Server in the previous blogpost, we have configured the Consul Servers with this property:

"verify_incoming": true,

We have to open the ui on the Consul Server and because we have the property above configured, we need to load a SSL client certificate in our browser. (Or for now, you can also remove the property and restart Consul. But make sure you add it again when you are done!)

Now open the ui on the server and click on the right button (Settings). You’ll see something like this:

consul_settings

We enter the token we placed in the file in the field we see in our browser. Now we click on the button “ACL” (Token is saved automatically in your browser) and we see something like this:

consul_acl

This is an overview of all tokens available in Consul. You’ll see that 2 tokens exists in Consul right now:

  • Anonymous Token
  • Master Token

Anonymous Token

The anonymous token is used when you didn’t configure a Token in the settings page or didn’t supply it when using 3rd party software. You’ll only see the “consul” service, but won’t see anything else. If we would create a key in the key/value store, it will fail because the Anonymous token can’t do anything (Because of the property “acl_default_policy”:”deny”).

Master token

The master token is the token we just filled in the settings tab and the one configured in the json file in the beginning of this blogpost and is sort of the root token. The one token to rule them all.

So what do you need when you want to create an ACL? There are 3 types of policies that can be used:

  • read
  • write
  • deny

Might be obvious that the “read” policy is for reading data, “write” policy is for reading and writing data and “deny” is for NOT reading or writing data to Consul.

The ACL is written in the HCL language (HCL stands for HashiCorp Language) and we will create an ACL via the ui. You can also do that via the Consul API and automatically maintain them with for example Ansible, but that is out of the scope for this blogpost. In the ui we see on the right side of the page “New ACL”.

In the “name” field we enter for now “test” and select “client” as type. In the “Rules” field we enter the following:

key "" {
  policy = "read"
}
key "foo/" {
  policy = "write"
}

When we click on “create”, the ACL will be created. With this ACL, we choose the type “client” instead of the “management” type. When you have selected “management” as ACL type, the users/services which will use this ACL can also create/update/delete this and other ACL’s in the cluster. As we don’t want that, we select the “client” type.

We created 2 rules, both are for the key/value store. The first “key” rule specifies that all keys in the key/value store can be read with the ACL. With the 2nd “key” we specify that all keys in the “foo/” directory can be read and written. When we use this ACL, we can create the key “foo/bar”, but not the key “foobar”.

Next for using “key” in the rules, you can also configure “service”, “event” and “query” rules. It has the same format as the “key” example above and uses the same policies. With this you can easily give each application (or user) the correct rights.

Registrator

With registrator we can easily add docker containers as services into Consul. Now we have configured a default ACL policy to “deny” we have to update our configuration for the registrator. Registrator will attempt to sent the data to Consul for creating the services and registrator will think this is done, but Consul will deny because of the default policy. We can create a ACL specific to registrator.

Let’s create one via the UI. We enter the name “Registrator” and select “client” type. There are 2 possibilities to proceed regarding the “Rules”:

We can add a rule that will be used for all services the registry will add:

service "" {
  policy = "write"
}

Or we mention each service independently:

service "kibana" {
  policy = "write"
}
service "jenkins" {
  policy = "write"
}

Both have their pros and cons. With the first rule we allow that registrator can add all services into Consul and requires not much “maintenance”, it is a little bit to “open”. The 2nd rule requires more maintenance by adding all services but is more secure. With this, not all containers are added automatically and thus no rogue containers will be available in Consul.

We click on “create” to create the ACL. Now we have an token id and use that token in our docker run command. Our command to start registrator will look like this now:

docker run -h vserver-201 \
-v /var/run/docker.sock:/tmp/docker.sock \
-v /data/consul/config/ssl:/consul:ro \
-e CONSUL_CACERT=/consul/dj-wasabi.local.pem \
-e CONSUL_TLSCERT=/consul/vserver-201.dc1.dj-wasabi.local.crt \
-e CONSUL_TLSKEY=/consul/vserver-201.dc1.dj-wasabi.local.key \
-e CONSUL_HTTP_TOKEN=5c7d6559-cd90-d244-bbed-14d459a74bd2 \
gliderlabs/registrator:master \
-ip=192.168.1.201 consul-tls://vserver-201.dc1.dj-wasabi.local:8500

We had to add the -e CONSUL_HTTP_TOKEN variable with the token id as value. When I start the “kibana” container it will be added to Consul and we see the service is created.

We covered the basics for creating and using ACL’s in Consul. Using ACL’s in Consul will help securing Consul more by only allowing settings that is needed for the container purpose. Hopefully this will help you configuring ACLs in your environment.

Setting up a secure Consul cluster with docker

consul_logo

This post is the first of 2 blog items about setting up a secure Consul environment.

With the first post – which is this one – we will discuss how we setup a secure Consul environment. We will use a docker container and configure it with SSL certificates to secure the traffic from and to Consul. The 2nd post (This one), we will dive into ACLs and how we can make use of ACLs in Consul.

We will use the ‘wdijkerman/consul’ docker container to setup a secure environment. For now we create a Consul cluster with 2 hosts, named ‘vserver-201′ and ‘vserver-202′. ‘vserver-201′ will be the Consul Agent and ‘vserver-202′ will be the Consul Server. There is no specific need to use this container, you can also make this work with other Consul (containers) or installations.

Before we are going to setup the environment, we will briefly discuss the used docker container first.

wdijkerman/consul

This is a docker container created by myself which has Consul installed and configured. This container holds some basic Consul configuration and we can easily add some new configuration options by either supplying them to the command line or by creating a configuration json file. This container is running Consul 0.7.2 (Which is the latest version at moment of writing) and is running Alpine 3.5 (Also latest version at moment of writing). The most important thing is is that Consul isn’t running as user root, it is running as user ‘consul’ (with a fixed UID).

Before we start anything with the container, we going to add a user with that UID on the hosts running Consul.

useradd -u 995 consul

After this, we have to create 2 directories on the hosts running Consul. We use the following 2 directories:

mkdir -p /data/consul/data /data/consul/config
chown consul /data/consul/data /data/consul/config

The first directory is where Consul will store the Consul data and is only needed for the host running the Consul Server. The 2nd directory is where Consul will look for configuration files in which we create some files further in this post. On the host running the Consul Agent (In my case the host ‘vserver-201′) we only have to create the /data/consul/config directory. After the creation of the directories, we make sure these directories are owned by the earlier created user consul.

Before we are going to create some configuration files, take a look at the following json file. This json file is already present in the Consul docker container (So we don’t have to create it ourself) and is the default configuration of Consul:

{
  "data_dir": "/consul/data",
  "ui_dir": "/consul/ui",
  "log_level": "INFO",
  "client_addr": "0.0.0.0",
  "ports": {
    "dns": 53
  },
  "recursor": "8.8.8.8",
  "disable_update_check": true
}

As you see, this is a very basic configuration and we need to add some options to make it secure.

encrypt

We are going to expand our configuration by adding a new file in the /data/consul/config directory. With this file we are going to encrypt all of our internal Consul gossip traffic. This file should be placed on all of the hosts running Consul that will be/is part of this cluster.

Lets create a string with the following command:

docker run --rm --entrypoint consul wdijkerman/consul keygen

We use the output of this command and place it in the following file: /data/consul/config/encrypt.json

{
  "encrypt": "iuwMf/cScjTvKUKDC77kJA=="
}

We make sure that the rights of the file is set to 0400 and owned by the user consul.

chown consul:consul /data/consul/config/encrypt.json
chmod 0400 /data/consul/config/encrypt.json

All of the Consul nodes (Server and Agent) need this file, so make sure your Ansible (or Puppet, Chef or Saltstack) is configured to place this file on all of your nodes.

ssl

As all requests to and from Consul are done via http, we need to configure Consul that it listens on https instead of http. Before we do anything with Consul, we need access to a ssl crt, key and ca file first.

Before we execute a openssl command, we have to make sure that our CA SSL configuration is correct. Consul (Well, actually the go language: https://github.com/golang/go/issues/7423) requires some extra configuration specifically for using extentions in certificates. We have to add (or update) the property ‘extendedKeyUsage’ in the SSL CA configuration file so that the following values are added:

serverAuth,clientAuth

The usr_cert configuration in the CA openssl configuration file will look something like this:

[ usr_cert ]

basicConstraints=CA:FALSE
nsComment = "OpenSSL Generated Certificate"
subjectKeyIdentifier=hash
authorityKeyIdentifier=keyid,issuer
extendedKeyUsage = critical,timeStamping,serverAuth,clientAuth

(I have no idea why critical and timeStamping are there, so I just keep them there. :-))

We have to create the certificates now, the FQDN for this is:

<name_of_node>.<datacenter>.<domain>

In my case, my nodes are ‘vserver-201′ and ‘vserver-201′ , my domain is ‘dj-wasabi.local’ and have the default ‘dc1′ as datacenter. I need to create a crt and key for the host ‘vserver-201.dc1.dj-wasabi.local’ and ‘vserver-202.dc1.dj-wasabi.local’.

So on the host where my ‘dj-wasabi.local’ CA is configured, I need to execute the following set of commands:

cd /etc/pki/CA
openssl genrsa -out private/vserver-202.dc1.dj-wasabi.local.key 4096

We first generate the SSL key.

openssl req -new -extensions usr_cert -sha256 -subj "/C=NL/ST=Utrecht/L=Nieuwegin/O=dj-wasabi/CN=vserver-202.dc1.dj-wasabi.local" -key private/vserver-202.dc1.dj-wasabi.local.key -out csr/vserver-202.dc1.dj-wasabi.local.csr

We generate the csr file from the earlier created key.

openssl ca -batch -config /etc/pki/tls/openssl.cnf -notext -in csr/vserver-202.dc1.dj-wasabi.local.csr -out certs/vserver-202.dc1.dj-wasabi.local.crt

And now we will create a crt by signing the csr via the OpenSSL CA.

(And I do the same for host vserver-201.dc1.dj-wasabi.local)

Now we have to copy these files (Including the CA certificate file) to the servers and make sure these files are stored in the /data/consul/config directory, owned and only available by user consul. I create a ssl directory and places all the ssl files in this directory.

Now we have to create a configuration file, so Consul knows that it has SSL certificates. First we configure the Consul Server, in my case it is running on the ‘vserver-202′ host. We create the file /data/consul/config/ssl.json with the following content:

{
  "ca_file": "/consul/config/ssl/dj-wasabi.local.pem",
  "cert_file": "/consul/config/ssl/vserver-202.dc1.dj-wasabi.local.crt",
  "key_file": "/consul/config/ssl/vserver-202.dc1.dj-wasabi.local.key",
  "verify_incoming": true,
  "verify_outgoing": true
}

(Keep in mind that /data/consul/config is mounted in the container as /consul/config).

With the ‘verify_incoming‘ and ‘verify_outgoing‘ we make sure that all traffic to and from the Server is encrypted. If we would start the container right now, you can only access the ui if you have have created client ssl certificates and loaded it in your browser.

For the Consul agent, we use the same ssl.conf configuration file as mentioned above, but without the ‘verify_incoming‘ option.

ports

Before we start the container, we have to do 1 small thing. With a default configuration which we currently have, port 8500 is used for http. We create a new configuration file and assign the http listener to a different port number, so we can configure port 8500 to be https.

We create the file: /data/consul/config/ports.json with the following content:

{
  "ports": {
    "http": 8501,
    "https": 8500
  }
}

We have to specifiy the http port and give this a port number, otherwise it will be set default to 8500. When we start the container with the next step, we only configure port 8500 to be opened and not port 8501 and thus we have a https enabled Consul container.

Start Consul

Now we are able to start the Consul Server on the Consul server ‘vserver-202‘. We execute the following command:

docker run -h vserver-202 --name consul \
-v /data/consul/cluster:/consul/data \
-v /data/consul/config:/consul/config \
-p 8300:8300 -p 8301:8301 -p 8301:8301/udp \
-p 8302:8302 -p 8302:8302/udp -p 8400:8400 \
-p 8500:8500 -p 8600:53/udp wdijkerman/consul \
-server -ui -ui-dir /consul/ui -bootstrap-expect=1 \
-advertise 192.168.1.202 -domain dj-wasabi.local \
-recursor=8.8.8.8 -recursor=8.8.4.4

The following output appears:

[root@vserver-202 config]# docker logs consul
==> WARNING: BootstrapExpect Mode is specified as 1; this is the same as Bootstrap mode.
==> WARNING: Bootstrap mode enabled! Do not enable unless necessary
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Version: 'v0.7.2'
Node name: 'vserver-202'
Datacenter: 'dc1'
Server: true (bootstrap: false)
Client Addr: 0.0.0.0 (HTTP: 8501, HTTPS: 8500, DNS: 53, RPC: 8400)
Cluster Addr: 192.168.1.202 (LAN: 8301, WAN: 8302)
Gossip encrypt: true, RPC-TLS: true, TLS-Incoming: true
Atlas: <disabled>

==> Log data will now stream in as it occurs:

Most important in this output are these 2 lines:

Client Addr: 0.0.0.0 (HTTP: 8501, HTTPS: 8500, DNS: 53, RPC: 8400)
Gossip encrypt: true, RPC-TLS: true, TLS-Incoming: true

First line we can see that port 8500 is used for HTTPS and port 8501 is used for HTTP.
2nd line we see that the parameter encrypt is active (Is set to true) and both the ‘verify_incoming’ and ‘verify_outgoing’ are also set to true.

Now we can start Consul on the ‘vserver-201′ (Consul Agent):

docker run -h vserver-201 --name consul \
-v /data/consul/config:/consul/config \
-p 8300:8300 -p 8301:8301 -p 8301:8301/udp \
-p 8302:8302 -p 8302:8302/udp -p 8400:8400 \
-p 8500:8500 -p 8600:53/udp wdijkerman/consul \
-join 192.168.1.202 -advertise 192.168.1.201 \
-domain dj-wasabi.local

The Consul Agent will connect to the Consul Server and we can open the ui on the Agent with url https://vserver-201.dc1.dj-wasabi.local:8500. In my case it complains that the certificate is not validated (I’m using a self-signed CA certficate), but I’m able to access the ui and see the service ‘consul’. I do have an issue with opening the ui on the Consul Server. Why?

We have added the following property in the file /data/consul/config/ssl.json

"verify_incoming": true,

This means that ALL traffic to the Consul Server should be done via SSL certificates. If we really want to access the ui on the Consul Server (And we do want that, ACL’s ;-)) we have to create a client SSL certificate, load it in the browser and try opening the ui again.

Registrator

I use registrator in my environment and have to make sure that it can work with SSL to. For registrator, we have to configure 3 environment variables which are used for the locations of the ssl crt, key and ca file. To do this, we also have to mount the ssl directory in the registrator container so it has access to theses files.

Next, we have to use the consul-tls:// option instead of the consul:// when starting registrator.
Our command looks like this now:

docker run -h vserver-201 \
-v /var/run/docker.sock:/tmp/docker.sock \
-v /data/consul/config/ssl:/consul:ro \
-e CONSUL_CACERT=/consul/dj-wasabi.local.pem \
-e CONSUL_TLSCERT=/consul/vserver-201.dc1.dj-wasabi.local.crt \
-e CONSUL_TLSKEY=/consul/vserver-201.dc1.dj-wasabi.local.key \
gliderlabs/registrator:master \
-ip=192.168.1.201 consul-tls://vserver-201.dc1.dj-wasabi.local:8500

After executing the above command, new docker containers will be added automatically in Consul as a service via tls.

We successfully created a secure Consul environment where all traffic from and to Consul are encrypted. Even with the registrator tool we add new services via TLS connections.

Next blog item we will discuss the ACLs in Consul to make sure that not everyone can create/update/delete keys in the k/v store and or create/add/delete services.