Last week, the excellent Adrian Mouat, Docker captain & author of “Using Docker” gave a webinar on how to use Docker to secure your microservice containers. The webinar was a teaser for a 2 day training session by Adrian and Sam Newman (author of “Building Microservices”) in Amsterdam on the 31 July and London on the 30 August. I’ll include booking links below, there are still a couple of spaces spare on both courses, which should be very good indeed.
So what was the gist? Well, Adrian covered so much I’m going to break this into a two-parter. In the first part I’ll talk about the basics of a healthy Dockerfile and in the second part I’ll talk about safe deployment.
What Does Good Security Look Like?
According to Adrian, security done properly will be almost invisible to users. The only difference between a secure site and an insecure one is things that don't happen - secure sites are not defaced and do not lose sensitive customer data for example.
We know you can never realistically be 100% secure. Who knows if your spouse married you to get your AWS credentials? I’m constantly suspicious on that front. However, like Sam Newman said in his own webinar, most of us don’t need to be 100% safe. We just need to make ourselves high effort for hackers so they go elsewhere.
What Does Bad Security Look Like?
Although it’s hard to describe good security, it’s pretty easy to describe bad security and that’s what Adrian did using the example of a bad Dockerfile. Specifically, there are 4 things we commonly see in Dockerfiles that are bad for security:
- No version numbers
- No User (the container runs as root)
- No verification of downloads
- No metadata.
I’ll describe these next but what we need to grok is there are lots of good Docker security features for containers and we should use them. Unfortunately, the default behaviour is not to.
Aside - actually maybe having default insecurity is a good thing? You really want there to be easy pickings (i.e. rookie containers) out there for hackers so they don’t bother with your more secure ones. As long as you don’t accidentally build on one of the those rookie containers of course…
1. Set Version Numbers
“Latest” is not your friend.
When we specify any base images in a Dockerfile (FROM <image>[:<tag>] [AS <name>]) we should ALWAYS provide a version number tag (e.g. FROM alpine:3.4).
There are 2 reasons for this:
- Repeatability. We generally want our containers to be the same every time they run unless we specifically choose otherwise. If we use the “latest” version then we have no control over what’s going to download and execute next time.
- Provenance. When we’re diagnosing issues (particularly security ones) we want to know where the code we are executing came from. If we use the “latest” tag on our base images we make it difficult or impossible to trace the exact versions and sources of the software we are running.
Of course, Adrian points out, it’s not quite so easy to specify an exact version (when is stuff ever easy?). How specific should you be?
When using semantic versioning, we can define a version to 3 levels of specificity: MAJOR.MINOR.PATCH. If you specify all 3 you’ve got good repeatability and provenance, but you won’t automatically pick up new security patches. If you just specify the MAJOR number then you're at risk of breaking changes (whilst these should be backwards compatible, there can be large changes to codebases in minor versions). Adrian reckons a good compromise is often to use MAJOR.MINOR as the version number for a Dockerfile image or installed package (e.g. apt, pip, npm).. That way we shouldn’t pick up a breaking change but we should pick up fixes and security patches.
Another aside - when I Googled examples of Dockerfiles everything shown to me had no tag or used “latest”! So I infer there are a lot of rookie containers out there… This is more important for images that run in production, latest is probably fine for dev tooling and tests, but bad habits are hard to break.
A final aside from Adrian - we’ve said repeatability is good but in most cases it’s actually pretty unattainable (Google do it but I say “let’s face it, they’re barely human”). It’s hard because you can’t really guarantee your sub-dependencies won’t change under your feet. If you desperately need repeatability you’ll have to run your own package mirror and you should also take a look at the new Bazel tool from the Google gang.
2. Set a User
If you don’t specify a user in your Dockerfile your container will run as the default user - root (!!) That isn’t good. It means if some baddie gets control of your contained process they’ll have root access to your container, and if they manage to breakout of the container, they will be root on the host (docker users are not namespaced, root in a container is root everywhere on the host). That is, in fact, bad. Particularly as it’s easily rectified with a tiny effort.
You can specify a user for your container in your Dockerfile: USER <user>[:<group>] or USER <UID>[:<GID>]
Just set a user who is less privileged than root and you’ll already be miles safer.
Of course, as Adrian points out, this is not always so easy (again!). You might need root permissions to kick off the container but not afterwards. In that case, there are ways to start as root and downgrade your user permissions later. You can create the user in the Dockerfile but only switch to it in the entry point or cmd script. See github for an example.
3. Verify Downloads
It we are being careful we can also check that the images we’re downloading from our registry are actually the images we expected. I.e. they haven’t been hacked or replaced. We can do this by using a hash, or digest, on the FROM command, for example:
FROM debian@sha256:bla...
This will mean the Dockerfile always inherits from the same image. The advantages here are you are guaranteed to always get exactly the same image and that it cannot have been tampered with. You can be certain that the base image won’t change unexpectedly and break things. As with tight specification of version, however, the disadvantage is that you won't get any updates, including essential security updates.
4. Use Metadata
Dockerfiles have excellent support for metadata using the LABEL command. You can add useful metadata like the git repo the image was built from thus allowing anyone debugging the container to locate the source code.
The Dockerfile LABEL command is a great resource for improving container provenance and it’s easy to use. For some examples see http://label-schema.org
What next?
Right, we’ve covered the very basics of Dockerfile security. In the next post we’ll go over what Adrian said about deployment security for microservices in Docker containers.
In the meantime, if you are interested in containers and security, here are some things you should do:
- Attend the Secure Microservices workshops from the experts Sam Newman and Adrian Mouat in Amsterdam or London
- Read the free “Docker Security” O’Reilly mini-book from Adrian
- Read part 2!
Read more about our work in The Cloud Native Attitude.