All Hail the New Docker Swarm

Unfortunately, I'm not able to attend DockerCon US this year, but I will be keeping up with the announcements. As part of the Docker Captains program, I was given a preview of Docker 1.12 including the new Swarm integration which is Docker's native clustering/orchestration solution (also known as SwarmKit, but that's really the repo/library name). And it's certainly a big change. In this post I'll try to highlight the changes and why they're important.

The first and most obvious change is the move into Docker core; to start a Docker Swarm is now as simple as running docker swarm init on the manager node and docker swarm join $IP:PORT on the worker nodes, where IP:PORT is the address of the leader. You can then use the top level node command to get more information on the swarm e.g:

root@do-sw01:~# docker node ls
2yvkbxqnrct6cwtalrpmfceyy *  do-sw01  Accepted    Ready   Active        Leader
cwjj5fht6f9gtf7rskhzkd15u    do-sw02  Accepted    Ready   Active 

There is no need to download further images or binaries; everything you need is right there in your Docker install. In this post I'll stick to covering the technology, but it would seem naive not to point out that there are some large political ramifications from this decision; other orchestration frameworks such as Kubernetes and Mesos/Marathon which currently use Docker for running containers are unlikely to be happy with having to distribute Swarm alongside their own technology. As a side effect of this decision, I would expect to see other orchestrators move towards rkt or wrapping Docker's underlying runc runtime with their own tooling.

Unlike the previous version of Swarm, there is also no dependency on a external store such as Consul or etcd. This has been achieved by pulling the Raft consensus implementation out of etcd and integrating it into the SwarmKit code. Comparing getting a Kubernetes or Mesos cluster running, Docker Swarm is a snap. This is a big bonus for developers who just want to try things out.

Launching containers to run on the Swarm is achieved through the service command rather than the normal docker run syntax:

root@do-sw01:~# docker service create --name redis --network mynet redis:3

We can use the ls and tasks subcommands to find out a bit more about the service:

root@do-sw01:~# docker service ls
1kcw0f0s1105  redis  1/1       redis:3  
root@do-sw01:~# docker service tasks redis
ID                         NAME     SERVICE  IMAGE    LAST STATE          DESIRED STATE  NODE
coe6ha9pw2p6jx3yh8b1dxjfs  redis.1  redis    redis:3  Running 29 minutes  Running        do-sw01

And we can of course still find the container using the normal Docker commands:

root@do-sw01:~# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
d9883a9f25f3        redis:3             "docker-entrypoint.sh"   31 minutes ago      Up 31 minutes       6379/

This is a fundamental departure from the way Swarm used to work, which just re-used the normal Docker API. Whilst re-using the Docker API made things easy for developers to move between running on a single engine to a cluster, it also limited what was possible in Swarm and made it difficult or impossible to express certain concepts - for example it wasn't possible to have a group of containers ("pod" if you must) scheduled as a single atomic unit. Notably, as can been above, the interface refers to "tasks" rather than containers. Whilst focused on containers currently, Docker intend to add support for other schedulable units such as VMs, unikernels and pods.

The service abstraction feels a lot like Kubernetes, which has clearly been a source of inspiration. For example, to increase the number of running tasks for a service:

root@do-sw01:~# docker service update --replicas 5 redis
root@do-sw01:~# docker service tasks redis
ID                         NAME     SERVICE  IMAGE    LAST STATE          DESIRED STATE  NODE
coe6ha9pw2p6jx3yh8b1dxjfs  redis.1  redis    redis:3  Running 40 minutes  Running        do-sw01
5cy0585s5b29z1jyftktuj2tq  redis.2  redis    redis:3  Running 19 seconds  Running        do-sw02
30dvjnmvnxw7u6x2nnbyv6kvs  redis.3  redis    redis:3  Running 19 seconds  Running        do-sw01
7qqyqafks6n8jqgcpmim3jog7  redis.4  redis    redis:3  Running 19 seconds  Running        do-sw01
1v79qukwd2xqrrbssp7k98n34  redis.5  redis    redis:3  Running 19 seconds  Running        do-sw02

Which even mirrors Kubernetes language ("replicas"). Docker have also paid attention to the "service" abstraction in Kubernetes and built similar functionality into Swarm. It's easiest to describe this with an example. We'll use a second service to test our existing redis service (we can't use a regular container started with docker run as it will be unable to connect to the Swarm network):

root@do-sw01:~# docker service create --name redis2 --network mynet redis:3

We can then exec into the container created by this service (after connecting to the appropriate host and identifying the correct task):

root@do-sw02:~# docker exec -it 934f85433def sh

Now we can use the "redis-cli" tool available inside the container to connect to the redis service. This is a simple utility for connecting to Redis servers and setting/getting data. By using the hostname "redis" when connecting, we should find that Docker connects us to one of the 5 Redis containers:

# redis-cli -h redis
redis:6379> set name 1
redis:6379> exit
# redis-cli -h redis
redis:6379> get name
redis:6379> set name 2
redis:6379> exit
redis:6379> set name 5
redis:6379> exit
# redis-cli -h redis
redis:6379> get name

So we get DNS-based load-balancing for free. But that's not all. There are a lot of cases where you don't want to use DNS based load-balancing, so Swarm also provides a virtual IP for our service:

root@do-sw01:~# docker service inspect -f '' redis

The IP is "floating" or "virtual" - it hasn't been assigned to a particular container. If we hit it, we should find that we are also load-balanced across the Redis instances:

# redis-cli -h> get name
"2"> exit
# redis-cli -h> get name
"3"> exit
#  redis-cli -h> get name
"4"> exit
#  redis-cli -h> get name
"5"> exit
#  redis-cli -h> get name

If you have any experience with Kubernetes, this should seem familiar to you (but note that Docker have conflated the service and replication controller functionality of Kubernetes into a single service command). The lack of this functionality in the previous version of Swarm meant users had to turn to tools such as consul-template or Flux from Weaveworks. Now Swarm comes with everything you need to stand-up and properly load-balance a simple service.

Other in-built features include support for rolling-updates and TLS encryption of communication between Swarm nodes (Swarm does the hard work of automatically setting up TLS certification) for added security.

Of course, for such a new technology, it's not all roses. It's surprisingly annoying to go between the Swarm and container abstractions. For example, it would be nice to be able to `docker exec` into a task, but this requires figuring out which container represents the task and on which node it's running. You can't currently use Compose with Swarm, although I'm sure this will change quickly (I guess Compose will move to core at the same time). There doesn't seem to be a way to switch between the spread and binpack scheduling strategies, but that's probably just a documentation issue. Overall the functionality and maturity is naturally behind Kubernetes, but Swarm has made large in-roads in a very short period of time and it will be interesting to see how quickly they close the gap.

I've given several presentations comparing Swarm, Mesos/Marathon and Kubernetes. During these I would describe the disadvantages of Swarm as the overloading of the Docker API and the need to use third party tools for even simple systems. This new release addresses both those concerns in a single swoop and sets-up Swarm to be a serious competitor in the orchestration space.

Leave your Comment