Today and yesterday Pini, Quinten and I were busy with one of our favourite clients here on the Swiss-German border. The problem we were trying to solve was how to continuously deliver an application from development machines all the way through to staging and production where the target system is Mesos and our services are in Docker containers. We did this by locking ourselves in an old water tower for two days. The water tower had wifi.
CD with Docker and Mesos is not a problem for the faint hearted. Firstly, you need to know how Docker and Mesos play together. Secondly, when you are doing continuous delivery you have to understand the Mesos API. Was the old node killed? This information is what we need to know to do system testing. Then we need to kill nodes to see if the system responds accordingly, that's also part of the system test. To do this, you can use the Mesos events or other methods from the REST API. Thirdly, you have to put developers and operations people in the same room and you have to have a real go at getting them to play together. If people are unwilling, then it won’t work. Fourthly, you actually have to understand what continuous delivery is and how it works.
Thus there were four pretty big things:
So what did we learn?
Conclusion
It is quite clear to me that Docker used in conjunction with Mesos is becoming a common pattern. What was less clear was if we could continually deploy systems using Mesos and Docker. The answer is, yes, you can. However, it’s extremely difficult not just because it’s technically difficult but because it requires great organisation skills because of all the moving parts. It also requires knowledge of the tools, which is hard to come by.
What would I do differently next time? We waited too long this time to introduce the developers to Mesos. The next time we do this we can take a lot of the sting out of the project by teaching the developer Mesos before the project starts. As we speak, Quinten is putting together a workshop ‘CD with Docker and Mesos’. The other thing I'd do differently is get the team together earlier. This is a classic mistake but one we keep making. It's more important now because of all the moving parts.
I always say, mainly to deaf ears, that there are no technical problems only managerial ones. This adage of mine was never truer than it is now where we are building massively distributed systems that must scale, never be down and where we must deploy to them quickly and without error. The revolution may very well be containerised but it also has to be very well managed if we are to avoid disappointment and break out of the patterns of the past.