It is an interesting feature of today’s software development landscape, that while some consider Continuous Delivery a done deal, there are still many teams out there who either don’t see or are struggling to realise the benefits. The idea of releasing small increments of functionality often seems like such a straightforward solution to alleviate the pain from large, risky releases that we gloss over the very hard obstacles in the way.
The most prevalent software development method today is Scrum. Scrum advocates having your application ready to release at the end of every (roughly two week long) sprint. I’m not singling out Scrum teams here however - any method where several features are bundled together into a release before they get deployed in a production(-ish) environment are problematic.
- They bundle together many features and fixes. If any one of them breaks in testing it delays the whole bundle. If any one of them breaks in production a rollback of the full functionality is needed.
- They increase the risk something breaking, failing tests, etc., simply because of there is more stuff. Think of big releases as lots small risks glued together to create huge ball of risk which you will throw at your production environment.
- Increase work in progress - work that seems to have been done (coding) but is really awaiting a next step (testing, production deployment) which might send it back to the previous stage in your pipeline.
Given that large releases are risky and error prone, why do so many teams still cling to them? I see two repeating issues that work against adoption of Continuous Delivery: breaking old habits and misplaced expectations.
How does CD break old habits?
CD shifts around hidden bottlenecks in the release process. For developers, handovers to QA and Ops are a sort of blessing. Delaying the inevitable messiness of running in a real environment feels like increasing productivity. It also looks pretty productive from the point of view of a manager of the team. During most of the sprint developers seem super-busy writing code. What happens when the release date approaches? There is heroic effort to test, to release, to fix the build and test again until all is done and the lights in the office can be turned off.
Many managers and developers accept this iteration between seemingly high productivity and error-prone heroic releases as a fact of software development. Doing Continuous Delivery looks counter-intuitive from this perspective: if releases are stressful, error-prone events, why have more of them?
Misplaced expectations
A second obstacle to CD adoption is having the wrong expectations. We all watch talks and read blogs about companies doing CD deploying to production 50x a day. Amazing right? But we don't feel like it's important for us - our customers are completely fine (maybe even more comfy) with seeing a new release every once in awhile. So CD is not for us, it's for those other companies who have to deliver new stuff all the time for some reason.
What is usually lost in these success stories is, that it's not (just) about the customer. It's about the way software is developed inside the company. About increasing productivity and managing risk in small increments. It's a very agile idea even though many agile teams manage to ignore it.
The importance of limiting Work In Progress
So how does delivering small releases improve productivity compared to delivering large releases? The key is work in progress. Every item of work in progress slows a team down and a feature is not ready until it is in production. Any change in the application or infrastructure needs to pass through several stages. The last stage is the change landing on a user’s computer in some form. Once the application fulfills its purpose for the user and nothing was broken by the change, the item can be ticked off and it doesn’t pose any risks anymore.
Not all environments allow pushing each feature independently all the way to users. Public webapps are the best at this while enterprise applications delivered by a third party are usually the worst. If your environment doesn’t allow pushing to end-users every day, try to get as close as possible to them. Grab a few business analysts and treat them as real users (migrate their data, only communicate through official channels, etc).
Think of each feature that hasn’t made it into the hands of your users yet as a ticking bomb ready to blow up. Work in progress is both a distraction and a risk.
Let’s illustrate how work in progress can bite you. Your team at CatsRUs™ is busy finishing the sprint while business is eager to deliver the new ranking button which will allow users to give cat photos 1 to 5 stars. You have a sizable team of 8 developers, 4 SETs (Software Engineer in Test) and 2 SREs (Site Reliability Engineer).
As the sprint is two weeks long you want to get as many features into the release as possible. 8 developers can comfortably start the sprint working on 6 different issues and might get even 10 done in two weeks. One of those will be the ranking button which the sales department promised by signing the contract with their own blood.
Work on the ranking button goes really well and it is QA tested after three days. The team feels relieved and focuses energy on getting the rest of the features done. Improvements to user management will require a database schema change, but that should be handled just fine. Finally by the end of the second week all features are either done or dropped from the sprint. The team is ready to bundle it all and do the release. SREs deploy to the staging environment and notice that the application doesn’t function properly - users can’t login. It turns out another feature finished on the last day before the release, was not tested with the new user database schema.
So the release needs to be rolled back to fix the problem. It will take 2 days of fixing and regression testing everything that might have been affected. Management is nervous, the team doesn’t sleep but finally a new release is out and gets successfully deployed to production. Delivery of an important piece of functionality was unexpectedly delayed by an unrelated change.
The elegant solution to the above problem is Continuous Delivery. Push that ranking feature all the way to production without bundling it with anything else. It might not be as efficient as creating a big release on a sunny day when nothing breaks, because regression testing and deployment need to be done way more times. However the reduced risk and increased control more than make up for that. And of course both regression testing and deployments can be more and more automated if you put effort into it.
Conclusions
Continuous Delivery is the most important improvement you can make to your delivery process after adopting agile methods of working. Leaving work in progress by batching features for large(er) releases will ultimately increase your deployment risks and lower team productivity. Try CD for a few weeks and you will see that your delivery process even becomes simpler as much of project management revolves around keeping track of in-progress work items.
For a truly performant CD process, automation in all stages of the pipeline is critical. Flexible application architecture, such as one organised around independent microservices also helps with safely deploying individual changes. These concerns are outside of the scope of this blog post but be on the lookout for increased pressure on your team in these areas once you adopt Continuous Delivery.