In Kubernetes there are a few different ways to release an application, it is necessary to choose the right strategy to make your infrastructure reliable during an application update.
Choosing the right deployment procedure depends on the needs, we listed below some of the possible strategies to adopt:
- recreate: terminate the old version and release the new one
- ramped: release a new version on a rolling update fashion, one after the other
- blue/green: release a new version alongside the old version then switch traffic
- canary: release a new version to a subset of users, then proceed to a full rollout
- a/b testing: release a new version to a subset of users in a precise way (HTTP headers, cookie, weight, etc.). A/B testing is really a technique for making business decisions based on statistics but we will briefly describe the process. This doesn’t come out of the box with Kubernetes, it implies extra work to setup a more advanced infrastructure (Istio, Linkerd, Traefik, custom nginx/haproxy, etc).
You can experiment with each of these strategies using Minikube, the manifests and steps to follow are explained in this repository: https://github.com/ContainerSolutions/k8s-deployment-strategies
Let’s take a look at each strategy and see what type of application would fit best for it.
Recreate - best for development environment
A deployment defined with a strategy of type Recreate will terminate all the running instances then recreate them with the newer version.
spec: replicas: 3 strategy: type: Recreate
A full example and steps to deploy can be found at https://github.com/ContainerSolutions/k8s-deployment-strategies/tree/master/recreate
- application state entirely renewed
- downtime that depends on both shutdown and boot duration of the application
Ramped - slow rollout
A ramped deployment updates pods in a rolling update fashion, a secondary ReplicaSet is created with the new version of the application, then the number of replicas of the old version is decreased and the new version is increased until the correct number of replicas is reached.
spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: maxSurge: 2 # how many pods we can add at a time maxUnavailable: 0 # maxUnavailable define how many pods can be unavailable # during the rolling update
A full example and steps to deploy can be found at https://github.com/ContainerSolutions/k8s-deployment-strategies/tree/master/ramped
When setup together with horizontal pod autoscaling it can be handy to use a percentage based value instead of a number for maxSurge and maxUnavailable.
If you trigger a deployment while an existing rollout is in progress, the deployment will pause the rollout and proceed to a new release by overriding the rollout.
- version is slowly released across instances
- convenient for stateful applications that can handle rebalancing of the data
- rollout/rollback can take time
- supporting multiple APIs is hard
- no control over traffic
Blue/Green - best to avoid API versioning issues
A blue/green deployment differs from a ramped deployment because the "green" version of the application is deployed alongside the "blue" version. After testing that the new version meets the requirements, we update the Kubernetes Service object that plays the role of load balancer to send traffic to the new version by replacing the version label in the selector field.
apiVersion: v1 kind: Service metadata: name: my-app labels: app: my-app spec: type: NodePort ports: - name: http port: 8080 targetPort: 8080 # Note here that we match both the app and the version. # When switching traffic, we update the label “version” with # the appropriate value, ie: v2.0.0 selector: app: my-app version: v1.0.0
A full example and steps to deploy can be found at https://github.com/ContainerSolutions/k8s-deployment-strategies/tree/master/blue-green
- instant rollout/rollback
- avoid versioning issue, change the entire cluster state in one go
- requires double the resources
- proper test of the entire platform should be done before releasing to production
- handling stateful applications can be hard
Canary - let the consumer do the testing
A canary deployment consists of routing a subset of users to a new functionality. In Kubernetes, a canary deployment can be done using two Deployments with common pod labels. One replica of the new version is released alongside the old version. Then after some time and if no error is detected, scale up the number of replicas of the new version and delete the old deployment.
Using this ReplicaSet technique requires spinning-up as many pods as necessary to get the right percentage of traffic. That said, if you want to send 1% of traffic to version B, you need to have one pod running with version B and 99 pods running with version A. This can be pretty inconvenient to manage so if you are looking for a better managed traffic distribution, look at load balancers such as HAProxy or service meshes like Linkerd, which provide greater controls over traffic.
In the following example we use two ReplicaSets side by side, version A with three replicas (75% of the traffic), version B with one replica (25% of the traffic).
Truncated deployment manifest version A:
spec: replicas: 3
Truncated deployment manifest version B, note that we only start one replica of the application:
spec: replicas: 1
A full example and steps to deploy can be found at https://github.com/ContainerSolutions/k8s-deployment-strategies/tree/master/canary
- version released for a subset of users
- convenient for error rate and performance monitoring
- fast rollback
- slow rollout
- fine tuned traffic distribution can be expensive (99% A/ 1%B = 99 pod A, 1 pod B)
The procedure used above is Kubernetes native, we adjust the number of replicas managed by a ReplicaSet to distribute the traffic amongst the versions.
If you are not confident about the impact that the release of a new feature might have on the stability of the platform, a canary release strategy is suggested.
A/B testing - best for feature testing on a subset of users
A/B testing is really a technique for making business decisions based on statistics, rather than a deployment strategy. However, it is related and can be implemented using a canary deployment so we will briefly discuss it here.
In addition to distributing traffic amongst versions based on weight, you can precisely target a given pool of users based on a few parameters (cookie, user agent, etc.). This technique is widely used to test conversion of a given feature and only rollout the version that converts the most.
Istio, like other service meshes, provides a finer-grained way to subdivide service instances with dynamic request routing based on weights and/or HTTP headers.
Below is an example of rules setup using Istio, as Istio is still in heavy development the following example rule may change in the future:
route: - tags: version: v1.0.0 weight: 90 - tags: version: v2.0.0 weight: 10
A full example and steps to deploy can be found at https://github.com/ContainerSolutions/k8s-deployment-strategies/tree/master/ab-testing
Other tools like Linkerd, Traefik, NGINX, HAProxy, also allow you to do this.
- requires intelligent load balancer
- several versions run in parallel
- full control over the traffic distribution
- hard to troubleshoot errors for a given session, distributed tracing becomes mandatory
- not straightforward, you need to setup additional tools
To sum up
There are different ways to deploy an application, when releasing to development/staging environments, a recreate or ramped deployment is usually a good choice. When it comes to production, a ramped or blue/green deployment is usually a good fit, but proper testing of the new platform is necessary. If you are not confident with the stability of the platform and what could be the impact of releasing a new software version, then a canary release should be the way to go. By doing so, you let the consumer test the application and its integration to the platform. Last but not least, if your business requires testing of a new feature amongst a specific pool of users, for example all users accessing the application using a mobile phone are sent to version A, all users accessing via desktop go to version B. Then you may want to use the A/B testing technique which, by using a Kubernetes service mesh or a custom server configuration lets you target where a user should be routed depending on some parameters.
I hope this was useful, if you have any questions/feedback feel free to comment below.