Fully automated canary deployments in Kubernetes

In a previous article, we described how you can do blue/green deployments in Codefresh using a declarative step in your Codefresh Pipeline.

Blue/Green deployments are very powerful when it comes to easy rollbacks, but they are not the only approach for updating your Kubernetes application.

Another deployment strategy is using Canaries (a.k.a. incremental rollouts). With canaries, the new version of the application is gradually deployed to the Kubernetes cluster while getting a very small amount of live traffic (i.e. a subset of live users are connecting to the new version while the rest are still using the previous version).

The small subset of live traffic to the new version acts as an early warning for potential problems that might be present in the new code. As our confidence increases, more canaries are created and more users are now connecting to the updated version. In the end, all live traffic goes to canaries, and thus the canary version becomes the new “production version”.

The big advantage of using canaries is that deployment issues can be detected very early while they still affect only a small subset of all application users. If something goes wrong with a canary, the production version is still present and all traffic can simply be reverted to it.

While a canary is active, you can use it for additional verification (for example running smoke tests) to further increase your confidence on the stability of each new version.

Unlike Blue/green deployments, Canary releases are based on the following assumptions:

  1. Multiple versions of your application can exist together at the same time, getting live traffic.
  2. If you don’t use some kind of sticky session mechanism, some customers might hit a production server in one request and a canary server in another.

If you cannot guarantee these two points, then blue/green deployments are a much better approach for safe deployments.

Canaries with/without Istio

The gradual confidence offered by canary releases is a major selling point and lots of organizations are looking for ways to adopt canaries for the main deployment method. Codefresh recently released a comprehensive webinar that shows how you can perform canary updates in Kubernetes using Helm and Istio.

The webinar shows the recommended way to do canaries using Istio. Istio is a service mesh that can be used in your Kubernetes cluster to shape your traffic according to your own rules. Istio is a perfect solution for doing canaries as you can point any percentage of your traffic to the canary version regardless of the number of pods that serve it.

In a Kubernetes cluster without Istio, the number of canary pods is directly affecting the traffic they get at any given point in time.

Traffic switching without IstioTraffic switching without Istio

So if for example you need your canary to get at 10% traffic, you need at least 9 production pods. With Istio there is no such restriction. The number of pods serving the canary version and the traffic they get is unrelated. All possible combinations that you might think of are valid. Here are some examples of what you can achieve with Istio:

Traffic switching with IstioTraffic switching with Istio

This is why we recommend using Istio. Istio has several other interesting capabilities such as rate limiting, circuit breakers, A/B testing etc.

The webinar also uses Helm for deployments. Helm is a package manager for Kubernetes that allows you to group multiple manifests together, allowing you to deploy an application along with its dependencies.

At Codefresh we have several customers that wanted to use Canary deployments in their pipelines but chose to wait until Istio reached 1.0 version before actually using it in production.

Even though we fully recommend Istio for doing canary deployments, we also developed a Codefresh plugin (i.e. a Docker image) that allows you to take advantage of canary deployments even on plain Kubernetes clusters (without Istio installed).

We are open-sourcing this Docker image today for everybody to use and we will explain how you can integrate it in a Codefresh pipeline with only declarative syntax.

Canary deployments with a declarative syntax

In a similar manner as the blue/green deployment plugin, the Canary plugin is also taking care of all the kubectl invocations needed behind the scenes. To use it you can simply insert it in a Codefresh pipeline as below:

canaryDeploy:

title: “Deploying new version ${}”

image: codefresh/k8s-canary:master

environment:

– WORKING_VOLUME=.

– SERVICE_NAME=my-demo-app

– DEPLOYMENT_NAME=my-demo-app

– TRAFFIC_INCREMENT=20

– NEW_VERSION=${}

– SLEEP_SECONDS=40

– NAMESPACE=canary

– KUBE_CONTEXT=myDemoAKSCluster

Notice the complete lack of kubectl commands. The Docker image k8s-canary contains a single executable that takes the following parameters as environment variables:

Environment Variable Description
KUBE_CONTEXT Name of your cluster in Codefresh dashboard
WORKING_VOLUME A folder for saving temp/debug files
SERVICE_NAME Existing K8s service
DEPLOYMENT_NAME Existing k8s deployment
TRAFFIC_INCREMENT Percentage of pods to convert to canaries at each stage
NEW_VERSION Docker tag for the next version of the app
SLEEP_SECONDS How many seconds each canary stage will wait. After that, the new pods will be checked for restarts
NAMESPACE K8s Namespace where deployments happen

Prerequisites

The canary deployments steps expect the following assumptions:

  • An initial service and the respective deployment should already exist in your cluster.
  • The name of each deployment should contain each version
  • The service should have a metadata label that shows what the “production” version is.

These requirements allow each canary deployment to finish into a state that allows the next one to run in a similar manner.

You can use anything you want as a “version”, but the recommended approach is to use GIT hashes and tag your Docker images with them. In Codefresh this is very easy because the built-in variable CF_SHORT_REVISION gives you the git hash of the commit that was pushed.

The build step of the main application that creates the Docker image that will be used in the canary step is a standard build step that tags the Docker image with the git hash.

BuildingDockerImage:

title: Building Docker Image

type: build

image_name: trivial-web

working_directory: ./example/

tag: ‘${}’

dockerfile: Dockerfile

For more details, you can look at the example application that also contains a service and deployment with the correct labels as well as the full codefresh.yml file.

How to perform Canary deployments

When you run a deployment in Codefresh, the pipeline step will print messages with its progress:

Canary LogsCanary Logs

First, the Canary plugin will read the Kubernetes services and extract the “version” metadata label to find out which version is running “in production”. Then it will read the respective deployment and find the Docker image currently getting live traffic. It will also read the number of current replicas for that deployment.

Then it will create a second deployment using the new Docker image tag. This second deployment uses the same labels as the first one, so the existing service will serve BOTH deployments at the same time. A single pod for the new version will be deployed. This pod will instantly get live traffic according to the total number of pods. For example, if you have in production 3 pods and the new version pod is created, it will instantly get 25% of the traffic (1 canary, 3 production version).

Once the first pod is created, the script is running in a loop where each iteration does the following:

  1. Increases the number of canaries according to the predefined percentage. For example, a percentage of 33% means that 3 phases of canaries will be performed. With 25%, you will see 4 canary iterations and so on. The algorithm used is pretty basic and for a very low number of pods, you will see a lot of rounding happening.
  2. Waits for some seconds until the pods have time to start (the time is configurable).
  3. Checks for pod restarts. If there are none, it assumes that everything is ok and the next iteration happens.

This goes on until only canaries get live traffic. The previous deployment is destroyed and the new one is marked as “production” in the service.

If at any point there are problems with canaries (or restarts), all canary instances are destroyed and all live traffic goes back to the production version.

You can see all this happening in real-time either using direct kubectl commands or looking at the Codefresh Kubernetes dashboard. While canaries are active you will see two docker image versions in the Images column:

We are working on more ways of health-checking in addition to looking at pod restarts. The Canary image is available in Dockerhub.

New to Codefresh? Create Your Free Account Today!

Source

Leave a Reply

Your email address will not be published. Required fields are marked *