Deploying Istio on a Kubernetes Cluster using Rancher 2.0

 

Expert Training in Kubernetes and Rancher

Join our free online training sessions to learn how to manage Kubernetes workloads with Rancher.

Sign up here

Service mesh is a new technology stack aimed at solving the connectivity problem between cloud native applications. If you want to build a cloud native application, you need a service mesh. One of the big players in the service mesh world is Istio. Istio is best described in their own about page. It’s a very promising service mesh solution, based on Envoy Proxy, having multiple tech giants contributing to it.

Below is an overview of how you can deploy Istio using Rancher 2.0

Istio at the moment works best with Kubernetes, but they are working to bring support for other platforms too. So to deploy Istio and demonstrate some of its capabilities, there’s a need for a kubernetes cluster. To do that is pretty easy using Rancher 2.0.

Prerequisites

To perform this demo, you will need the following:

  • a Google Cloud Platform account, the free tier provided is more than enough;
  • one ubuntu 16.04 instance (this is where the Rancher instance will be running)
  • a kubernetes cluster deployed to Google Cloud Platform, using Google Kubernetes Engine. This demo uses version 1.10.5-gke.2, which is the latest available at the time of writing;
  • Istio version 0.8.0, the latest available at the time of this writing.

Normally the steps provided should be valid with newer versions, too.

Starting a Rancher 2.0 instance

To begin, start a Rancher 2.0 instance. There’s a very intuitive getting started guide for this purpose here. Just to be sure you’ll get the information you need, the steps will be outlined below as well.

This example will use Google Cloud Platform, so let’s start an Ubuntu instance there and allow HTTP and HTTPs traffic to it, either via Console or CLI. Here’s an example command to achieve the above:

gcloud compute –project=rancher-20 instances create rancher-20
–zone=europe-west2-a –machine-type=n1-standard-1
–tags=http-server,https-server –image=ubuntu-1604-xenial-v20180627
–image-project=ubuntu-os-cloud

gcloud compute –project=rancher-20 firewall-rules create default-allow-http
–direction=INGRESS –priority=1000 –network=default –action=ALLOW
–rules=tcp:80 –source-ranges=0.0.0.0/0 –target-tags=http-server

gcloud compute –project=rancher-20 firewall-rules create default-allow-https
–direction=INGRESS –priority=1000 –network=default –action=ALLOW
–rules=tcp:443 –source-ranges=0.0.0.0/0 –target-tags=https-server

Make sure you have at least 1 vCPU and about 4GB of RAM available for the Rancher instance.

The next step is to ssh into the instance and install Docker. Once Docker is installed, start Rancher and verify that it’s running:

[email protected]:~$ sudo docker run -d –restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher
Unable to find image ‘rancher/rancher:latest’ locally
latest: Pulling from rancher/rancher
6b98dfc16071: Pull complete
4001a1209541: Pull complete
6319fc68c576: Pull complete
b24603670dc3: Pull complete
97f170c87c6f: Pull complete
c5880aba2145: Pull complete
de3fa5ee4e0d: Pull complete
c973e0300d3b: Pull complete
d0f63a28838b: Pull complete
b5f0c036e778: Pull complete
Digest: sha256:3f042503cda9c9de63f9851748810012de01de380d0eca5f1f296d9b63ba7cd5
Status: Downloaded newer image for rancher/rancher:latest
2f496a88b82abaf28e653567d8754b3b24a2215420967ed9b817333ef6d6c52f
[email protected]:~$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2f496a88b82a rancher/rancher “rancher –http-list…” About a minute ago Up 59 seconds 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp elegant_volhard

Get the public IP address of the instance and point your browser to it:

$ gcloud compute instances describe rancher-20 –project=rancher-20 –format=”value(networkInterfaces[0].accessConfigs[0].natIP)”
35.189.72.39

You should be redirected to a HTTPS page of Rancher and you should see a warning from your browser, because Rancher uses a self signed certificate. Ignore the warnings, because this is the instance that you have started (never do that on untrusted sites!), and proceed to set up Rancher 2.0 by setting the admin password and server URL. That was it – you now have Rancher 2.0 running. Now it’s time to start your Kubernetes cluster.

Starting a Kubernetes Cluster

To start a Kubernetes cluster, you’ll need a Google Cloud Service Account with the following Roles attached to it: Compute Viewer, Kubernetes Engine Admin, Service Account User, Project Viewer. Afterwards you need to generate service account keys, as described here.

Now get your service account keys (it’s safe to use the default Compute Engine service account); you will need your service account keys to start a Kubernetes cluster using Rancher 2.0:

gcloud iam service-accounts keys create ./key.json
–iam-account <SA-NAME>@developer.gserviceaccount.com

Note the <SA-NAME>@developer.gserviceaccount.com value; you will need it later.

Now you’re ready to start your cluster. Go to the Rancher dashboard and click on Add Cluster. Make sure you do the following:
* select Google Container Engine for the hosted Kubernetes provider;
* give your cluster a name, for example rancher-demo;
* import or copy/paste the service key details from key.json file (generated above) into Service Account field;

Proceed with Configure Nodes option and select the following:
* for Kubernetes Version, it should be safe to select the latest available version; this test was done on 1.10.5-gke.2 version;
* select the zone that is closest to you;
* Machine Type needs to be at least n1-standard-1;
* for Istio Demo, the Node Count should be at least 4;

Once these are selected, your setup would look like the image below:

Rancher add cluster

Click with confidence on Create

After several minutes you should see your cluster as active in Rancher dashboard. Remember that <SA-NAME>@developer.gserviceaccount.com value? You need it now, to grant cluster admin permissions to the current user (admin permissions are required to create the necessary RBAC rules for Istio). To do that, click on the rancher-demo Cluster Name in Rancher Dashboard, that will take you to rancher-demo Cluster dashboard, it should look similar to the image below:

rancher-demo Cluster Dashboard

Now Launch kubectl, this will open a kubectl command line for this particular cluster. You can also export the Kubeconfig File to use with your locally installed kubectl. For this purpose it should be enough to use the command line provided by Rancher. Once you have the command line opened, run the following command there:

> kubectl create clusterrolebinding cluster-admin-binding
–clusterrole=cluster-admin
–user=<SA-NAME>@developer.gserviceaccount.com

clusterrolebinding “cluster-admin-binding” created
>

Deploying Istio on Rancher

Istio has a Helm package and Rancher can consume that Helm package and install Istio. To get the official Istio Helm package, it’s best to add Istio’s repository to Rancher Apps Catalog. To do that, go to Rancher Global View, then to Catalogs option and select Add Catalog. Fill in there as follows:
* for name, let’s put there istio-github;
* in Catalog URL, paste the following URL: https://github.com/istio/istio.git (Rancher works with anything git clone can handle)
* the Branch part should allow you now to write the branch name, set it to master
It should look as in the screenshot below:

Rancher add Istio Helm Catalog

Hit Create

At this stage, you should be able to deploy Istio using Rancher’s Catalog. To do that, go to the Default Project of rancher-demo’s cluster and select Catalog Apps there. Once you click on Launch, you will be presented with a number of default available applications. As this demo is about Istio, select from All Catalogs, the istio-github catalog, that you’ve just created. This will present you with 2 options istio and istio-remote. Select View Details for istio one. You’ll be presented with the options to deploy Istio. Select the followings:
* let’s set the name to istio-demo;
* leave the template version to 0.8.0;
* the default namespace used for istio is istio-system, thus set the namespace to istio-system;
* by default, Istio doesn’t encrypt traffic between it’s components. That’s a very nice feature to have, let’s add it. On the same topic, Istio’s helm chart doesn’t add by default Grafana, that’s very useful to have, let’s add this one too. This is done by setting to true the global.controlPlaneSecurityEnabled and grafana.enabled variables. To do this:
– click Add Answer;
– put the variable name global.controlPlaneSecurityEnabled;
– set it’s Value to true;
– do the same for grafana.enabled;

All of the above should look like in the screenshot below:

Deploy Istio from Rancher Catalog

Everything looks good, click on Launch

Now if you look at the Workloads tab, you should see there all the components of Istio spinning up in your Cluster. Make sure all of the workloads are green. Also, check the Load Balancing tab, you should have istio-ingress and istio-ingressgateway there, both in Active state.

In case you have istio-ingressgateway in Pending state, you need to apply istio-ingressgateway service once again. To do that:
* click on Import Yaml;
* for Import Mode, select Cluster: Direct import of any resources into this cluster;
* copy/paste istio-demo-ingressgateway.yaml Service into the Import Yaml editor and hit Import:

This step should solve the Pending problem with istio-ingressgateway.

You should now check that all Istio’s Workloads, Load Balancing and Service Discovery parts are green in Rancher Dashboard.

One last thing to add, so Istio sidecar container is injected automatically into your pods, run the following kubectl command (you can launch kubectl from inside Rancher, as described above), to add a istio-injected label to your default namespace:

> kubectl label namespace default istio-injection=enabled
namespace “default” labeled
> kubectl get namespace -L istio-injection
NAME STATUS AGE ISTIO-INJECTION
cattle-system Active 1h
default Active 1h enabled
istio-system Active 37m
kube-public Active 1h
kube-system Active 1h
>

This label, will make sure that Istio-Sidecar-Injector will automatically inject Envoy containers into your application pods

Deploying Bookinfo sample app

Only now, you can deploy a test application and test the power of Istio. To do that, let’s deploy the Bookinfo sample application. The interesting part of this application is that it has 3 versions of the reviews app, running at the same time. Here’s where we can see some of Istio’s features.
Go to the rancher-demo Default project workloads to deploy the Bookinfo app:
* click on Import Yaml;
* download the following bookinfo.yaml to your local computer;
* upload it to Rancher by using the Read from file option, after you enter the Import Yaml menu;
* for the Import Mode select Cluster: Direct import of any resources into this cluster;
* click on Import

This should add 6 more workloads to your rancher-demo Default project. Just like in the screenshot below:

Rancher Bookinfo Workloads

Now to expose the Bookinfo app via Istio, you need to apply this bookinfo-gateway.yaml the same way as the bookinfo.yaml.
At this moment, you can access the bookinfo app with your browser. Get the external IP address of the istio-ingressgateway Load Balancer. There are several ways to find this IP address. From Rancher, you can go to Load Balancing, and from the right hand side menu select View in API, just like in the screenshot below:

View Load Balancer in API

It should open in a new browser tab, search there for publicEndpoints -> addresses and you should see the public IP address.
Another way is via kubectl:

> export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}’)
> echo $INGRESS_HOST

Point your browser to http://$/productpage and you should see the Bookinfo app. If you refresh your page multiple times, you should see 3 different versions for the Book Reviews part:
– first one with no stars;
– second one with black stars;
– third one with red stars.

Using istio, you can limit your app to route only to the first version of the app. To do that, Import the route-rule-all-v1.yaml into Rancher, wait for a couple of seconds, and then refresh the page multiple times. You should no longer see any stars on the reviews.

Another example is to route traffic to only a set of users. If you import route-rule-reviews-test-v2.yaml to Rancher, login to the Bookinfo app with username jason (no password needed), you should see only version 2 of the reviews (the one with the black stars). Logging out, will show you again only version 1 of the reviews app.

The power provided by Istio can already be seen. Of course, there are many more possibilities with Istio. With this setup created, you can play around with the tasks provided in Istio’s documentation

Istio’s telemetry

Now it’s time to dive into the even more useful features of Istio – the metrics provided by default.

Let’s start with Grafana. The variable grafana.enabled, that was set to true, when we deployed Istio, created a grafana instance, configured to collect Istio’s metrics and display them in several Dashboards. By default Grafana’s service isn’t exposed publicly, thus to view the metrics, you first need to expose Grafana’s service to a public IP address. There’s also the option to expose the service using NodePort, but this will require you to open that NodePort on all of the nodes from Google Cloud Platform firewall, and that’s one more task to deal with, thus it’s simpler to just expose it via a public IP address.

To do this, go to the Workloads under ranchers-demo Default project and select the Service Discovery tab. After all the work already done on the cluster, there should be about 5 services in the default namespace and 12 services in the istio-system namespace, all in Active state. Select the grafana service and from the right hand side menu, select View/Edit YAML, just like in the image below:

Rancher change grafana service

Find the line that says type: ClusterIP and change it to type: LoadBalancer and confidently click Save. Now it should provision a load balancer in Google Cloud Platform and expose Grafana there, on it’s default port 3000. To get the public IP address of Grafana, repeat the process used to find the IP address for bookinfo example, meaning either view grafana service in API, where you can find the IP address, or get it via kubectl:

export GRAFANA_HOST=$(kubectl -n istio-system get service grafana -o jsonpath='{.status.loadBalancer.ingress[0].ip}’)
echo $GRAFANA_HOST

Point your browser to http://$:3000/, select one of the Dashboards, for example Istio Service Dashboard. With previously applied configurations, we limited traffic to show only version 1 of reviews app. To see that on the graphs, select reviews.default.svc.cluster.local form the Service dropdown. Now generate some traffic from Rancher’s kubectl, using the following commands:

export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}’)
for i in ; do curl -o /dev/null -s -w “%n” http://$/productpage; sleep 0.2; done

Wait for about 5 minutes, to generate traffic for the Grafana to show on the Dashboard and after that, it should look like this:

Grafana Istio Service Dashboard

If you scroll a little bit on the Dashboard, under SERVICE WORKLOADS you can clearly see on Incoming Requests by Destination And Response Code graph, that requests for the Reviews app end up only on v1 endpoint. If you generate some requests to version 2 of the app, with the following command (remember that user jason has access to v2 of the reviews app):

export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}’)
for i in ; do curl -o /dev/null -s -w “%n” –cookie “user=jason” http://$/productpage; sleep 0.2; done

You should see requests appearing for the v2 app too, just like in the below screenshot:

Grafana Istio Services Graph for v1 and v2

In the same manner, there’s a possibility to expose and view other default metrics available from Istio, like Prometheus, Tracing and ServiceGraph.

Some final thoughts

As you already have seen, Istio is a very powerful and useful service mesh platform. It surely will become an essential tool in the Cloud Native world. The main problem, at the moment, is that it is not production ready, yet. To quote the one and only @kelseyhightower – “Don’t run out of here and deploy it in production. You’ll be on the news.” Anyways, you should definitely consider it, as it won’t take long until it will become production ready.

As for Rancher 2.0, it is very useful to see the whole kubernetes cluster state, all the workloads, services and pods. It provides an easy way to manage the cluster via the WebUI and deploy apps via Helm Charts, even for someone who isn’t very familiar with Kubernetes. With Rancher 2.0 you have everything you need to manage a kubernetes cluster and have a great overview of it’s state and I’m sure guys at Rancher will add more and more useful features to it.

Roman Doroschevici

Roman Doroschevici

github

Source

Kubernetes Wins the 2018 OSCON Most Impact Award

Kubernetes Wins the 2018 OSCON Most Impact Award

Authors: Brian Grant (Principal Engineer, Google) and Tim Hockin (Principal Engineer, Google)

We are humbled to be recognized by the community with this award.

We had high hopes when we created Kubernetes. We wanted to change the way cloud applications were deployed and managed. Whether we’d succeed or not was very uncertain. And look how far we’ve come in such a short time.

The core technology behind Kubernetes was informed by lessons learned from Google’s internal infrastructure, but nobody can deny the enormous role of the Kubernetes community in the success of the project. The community, of which Google is a part, now drives every aspect of the project: the design, development, testing, documentation, releases, and more. That is what makes Kubernetes fly.

While we actively sought partnerships and community engagement, none of us anticipated just how important the open-source community would be, how fast it would grow, or how large it would become. Honestly, we really didn’t have much of a plan.

We looked to other open-source projects for inspiration and advice: Docker (now Moby), other open-source projects at Google such as Angular and Go, the Apache Software Foundation, OpenStack, Node.js, Linux, and others. But it became clear that there was no clear-cut recipe we could follow. So we winged it.

Rather than rehashing history, we thought we’d share two high-level lessons we learned along the way.

First, in order to succeed, community health and growth needs to be treated as a top priority. It’s hard, and it is time-consuming. It requires attention to both internal project dynamics and outreach, as well as constant vigilance to build and sustain relationships, be inclusive, maintain open communication, and remain responsive to contributors and users. Growing existing contributors and onboarding new ones is critical to sustaining project growth, but that takes time and energy that might otherwise be spent on development. These things have to become core values in order for contributors to keep them going.

Second, start simple with how the project is organized and operated, but be ready to adopt to more scalable approaches as it grows. Over time, Kubernetes has transitioned from what was effectively a single team and git repository to many subgroups (Special Interest Groups and Working Groups), sub-projects, and repositories. From manual processes to fully automated ones. From informal policies to formal governance.

We certainly didn’t get everything right or always adapt quickly enough, and we constantly struggle with scale. At this point, Kubernetes has more than 20,000 contributors and is approaching one million comments on its issues and pull requests, making it one of the fastest moving projects in the history of open source.

Thank you to all our contributors and to all the users who’ve stuck with us on the sometimes bumpy journey. This project would not be what it is today without the community.

Source

Couchbase on OpenShift and Kubernetes – Jetstack Blog

By Matthew Bates

Jetstack are pleased to open source a proof-of-concept sidecar for deployment of managed Couchbase clusters on OpenShift. The project is the product of a close engineering collaboration with Couchbase, Red Hat and Amadeus, and a demo was presented at the recent Red Hat Summit in Boston, MA.

This project provides a sidecar container that can be used alongside official Couchbase images to provide a scalable and flexible Couchbase deployment for OpenShift and Kubernetes. The sidecars manage cluster lifecycle, including registering new nodes into the Couchbase cluster, automatically triggering cluster rebalances, and handling migration of data given a scale-down or node failure event.

Couchbase Server is a NoSQL document database with a distributed architecture for performance, scalability, and availability. It enables developers to build applications easier and faster by leveraging the power of SQL with the flexibility of JSON.

In recent versions of OpenShift (and the upstream Kubernetes project), there has seen significant advancement in a number of the building blocks required for deployment of distributed applications. Notably:

  • StatefulSet: (nee PetSet, and now in technical preview as of OpenShift
    3.5
    ) provides unique and stable identity and storage to pods, and guarantees deployment order and scaling. This is in contrast to a Deployment or ReplicaSet where pod replicas do not maintain identity across restart/rescheduling, may have the same volume storage properties – hence, these resources are suited to stateless applications.
  • Dynamic volume provisioning: first introduced in technology preview in 3.1.1, and now GA in 3.3, this feature enables storage to be
    dynamically provisioned ‘on-demand’ in a supported cloud environment (e.g. AWS, GCP, OpenStack). The StatefulSet controller automatically creates requests for storage (PersistentVolumeClaim – PVC) per pod, and the storage is provisioned (PersistentVolume – PV). The unique 1-to-1 binding between PV and PVC ensures a pod is always reunited with its same volume, even if scheduled on another node in the instance of failure.

Whilst OpenShift (or Kubernetes), by utilising its generic concepts of StatefulSet and dynamic volume provisioning, will make sure the right pods are scheduled and running, it cannot account for Couchbase-specific requirements in its decision making process. For example, registering new nodes when scaling up, rebalancing and also handling migration of data at a scale-down or on node failure. The pod and node events are well-known to OpenShift/Kubernetes, but the actions required are very much database-specific.

In this PoC, we’ve codified the main Couchbase cluster lifecycle operations into a sidecar container that sits alongside a standard Couchbase container in a pod. The sidecar uses the APIs of OpenShift/Kubernetes and Couchbase to determine cluster state, and it will safely and appropriately respond to Couchbase cluster events, such as scale-up/down and node failure.

For instance, the sidecar can respond to the following events:

  • Scale-up: the sidecar determines the node is new to the cluster and it is initialized and joined to the cluster. This prompts a rebalance.
  • Scale-down: a preStop hook (pre-container shutdown) is executed and the sidecar safely removes the node from the cluster, rebalancing as necessary.
  • Readiness: the sidecar connects to the local Couchbase container and determines its health. The result of the readiness check is used to determine service availability in OpenShift.

Experiment with the open source sidecar

The proof-of-concept sidecar has now been open sourced at https://github.com/jetstack-experimental/couchbase-sidecar. At this repository, find instructions on how to get started with OpenShift (and Kubernetes too with a Helm chart). Feedback and contributions are welcome, but please note that this is strictly a proof-of-concept and should not be used in production. We look forward to future versions, in which the sidecar will be improved and extended, and battle-tested at scale, in a journey to a production release. Let us know what you think!

Source

A Journey from Cattle to Kubernetes!

How to Migrate from Rancher 1.6 to Rancher 2.1 Online Meetup

Key terminology differences, implementing key elements, and transforming Compose to YAML

Watch the video

The past few years I have been developing and enhancing Cattle, which is the default container orchestration and scheduling framework for Rancher 1.6.

Cattle is used extensively by Rancher users to create and manage applications based on Docker containers. One of the key reasons for its extensive adoption is its compatibility with standard Docker Compose syntax.

With the release of Rancher 2.0, we shifted from Cattle as the base orchestration platform to Kubernetes. Kubernetes introduces its own terminologies and yaml specs for deploying services and pods that differs from the Docker Compose syntax.

I must say it really is a big learning curve for Cattle developers like me and our users to find ways to migrate apps to the Kubernetes-based 2.0 platform.

In this blog series, we will explore how various features supported using Cattle in Rancher 1.6 can be mapped to their Kubernetes equivalents in Rancher 2.0.

Who Moved My Stack? 🙂

In Rancher 1.6, you could easily deploy services running Docker images in one of two ways: using either the Rancher UI or the Rancher Compose Tool, which extends the popular Docker Compose.

With Rancher 2.0, we’ve introduced new grouping boundaries and terminologies to align with Kubernetes. So what happens to your Cattle-based environments and stacks in a 2.0 environment? How can a Cattle user transition their stacks and services to Rancher 2.0?

To solve this problem, lets identify parallels between the two versions.

Some of the key terms around application deployment in 1.6 are:

  • Container: The smallest deployment unit. Containers are a lightweight, stand-alone, executable package of software that includes everything required to run it. (https://www.docker.com/what-container)
  • Service: A group of one or more containers running an identical Docker image.
  • Stack: Services that belong to an application can be grouped together under a stack, which bundles your applications into logical groups.
  • Compose config: Rancher allows users to view/export config files for the entire stack. These files, named docker_compose.yml and rancher_compose.yml, include all services and can be used to replicate the same application stack from a different Rancher setup.

Equivalent key terms for Rancher 2.0 are below. You can find more information about them in the Rancher 2.0 Documentation.

  • Pod: In Kubernetes, a pod is the smallest unit of deployment. A pod consist of one or more containers running a specific image. Pods are roughly equivalent to containers in 1.6. An application service consists of one or more running pods. If a Rancher 1.6 service has sidekicks, the pod equivalent would have more than one container, one container launched per sidekick.
  • Workload: The term service used in 1.6 maps to the term workload in 2.0. A workload object defines the specs and deployment rules for a set of pods that comprise the application. However, unlike services in 1.6, workloads are divided into different categories. The workload category most similar to a stateless service from 1.6 is the deployment category.
  • Namespace: The term stack from 1.6 maps to the Kubernetes concept of a namespace in 2.0. After launching a Kubernetes cluster in Rancher 2.0, workloads are deployed to the default namespace, unless you explicitly define a namespace yourself. This functionality is similar to the default stack in 1.6.
  • Kubernetes YAML: This file type is similar to a Docker Compose file. It specifies Kubernetes objects in YAML format. Just as the Docker Compose tool can digest Compose files to deploy specific container services, kubectl is the cli tool that processes Kubernetes YAML as input, which is then used to provision Kubernetes objects. For more information, see the Kubernetes Documentation.

How Do I Move a Simple Application from Rancher 1.6 to 2.0?

After learning the parallels between Cattle and Kubernetes, I began investigating options for transitioning a simple application from Rancher 1.6 to 2.0.

For this exercise, I used the LetsChat app, which is formed from a couple of services. I deployed these services to a stack in 1.6 using Cattle. Here is the docker-compose.yml file for the services in my stack:

Imgur

Along with provisioning the service containers, Cattle facilitates service discovery between the services in my stack. This service discovery allows the LetsChat service talk to the Mongo service.

Is provisioning and configuring service discovery in Rancher 2.0 as easy as it was in 1.6?

A Cluster and Project on Rancher 2.0

First, I needed to create a Rancher 2.0 Kubernetes cluster. You can find instructions for this process in our Quick Start Guide.

In Rancher 1.6, I’m used to deploying my stacks within a Cattle Environment that has some compute resources assigned.

After inspecting the UI in Rancher 2.0, I recognized that workloads are deployed in a project within the Kubernetes Cluster that I created. It seems that a 2.0 Cluster and a Project together are equivalent to a Cattle environment from 1.6!

Imgur

However, there are some important differences to note:

  • In 1.6, Cattle environments have a set of compute nodes assigned to them, and the Rancher Server is the global control plane backed by mysql DB, which provides storage for each environment. In 2.0, each Kubernetes cluster has its own set of compute nodes, nodes running the cluster control plane, and nodes running etcd for storage.
  • In 1.6, all Cattle environment users could access any host in the environment. In Rancher 2.0, this access model has changed. You can now restrict users to specific projects. This model allows for multi-tenancy since hosts are owned by the cluster, and the cluster can be further divided into multiple projects where users can manage their apps.

Deploying Workloads from Rancher 2.0 UI

With my new Kubernetes cluster in place, I was set to launch my applications the 2.0 way!

I navigated to the Default project under my cluster. From the Workloads tab, I launched a deployment for the LetsChat and Mongo Docker images.

Imgur

For my LetsChat deployment, I exposed container port 8080 by selecting the HostPort option for port mapping. Then I entered my public port 9890 as the listening port.

I selected HostPort because Kubernetes exposes the specified port for each host that the workload (and its pods) are deployed to. This behavior is similar to exposing a public port on Cattle.

While Rancher provisioned the deployments, I monitored the status from the Workloads view. I could drill down to the deployed Kubernetes pods and monitor the logs. This experience was very similar to launching services using Cattle and drilling down to the service containers!

Imgur
Once the workloads were provisioned, Rancher provided a convenient link to the public endpoint of my LetsChat app. Upon clicking the link, voilá!

Imgur

Docker Compose to Kubernetes Yaml

If you’re migrating multiple application stacks from Rancher 1.6 to 2.0, manually migrating by UI is not ideal. Instead, use a Docker Compose config file to speed things up.

If you are a Rancher 1.6 user, you’re probably familiar with launching services by calling a Compose file from Rancher CLI. Similarly, Rancher 2.0 provides a CLI to launch the Kubernetes resources.

So our next step is to convert our docker-compose.yml file to the kubernetes yaml specs and use CLI.

Converting my Compose file to the Kubernetes YAML specs manually didn’t inspire confidence. I’m unfamiliar with Kubernetes YAML, and it’s confusing compared to the simplicity of Docker Compose. A quick Google search led me to this conversion tool—Kompose.

Imgur

Kompose generated two files per service in the docker-compose.yml:

  • a deployment YAML
  • a service YAML

Why is a separate service spec required?

A Kubernetes service is a REST object that abstracts access to the pods in the workload. A service provides a static endpoint to the pods. Therefore, even if the pods change IP address, the public endpoint remains unchanged. A service object points to its corresponding deployment (workload) by using selector labels.

When a service in Docker Compose exposes public ports, Kompose translates that to a service YAML spec for Kubernetes, along with a deployment YAML spec.

Lets see how the compose and Kubernetes YAML specs compare:

Imgur

As highlighted above, everything under the chat service in docker-compose.yml is mapped to spec.containers in the Kubernetes chat-deployment.yaml file.

  • The service name in docker-compose.yml is placed under spec.containers.name
  • image in docker-compose.yml maps to spec.containers.image
  • ports in docker-compose.yml maps to spec.containers.ports.containerPort
  • Any Labels present in docker-compose.yml are placed as metadata.annotations

Note that the separate chat-service.yaml file contains the public port mapping of the deployment, and it points to the deployment using a selector io.kompose.service: chat), which is a label on the chat-deployment object.

To deploy these files to my cluster namespace, I downloaded and configured the Rancher CLI tool.

The workloads launched fine, but…

Imgur

There was no public endpoint placed for the chat workload. After some troubleshooting, I noticed that the generated file from Kompose was missing the HostPort spec in the chat-deployment.yaml file! I manually added the missing spec and re-imported the yaml to publicly expose the LetsChat workload.

Imgur

Troubleshoot successful! I could access the application on the Host-IP:HostPort.

Imgur

Finished

There you have it! Rancher users can successfully port their application stacks from 1.6 to 2.0 using either the UI or Compose-to-Kubernetes YAML conversion.

Although the complexity of Kubernetes is still apparent, with the help of Rancher 2.0, I found the provisioning flow just as simple and intuitive as Cattle.

This article looked at the bare minimum flow of transitioning simple services from Cattle to Rancher 2.0. However there are more challenges you’ll face when migrating to Rancher 2.0: you’ll need to understand the changes in Rancher 2.0 around scheduling, load balancing, service discovery, and service monitoring. Let’s dig deeper in upcoming articles!

In the next article, we will explore various options for exposing a workload publicly via port mapping options on Kubernetes.

Prachi Damle

Prachi Damle

Principal Software Engineer

Source

The History of Kubernetes, the Community Behind It

Authors: Brendan Burns (Distinguished Engineer, Microsoft)

oscon award

It is remarkable to me to return to Portland and OSCON to stand on stage with members of the Kubernetes community and accept this award for Most Impactful Open Source Project. It was scarcely three years ago, that on this very same stage we declared Kubernetes 1.0 and the project was added to the newly formed Cloud Native Computing Foundation.

To think about how far we have come in that short period of time and to see the ways in which this project has shaped the cloud computing landscape is nothing short of amazing. The success is a testament to the power and contributions of this amazing open source community. And the daily passion and quality contributions of our endlessly engaged, world-wide community is nothing short of humbling.

Congratulations @kubernetesio for winning the “most impact” award at #OSCON I’m so proud to be a part of this amazing community! @CloudNativeFdn pic.twitter.com/5sRUYyefAK

— Jaice Singer DuMars (@jaydumars) July 19, 2018

👏 congrats @kubernetesio community on winning the #oscon Most Impact Award, we are proud of you! pic.twitter.com/5ezDphi6J6

— CNCF (@CloudNativeFdn) July 19, 2018

At a meetup in Portland this week, I had a chance to tell the story of Kubernetes’ past, its present and some thoughts about its future, so I thought I would write down some pieces of what I said for those of you who couldn’t be there in person.

It all began in the fall of 2013, with three of us: Craig McLuckie, Joe Beda and I were working on public cloud infrastructure. If you cast your mind back to the world of cloud in 2013, it was a vastly different place than it is today. Imperative bash scripts were only just starting to give way to declarative configuration of IaaS with systems. Netflix was popularizing the idea of immutable infrastructure but doing it with heavy-weight full VM images. The notion of orchestration, and certainly container orchestration existed in a few internet scale companies, but not in cloud and certainly not in the enterprise.

Docker changed all of that. By popularizing a lightweight container runtime and providing a simple way to package, distributed and deploy applications onto a machine, the Docker tooling and experience popularized a brand-new cloud native approach to application packaging and maintenance. Were it not for Docker’s shifting of the cloud developer’s perspective, Kubernetes simply would not exist.

I think that it was Joe who first suggested that we look at Docker in the summer of 2013, when Craig, Joe and I were all thinking about how we could bring a cloud native application experience to a broader audience. And for all three of us, the implications of this new tool were immediately obvious. We knew it was a critical component in the development of cloud native infrastructure.

But as we thought about it, it was equally obvious that Docker, with its focus on a single machine, was not the complete solution. While Docker was great at building and packaging individual containers and running them on individual machines, there was a clear need for an orchestrator that could deploy and manage large numbers of containers across a fleet of machines.

As we thought about it some more, it became increasingly obvious to Joe, Craig and I, that not only was such an orchestrator necessary, it was also inevitable, and it was equally inevitable that this orchestrator would be open source. This realization crystallized for us in the late fall of 2013, and thus began the rapid development of first a prototype, and then the system that would eventually become known as Kubernetes. As 2013 turned into 2014 we were lucky to be joined by some incredibly talented developers including Ville Aikas, Tim Hockin, Dawn Chen, Brian Grant and Daniel Smith.

Happy to see k8s team members winning the “most impact” award. #oscon pic.twitter.com/D6mSIiDvsU

— Bridget Kromhout (@bridgetkromhout) July 19, 2018

Kubernetes won the O’Reilly Most Impact Award. Thanks to our contributors and users! pic.twitter.com/T6Co1wpsAh

— Brian Grant (@bgrant0607) July 19, 2018

The initial goal of this small team was to develop a “minimally viable orchestrator.” From experience we knew that the basic feature set for such an orchestrator was:

  • Replication to deploy multiple instances of an application
  • Load balancing and service discovery to route traffic to these replicated containers
  • Basic health checking and repair to ensure a self-healing system
  • Scheduling to group many machines into a single pool and distribute work to them

Along the way, we also spent a significant chunk of our time convincing executive leadership that open sourcing this project was a good idea. I’m endlessly grateful to Craig for writing numerous whitepapers and to Eric Brewer, for the early and vocal support that he lent us to ensure that Kubernetes could see the light of day.

In June of 2014 when Kubernetes was released to the world, the list above was the sum total of its basic feature set. As an early stage open source community, we then spent a year building, expanding, polishing and fixing this initial minimally viable orchestrator into the product that we released as a 1.0 in OSCON in 2015. We were very lucky to be joined early on by the very capable OpenShift team which lent significant engineering and real world enterprise expertise to the project. Without their perspective and contributions, I don’t think we would be standing here today.

Three years later, the Kubernetes community has grown exponentially, and Kubernetes has become synonymous with cloud native container orchestration. There are more than 1700 people who have contributed to Kubernetes, there are more than 500 Kubernetes meetups worldwide and more than 42000 users have joined the #kubernetes-dev channel. What’s more, the community that we have built works successfully across geographic, language and corporate boundaries. It is a truly open, engaged and collaborative community, and in-and-of-itself and amazing achievement. Many thanks to everyone who has helped make it what it is today. Kubernetes is a commodity in the public cloud because of you.

But if Kubernetes is a commodity, then what is the future? Certainly, there are an endless array of tweaks, adjustments and improvements to the core codebase that will occupy us for years to come, but the true future of Kubernetes are the applications and experiences that are being built on top of this new, ubiquitous platform.

Kubernetes has dramatically reduced the complexity to build new developer experiences, and a myriad of new experiences have been developed or are in the works that provide simplified or targeted developer experiences like Functions-as-a-Service, on top of core Kubernetes-as-a-Service.

The Kubernetes cluster itself is being extended with custom resource definitions (which I first described to Kelsey Hightower on a walk from OSCON to a nearby restaurant in 2015), these new resources allow cluster operators to enable new plugin functionality that extend and enhance the APIs that their users have access to.

By embedding core functionality like logging and monitoring in the cluster itself and enabling developers to take advantage of such services simply by deploying their application into the cluster, Kubernetes has reduced the learning necessary for developers to build scalable reliable applications.

Finally, Kubernetes has provided a new, common vocabulary for expressing the patterns and paradigms of distributed system development. This common vocabulary means that we can more easily describe and discuss the common ways in which our distributed systems are built, and furthermore we can build standardized, re-usable implementations of such systems. The net effect of this is the development of higher quality, reliable distributed systems, more quickly.

It’s truly amazing to see how far Kubernetes has come, from a rough idea in the minds of three people in Seattle to a phenomenon that has redirected the way we think about cloud native development across the world. It has been an amazing journey, but what’s truly amazing to me, is that I think we’re only just now scratching the surface of the impact that Kubernetes will have. Thank you to everyone who has enabled us to get this far, and thanks to everyone who will take us further.

Brendan

Source

Configuring Horizontal Pod Autoscaling on Running Services on Kubernetes.

Take a deep dive into Best Practices in Kubernetes Networking

From overlay networking and SSL to ingress controllers and network security policies, we’ve seen many users get hung up on Kubernetes networking challenges. In this video recording, we dive into Kubernetes networking, and discuss best practices for a wide variety of deployment options.

Watch the video

Introduction

One of the nicer features of Kubernetes is the ability to code and configure autoscale on your running services. Without autoscaling, it’s difficult to accomodate deployment scaling and meet SLAs. This feature is called Horizontal Pod Autoscaler (HPA) on Kubernetes clusters.

Why use HPA

Using HPA, you can achieve up/down autoscaling in your deployments, based on resource use and/or custom metrics, and to accomodate deployments scale to realtime load of your services.

HPA produces two direct improvements to your services:

  1. Use compute and memory resources when needed, releasing them if not required.
  2. Increase/decrease performance as needed to accomplish SLA’s.

How HPA works

HPA automatically scales the number of pods (defined minimum and maximum number of pods) in a replication controller, deployment or replica set, based on observed CPU/memory utilization (resource metrics) or based on custom metrics provided by third-party metrics application like Prometheus, Datadog, etc. HPA is implemented as a control loop, with a period controlled by the Kubernetes controller manager –horizontal-pod-autoscaler-sync-period flag (default value 30s).

HPA schema

HPA definition

HPA is an API resource in the Kubernetes autoscaling API group. The current stable version is autoscaling/v1, which only includes support for CPU autoscaling. To get additional support for scaling on memory and custom metrics, the Beta vesion should be used autoscaling/v2beta1.

Read more info about the HPA API object.

HPA is supported in a standard way by kubectl. It can be created, managed and deleted using kubectl:

  • Creating HPA
    • With manifest: kubectl create -f <HPA_MANIFEST>
    • Without manifest (Just support CPU): kubectl autoscale deployment hello-world –min=2 –max=5 –cpu-percent=50
  • Getting hpa info
    • Basic: kubectl get hpa hello-world
    • Detailed description: kubectl describe hpa hello-world
  • Deleting hpa
    • kubectl delete hpa hello-world

Here’s a HPA manifest definition example:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: hello-world
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: hello-world
minReplicas: 1
maxReplicas: 10
metrics:
– type: Resource
resource:
name: cpu
targetAverageUtilization: 50
– type: Resource
resource:
name: memory
targetAverageValue: 100Mi

  • Using autoscaling/v2beta1 version to use cpu and memory metrics
  • Controlling autoscale of hello-world deployment
  • Defined minimum number of replicas of 1
  • Defined maximum number of replicas of 10
  • Scaling up when:
    • cpu use is more that 50%
    • Memory use more than 100Mi

Installation

Before HPA can be used in your Kubernetes cluster, some elements have to be installed and configured in your system.

Requirements

Be sure that your Kubernetes cluster services are running at least with these flags:
– kube-api: requestheader-client-ca-file
– kubelet: read-only-port at 10255
– kube-controller: Optional, just needed if distinct values than default are required.
– horizontal-pod-autoscaler-downscale-delay: “5m0s”
– horizontal-pod-autoscaler-upscale-delay: “3m0s”
– horizontal-pod-autoscaler-sync-period: “30s”

For RKE, Kubernetes cluster definition, be sure you add these lines at the services section. To do it in the Rancher v2.0.X UI, open “Cluster options” – “Edit as YAML” and add these definitions:

services:

kube-api:
extra_args:
requestheader-client-ca-file: “/etc/kubernetes/ssl/kube-ca.pem”
kube-controller:
extra_args:
horizontal-pod-autoscaler-downscale-delay: “5m0s”
horizontal-pod-autoscaler-upscale-delay: “1m0s”
horizontal-pod-autoscaler-sync-period: “30s”
kubelet:
extra_args:
read-only-port: 10255

In order to deploy metrics services, you must have your Kubernetes cluster configured and deployed properly.

Note: For deploy and test examples, Rancher v2.0.6 and k8s v1.10.1 cluster are being used.

Resource metrics

If HPA wants to use resource metrics, package metrics-server is needed at kube-system namespace of Kubernetes cluster.

To accomplish this, follow these steps:

  1. Configure kubectl to connect to the proper Kubernetes cluster.
  2. Clone Github metrics-server repo:git clone https://github.com/kubernetes-incubator/metrics-server
  3. Install metrics-server package (assuming that Kubernetes is up to version 1.8):kubectl create -f metrics-server/deploy/1.8+/
  4. Check that metrics-server is running properly. Check service pod and logs at namespace kube-system

    # kubectl get pods -n kube-system
    NAME READY STATUS RESTARTS AGE

    metrics-server-6fbfb84cdd-t2fk9 1/1 Running 0 8h

    # kubectl -n kube-system logs metrics-server-6fbfb84cdd-t2fk9
    I0723 08:09:56.193136 1 heapster.go:71] /metrics-server –source=kubernetes.summary_api:”
    I0723 08:09:56.193574 1 heapster.go:72] Metrics Server version v0.2.1
    I0723 08:09:56.194480 1 configs.go:61] Using Kubernetes client with master “https://10.43.0.1:443” and version
    I0723 08:09:56.194501 1 configs.go:62] Using kubelet port 10255
    I0723 08:09:56.198612 1 heapster.go:128] Starting with Metric Sink
    I0723 08:09:56.780114 1 serving.go:308] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
    I0723 08:09:57.391518 1 heapster.go:101] Starting Heapster API server…
    [restful] 2018/07/23 08:09:57 log.go:33: [restful/swagger] listing is available at https:///swaggerapi
    [restful] 2018/07/23 08:09:57 log.go:33: [restful/swagger] https:///swaggerui/ is mapped to folder /swagger-ui/
    I0723 08:09:57.394080 1 serve.go:85] Serving securely on 0.0.0.0:443

  5. Check that the metrics API is accesible from kubectl:
    • If you are accessing directly to the Kubernetes cluster, use the server URL at kubectl config like ‘https://:6443’

      # kubectl get –raw /apis/metrics.k8s.io/v1beta1
      {“kind”:”APIResourceList”,”apiVersion”:”v1″,”groupVersion”:”metrics.k8s.io/v1beta1″,”resources”:[{“name”:”nodes”,”singularName”:””,”namespaced”:false,”kind”:”NodeMetrics”,”verbs”:[“get”,”list”]},{“name”:”pods”,”singularName”:””,”namespaced”:true,”kind”:”PodMetrics”,”verbs”:[“get”,”list”]}]}

    • If you are accessing the Kubernetes cluster through Rancher, server URL at kubectl config like this: https://<RANCHER_URL>/k8s/clusters/<CLUSTER_ID> You also need to add prefix /k8s/clusters/<CLUSTER_ID> to the API path.

      # kubectl get –raw /k8s/clusters/<CLUSTER_ID>/apis/metrics.k8s.io/v1beta1
      {“kind”:”APIResourceList”,”apiVersion”:”v1″,”groupVersion”:”metrics.k8s.io/v1beta1″,”resources”:[{“name”:”nodes”,”singularName”:””,”namespaced”:false,”kind”:”NodeMetrics”,”verbs”:[“get”,”list”]},{“name”:”pods”,”singularName”:””,”namespaced”:true,”kind”:”PodMetrics”,”verbs”:[“get”,”list”]}]}

Custom Metrics (Prometheus)

Custom metrics could be provided with many third-party applications as the source. We are going to use Prometheus for our demonstration. We are assuming that Prometheus is deployed on your Kubernetes cluster, getting proper metrics from pods, nodes, namespaces,…. We’ll use the Prometehus url, http://prometheus.mycompany.io exposed at port 80.

Prometheus is available for deployment in the Rancher v2.0 catalog. Deploy it from the Rancher catalog if it isn’t alrady running on your Kubernetes cluster.

If HPA wants to use custom metrics from Prometheus, package k8s-prometheus-adapter is needed at kube-system namespace on the Kubernetes cluster. Just to facilitate k8s-prometheus-adapter installation, we are going to use the Helm chart available at banzai-charts

To use this chart, follow these steps:

  1. Init helm at k8s cluster:

    kubectl -n kube-system create serviceaccount tiller
    kubectl create clusterrolebinding tiller –clusterrole cluster-admin –serviceaccount=kube-system:tiller
    helm init –service-account tiller

  2. Clone the Github banzai-charts repo:

    git clone https://github.com/banzaicloud/banzai-charts

  3. Install prometheus-adapter char specifying Prometheus URL and port:

    helm install –name prometheus-adapter banzai-charts/prometheus-adapter –set prometheus.url=”http://prometheus.mycompany.io”,prometheus.port=”80″ –namespace kube-system

  4. Check that prometheus-adapter is running properly. Check service pod and logs at namespace kube-system

    # kubectl get pods -n kube-system
    NAME READY STATUS RESTARTS AGE

    prometheus-adapter-prometheus-adapter-568674d97f-hbzfx 1/1 Running 0 7h

    # kubectl logs prometheus-adapter-prometheus-adapter-568674d97f-hbzfx -n kube-system

    I0724 10:18:45.696679 1 round_trippers.go:436] GET https://10.43.0.1:443/api/v1/namespaces/default/pods?labelSelector=app%3Dhello-world 200 OK in 2 milliseconds
    I0724 10:18:45.696695 1 round_trippers.go:442] Response Headers:
    I0724 10:18:45.696699 1 round_trippers.go:445] Date: Tue, 24 Jul 2018 10:18:45 GMT
    I0724 10:18:45.696703 1 round_trippers.go:445] Content-Type: application/json
    I0724 10:18:45.696706 1 round_trippers.go:445] Content-Length: 2581
    I0724 10:18:45.696766 1 request.go:836] Response Body: {“kind”:”PodList”,”apiVersion”:”v1″,”metadata”:{“selfLink”:”/api/v1/namespaces/default/pods”,”resourceVersion”:”6237″},”items”:[{“metadata”:{“name”:”hello-world-54764dfbf8-q6l82″,”generateName”:”hello-world-54764dfbf8-“,”namespace”:”default”,”selfLink”:”/api/v1/namespaces/default/pods/hello-world-54764dfbf8-q6l82″,”uid”:”484cb929-8f29-11e8-99d2-067cac34e79c”,”resourceVersion”:”4066″,”creationTimestamp”:”2018-07-24T10:06:50Z”,”labels”:{“app”:”hello-world”,”pod-template-hash”:”1032089694″},”annotations”:{“cni.projectcalico.org/podIP”:”10.42.0.7/32″},”ownerReferences”:[{“apiVersion”:”extensions/v1beta1″,”kind”:”ReplicaSet”,”name”:”hello-world-54764dfbf8″,”uid”:”4849b9b1-8f29-11e8-99d2-067cac34e79c”,”controller”:true,”blockOwnerDeletion”:true}]},”spec”:{“volumes”:[{“name”:”default-token-ncvts”,”secret”:{“secretName”:”default-token-ncvts”,”defaultMode”:420}}],”containers”:[{“name”:”hello-world”,”image”:”rancher/hello-world”,”ports”:[{“containerPort”:80,”protocol”:”TCP”}],”resources”:{“requests”:{“cpu”:”500m”,”memory”:”64Mi”}},”volumeMounts”:[{“name”:”default-token-ncvts”,”readOnly”:true,”mountPath”:”/var/run/secrets/kubernetes.io/serviceaccount”}],”terminationMessagePath”:”/dev/termination-log”,”terminationMessagePolicy”:”File”,”imagePullPolicy”:”Always”}],”restartPolicy”:”Always”,”terminationGracePeriodSeconds”:30,”dnsPolicy”:”ClusterFirst”,”serviceAccountName”:”default”,”serviceAccount”:”default”,”nodeName”:”34.220.18.140″,”securityContext”:{},”schedulerName”:”default-scheduler”,”tolerations”:[{“key”:”node.kubernetes.io/not-ready”,”operator”:”Exists”,”effect”:”NoExecute”,”tolerationSeconds”:300},{“key”:”node.kubernetes.io/unreachable”,”operator”:”Exists”,”effect”:”NoExecute”,”tolerationSeconds”:300}]},”status”:{“phase”:”Running”,”conditions”:[{“type”:”Initialized”,”status”:”True”,”lastProbeTime”:null,”lastTransitionTime”:”2018-07-24T10:06:50Z”},{“type”:”Ready”,”status”:”True”,”lastProbeTime”:null,”lastTransitionTime”:”2018-07-24T10:06:54Z”},{“type”:”PodScheduled”,”status”:”True”,”lastProbeTime”:null,”lastTransitionTime”:”2018-07-24T10:06:50Z”}],”hostIP”:”34.220.18.140″,”podIP”:”10.42.0.7″,”startTime”:”2018-07-24T10:06:50Z”,”containerStatuses”:[{“name”:”hello-world”,”state”:{“running”:{“startedAt”:”2018-07-24T10:06:54Z”}},”lastState”:{},”ready”:true,”restartCount”:0,”image”:”rancher/hello-world:latest”,”imageID”:”docker-pullable://rancher/[email protected]:4b1559cb4b57ca36fa2b313a3c7dde774801aa3a2047930d94e11a45168bc053″,”containerID”:”docker://cce4df5fc0408f03d4adf82c90de222f64c302bf7a04be1c82d584ec31530773″}],”qosClass”:”Burstable”}}]}
    I0724 10:18:45.699525 1 api.go:74] GET http://prometheus-server.prometheus.34.220.18.140.xip.io/api/v1/query?query=sum%28rate%28container_fs_read_seconds_total%7Bpod_name%3D%22hello-world-54764dfbf8-q6l82%22%2Ccontainer_name%21%3D%22POD%22%2Cnamespace%3D%22default%22%7D%5B5m%5D%29%29+by+%28pod_name%29&time=1532427525.697 200 OK
    I0724 10:18:45.699620 1 api.go:93] Response Body: {“status”:”success”,”data”:{“resultType”:”vector”,”result”:[{“metric”:{“pod_name”:”hello-world-54764dfbf8-q6l82″},”value”:[1532427525.697,”0″]}]}}
    I0724 10:18:45.699939 1 wrap.go:42] GET /apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/fs_read?labelSelector=app%3Dhello-world: (12.431262ms) 200 [[kube-controller-manager/v1.10.1 (linux/amd64) kubernetes/d4ab475/system:serviceaccount:kube-system:horizontal-pod-autoscaler] 10.42.0.0:24268]
    I0724 10:18:51.727845 1 request.go:836] Request Body: {“kind”:”SubjectAccessReview”,”apiVersion”:”authorization.k8s.io/v1beta1″,”metadata”:{“creationTimestamp”:null},”spec”:{“nonResourceAttributes”:{“path”:”/”,”verb”:”get”},”user”:”system:anonymous”,”group”:[“system:unauthenticated”]},”status”:{“allowed”:false}}

  5. Check that metrics API is accesible from kubectl:
    • Accessing directly to the Kubernetes cluster, server URL at kubectl config, such as https://<K8s_URL>:6443

      # kubectl get –raw /apis/custom.metrics.k8s.io/v1beta1
      {“kind”:”APIResourceList”,”apiVersion”:”v1″,”groupVersion”:”custom.metrics.k8s.io/v1beta1″,”resources”:[{“name”:”pods/fs_usage_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_rss”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_cpu_period”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_cfs_throttled”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_io_time”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_read”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_sector_writes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_user”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/last_seen”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/tasks_state”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_cpu_quota”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/start_time_seconds”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_limit_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_write”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_cache”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_usage_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_cfs_periods”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_cfs_throttled_periods”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_reads_merged”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_working_set_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/network_udp_usage”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_inodes_free”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_inodes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_io_time_weighted”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_failures”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_swap”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_cpu_shares”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_memory_swap_limit_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_usage”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_io_current”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_writes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_failcnt”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_reads”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_writes_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_writes_merged”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/network_tcp_usage”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_max_usage_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_memory_limit_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_memory_reservation_limit_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_load_average_10s”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_system”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_reads_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_sector_reads”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]}]}

  • Accessing the Kubernetes cluster through Rancher, server URL at kubectl config like https://<RANCHER_URL>/k8s/clusters/<CLUSTER_ID> You need to add prefix /k8s/clusters/<CLUSTER_ID>

# kubectl get –raw /k8s/clusters/<CLUSTER_ID>/apis/custom.metrics.k8s.io/v1beta1
{“kind”:”APIResourceList”,”apiVersion”:”v1″,”groupVersion”:”custom.metrics.k8s.io/v1beta1″,”resources”:[{“name”:”pods/fs_usage_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_rss”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_cpu_period”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_cfs_throttled”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_io_time”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_read”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_sector_writes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_user”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/last_seen”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/tasks_state”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_cpu_quota”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/start_time_seconds”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_limit_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_write”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_cache”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_usage_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_cfs_periods”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_cfs_throttled_periods”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_reads_merged”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_working_set_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/network_udp_usage”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_inodes_free”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_inodes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_io_time_weighted”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_failures”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_swap”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_cpu_shares”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_memory_swap_limit_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_usage”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_io_current”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_writes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_failcnt”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_reads”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_writes_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_writes_merged”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/network_tcp_usage”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/memory_max_usage_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_memory_limit_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/spec_memory_reservation_limit_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_load_average_10s”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/cpu_system”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_reads_bytes”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]},{“name”:”pods/fs_sector_reads”,”singularName”:””,”namespaced”:true,”kind”:”MetricValueList”,”verbs”:[“get”]}]}

ClusterRole and ClusterRoleBinding

By default, HPA will try to read metrics (resource and custom) with user system:anonymous. This user is needed to define view-resource-metrics and view-custom-metrics ClusterRole and ClusterRoleBindings assigning them to system:anonymous to open read access to the metrics.

To accomplish this, follow these steps:

  1. Configure kubectl to connect proper k8s cluster.
  2. Copy ClusterRole and ClusterRoleBinding manifest for:
    • resource metrics: ApiGroups metrics.k8s.io
      “`
      apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRole
      metadata:
      name: view-resource-metrics
      rules:
    • apiGroups:
      • metrics.k8s.io
        resources:
      • pods
      • nodes
        verbs:
      • get
      • list
      • watch

        apiVersion: rbac.authorization.k8s.io/v1
        kind: ClusterRoleBinding
        metadata:
        name: view-resource-metrics
        roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: ClusterRole
        name: view-resource-metrics
        subjects:
    • apiGroup: rbac.authorization.k8s.io
      kind: User
      name: system:anonymous
      “`
    • custom metrics: ApiGroups custom.metrics.k8s.io

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: view-custom-metrics
rules:
– apiGroups:
– custom.metrics.k8s.io
resources:
– “*”
verbs:
– get
– list
– watch

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: view-custom-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: view-custom-metrics
subjects:
– apiGroup: rbac.authorization.k8s.io
kind: User
name: system:anonymous

  1. Create them at your Kubernetes cluster (if you want to use custom metrics):

    # kubectl create -f <RESOURCE_METRICS_MANIFEST>
    # kubectl create -f <CUSTOM_METRICS_MANIFEST>

Service deployment

For HPA to work properly, service deployments should have resources request definition for containers.

Lets see a hello-world example for testing if HPA is working well.

To do this, follow these steps:
1. Configure kubectl to connect proper k8s cluster.
2. Copy hello-world deployment manifest.

“`
apiVersion: apps/v1beta2
kind: Deployment
metadata:
labels:
app: hello-world
name: hello-world
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: hello-world
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
app: hello-world
spec:
containers:
– image: rancher/hello-world
imagePullPolicy: Always
name: hello-world
resources:
requests:
cpu: 500m
memory: 64Mi
ports:
– containerPort: 80
protocol: TCP
restartPolicy: Always

apiVersion: v1
kind: Service
metadata:
name: hello-world
namespace: default
spec:
ports:
– port: 80
protocol: TCP
targetPort: 80
selector:
app: hello-world
“`

  1. Deploy it at k8s cluster

    # kubectl create -f <HELLO_WORLD_MANIFEST>

  2. Copy HPA for resource or custom metrics:
  3. resource metrics
    “`
    apiVersion: autoscaling/v2beta1
    kind: HorizontalPodAutoscaler
    metadata:
    name: hello-world
    namespace: default
    spec:
    scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: hello-world
    minReplicas: 1
    maxReplicas: 10
    metrics:

    • type: Resource
      resource:
      name: cpu
      targetAverageUtilization: 50
    • type: Resource
      resource:
      name: memory
      targetAverageValue: 1000Mi
      “`
    • custom metrics (same as resource but adding custom cpu_system metric)

    apiVersion: autoscaling/v2beta1
    kind: HorizontalPodAutoscaler
    metadata:
    name: hello-world
    namespace: default
    spec:
    scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: hello-world
    minReplicas: 1
    maxReplicas: 10
    metrics:
    – type: Resource
    resource:
    name: cpu
    targetAverageUtilization: 50
    – type: Resource
    resource:
    name: memory
    targetAverageValue: 100Mi
    – type: Pods
    pods:
    metricName: cpu_system
    targetAverageValue: 20m

  4. To get HPA info and description and check that resource metrics data are shown:
    • resource metrics

    # kubectl get hpa
    NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
    hello-world Deployment/hello-world 1253376 / 100Mi, 0% / 50% 1 10 1 6m
    # kubectl describe hpa
    Name: hello-world
    Namespace: default
    Labels: <none>
    Annotations: <none>
    CreationTimestamp: Mon, 23 Jul 2018 20:21:16 +0200
    Reference: Deployment/hello-world
    Metrics: ( current / target )
    resource memory on pods: 1253376 / 100Mi
    resource cpu on pods (as a percentage of request): 0% (0) / 50%
    Min replicas: 1
    Max replicas: 10
    Conditions:
    Type Status Reason Message
    —- —— —— ——-
    AbleToScale True ReadyForNewScale the last scale time was sufficiently old as to warrant a new scale
    ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
    ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    Events: <none>

    • custom metrics

    kubectl describe hpa
    Name: hello-world
    Namespace: default
    Labels: <none>
    Annotations: <none>
    CreationTimestamp: Tue, 24 Jul 2018 18:36:28 +0200
    Reference: Deployment/hello-world
    Metrics: ( current / target )
    resource memory on pods: 3514368 / 100Mi
    “cpu_system” on pods: 0 / 20m
    resource cpu on pods (as a percentage of request): 0% (0) / 50%
    Min replicas: 1
    Max replicas: 10
    Conditions:
    Type Status Reason Message
    —- —— —— ——-
    AbleToScale True ReadyForNewScale the last scale time was sufficiently old as to warrant a new scale
    ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
    ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    Events: <none>

  5. Generating load for the service to test up and down autoscalation. Any tool could be used at this point, but we’ve used https://github.com/rakyll/hey to generate http requests to our hello-world service, and observe if autoscaling is working propwrly.
  6. Observing autoscale up and down
    • Resource metrics
    • Autoscale up to 2 pods when cpu usage is up to target:

      # kubectl describe hpa
      Name: hello-world
      Namespace: default
      Labels: <none>
      Annotations: <none>
      CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
      Reference: Deployment/hello-world
      Metrics: ( current / target )
      resource memory on pods: 10928128 / 100Mi
      resource cpu on pods (as a percentage of request): 56% (280m) / 50%
      Min replicas: 1
      Max replicas: 10
      Conditions:
      Type Status Reason Message
      —- —— —— ——-
      AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
      ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
      ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      Events:
      Type Reason Age From Message
      —- —— —- —- ——-
      Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target

# kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-k8ph2 1/1 Running 0 1m
hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h

Autoscale up to 3 pods when cpu usage limit is up to target for every horizontal-pod-autoscaler-upscale-delay 3 minutes by default

# kubectl describe hpa
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 9424896 / 100Mi
resource cpu on pods (as a percentage of request): 66% (333m) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
—- —— —— ——-
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal SuccessfulRescale 4m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 16s horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target # kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-f46kh 0/1 Running 0 1m
hello-world-54764dfbf8-k8ph2 1/1 Running 0 5m
hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h

Autoscale down to 1 pods when all metrics below target for horizontal-pod-autoscaler-downscale-delay 5 minutes by default:

kubectl describe hpa
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 10070016 / 100Mi
resource cpu on pods (as a percentage of request): 0% (0) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
—- —— —— ——-
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 6m horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 1s horizontal-pod-autoscaler New size: 1; reason: All metrics below target kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h

  • custom metrics
    • Autoscale up to 2 pods when cpu usage is up to target:

kubectl describe hpa
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 8159232 / 100Mi
“cpu_system” on pods: 7m / 20m
resource cpu on pods (as a percentage of request): 64% (321m) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
—- —— —— ——-
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal SuccessfulRescale 16s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-5pfdr 1/1 Running 0 3s
hello-world-54764dfbf8-q6l82 1/1 Running 0 6h

Autoscale up to 3 pods when cpu_system usage limit is up to target:

kubectl describe hpa
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 8374272 / 100Mi
“cpu_system” on pods: 27m / 20m
resource cpu on pods (as a percentage of request): 71% (357m) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
—- —— —— ——-
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 3s horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-5pfdr 1/1 Running 0 3m
hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
hello-world-54764dfbf8-q6l82 1/1 Running 0 6h

Autoscale up to 4 pods when cpu usage limit is up to target for every horizontal-pod-autoscaler-upscale-delay 3 minutes by default

kubectl describe hpa
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 8374272 / 100Mi
“cpu_system” on pods: 27m / 20m
resource cpu on pods (as a percentage of request): 71% (357m) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
—- —— —— ——-
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
Normal SuccessfulRescale 4s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-2p9xb 1/1 Running 0 5m
hello-world-54764dfbf8-5pfdr 1/1 Running 0 2m
hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
hello-world-54764dfbf8-q6l82 1/1 Running 0 6h

Autoscale down to 1 pods when all metrics below target for horizontal-pod-autoscaler-downscale-delay 5 minutes by default:

kubectl describe hpa
Name: hello-world
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
Reference: Deployment/hello-world
Metrics: ( current / target )
resource memory on pods: 8101888 / 100Mi
“cpu_system” on pods: 8m / 20m
resource cpu on pods (as a percentage of request): 0% (0) / 50%
Min replicas: 1
Max replicas: 10
Conditions:
Type Status Reason Message
—- —— —— ——-
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 8m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 1; reason: All metrics below target kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-world-54764dfbf8-q6l82 1/1 Running 0 6h

Conclusion

We’ve seen how Kubernetes HPA can be used on Rancher for autoscaling your deployments up and down. It’s a very nice and useful feature to accomodate deployments scale to real service load and to accomplish services SLA’s.

We’ve also seen how horizontal-pod-autoscaler-downscale-delay (5m by default) and horizontal-pod-autoscaler-upscale-delay (3m by default) could be parametrized at kube-controller to adjust the up and down scale reaction.

For our custom metric we’ve used as the example cpu_system, but could use any metric that is exported to Prometheus and makes sense over you service performance, like http_request_number, http_response_time, etc.

To facilitate HPA use, we are working to integrate metric-server as an addon to RKE cluster deployments. It’s already included in RKE v0.1.9-rc2 for testing, but not officially supported yet. It would be supported in RKE v0.1.9.

Raul Sanchez

Raul Sanchez

DevOps Lead

twitter

Source

Feature Highlight: CPU Manager – Kubernetes

Feature Highlight: CPU Manager

Authors: Balaji Subramaniam (Intel), Connor Doyle (Intel)

This blog post describes the CPU Manager, a beta feature in Kubernetes. The CPU manager feature enables better placement of workloads in the Kubelet, the Kubernetes node agent, by allocating exclusive CPUs to certain pod containers.

cpu manager

Sounds Good! But Does the CPU Manager Help Me?

It depends on your workload. A single compute node in a Kubernetes cluster can run many pods and some of these pods could be running CPU-intensive workloads. In such a scenario, the pods might contend for the CPU resources available in that compute node. When this contention intensifies, the workload can move to different CPUs depending on whether the pod is throttled and the availability of CPUs at scheduling time. There might also be cases where the workload could be sensitive to context switches. In all the above scenarios, the performance of the workload might be affected.

If your workload is sensitive to such scenarios, then CPU Manager can be enabled to provide better performance isolation by allocating exclusive CPUs for your workload.

CPU manager might help workloads with the following characteristics:

  • Sensitive to CPU throttling effects.
  • Sensitive to context switches.
  • Sensitive to processor cache misses.
  • Benefits from sharing a processor resources (e.g., data and instruction caches).
  • Sensitive to cross-socket memory traffic.
  • Sensitive or requires hyperthreads from the same physical CPU core.

Ok! How Do I use it?

Using the CPU manager is simple. First, enable CPU manager with the Static policy in the Kubelet running on the compute nodes of your cluster. Then configure your pod to be in the Guaranteed Quality of Service (QoS) class. Request whole numbers of CPU cores (e.g., 1000m, 4000m) for containers that need exclusive cores. Create your pod in the same way as before (e.g., kubectl create -f pod.yaml). And voilà, the CPU manager will assign exclusive CPUs to each of container in the pod according to their CPU requests.

apiVersion: v1
kind: Pod
metadata:
name: exclusive-2
spec:
containers:
– image: quay.io/connordoyle/cpuset-visualizer
name: exclusive-2
resources:
# Pod is in the Guaranteed QoS class because requests == limits
requests:
# CPU request is an integer
cpu: 2
memory: “256M”
limits:
cpu: 2
memory: “256M”

Pod specification requesting two exclusive CPUs.

Hmm … How Does the CPU Manager Work?

For Kubernetes, and the purposes of this blog post, we will discuss three kinds of CPU resource controls available in most Linux distributions. The first two are CFS shares (what’s my weighted fair share of CPU time on this system) and CFS quota (what’s my hard cap of CPU time over a period). The CPU manager uses a third control called CPU affinity (on what logical CPUs am I allowed to execute).

By default, all the pods and the containers running on a compute node of your Kubernetes cluster can execute on any available cores in the system. The total amount of allocatable shares and quota are limited by the CPU resources explicitly reserved for kubernetes and system daemons. However, limits on the CPU time being used can be specified using CPU limits in the pod spec. Kubernetes uses CFS quota to enforce CPU limits on pod containers.

When CPU manager is enabled with the “static” policy, it manages a shared pool of CPUs. Initially this shared pool contains all the CPUs in the compute node. When a container with integer CPU request in a Guaranteed pod is created by the Kubelet, CPUs for that container are removed from the shared pool and assigned exclusively for the lifetime of the container. Other containers are migrated off these exclusively allocated CPUs.

All non-exclusive-CPU containers (Burstable, BestEffort and Guaranteed with non-integer CPU) run on the CPUs remaining in the shared pool. When a container with exclusive CPUs terminates, its CPUs are added back to the shared CPU pool.

More Details Please …

cpu manager

The figure above shows the anatomy of the CPU manager. The CPU Manager uses the Container Runtime Interface’s UpdateContainerResources method to modify the CPUs on which containers can run. The Manager periodically reconciles the current State of the CPU resources of each running container with cgroupfs.

The CPU Manager uses Policies to decide the allocation of CPUs. There are two policies implemented: None and Static. By default, the CPU manager is enabled with the None policy from Kubernetes version 1.10.

The Static policy allocates exclusive CPUs to pod containers in the Guaranteed QoS class which request integer CPUs. On a best-effort basis, the Static policy tries to allocate CPUs topologically in the following order:

  1. Allocate all the CPUs in the same processor socket if available and the container requests at least an entire socket worth of CPUs.
  2. Allocate all the logical CPUs (hyperthreads) from the same physical CPU core if available and the container requests an entire core worth of CPUs.
  3. Allocate any available logical CPU, preferring to acquire CPUs from the same socket.

How is Performance Isolation Improved by CPU Manager?

With CPU manager static policy enabled, the workloads might perform better due to one of the following reasons:

  1. Exclusive CPUs can be allocated for the workload container but not the other containers. These containers do not share the CPU resources. As a result, we expect better performance due to isolation when an aggressor or a co-located workload is involved.
  2. There is a reduction in interference between the resources used by the workload since we can partition the CPUs among workloads. These resources might also include the cache hierarchies and memory bandwidth and not just the CPUs. This helps improve the performance of workloads in general.
  3. CPU Manager allocates CPUs in a topological order on a best-effort basis. If a whole socket is free, the CPU Manager will exclusively allocate the CPUs from the free socket to the workload. This boosts the performance of the workload by avoiding any cross-socket traffic.
  4. Containers in Guaranteed QoS pods are subject to CFS quota. Very bursty workloads may get scheduled, burn through their quota before the end of the period, and get throttled. During this time, there may or may not be meaningful work to do with those CPUs. Because of how the resource math lines up between CPU quota and number of exclusive CPUs allocated by the static policy, these containers are not subject to CFS throttling (quota is equal to the maximum possible cpu-time over the quota period).

Ok! Ok! Do You Have Any Results?

Glad you asked! To understand the performance improvement and isolation provided by enabling the CPU Manager feature in the Kubelet, we ran experiments on a dual-socket compute node (Intel Xeon CPU E5-2680 v3) with hyperthreading enabled. The node consists of 48 logical CPUs (24 physical cores each with 2-way hyperthreading). Here we demonstrate the performance benefits and isolation provided by the CPU Manager feature using benchmarks and real-world workloads for three different scenarios.

How Do I Interpret the Plots?

For each scenario, we show box plots that illustrates the normalized execution time and its variability of running a benchmark or real-world workload with and without CPU Manager enabled. The execution time of the runs are normalized to the best-performing run (1.00 on y-axis represents the best performing run and lower is better). The height of the box plot shows the variation in performance. For example if the box plot is a line, then there is no variation in performance across runs. In the box, middle line is the median, upper line is 75th percentile and lower line is 25th percentile. The height of the box (i.e., difference between 75th and 25th percentile) is defined as the interquartile range (IQR). Whiskers shows data outside that range and the points show outliers. The outliers are defined as any data 1.5x IQR below or above the lower or upper quartile respectively. Every experiment is run ten times.

Protection from Aggressor Workloads

We ran six benchmarks from the PARSEC benchmark suite (the victim workloads) co-located with a CPU stress container (the aggressor workload) with and without the CPU Manager feature enabled. The CPU stress container is run as a pod in the Burstable QoS class requesting 23 CPUs with –cpus 48 flag. The benchmarks are run as pods in the Guaranteed QoS class requesting a full socket worth of CPUs (24 CPUs on this system). The figure below plots the normalized execution time of running a benchmark pod co-located with the stress pod, with and without the CPU Manager static policy enabled. We see improved performance and reduced performance variability when static policy is enabled for all test cases.

execution time

Performance Isolation for Co-located Workloads

In this section, we demonstrate how CPU manager can be beneficial to multiple workloads in a co-located workload scenario. In the box plots below we show the performance of two benchmarks (Blackscholes and Canneal) from the PARSEC benchmark suite run in the Guaranteed (Gu) and Burstable (Bu) QoS classes co-located with each other, with and without the CPU manager static policy enabled.

Starting from the top left and proceeding clockwise, we show the performance of Blackscholes in the Bu QoS class (top left), Canneal in the Bu QoS class (top right), Canneal in Gu QoS class (bottom right) and Blackscholes in the Gu QoS class (bottom left, respectively. In each case, they are co-located with Canneal in the Gu QoS class (top left), Blackscholes in the Gu QoS class (top right), Blackscholes in the Bu QoS class (bottom right) and Canneal in the Bu QoS class (bottom left) going clockwise from top left, respectively. For example, Bu-blackscholes-Gu-canneal plot (top left) is showing the performance of Blackscholes running in the Bu QoS class when co-located with Canneal running in the Gu QoS class. In each case, the pod in Gu QoS class requests cores worth a whole socket (i.e., 24 CPUs) and the pod in Bu QoS class request 23 CPUs.

There is better performance and less performance variation for both the co-located workloads in all the tests. For example, consider the case of Bu-blackscholes-Gu-canneal (top left) and Gu-canneal-Bu-blackscholes (bottom right). They show the performance of Blackscholes and Canneal run simultaneously with and without the CPU manager enabled. In this particular case, Canneal gets exclusive cores due to CPU manager since it is in the Gu QoS class and requesting integer number of CPU cores. But Blackscholes also gets exclusive set of CPUs as it is the only workload in the shared pool. As a result, both Blackscholes and Canneal get some performance isolation benefits due to the CPU manager.

performance comparison

Performance Isolation for Stand-Alone Workloads

This section shows the performance improvement and isolation provided by the CPU manager for stand-alone real-world workloads. We use two workloads from the TensorFlow official models: wide and deep and ResNet. We use the census and CIFAR10 dataset for the wide and deep and ResNet models respectively. In each case the pods (wide and deep, ResNet request 24 CPUs which corresponds to a whole socket worth of cores. As shown in the plots, CPU manager enables better performance isolation in both cases.

performance comparison

Limitations

Users might want to get CPUs allocated on the socket near to the bus which connects to an external device, such as an accelerator or high-performance network card, in order to avoid cross-socket traffic. This type of alignment is not yet supported by CPU manager.
Since the CPU manager provides a best-effort allocation of CPUs belonging to a socket and physical core, it is susceptible to corner cases and might lead to fragmentation.
The CPU manager does not take the isolcpus Linux kernel boot parameter into account, although this is reportedly common practice for some low-jitter use cases.

Acknowledgements

We thank the members of the community who have contributed to this feature or given feedback including members of WG-Resource-Management and SIG-Node.
cmx.io (for the fun drawing tool).

Notices and Disclaimers

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/benchmarks.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

Workload Configuration:
https://gist.github.com/balajismaniam/fac7923f6ee44f1f36969c29354e3902
https://gist.github.com/balajismaniam/7c2d57b2f526a56bb79cf870c122a34c
https://gist.github.com/balajismaniam/941db0d0ec14e2bc93b7dfe04d1f6c58
https://gist.github.com/balajismaniam/a1919010fe9081ca37a6e1e7b01f02e3
https://gist.github.com/balajismaniam/9953b54dd240ecf085b35ab1bc283f3c

System Configuration:
CPU
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 48
On-line CPU(s) list: 0-47
Thread(s) per core: 2
Core(s) per socket: 12
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
Model name: Intel® Xeon® CPU E5-2680 v3
Memory
256 GB
OS/Kernel
Linux 3.10.0-693.21.1.el7.x86_64

Intel, the Intel logo, Xeon are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
© Intel Corporation.

Source

Roundup – @JetstackHQ’s Tuesday Twitter Tips for Kubernetes /

By Matt Bates

Last year we were successful with a series of Kubernetes tips shared via Twitter : it was called Tuesday Tips. Following a bit of a hiatus,
we’d like to bring it back. We’re starting with a roundup of our previous tips (those that are still valid anyway!)

This blog post compiles a summary of them, and ranks them according to popularity. Looking back it’s amazing how much the project has changed, so we’re exploring new features and running another series.

First time around the top tip was:

#1 Software engineers love shell auto-completion because it saves time and keystrokes – this tweet shows how to enable it for the kubectl command.

Add kubectl shell auto-completion for bash/zsh in 1.3+ by sourcing kubectl completion . #tuesdaytip #k8s https://t.co/bb5s6J9NZN

— Jetstack (@JetstackHQ) July 26, 2016

#2 You don’t have to do anything special to get your service distributed across nodes.

Create a service prior to a RC/RS and pods will spread across nodes. The default scheduler has service anti-affinity. #tuesdaytip

— Jetstack (@JetstackHQ) June 21, 2016

#3 We showed you a new and easy way to spin up a Job.

Use kubectl run with ‘–restart=Never’ and it’ll spin up a Job (vs Deployment+RS with default restart policy of Always) #tuesdaytip

— Jetstack (@JetstackHQ) June 7, 2016

#4 You need to be able to access certain types of pod using a predictable network identity – you can declare the DNS entry using PodSpec. This was a two-part tip – we gave you the annotations to achieve this in the previous version too.

As of #k8s v1.3, you can modify a pod’s hostname and subdomain via new field in the PodSpec: https://t.co/Gw8br7Y1Dg #tuesdaytip

— Jetstack (@JetstackHQ) July 12, 2016

To achieve the same behaviour in 1.2 you can use the https://t.co/TbFWey17Ha + https://t.co/V1zOJ2SA4S annotations #tuesdaytip

— Jetstack (@JetstackHQ) July 12, 2016

#5 A bash one-liner for copying resources from one namespace to another. This deserved to place higher.

Use kubectl’s standard streams to easily copy resources across namespaces #K8s #TuesdayTip https://t.co/zDZCPUjkeG pic.twitter.com/kKr3VRN4t2

— Jetstack (@JetstackHQ) August 2, 2016

#6 DaemonSets run on all nodes – even those where scheduling is disabled for maintenance.

Nodes that are marked with “SchedulingDisabled” will still run pods from DaemonSets #TuesdayTip #Kubernetes

— Jetstack (@JetstackHQ) August 9, 2016

#7 Add a record of what has been done to your resource annotation.

kubectl has a –record flag to store create/update commands as a resource annotation, useful for introspection #tuesdaytip #kubernetes

— Jetstack (@JetstackHQ) August 23, 2016

#8 This is an important one – use kubectl drain to decommission nodes prior to maintenance (but see #6!).

Use kubectl drain to decommission a #k8s node prior to upgrade/maintenance; cordons the node (unschedulable) + deletes all pods #tuesdaytip

— Jetstack (@JetstackHQ) June 14, 2016

#9 A guest tweet and a valuable one.

This week’s #Kubernetes #TuesdayTip, courtesy of @asynchio!! https://t.co/jwnGItvf74

— Jetstack (@JetstackHQ) August 16, 2016

#10 Last, but certainly not least, as it’s still really useful for keeping track of your infrastructure.

Add your own custom labels to nodes using kubelet param –node-labels= Eg use a node-label for role (master/worker) #tuesdaytip #kubernetes

— Jetstack (@JetstackHQ) July 5, 2016

Source

Building a CI/CD Pipeline with Kubernetes using Auto Devops, Rancher, and Gitlab

Build a CI/CD Pipeline with Kubernetes and Rancher 2.0

Recorded Online Meetup of best practices and tools for building pipelines with containers and kubernetes.

Watch the training

This blog coincides with the Rancher Labs Online Meetup August 2018 covering Continuous Integration/Continuous Delivery (CI/CD). This topic is becoming increasingly important in a world where services are getting small, with frequent updates. It allows for companies to completely automate the building, testing and deployment of code to ensure that it is done in a consistent, repeatable way.

There are several different CI/CD tools available and there are a bunch of them that will integrate with Kubernetes natively.

This blog is going to cover CI/CD using the hosted GitLab.com solution but the Kubernetes integrations that will be covered are generic and should work with any CI/CD provider that interface directly to Kubernetes using a service account.

Prerequisites

  1. Rancher 2.0 cluster for deploying workloads to.
  2. gitlab.com login.

For the purposes of this blog we are going to use one of the templates that GitLab provides, first step is to login to gitlab.com via https://gitlab.com/users/sign_in

Create a Project

The first step is to create a project:

  • Click New project.
  • Choose the Create from template tab.
  • Click use template under Ruby on Rails
  • Set Project name.
  • Click Create Project.
  • Wait for the project to finish importing.

Add your Kubernetes endpoint to your project

Under Operations choose Kubernetes

Click Kubernetes cluster, then choose Add existing cluster
add existing cluster

All of the above fields will need to be filled in, the following sections will detail how to fill this in.

API URL

The API URL is the URL that GitLab will use to talk to the Kubernetes API in your cluster that is going to be used to deploy your workloads. Depending on where your Kubernetes cluster is running you will need to ensure that the port is open to allow for communication from gitlab.com to the <address>:<port> of the Kubernetes cluster.

In order to retrieve the API URL we are going to run a script on the Rancher Server that controls your Kubernetes cluster, this will generate a kubeconfig file that contains the information we need to configure the Kubernetes settings with GitLab.

  • Log onto the server that is running your Rancher Server container.
  • Download the content of get_kubeconfig_custom_cluster_rancher2.sh from https://gist.github.com/superseb/f6cd637a7ad556124132ca39961789a4
  • Create a file on the server and paste the contents into it.
  • Make the file executable chmod +x <filename>.
  • Run the script as ./<filename> <name_of_cluster_to _deploy_to>
  • This will generate a kubeconfig file in the local directory.
  • Run cat kubeconfig | grep server:
  • The https value is what needs to be populated into the API URL field in the Add existing cluster

CA Certificate

The CA certificate is required as these are often custom certificates that aren’t in the certificate store of the GitLab server, this will allow the connection to be secured.

From the folder containing the kubeconfig generated in the API URL instructions:

  • Run cat kubeconfig | grep certificate-authority-data.

This give you a base64 encoded certificate string, the field in GitLab requires it in PEM format.

  • Save the contents of the encoded string to a file i.e. cert.base64
  • Run base64 -d cert.base64 > cert.pem.
  • This will then give you the certificate in pem format, which you can then copy and paste into the CA Certificate field in GitLab

Token

For the gitlab.com instance to be able to talk to the cluster we are going to create a service account that it will use. We are also going to create a namespace for GitLab to deploy the apps to.

To simplify this I’ve put all of this into a file, which can be viewed at http://x.co/rm082018

For this to create the necessary prerequisites you will need to run the following command:

kubectl apply -f http://x.co/rm082018 (optionally add –kubeconfig <kubeconfig> if you want to use a cluster other than the default one specified in your .kube/config file)

This will create a service account and create a token for it, which is the token that we need to specify in the GitLab Kubernetes configuration pane.

To obtain the secret execute

kubectl describe secrets/gitlab-secret -n gitlab-managed-apps | grep token:

Copy the token and paste that into the GitLab configuration

Project Namespace

If you have followed this blog so far and have used applied my Kubernetes manifest file then you will need to set the Project Namespace as gitlab-managed-apps. If you have updated the manifest then you will need to set this to reflect the namespace that you have set.

Rancher Server 2.0 setup

As part of the GitLab template projects they deploy a PostgresSQL pod. This means that you will need to have a dynamic storage provisioner, if you don’t have one of these set up then go into the catalog on the cluster that you are going to deploy to and Launch the Library NFS provisioner. This isn’t recommended for production use but it will get the auto devops function working.

Enabling Auto devops

In the GitLab UI, go into the Settings – CI/CD and expand Auto DevOps

Click the Enable Auto Devops radio button

In the Domain section it requires you to specify the DNS name that will be used to reach the service that is going to be deployed. The DNS name should point to the ingress on the cluster that the service will be deployed to. For testing you can just use <host-ip>.nip.io which will resolve to the host ip that is specified.

This will not give resilience and will only allow http not https but again this is enough to show it working, if you want to use https then you would need to add a wildcard cert the ingress controller.

Click Save changes and this should automatically launch you a pipeline and start a job running.

You can go into CI/CD – Pipelines to see the progress.

At the end of the production phase you should see the http address that you can access your application on.

Hopefully this has allowed you to successfully deploy a nice demonstration CI/CD pipeline. As I stated at the start the Kubernetes piece should work for the majority of CI/CD Kubernetes integrations.

Chris Urwin

Chris Urwin

UK Technical Lead

twitter

Source

KubeVirt: Extending Kubernetes with CRDs for Virtualized Workloads

KubeVirt: Extending Kubernetes with CRDs for Virtualized Workloads

Author: David Vossel (Red Hat)

What is KubeVirt?

KubeVirt is a Kubernetes addon that provides users the ability to schedule traditional virtual machine workloads side by side with container workloads. Through the use of Custom Resource Definitions (CRDs) and other Kubernetes features, KubeVirt seamlessly extends existing Kubernetes clusters to provide a set of virtualization APIs that can be used to manage virtual machines.

Why Use CRDs Over an Aggregated API Server?

Back in the middle of 2017, those of us working on KubeVirt were at a crossroads. We had to make a decision whether or not to extend Kubernetes using an aggregated API server or to make use of the new Custom Resource Definitions (CRDs) feature.

At the time, CRDs lacked much of the functionality we needed to deliver our feature set. The ability to create our own aggregated API server gave us all the flexibility we needed, but it had one major flaw. An aggregated API server significantly increased the complexity involved with installing and operating KubeVirt.

The crux of the issue for us was that aggregated API servers required access to etcd for object persistence. This meant that cluster admins would have to either accept that KubeVirt needs a separate etcd deployment which increases complexity, or provide KubeVirt with shared access to the Kubernetes etcd store which introduces risk.

We weren’t okay with this tradeoff. Our goal wasn’t to just extend Kubernetes to run virtualization workloads, it was to do it in the most seamless and effortless way possible. We felt that the added complexity involved with an aggregated API server sacrificed the part of the user experience involved with installing and operating KubeVirt.

Ultimately we chose to go with CRDs and trust that the Kubernetes ecosystem would grow with us to meet the needs of our use case. Our bets were well placed. At this point there are either solutions in place or solutions under discussion that solve every feature gap we encountered back in 2017 when were evaluating CRDs vs an aggregated API server.

Building Layered “Kubernetes like” APIs with CRDs

We designed KubeVirt’s API to follow the same patterns users are already familiar with in the Kubernetes core API.

For example, in Kubernetes the lowest level unit that users create to perform work is a Pod. Yes, Pods do have multiple containers but logically the Pod is the unit at the bottom of the stack. A Pod represents a mortal workload. The Pod gets scheduled, eventually the Pod’s workload terminates, and that’s the end of the Pod’s lifecycle.

Workload controllers such as the ReplicaSet and StatefulSet are layered on top of the Pod abstraction to help manage scale out and stateful applications. From there we have an even higher level controller called a Deployment which is layered on top of ReplicaSets help manage things like rolling updates.

In KubeVirt, this concept of layering controllers is at the very center of our design. The KubeVirt VirtualMachineInstance (VMI) object is the lowest level unit at the very bottom of the KubeVirt stack. Similar in concept to a Pod, a VMI represents a single mortal virtualized workload that executes once until completion (powered off).

Layered on top of VMIs we have a workload controller called a VirtualMachine (VM). The VM controller is where we really begin to see the differences between how users manage virtualized workloads vs containerized workloads. Within the context of existing Kubernetes functionality, the best way to describe the VM controller’s behavior is to compare it to a StatefulSet of size one. This is because the VM controller represents a single stateful (immortal) virtual machine capable of persisting state across both node failures and multiple restarts of its underlying VMI. This object behaves in the way that is familiar to users who have managed virtual machines in AWS, GCE, OpenStack or any other similar IaaS cloud platform. The user can shutdown a VM, then choose to start that exact same VM up again at a later time.

In addition to VMs, we also have a VirtualMachineInstanceReplicaSet (VMIRS) workload controller which manages scale out of identical VMI objects. This controller behaves nearly identically to the Kubernetes ReplicSet controller. The primary difference being that the VMIRS manages VMI objects and the ReplicaSet manages Pods. Wouldn’t it be nice if we could come up with a way to use the Kubernetes ReplicaSet controller to scale out CRDs?

Each one of these KubeVirt objects (VMI, VM, VMIRS) are registered with Kubernetes as a CRD when the KubeVirt install manifest is posted to the cluster. By registering our APIs as CRDs with Kubernetes, all the tooling involved with managing Kubernetes clusters (like kubectl) have access to the KubeVirt APIs just as if they are native Kubernetes objects.

Dynamic Webhooks for API Validation

One of the responsibilities of the Kubernetes API server is to intercept and validate requests prior to allowing objects to be persisted into etcd. For example, if someone tries to create a Pod using a malformed Pod specification, the Kubernetes API server immediately catches the error and rejects the POST request. This all occurs before the object is persistent into etcd preventing the malformed Pod specification from making its way into the cluster.

This validation occurs during a process called admission control. Until recently, it was not possible to extend the default Kubernetes admission controllers without altering code and compiling/deploying an entirely new Kubernetes API server. This meant that if we wanted to perform admission control on KubeVirt’s CRD objects while they are posted to the cluster, we’d have to build our own version of the Kubernetes API server and convince our users to use that instead. That was not a viable solution for us.

Using the new Dynamic Admission Control feature that first landed in Kubernetes 1.9, we now have a path for performing custom validation on KubeVirt API through the use of a ValidatingAdmissionWebhook. This feature allows KubeVirt to dynamically register an HTTPS webhook with Kubernetes at KubeVirt install time. After registering the custom webhook, all requests related to KubeVirt API objects are forwarded from the Kubernetes API server to our HTTPS endpoint for validation. If our endpoint rejects a request for any reason, the object will not be persisted into etcd and the client receives our response outlining the reason for the rejection.

For example, if someone posts a malformed VirtualMachine object, they’ll receive an error indicating what the problem is.

$ kubectl create -f my-vm.yaml
Error from server: error when creating “my-vm.yaml”: admission webhook “virtualmachine-validator.kubevirt.io” denied the request: spec.template.spec.domain.devices.disks[0].volumeName ‘registryvolume’ not found.

In the example output above, that error response is coming directly from KubeVirt’s admission control webhook.

CRD OpenAPIv3 Validation

In addition to the validating webhook, KubeVirt also uses the ability to provide an OpenAPIv3 validation schema when registering a CRD with the cluster. While the OpenAPIv3 schema does not let us express some of the more advanced validation checks that the validation webhook provides, it does offer the ability to enforce simple validation checks involving things like required fields, max/min value lengths, and verifying that values are formatted in a way that matches a regular expression string.

Dynamic Webhooks for “PodPreset Like” Behavior

The Kubernetes Dynamic Admission Control feature is not only limited to validation logic, it also provides the ability for applications like KubeVirt to both intercept and mutate requests as they enter the cluster. This is achieved through the use of a MutatingAdmissionWebhook object. In KubeVirt, we are looking to use a mutating webhook to support our VirtualMachinePreset (VMPreset) feature.

A VMPreset acts in a similar way to a PodPreset. Just like a PodPreset allows users to define values that should automatically be injected into pods at creation time, a VMPreset allows users to define values that should be injected into VMs at creation time. Through the use of a mutating webhook, KubeVirt can intercept a request to create a VM, apply VMPresets to the VM spec, and then validate that the resulting VM object. This all occurs before the VM object is persisted into etcd which allows KubeVirt to immediately notify the user of any conflicts at the time the request is made.

Subresources for CRDs

When comparing the use of CRDs to an aggregated API server, one of the features CRDs lack is the ability to support subresources. Subresources are used to provide additional resource functionality. For example, the pod/logs and pod/exec subresource endpoints are used behind the scenes to provide the kubectl logs and kubectl exec command functionality.

Just like Kubernetes uses the pod/exec subresource to provide access to a pod’s environment, in KubeVirt we want subresources to provide serial-console, VNC, and SPICE access to a virtual machine. By adding virtual machine guest access through subresources, we can leverage RBAC to provide access control for these features.

So, given that the KubeVirt team decided to use CRD’s instead of an aggregated API server for custom resource support, how can we have subresources for CRDs when the CRD feature expiclity does not support subresources?

We created a workaround for this limitation by implementing a stateless aggregated API server that exists only to serve subresource requests. With no state, we don’t have to worry about any of the issues we identified earlier with regards to access to etcd. This means the KubeVirt API is actually supported through a combination of both CRDs for resources and an aggregated API server for stateless subresources.

This isn’t a perfect solution for us. Both aggregated API servers and CRDs require us to register an API GroupName with Kubernetes. This API GroupName field essentially namespaces the API’s REST path in a way that prevents API naming conflicts between other third party applications. Because CRDs and aggregated API servers can’t share the same GroupName, we have to register two separate GroupNames. One is used by our CRDs and the other is used by the aggregated API server for subresource requests.

Having two GroupNames in our API is slightly inconvenient because it means the REST path for the endpoints that serve the KubeVirt subresource requests have a slightly different base path than the resources.

For example, the endpoint to create a VMI object is as follows.

/apis/kubevirt.io/v1alpha2/namespaces/my-namespace/virtualmachineinstances/my-vm

However, the subresource endpoint to access graphical VNC looks like this.

/apis/subresources.kubevirt.io/v1alpha2/namespaces/my-namespace/virtualmachineinstances/my-vm/vnc

Notice that the first request uses kubevirt.io and the second request uses subresource.kubevirt.io. We don’t like that, but that’s how we’ve managed to combine CRDs with a stateless aggregated API server for subresources.

One thing worth noting is that in Kubernetes 1.10 a very basic form of CRD subresource support was added in the form of the /status and /scale subresources. This support does not help us deliver the virtualization features we want subresources for. However, there have been discussions about exposing custom CRD subresources as webhooks in a future Kubernetes version. If this functionality lands, we will gladly transition away from our stateless aggregated API server workaround to use a subresource webhook feature.

CRD Finalizers

A CRD finalizer is a feature that lets us provide a pre-delete hook in order to perform actions before allowing a CRD object to be removed from persistent storage. In KubeVirt, we use finalizers to guarantee a virtual machine has completely terminated before we allow the corresponding VMI object to be removed from etcd.

API Versioning for CRDs

The Kubernetes core APIs have the ability to support multiple versions for a single object type and perform conversions between those versions. This gives the Kubernetes core APIs a path for advancing the v1alpha1 version of an object to a v1beta1 version and so forth.

Prior to Kubernetes 1.11, CRDs did not have support for multiple versions. This meant when we wanted to progress a CRD from kubevirt.io/v1alpha1 to kubevirt.io/v1beta1, the only path available to was to backup our CRD objects, delete the registered CRD from Kubernetes, register a new CRD with the updated version, convert the backed up CRD objects to the new version, and finally post the migrated CRD objects back to the cluster.

That strategy was not exactly a viable option for us.

Fortunately thanks to some recent work to rectify this issue in Kubernetes, the latest Kubernetes v1.11 now supports CRDs with multiple versions. Note however that this initial multi version support is limited. While a CRD can now have multiple versions, the feature does not currently contain a path for performing conversions between versions. In KubeVirt, the lack of conversion makes it difficult us to evolve our API as we progress versions. Luckily, support for conversions between versions is underway and we look forward to taking advantage of that feature once it lands in a future Kubernetes release.

Source